Hi I have a dataset of freshwater fish in a range of sites, with repeat monthly visits over many years. Each row has the species found, the total and the status (i.e. positive or negative for a test result).
sample_ID site coll_date species TOT inf_status
382870 site 1 27/10/2007 Species B 1 positive
382872 site 2 27/10/2007 Species D 1 positive
487405 site 3 28/10/2007 Species A 1 positive
487405 site 3 28/10/2007 Species A 1 positive
382899 site 4 03/11/2007 Species A 1 positive
382900 site 5 03/11/2007 Species A 1 positive
382901 site 5 03/11/2007 Species A 1 positive
382902 site 6 03/11/2007 Species A 1 positive
382903 site 7 09/12/2007 Species B 1 positive
382904 site 8 05/02/2008 Species C 9 negative
382905 site 8 05/02/2008 Species A 13 negative
382906 site 9 14/02/2008 Species A 1 positive
382907 site 9 14/02/2008 Species A 1 positive
I need to reformat the data so that there is just one row per site visit (i.e. in a given site name and date combo) with columns for total found by species and the fish status (i.e. speciesA_pos, SpeciesA_neg, Sp_B_pos.. etc).
site coll_date SP_A_pos SP_A_neg SP_B_pos SP_B_neg SP_C_pos SP_C_neg SP_D_pos SP_D_neg
site 1 27/10/2007 0 0 1 0 0 0 0 0
site 2 27/10/2007 0 0 0 0 0 0 1 0
site 3 28/10/2007 3 0 0 0 0 0 0 0
site 4 03/11/2007 1 0 0 0 0 0 0 0
site 5 03/11/2007 2 0 0 0 0 0 0 0
site 6 03/11/2007 1 0 0 0 0 0 0 0
site 7 09/12/2007 0 0 1 0 0 0 0 0
site 8 05/02/2008 0 13 0 0 0 9 0 0
site 9 14/02/2008 2 0 0 0 0 0 0 0
figured I could use the reshape function but still need to sum within site visits as reshape will take the first row. My thoughts were to use split/apply/aggregate/for loops etc but tried various combinations and not getting anywhere. apologies I'm not familiar with R. any comments appreciated!