Consider me a n00b but I have searched my specific query here and I haven't found the answer yet. My problem is as follows. Consider the following simplified csv file r_split.csv which represents my dataset:
id,v1,v2,v3,v4,str
1,2.4,2.4,345.5,234.2,gbbc
2,4.5,2.56,7.45,34.6,ebird
3,3.4,5.6,4.45,6.3,ebird_can
The first row contains the header names. You can see that the column str contains 3 different string values i.e. gbbc, ebird, ebird_can
. My objective is to split this big dataset into 2 datasets. The first one will only contain all the str values = gbbc
and the second one will contain all the str values of ebird
and ebird_can
renamed as allebird
.
I can split the dataset into 3 distinct datasets by using the following command:
splitted<-split(rsplit,rsplit$str)
However, I cannot figure out how to use 2 distinct values of the str column and combine them into the third. Can someone help me out please?
Thanks.