I have a dataframe in R with this structure:
df1<-data.frame(SiteID=c("A","A","A","B","B","C"),Unrelated=c("dog","cat","catamount","bird","horse","monkey"),AirMonitor=c(1,0,0,0,0,1),WaterMonitor=c(0,1,0,1,0,0),SoilMonitor=c(0,0,1,0,1,0))
with output like this:
SiteID Unrelated AirMonitor WaterMonitor SoilMonitor
1 A dog 1 0 0
2 A cat 0 1 0
3 A catamount 0 0 1
4 B bird 0 1 0
5 B horse 0 0 1
6 C monkey 1 0 0
1
indicates the presence of that type of monitor at that site (I could change the monitor columns to factors with "Y"
and "N"
values instead if needed). Basically, I want to aggregate monitor presence by SiteID
but then keep all the rows in order not to lose the data in the Unrelated
column. I want 1
values to win over 0
values. For example, if any row with SiteID
A has a 1
for AirMonitor
, I want all rows with SiteID
A to have a 1
for AirMonitor
. Same for the other two monitor types.
Desired output:
SiteID Unrelated AirMonitor WaterMonitor SoilMonitor
1 A dog 1 1 1
2 A cat 1 1 1
3 A catamount 1 1 1
4 B bird 0 1 1
5 B horse 0 1 1
6 C monkey 1 0 0
The real dataset has several more unrelated columns and thousands of rows. There must be some simple way to do this (maybe with aggregate
?).