I have the following problem: in a csv-file I have a column for species, one for transect, one for the year and one for the AUC. In another csv-file I have a column for transect, one for the year, one for precipitation and one for temperature. Now I would like to join the files in R in a way, that I can have the columns for species and AUC from the second csv and the columns for the rest from the first csv. In the end I'd like to get a file with transect_id, year, day, month, species, regional_gam(=AUC), precipitation and LST(=temperature). So basically the precipitation-/ LST-values from TR001 for every day in 2008 need to be assigned to every species which has an AUC-value for 2008 and TR001. Thanks!
Asked
Active
Viewed 2.7k times
4
-
1Look into `read.csv` and `merge` -- those two functions should do the trick. – Eric Brooks Jul 15 '15 at 12:04
-
3In addition to @EricBrooks' comment, it would be helpful if you could provide a (clearly formatted) sample of the data of your current input file and the expected output. – Juliën Jul 15 '15 at 12:05
-
transect_id,species,year,regional_gam,prop_pheno_sampled TR002,Allancastria cerisyi,2010,0,0.594080041 TR014,Allancastria cerisyi,2010,27,0.119491204 TR012,Allancastria cerisyi,2011,22.4,0.575799283 that's the AUC-data and that: transect_id,year,day,month,LST,precipitation TR001,2008,1,1,11.47578335,0 TR002,2008,1,1,11.47576523,0 TR007,2008,1,1,6.603372097,0 TR009,2008,1,1,13.25433731,0 is the weather data. Thanks for the tipp, I know the two functions, but I didn't get it so far. But then I know I have to try further in that direction. – jewish.cowboy Jul 15 '15 at 12:40
-
Sorry for the crappy formatting, it's my first time using the "ask a question"-part of this page. – jewish.cowboy Jul 15 '15 at 12:41
-
1Then just use the edit button and add your data to your original post and use the code highlighting {} :-) – drmariod Jul 15 '15 at 13:03
-
use dput(head(data)) for each data set and provide both. – Ricardo Oliveros-Ramos Jul 15 '15 at 13:24
3 Answers
4
Use read.csv
and then merge
.
Load the two csv files into R. (Don't forget to make sure their common variables share the same name!).
df1<-read.csv(dat1,head=T)
df2<-read.csv(dat2,head=T)
Merge the dataframes together by their shared variables and add argument all.x=T (the default) to ensure all rows are kept from your database containing species.
merge(df1,df2,by=c('transect_id','year'),all.x=T)
To see this in action using test data:
test<-data.frame(sp=c(rep(letters[1:10],2)),t=c(rep(1:3,2,20)),y=c(rep(2000:2008,len=20)),AUC=1:20)
test2<-data.frame(t=c(rep(1:3,2,9)),y=c(rep(2000:2008,len=9)),ppt=c(1:9),temp=c(11:19))
merge(test,test2,by=c('t','y'),all.x=T)

theforestecologist
- 4,667
- 5
- 54
- 91
-
Hei, thanks, I tried it and after some changes in my dataset it actually worked out with merge as I imagined :) Thanks a lot! – jewish.cowboy Jul 17 '15 at 13:40
-
Sure thing! Also, so you're aware: By default the data frames are merged on the columns with names they both have, but separate specifications of the columns can be given by `by.x` and `by.y`. If you can, please accept this answer :). – theforestecologist Jul 17 '15 at 17:31
0
Please use
library(dplyr)
df1<- read.csv("F:\\Test_Anything\\Merge\\1.csv" , head(T))
df2<-read.csv("F:\\Test_Anything\\Merge\\2.csv" , head(T))
r <- merge(df1,df2,by=c('NAME','NAME'),all.x=T)
write.csv(r,"F:\\Test_Anything\\Merge\\DF.csv" , all(T) )

B. Go
- 1,436
- 4
- 15
- 22
0
In general, to merge .csv
files, you can simply use this code snip:
path <- rstudioapi::getActiveDocumentContext()$path
Encoding(path) <- "UTF-8"
setwd(dirname(path))
datap1 = read.csv("file1.csv", header=TRUE)
datap2 = read.csv("file2.csv", header=TRUE)
data <- rbind(datap1, datap2)
write.csv(data,"merged.csv")
Note: First 3 lines of code set the working directory to where the R file is located and are not related to the question.

Khalil Youssefi
- 385
- 6
- 10