4

I have the following problem: in a csv-file I have a column for species, one for transect, one for the year and one for the AUC. In another csv-file I have a column for transect, one for the year, one for precipitation and one for temperature. Now I would like to join the files in R in a way, that I can have the columns for species and AUC from the second csv and the columns for the rest from the first csv. In the end I'd like to get a file with transect_id, year, day, month, species, regional_gam(=AUC), precipitation and LST(=temperature). So basically the precipitation-/ LST-values from TR001 for every day in 2008 need to be assigned to every species which has an AUC-value for 2008 and TR001. Thanks!

costebk08
  • 1,299
  • 4
  • 17
  • 42
jewish.cowboy
  • 119
  • 2
  • 3
  • 9
  • 1
    Look into `read.csv` and `merge` -- those two functions should do the trick. – Eric Brooks Jul 15 '15 at 12:04
  • 3
    In addition to @EricBrooks' comment, it would be helpful if you could provide a (clearly formatted) sample of the data of your current input file and the expected output. – Juliën Jul 15 '15 at 12:05
  • transect_id,species,year,regional_gam,prop_pheno_sampled TR002,Allancastria cerisyi,2010,0,0.594080041 TR014,Allancastria cerisyi,2010,27,0.119491204 TR012,Allancastria cerisyi,2011,22.4,0.575799283 that's the AUC-data and that: transect_id,year,day,month,LST,precipitation TR001,2008,1,1,11.47578335,0 TR002,2008,1,1,11.47576523,0 TR007,2008,1,1,6.603372097,0 TR009,2008,1,1,13.25433731,0 is the weather data. Thanks for the tipp, I know the two functions, but I didn't get it so far. But then I know I have to try further in that direction. – jewish.cowboy Jul 15 '15 at 12:40
  • Sorry for the crappy formatting, it's my first time using the "ask a question"-part of this page. – jewish.cowboy Jul 15 '15 at 12:41
  • 1
    Then just use the edit button and add your data to your original post and use the code highlighting {} :-) – drmariod Jul 15 '15 at 13:03
  • use dput(head(data)) for each data set and provide both. – Ricardo Oliveros-Ramos Jul 15 '15 at 13:24

3 Answers3

4

Use read.csv and then merge.

Load the two csv files into R. (Don't forget to make sure their common variables share the same name!).

df1<-read.csv(dat1,head=T)
df2<-read.csv(dat2,head=T)

Merge the dataframes together by their shared variables and add argument all.x=T (the default) to ensure all rows are kept from your database containing species.

merge(df1,df2,by=c('transect_id','year'),all.x=T)

To see this in action using test data:

test<-data.frame(sp=c(rep(letters[1:10],2)),t=c(rep(1:3,2,20)),y=c(rep(2000:2008,len=20)),AUC=1:20)
test2<-data.frame(t=c(rep(1:3,2,9)),y=c(rep(2000:2008,len=9)),ppt=c(1:9),temp=c(11:19))

merge(test,test2,by=c('t','y'),all.x=T)
theforestecologist
  • 4,667
  • 5
  • 54
  • 91
  • Hei, thanks, I tried it and after some changes in my dataset it actually worked out with merge as I imagined :) Thanks a lot! – jewish.cowboy Jul 17 '15 at 13:40
  • Sure thing! Also, so you're aware: By default the data frames are merged on the columns with names they both have, but separate specifications of the columns can be given by `by.x` and `by.y`. If you can, please accept this answer :). – theforestecologist Jul 17 '15 at 17:31
0

Please use

library(dplyr)

df1<- read.csv("F:\\Test_Anything\\Merge\\1.csv" , head(T))
df2<-read.csv("F:\\Test_Anything\\Merge\\2.csv" , head(T))
r <- merge(df1,df2,by=c('NAME','NAME'),all.x=T)
write.csv(r,"F:\\Test_Anything\\Merge\\DF.csv" , all(T) )
B. Go
  • 1,436
  • 4
  • 15
  • 22
0

In general, to merge .csv files, you can simply use this code snip:

path <- rstudioapi::getActiveDocumentContext()$path
Encoding(path) <- "UTF-8"
setwd(dirname(path))

datap1 = read.csv("file1.csv", header=TRUE)
datap2 = read.csv("file2.csv", header=TRUE)
data <- rbind(datap1, datap2)
write.csv(data,"merged.csv")

Note: First 3 lines of code set the working directory to where the R file is located and are not related to the question.

Khalil Youssefi
  • 385
  • 6
  • 10