0

am binding a list a files together in a for loop however the verbiage and placement a of columns can be different. For example, id (in column 1) and itemID (in column 3) are the same thing but in different spots with different names. There are multiple columns like this spread among the files. Is there and way to give a list of columns names i want to change and then what i want those new column names to be. i tried using 'setnames' but it does not seem to be working. I assume because there could be a file with my desired name (itemId) and then an undesired name (tester) and then another file with vise versa (the undesired 'ID' and desired 'test')

here is an example of what i am doing:

  #change names of columns from old files
  tryCatch({
    setnames(tempPull,
             old = c("ID", "tester"),
             new = c( "ItemId", "Test"))},
    error = function(e){})

This is kind of a further example. The file# are how the files looks and then the DesiredFormat is how i would like it to look at the end. I have also provided a list of names that needed to be changed and what they should be changed to:

file1 <- data.frame(ItemId = 1:3, Test = letters[1:3])
file2 <- data.frame(ItemId = 4:7, Tester = letters[4:7])
file3 <- data.frame(ID = 7:10, Tester = letters[7:10])
file4 <- data.frame(ID = 11:12, Test = letters[11:12])
file5 <- data.frame(ID = 12:15, Testx = letters[12:15])


DesiredFormat <- data.frame(ItemId = 1:15, Test = letters[1:15])

oldnames <- c("ID", "Tester", "Testx")
newnames <- c("ItemId", "Test", "Test")
alexb523
  • 718
  • 2
  • 9
  • 26
  • When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. What package is `setnames` from? – MrFlick Jun 14 '18 at 16:40
  • @MrFlick it is in the `data.table` package – alexb523 Jun 14 '18 at 17:29

2 Answers2

0

One solution I think of is using the rename function in dplyr package:

df %>% select(a,b,c) %>% rename(d = a, e = b, f = c)

Or using match

main_col <- c('a','b','c')
df.rename <- df %>% 
             dplyr::select(one_of(main_col))
namekey <- c(a = 'd', b = 'e', c = 'f')
names(df.rename) <- namekey[names(df.rename)]

Hope it helps. Btw as @MrFlick mentions, you should have put a reproducible example though :)

Blue Phoenix
  • 11
  • 1
  • 3
  • i have added some more notes about what i'm trying to do. Basically, One `file` could have a header `'ItemID'` and then the next could have 'ID' but i would like them both to be labeled `'ItemId'`. I am throwing an error in the examples provided 'Error: `ItemId` contains unknown variables`. – alexb523 Jun 25 '18 at 19:47
  • Also to note, it's not just one column it's a whole list of them that should be labeled differently if they exist they exist in the file. – alexb523 Jun 25 '18 at 19:51
0

Please let me know if there is a more elegant way to do this. I am using part of @Blue Phoenix response but want needed some extra steps to get all the columns in there.

  #a list of columns to be renamed
  #through out the files
  chgCols <- c("ID", "Tester", "Testx")

  #the names the columns will be changed to
  namekey <- c(ID = "ItemId", Tester = "Test", Testx = "Test")

  chgCols <- match(chgCols, colnames(tempPullList_2018))     #find any unwanted column indexes in data frame
  chgCols <- chgCols[!is.na(chgCols)]                        #remove NA's if column found
  chgCols <- colnames(tempPullList_2018[, chgCols])          #match indexes to column names
  namekey    <- namekey[chgCols]                             #associate name to be changed to namekey

  tempPullList_2018 <- tempPullList_2018 %>% rename(namekey) #rename the columns in data frame


  PullList_2018 <- rbindlist(list(PullList_2018, tempPullList_2018), fill = T)
alexb523
  • 718
  • 2
  • 9
  • 26