How to append each year from original separate files to new merge dataset

Question

I have 5 data files and each file name is xxx2014.dat, xxx2015.dat...xxx2018.dat. I merge these files together.

Now, I want to append each year back in my new merge dataset, how can I do it in r?

I try to use append() but I have no idea how r can know which row are 2014 or 2015...

yearslist = c("2014","2015","2016","2017","2018")

for (year in yearslist) {
    filename = paste0("xxx", year, ".dat")
    dataname <- assign(
        paste0("xxx", year),
        read_fwf(
            file = filename,
            fwf_positions(...),
            col_names = c("relationship","age","race","sex")
        )
    )
}
newmergedata <- bind_rows(fileA,fileB)

I expected to get my new merge dataset have one column is corresponding years.

For example:
Year     Sex     Region
2014     001       1
2014     002       1
2015     001       2
2018     002       1

However, now I only have 
Sex     Region
001       1
002       1
001       2
002       1

How to find the value their corresponding year in new merge data?

I think it might be easier to find help if you provide some sample data with the current and expected output(for two of the datasets for instance). You can [see this post for reproducibility help](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). — NelsonGon, Sep 20 '19 at 03:49
What are `x` and `y`? AND you should NOT combine the use of `<-` with `assign`. They do the same thing. — IRTFM, Sep 20 '19 at 03:56
@Wendy, was one of the posted answers helpful? If so, you may consider to take one of the actions proposed in the SO help center [What should I do when someone answers my question?](https://stackoverflow.com/help/someone-answers). Otherwise, please, let us know what is missing or unclear. Thank you. — Uwe, Sep 28 '19 at 06:43
Please don't make more work for others by vandalizing your posts. By posting on the Stack Exchange (SE) network, you've granted a non-revocable right, under a [CC BY-SA license](//creativecommons.org/licenses/by-sa/4.0), for SE to distribute the content (i.e. regardless of your future choices). By SE policy, the non-vandalized version is distributed. Thus, any vandalism will be reverted. Please see: [How does deleting work? …](//meta.stackexchange.com/q/5221). If permitted to delete, there's a "delete" button below the post, on the left, but it's only in browsers, not the mobile app. — Makyen, Oct 15 '19 at 00:09

score 1 · Answer 1 · answered Sep 20 '19 at 06:18

The rbindlist() function has an idcol parameter which

Creates a column in the result showing which list item those rows came from. [...] If the input list has names, those names are the values placed in this id column, [...]

(from help("rbindlist", "data.table"))

So, perhaps you may try something along the following lines:

library(data.table)
library(magrittr) 
library(readr)

years <- 2014:2018
filenames <- sprintf("xxx%i.dat", years)
newmergedata <- lapply(
  filenames, 
  read_fwf, 
  col_positions = fwf_positions(...),
  col_names = c("relationship","age","race","sex")
) %>% 
  set_names(years) %>% 
  rbindlist(idcol = "Year")

Note that the code is untested due to the lack of a reproducible example and may require amendments for your specific datasets.

score 0 · Answer 2 · answered Sep 20 '19 at 04:06

0

Here's what I used:

files <- Sys.glob("*.dat") # assuming it's all your dat files, but this is easily modified otherwise

result <- do.call(rbind, lapply(files, function(x) {
    year <- sub("x+([0-9]+).*", "\\1", x)
    df <- read_fwf(
        file = x,
        fwf_positions(...),
        col_names = c("relationship","age","race","sex")
    )
    df$year <- year
    return(df)
}))

answered Sep 20 '19 at 04:06

jake2389

1,166
8
22

Actually, I already combine them together, and write to a new file. I just want to add year to my new file. – Wendy Sep 20 '19 at 04:28
1

@Wendy If you have already created a new, merged dataset than the information on the origin of each row, i.e., which row comes from which file (year), has been lost and cannot be recovered. – Uwe Sep 20 '19 at 06:23

How to append each year from original separate files to new merge dataset

2 Answers2