1

I combine 12 different txt files into a dataframe, which looks like this:

Each file contains a different number of data, and the file name was name with the date, e.g. "Student-Score-2010-10.txt".

Each file represents one month.

How can I add back the date to each row?

   id  dep  score
id511   10     34
id512   10     32
id512   10     34

I need to add back the date to each row

   id  dep  score      date
id511   10     34   2010-10
id511   10     34   2010-10
id511   10     34   2010-10
id511   10     34   2010-11
id511   10     34   2010-11
id511   10     34   2010-12
id511   10     34    2011-1

I made up the date. It is not the real data

orignial data

Monthly report"

"University of XXXXX"

"+--------+------+-----+"

"| id | dep | scores |

"+-------+-----+------+

"| id593 | 2 | 233 |

Emilia311
  • 13
  • 3

1 Answers1

0

You could try:

 files <- list.files(pattern="^Student")
  files
 #[1] "Student-Score-2010-10.txt" "Student-Score-2010-11.txt"

 dat<- do.call(rbind,lapply(files, function(x) {
           al <- readLines(x)
           al2 <- grep("id[0-9]", al, value=TRUE)
           al3 <- gsub("^ +| +$", "", gsub("[[:punct:]]+", "", al2))
           al4 <-read.table(text=al3, header=FALSE, sep="", stringsAsFactors=FALSE)
           colnames(al4) <- c("id", "dep", "scores")
           transform(al4, date=gsub("[[:alpha:]]+\\-[[:alpha:]]+\\-(.*)\\.txt",
                                          "\\1",x))}))

  dat
  #   id dep scores    date
  #1 id592   2    235 2010-10
  #2 id593   2    233 2010-11
akrun
  • 874,273
  • 37
  • 540
  • 662