-3

So far, I have this:

file         responses
file1.csv    {"Q0":2, "Q1":2, "Q2":2, "Q3":2, .... "Q15":2}
file2.csv    {"Q0":2, "Q1":2, "Q2":2, "Q3":2, .... "Q15":2}

But the whole data of each file is is only one cell each.

I want this:

 Item    responses    file
 Q0      2            file1.csv
 Q1      2            file1.csv
 Q2      2            file1.csv
 ...
 Q15     2            file1.csv
 Q0      2            file2.csv
 Q1      2            file2.csv
 Q2      2            file2.csv
 ...
 Q15     2            file2.csv

Thanks a lot!

Gianluca
  • 43
  • 1
  • 9
  • 1
    Show us the script used to parse the csv files, so we can help you improve it. And avoid to link external pictures. – nicolallias Mar 29 '18 at 13:51
  • you mean the html script? thanks a lot! – Gianluca Mar 29 '18 at 13:54
  • 1
    That doesn't look like a CSV file. It looks like you have JSON data embedded in a text file. When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Make it very clear exactly what "this" is (what type of object it is). – MrFlick Mar 29 '18 at 13:54
  • html script? I thought you were using R script. Nice point @MrFlick. – nicolallias Mar 29 '18 at 13:56
  • https://cran.r-project.org/web/packages/rjson/rjson.pdf I could probably use this package but I don't know the sintax to make this operation – Gianluca Mar 29 '18 at 14:26

1 Answers1

1

1) Read each file and convert it from JSON format into the desired form. Finally combine each one using rbind.

# create test data
cat('{"Q0":1, "Q1":2, "Q2":3, "Q3":4, "Q15":5}\n', file = "file1.csv")
cat('{"Q0":11, "Q1":12, "Q2":13, "Q3":14, "Q15":15}\n', file = "file2.csv")
Files <- c("file1.csv", "file2.csv")

library(rjson)

m <- do.call("rbind", lapply(Files, function(f) {
  x <- fromJSON(file = f)
  cbind(Item = names(x), responses = unname(unlist(x)), file = f)
}))

giving this character matrix:

> m
      Item  responses file       
 [1,] "Q0"  "1"       "file1.csv"
 [2,] "Q1"  "2"       "file1.csv"
 [3,] "Q2"  "3"       "file1.csv"
 [4,] "Q3"  "4"       "file1.csv"
 [5,] "Q15" "5"       "file1.csv"
 [6,] "Q0"  "11"      "file2.csv"
 [7,] "Q1"  "12"      "file2.csv"
 [8,] "Q2"  "13"      "file2.csv"
 [9,] "Q3"  "14"      "file2.csv"
[10,] "Q15" "15"      "file2.csv"

2) If what you meant was that your starting point is not the files themselves but a data frame DF with file and responses columns then:

# form input data frame -- this is the two columns shown in the question
DF <- data.frame(file = Files, responses = sapply(Files, readLines))

dd <- do.call("rbind", by(DF, DF$file, function(d) {
  f <- as.character(d$file)
  x <- fromJSON(json_str = as.character(d$responses))
  data.frame(Item = names(x), responses = unname(unlist(x)), file = f, 
   stringsAsFactors = FALSE)
}))
rownames(dd) <- NULL

giving this data frame:

> dd
   Item responses      file
1    Q0         1 file1.csv
2    Q1         2 file1.csv
3    Q2         3 file1.csv
4    Q3         4 file1.csv
5   Q15         5 file1.csv
6    Q0        11 file2.csv
7    Q1        12 file2.csv
8    Q2        13 file2.csv
9    Q3        14 file2.csv
10  Q15        15 file2.csv
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Sorry, but it doesn't work. I've tried out the function you gave but it attributes for example "Q0" to 1 file for 4 times (the number of files is 4), – Gianluca May 16 '18 at 13:33
  • I have improved the input data from the data in the question which used all the same values so you can see that it does indeed work. Have also changed the second code (starting from `DF`) to produce a data frame so there is an example of both matrix and data frame output. – G. Grothendieck May 16 '18 at 14:11
  • thank you a lot for your kind help! Unfortunately it doesn't work. I start from a R dataframe. f <- as.character(d$file) x <- fromJSON(json_str = as.character(d$responses)), what this d refers to? is it mispelled as "DF"? Anyway i've tried it with a "DF" but it doesn't work – Gianluca May 16 '18 at 17:07
  • Maybe there is a problem in the dataframe (I've not done any "hard" operation to convert from json to R) – Gianluca May 16 '18 at 17:09
  • `d` is the argument to the anonymous function. If you are having problems you will need to provide a reproducible example that results in the error since the reproducible example in the answer does work as you can easily verify by copying and pasting it to your R session. – G. Grothendieck May 16 '18 at 19:00
  • It works!! I was also replacing the "d" with the name of the database. Thank you a lot, i'd invite you for a dinner! – Gianluca May 16 '18 at 19:45