2

I have some survey data that came in via a Google Form. Google generates a spreadsheet of the responses, but what I need to do is split this data into individual responses, so that a human could read it as though it was an interview published on a blog or something.

So let's say I've got something like this:

1st Question      2nd Question      3rd Question
"Response1 q1"    "Response1 q2"    "Response1 q3"
"Response2 q1"    "Response2 q2"    "Response2 q3"
"Response3 q1"    "Response3 q2"    "Response3 q3"

Where the first row (the column headers) are the questions, and each row is filled with responses to those questions. What I want to produce is something like this:

1st Question
-------
Response1 q1

2nd Question
-------
Response1 q2

3rd Question
-------
Response1 q3

Essentially, for each respondent, I want to make 1 individual file showing their question responses in a linear fashion.

I've given you the specifics of the problem I'm trying to solve in case there's a shortcut for my particular case but, in general, if you've got a data.frame in R that, for whatever reason, you need to traverse row-by-row and then column-by-column, how would one accomplish that short of just writing some for loops?

Zelbinian
  • 3,221
  • 5
  • 20
  • 23

2 Answers2

3

Assuming your data is in a data frame (with strings, not factors), like this:

qdata = structure(list(Q1.text = c("1r.text", "2r.text", "3r.text"), 
    Q2.text = c("1r.text", "2r.text", "3r.text"), Q3.text = c("1r.text", 
    "2r.text", "3r.text"), Q4.text = c("1r.text", "2r.text", 
    "3r.text")), .Names = c("Q1.text", "Q2.text", "Q3.text", 
"Q4.text"), class = "data.frame", row.names = c(NA, -3L))

(Next time, share your data with dput to make its structure easily reproducible.)

I would go for a vectorized solution. Here, I converted to matrix and then paste the column names to the entries, separated by new lines ("\n") and dashes as in your example.

qdata.m = as.matrix(qdata)
# Next, we take advantage of "recycling" of the column names,
# pasting them to the matrix values with a newline "\n" separator.
qdata.m = paste(colnames(qdata.m), "-------", t(qdata.m), sep = "\n")
# Note that matrices are normally used column-wise, so I transpose t()
# to make it row-wise instead.

# cat is good for putting text into a file. We'll separate each
# element with two line breaks.
cat(qdata.m, sep = "\n\n")

# Q1.text
# -------
# 1r.text
# 
# Q2.text
# -------
# 1r.text
# 
# Q3.text
# -------
# 1r.text
# etc.

One of the advantages of using cat here is it can print directly to a file (or you can first open a connection with sink---see their relative help pages for more details).

In the more general case, if you need to go row-by-row and then column-by-column, you could do it with nested for loops. It also seems like your're not really using the data frame structure at that point, so you could just turn it into a vector with unlist() In fact, in this case, that's probably easier than what I did above:

qvect = unlist(qdata)
# pasting much as above, with an order() to sort by the text
# (the order step may take more care with non-dummy text that isn't
#  alphabetical)
qvect = paste(names(qvect), "--------", qvect, sep = "\n")[order(qvect)]

Then you can proceed with cat as above.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • I apologize that the question seems unclear (but I assure you the purpose is realistic). I have chosen not to share the real data I am using because it is proprietary, and I did the best I could at providing an example by typing text into a box. I've attempted to update the question, perhaps it's a bit clearer now? – Zelbinian May 14 '15 at 23:40
  • 1
    It's not clear whether your data is stored in a matrix or data frame, if the first row in your box is column names or just a first row of data, if it's a data frame whether the data type is a factor or character. For your output, do you really just want to print that output to the console? Should it be stored in an R object (as a single string, or a character vector or a list?), written to a text file? – Gregor Thomas May 14 '15 at 23:43
  • 4
    Not sharing real data is just fine, but sharing fake data in a way that makes the underlying structure clear is nice. `dput(fake_data)` is a great way to create copy-pasteable R objects, which is what I did for `qdata` in the answer. [See here for lots of other reproducibility tips](http://stackoverflow.com/q/5963269/903061). – Gregor Thomas May 14 '15 at 23:45
  • 2
    Thanks for the tips. Sometimes it's hard to judge which details will add clarity and which will make it harder to see the problem I'm trying to solve, especially because I'm a little new at this. Learning how to ask good questions is, unfortunately, part of the learning curve for this stuff. – Zelbinian May 14 '15 at 23:50
  • ... but I removed the snarky first line, sorry if I offended ;) Thanks for clearing things up! – Gregor Thomas May 14 '15 at 23:51
  • No, no, that's alright. My skin's pretty thick. :) I must not have been *that* unclear because your solution worked perfectly! ;) If you've got the time, you might add some explanations for how you got there. I looked up all of the stuff you used and worked it out eventually, but if another n00b comes across this, they might not think to do that. – Zelbinian May 14 '15 at 23:56
1

This is the standard way of doing it with loops:

 for(i in 1:nrow(df)){ #traverse rows

    for(ii in 1:ncol(df)){ #traverse cols

    #do whatever

    }
}

where df is your dataframe

maRtin
  • 6,336
  • 11
  • 43
  • 66
  • I did think of doing it that way (and currently am, as we speak), but it just seemed so un-R like. Pretty much every time I've written a loop to do a thing in R someone else has come along and gone "No, vectorize it" or "Use an *apply" function or something. So I thought I'd ask. – Zelbinian May 14 '15 at 23:21