R: automatically copy large amounts of data

Question

I'm new in R, I have some data like this:

Yes No age    color  place
12  5  12-18  red    right
2   33 19-30  yellow left
...

I need to create new database,

answer   age    color  place
Yes      12-18  red    right 
Yes      12-18  red    right
...                          (12 times)

No       12-18  red    right
No       12-18  red    right
...                          (5 times)

How could I do?

@RichardErickson, how is that a possible duplicate? The linked question is about taking an `ftable` object and converting it to a `data.frame`. It has nothing to do with expanding the values out. Each part of the solution might be a duplicate, but I don't recall this particular combination as a question. — A5C1D2H2I1M1N2O1R2T1, Apr 27 '15 at 02:54
@AnandaMahto, You're correct. I'm sorry for messing up the flag. — Richard Erickson, Apr 27 '15 at 03:02

score 4 · Accepted Answer · answered Apr 27 '15 at 02:39

4

I would use a combination of expandRows from my "splitstackshape" package and melt from "reshape2".

Assuming your data are called "mydf", try:

library(splitstackshape)
library(reshape2)
dfLong <- expandRows(
  melt(mydf, measure.vars = c("Yes", "No"), 
       variable.name = "answer"), "value")

Here are the first 20 rows:

head(dfLong, 20)
#        age  color place answer
# 1    12-18    red right    Yes
# 1.1  12-18    red right    Yes
# 1.2  12-18    red right    Yes
# 1.3  12-18    red right    Yes
# 1.4  12-18    red right    Yes
# 1.5  12-18    red right    Yes
# 1.6  12-18    red right    Yes
# 1.7  12-18    red right    Yes
# 1.8  12-18    red right    Yes
# 1.9  12-18    red right    Yes
# 1.10 12-18    red right    Yes
# 1.11 12-18    red right    Yes
# 2    19-30 yellow  left    Yes
# 2.1  19-30 yellow  left    Yes
# 3    12-18    red right     No
# 3.1  12-18    red right     No
# 3.2  12-18    red right     No
# 3.3  12-18    red right     No
# 3.4  12-18    red right     No
# 4    19-30 yellow  left     No

## Confirm that there are the correct number of combinations
table(dfLong$age, dfLong$answer)
#        
#         Yes No
#   12-18  12  5
#   19-30   2 33

The order is a little bit different from what you've posted--doing all "Yes" answers first and then the "No" answers instead of alternating between them.

answered Apr 27 '15 at 02:39

A5C1D2H2I1M1N2O1R2T1

190,393
28
405
485

Or with `gather()` from `tidyr`: `expandRows(tidyr::gather(df, answer, value, -age, -color, -place), "value")` – Steven Beaupré Apr 27 '15 at 03:04
THX. If I want to set the new database output as .csv, how should i do? – Jeffery Chen Apr 27 '15 at 03:05
@JefferyChen: Take a look at `?write.table()` – Steven Beaupré Apr 27 '15 at 03:09
@JefferyChen, no problem. If this answered your question, do consider accepting it by clicking on the hollow check-mark to the left of the answer area. – A5C1D2H2I1M1N2O1R2T1 Apr 27 '15 at 03:43
@StevenBeaupré, isn't `tidyr::gather` essentially `reshape2::melt`? See `tidyr:::gather_.data.frame`. – A5C1D2H2I1M1N2O1R2T1 Apr 27 '15 at 04:41
@AnandaMahto Yes, pretty similar but as mentionned in http://blog.rstudio.org/2014/07/22/introducing-tidyr/: "*just as reshape2 did less than reshape, tidyr does less than reshape2. It’s designed specifically for tidying data, not general reshaping. In particular, existing methods only work for data frames, and tidyr never aggregates. This makes each function in tidyr simpler: each function does one thing well.*" – Steven Beaupré Apr 27 '15 at 12:08

R: automatically copy large amounts of data

1 Answers1