0

I have a data frame like the following (which contains 100 rows, here I have given only 6).

CTFIP   Hispanic    Non Hispanic
6001    323307  1154673
6003    63      1113
6005    4566    33761
6007    29512   189123
6009    4595    41399
6011    11136   10029

I want to generate different data frame for each row which should be like this

HISPn   Freq
1      323307
2     11154673

Where 1= Hispanic, 2= Non-hispanic

How can I generate these data frames in R?

SKM
  • 1
  • 2

3 Answers3

0

Here's one possible solution using tidyr and dplyr. First, the sample data.frame

dd<-read.table(text="CTFIP   Hispanic    NonHispanic
6001    323307  1154673
6003    63  1113
6005    4566    33761
6007    29512   189123
6009    4595    41399
6011    11136   10029", header=T)

then

library(tidyr)
library(dplyr)

dd %>% gather(ethnicity, freq, -CTFIP) %>% 
    mutate(HISPn=ifelse(ethnicity=="Hispanic", 1,2)) %>%
    select(HISPn, freq)

which returns

   HISPn    freq
1      1  323307
2      1      63
3      1    4566
4      1   29512
5      1    4595
6      1   11136
7      2 1154673
8      2    1113
9      2   33761
10     2  189123
11     2   41399
12     2   10029

If you want a list of a bunch of different data frames (which sounds like a bad idea), you can do

dd %>% gather(ethnicity, freq, -CTFIP) %>% 
    mutate(HISPn=ifelse(ethnicity=="Hispanic", 1,2)) %>%
    select(CTFIP , freq, HISPn) %>%
    {split(., .$CTFIP)}

which returns

$`6001`
  CTFIP    freq HISPn
1  6001  323307     1
7  6001 1154673     2
$`6003`
  CTFIP freq HISPn
2  6003   63     1
8  6003 1113     2
...
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • I need data frame for each row not summation of all rows. Thank you – SKM Dec 09 '14 at 01:39
  • Make sure you have `tidyr` installed. – MrFlick Dec 09 '14 at 01:40
  • For some reasons I can't install the tidyr. BTW, the result given here are based on the summation of each columns. But I need different data frames for each row. – SKM Dec 09 '14 at 01:46
  • I've updated my response which creates a list of a bunch of data.frames. However that seems like a very unfriendly data format. – MrFlick Dec 09 '14 at 01:51
0

The tidyr solution will create one giant data.frame where the columns have been transformed into rows. If you really need separate data.frames for each row, then I think you really want a loop. But -- are you sure you really need separate data.frames for each row? My experience has been, every time I thought this, there was an easier way to do it.

Bob
  • 1,274
  • 1
  • 13
  • 26
0

You could do this in base R if you have only 100 rows

lst <- setNames(lapply(seq_len(nrow(df)), function(i) 
       data.frame(HISPn=1:2, Freq=unlist(df[i,-1], use.names=FALSE))), 
                                                  paste0('df', df$CTFIP))

It is better to have it as a list of data.frames. But, if you need separate data.frame objects in the global environment.

list2env(lst, envir=.GlobalEnv)
#<environment: R_GlobalEnv>

df6001
# HISPn    Freq
#1     1  323307
#2     2 1154673

Or you could use reshape from base R

colnames(df)[-1] <- paste('Freq', 1:2, sep='.')
dfL <- reshape(df, direction='long', idvar='CTFIP', 
                             varying=2:3, timevar='HISPn')
row.names(dfL) <- NULL
lst1 <- split(dfL[,-1], df$CTFIP)
names(lst1) <- paste0('df', names(lst1))
list2env(lst1, envir=.GlobalEnv)
df6001
#  HISPn    Freq
#1     1  323307
#7     2 1154673

data

df <- structure(list(CTFIP = c(6001L, 6003L, 6005L, 6007L, 6009L, 6011L
), Hispanic = c(323307L, 63L, 4566L, 29512L, 4595L, 11136L), 
Non.Hispanic = c(1154673L, 1113L, 33761L, 189123L, 41399L, 
10029L)), .Names = c("CTFIP", "Hispanic", "Non.Hispanic"), class = "data.frame",
row.names = c(NA, -6L))
akrun
  • 874,273
  • 37
  • 540
  • 662