R: Creating multiple dataframes based on row filter

Question

I have a dataframe(df) that looks like below:

  Date           Group       Value
01-04-2029      Saffron      62.78
01-04-2029      Green        75.65
01-05-2019      Saffron      67.89
01-06-2019      Saffron      54.56
01-06-2019      Green        77.00
01-07-2019      Green        71.22

Objective: I want to create two seperate dataframes based on Group. Essentially I am looking for the followings

df_saffron: 
    Date           Group       Value
01-04-2029        Saffron      62.78
01-05-2019        Saffron      67.89
01-06-2019        Saffron      54.56

df_green:
   Date           Group        Value
01-04-2029        Green        75.65
01-06-2019        Green        77.00
01-07-2019        Green        71.22

Specifically, if I use the following code snippet (this thread)

for(i in unique(as.character(df$Group))) {
    nam <- paste("df", i, sep = ".")
    assign(nam, df[df$Group==i,])
    }

I am not getting any dataframe like df.Green or df.Saffron. I mean I am getting

<0 rows> (or 0 -length row.names)

I have also taken a look at this SO thread, but I am getting errors.

Error in assign(as.character(v, data %>% filter(data$Group == v), envir = .GlobalEnv)) :
argument "value" is missing, with no default
In addition: Warning message:
In data.matrix(data) : NAs introduced by coercion

I am novice to R and thus asking for any clue on where I am missing out?

Your code is working for me, and it gives the right `df.Green` and `df.Saffron`. It might help if you turn the `Group` column into strings with `as.character()` first (I guess they are factors). The best way to share data on StackOverflow is to use `dput()`, so we get exactly the same data frame you are looking at. — Bas, Apr 30 '20 at 06:10
Are you sure you have a dataframe and not matrix. The error message gives a hint. What is `class(df)` ? — Ronak Shah, Apr 30 '20 at 06:21
@RonakShah: The class of the data is dataframe. Sorry I forgot to mention that I had converted the data to `as.data.frame(df)` — pythondumb, Apr 30 '20 at 06:46

Ronak Shah · Answer 1 · 2020-04-30T06:25:51.650

4

Use split :

list_data <- split(df, df$Group)

This will give you list of dataframes, if you need separate dataframes.

names(list_data) <- paste0("df_", names(list_data))
list2env(list_data, .GlobalEnv)

To show how you can transform for loop code to lapply.

This is for loop code :

for(i in unique(as.character(df$Group))) {
   nam <- paste("df", i, sep = ".")
   assign(nam, df[df$Group==i,])
   #More code
   #More code
   #More code
}

To change it to lapply :

lapply(split(df, df$Group), function(x) {
   #More code
   #More code
   #More code
})

You can infact also use by which does not require data to be splitted.

by(df, df$Group, function(x) {
    #More code
    #More code
    #More code
})

Instead of accessing data in df_green, df_saffron in for loop you can access it in x in lapply/by.

edited Apr 30 '20 at 06:25

answered Apr 30 '20 at 06:11

Ronak Shah

377,200
20
156
213

Actually the idea is to perform some operations on each dataframe on the fly.Hence may be `for` loop is needed? – pythondumb Apr 30 '20 at 06:12
2

There are other(better) ways to perform operations on list of dataframes (eg. `lapply`) but difficult to comment on it without knowing details about what you want to do on each dataframe. – Ronak Shah Apr 30 '20 at 06:15
2

@pythondumb No, use the splitted df's and `lapply` the data transformation. – Rui Barradas Apr 30 '20 at 06:15
@Rui: Can you please little elaborate? – pythondumb Apr 30 '20 at 06:19
@RonakShah: I am trying to run a time series model on each of the dataframe – pythondumb Apr 30 '20 at 06:20
I have updated the answer to show how you can change `for` loop code to `lapply` or `by`. – Ronak Shah Apr 30 '20 at 06:41
@RonakShah: Thanks for the same. I have already used the one which you have shown in the second block. But when I am trying to see the `df.Green` or `df.Saffron` the error is coming as `<0 rows> (or 0 -length row.names) `. – pythondumb Apr 30 '20 at 06:44
@RonakShah: `class(df)` is `data.frame` – pythondumb Apr 30 '20 at 06:48

score 3 · Accepted Answer · answered Apr 30 '20 at 06:06

3

This shoudl do it:

for (v in unique(df$Group)){

  tmp <- subset(df, Group == v)
  assign(paste0('df_', tolower(v)), tmp)

}

I always find it easier to create a temporary dataset first rather than squash it all into the 1 assign step

answered Apr 30 '20 at 06:06

morgan121

2,213
1
15
33

I am getting error as `Error in assign(paste0("data.", tolower(v), tmp)) : argument "value" is missing, with no default` – pythondumb Apr 30 '20 at 06:09
your bracket is in the wrong spot. Use assign(paste0("data.", tolower(v)), tmp) – morgan121 Apr 30 '20 at 06:10
Well, that bracket was a typo. Sorry for the same. Even if I use the same, I am not able to get `df.green` or `df.saffron` – pythondumb Apr 30 '20 at 06:16
well there must be something else going on then with your data structure because it works fine for me, sorry. please add the `dput()` of your dat to the question and the exact code you are using (cop/paste it so there are no typos) – morgan121 Apr 30 '20 at 06:21

score 0 · Answer 3 · answered May 07 '20 at 08:02

As suggested by RonakShah, I have tried the following:

temp < -NULL
for (i in unique(as.character(Group)){
    nam <- paste("df", i, sep = ".")
    assign(nam, df[df$Group==i,])
   # more code
   result <- data.frame(Date = dates_all,
                        Group = i,
                        Value = all_values,
                        Derived = der_vals) 
   }
 temp <-result
 final <-rbind(temp,result)

But the final dataframe looks like

      Date           Group       Value     Derived
    01-04-2029      Saffron      62.78      22
    01-04-2029      Saffron      75.65      34.46
    01-05-2019      Saffron      67.89      54
    01-06-2019      Saffron      54.56      78
    01-06-2019      Saffron      77.00      29.85
    01-07-2019      Saffron      71.22      45.67

In other words, only Saffron as Group is getting repeated, although the derived values are correct ones. Can any body help on this?

R: Creating multiple dataframes based on row filter

3 Answers3