3

I have a dataframe(df) that looks like below:

  Date           Group       Value
01-04-2029      Saffron      62.78
01-04-2029      Green        75.65
01-05-2019      Saffron      67.89
01-06-2019      Saffron      54.56
01-06-2019      Green        77.00
01-07-2019      Green        71.22

Objective: I want to create two seperate dataframes based on Group. Essentially I am looking for the followings

df_saffron: 
    Date           Group       Value
01-04-2029        Saffron      62.78
01-05-2019        Saffron      67.89
01-06-2019        Saffron      54.56

df_green:
   Date           Group        Value
01-04-2029        Green        75.65
01-06-2019        Green        77.00
01-07-2019        Green        71.22

Specifically, if I use the following code snippet (this thread)

for(i in unique(as.character(df$Group))) {
    nam <- paste("df", i, sep = ".")
    assign(nam, df[df$Group==i,])
    }

I am not getting any dataframe like df.Green or df.Saffron. I mean I am getting

<0 rows> (or 0 -length row.names) 

I have also taken a look at this SO thread, but I am getting errors.

Error in assign(as.character(v, data %>% filter(data$Group == v), envir = .GlobalEnv)) :
argument "value" is missing, with no default
In addition: Warning message:
In data.matrix(data) : NAs introduced by coercion

I am novice to R and thus asking for any clue on where I am missing out?

pythondumb
  • 1,187
  • 1
  • 15
  • 30
  • Your code is working for me, and it gives the right `df.Green` and `df.Saffron`. It might help if you turn the `Group` column into strings with `as.character()` first (I guess they are factors). The best way to share data on StackOverflow is to use `dput()`, so we get exactly the same data frame you are looking at. – Bas Apr 30 '20 at 06:10
  • 1
    Are you sure you have a dataframe and not matrix. The error message gives a hint. What is `class(df)` ? – Ronak Shah Apr 30 '20 at 06:21
  • @RonakShah: The class of the data is dataframe. Sorry I forgot to mention that I had converted the data to `as.data.frame(df)` – pythondumb Apr 30 '20 at 06:46

3 Answers3

4

Use split :

list_data <- split(df, df$Group)

This will give you list of dataframes, if you need separate dataframes.

names(list_data) <- paste0("df_", names(list_data))
list2env(list_data, .GlobalEnv)

To show how you can transform for loop code to lapply.

This is for loop code :

for(i in unique(as.character(df$Group))) {
   nam <- paste("df", i, sep = ".")
   assign(nam, df[df$Group==i,])
   #More code
   #More code
   #More code
}

To change it to lapply :

lapply(split(df, df$Group), function(x) {
   #More code
   #More code
   #More code
})

You can infact also use by which does not require data to be splitted.

by(df, df$Group, function(x) {
    #More code
    #More code
    #More code
})

Instead of accessing data in df_green, df_saffron in for loop you can access it in x in lapply/by.

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
3

This shoudl do it:

for (v in unique(df$Group)){

  tmp <- subset(df, Group == v)
  assign(paste0('df_', tolower(v)), tmp)

}

I always find it easier to create a temporary dataset first rather than squash it all into the 1 assign step

morgan121
  • 2,213
  • 1
  • 15
  • 33
  • I am getting error as `Error in assign(paste0("data.", tolower(v), tmp)) : argument "value" is missing, with no default` – pythondumb Apr 30 '20 at 06:09
  • your bracket is in the wrong spot. Use assign(paste0("data.", tolower(v)), tmp) – morgan121 Apr 30 '20 at 06:10
  • Well, that bracket was a typo. Sorry for the same. Even if I use the same, I am not able to get `df.green` or `df.saffron` – pythondumb Apr 30 '20 at 06:16
  • well there must be something else going on then with your data structure because it works fine for me, sorry. please add the `dput()` of your dat to the question and the exact code you are using (cop/paste it so there are no typos) – morgan121 Apr 30 '20 at 06:21
0

As suggested by RonakShah, I have tried the following:

temp < -NULL
for (i in unique(as.character(Group)){
    nam <- paste("df", i, sep = ".")
    assign(nam, df[df$Group==i,])
   # more code
   result <- data.frame(Date = dates_all,
                        Group = i,
                        Value = all_values,
                        Derived = der_vals) 
   }
 temp <-result
 final <-rbind(temp,result)

But the final dataframe looks like

      Date           Group       Value     Derived
    01-04-2029      Saffron      62.78      22
    01-04-2029      Saffron      75.65      34.46
    01-05-2019      Saffron      67.89      54
    01-06-2019      Saffron      54.56      78
    01-06-2019      Saffron      77.00      29.85
    01-07-2019      Saffron      71.22      45.67

In other words, only Saffron as Group is getting repeated, although the derived values are correct ones. Can any body help on this?

pythondumb
  • 1,187
  • 1
  • 15
  • 30