0

I have a dataset that has information on power plants that are operating (currently functioning) or planned (will be opened in the future). However, for a few cases, there are power plants with the status "operational & planned". These power plants already exist but will expand at a given date. I need to correct this, by doubling the entries of this outlier, i.e., when a power plant is "operational & planned" I need to keep the current row (which I will rename to operational) and create a new identical row in which the status is planned. <br />

Simply put, what I need to do is, when there is an "operational & planned" power plant, I need to split it into operational and planned by creating a new row with the same data points. I am not being able to do this with ifelse. Could someone suggest an approach, please?



Example<br />

Firm   Status   Production<br />
A     Operating      1000<br />
B     Planned        1030<br />
C     Operating      1000<br />
D     Operating & Planned   1000<br />

Output<br />
Firm   Status   Production<br />
A     Operating      1000<br />
B     Planned        1030<br />
C     Operating      1000<br />
D     Operating   1000<br />
D     Planned   1000<br />
Nad Pat
  • 3,129
  • 3
  • 10
  • 20
PCRL
  • 111
  • 5
  • Welcome to SO! Please provide an example dataset and if possible an expected output. You can take look [here](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) to improve your question. – Maël Mar 11 '22 at 09:51
  • @Maël , updated – PCRL Mar 11 '22 at 09:58

2 Answers2

2

You can use the package splitstackshape. You can use this code:

First your data:

df <- data.frame(Firm = c("A", "B", "C", "D"),
                 Status = c("Operating", "Planned", "Operating", " Operating & Planned"),
                 Production = c(1000, 1030, 1000, 1000))

Output:

  Firm               Status Production
1    A            Operating       1000
2    B              Planned       1030
3    C            Operating       1000
4    D  Operating & Planned       1000

Next use this code to split the row in two rows with same value:

library(splitstackshape)
df2 <- cSplit(df, "Status", sep = "&", direction = "long")
df2

Output:

   Firm    Status Production
1:    A Operating       1000
2:    B   Planned       1030
3:    C Operating       1000
4:    D Operating       1000
5:    D   Planned       1000
Quinten
  • 35,235
  • 5
  • 20
  • 53
  • I was not able to install that package, maybe my corporate computer is blocking it. But thanks a lot anyway. – PCRL Mar 11 '22 at 10:32
1

You're looking for tidyr::separate_rows:

library(tidyr)
separate_rows(df, Status, sep = " & ")

Using @Quinten's dataset:

df <- data.frame(Firm = c("A", "B", "C", "D"),
                 Status = c("Operating", "Planned", "Operating", "Operating & Planned"),
                 Production = c(1000, 1030, 1000, 1000))
df %>% 
  separate_rows(Status, sep = " & ")

# A tibble: 5 x 3
  Firm  Status    Production
  <chr> <chr>          <dbl>
1 A     Operating       1000
2 B     Planned         1030
3 C     Operating       1000
4 D     Operating       1000
5 D     Planned         1000
Maël
  • 45,206
  • 3
  • 29
  • 67