-1

I have a dataframe of 5 columns: address (the location of a bus stop), line1, line2, line3 and line4 (the buses lines that pass through that stop)

Address     line1       line2      line3      line4 
string1     10Ida        12Reg     13Ida      15Reg
string2     19Ida        12Reg      NA         NA
string3     10Ida        12Reg     13Ida       NA
string4     24Reg        12Reg      NA         NA
string5     14Ida         NA        NA         NA 
string6     10Ida        12Reg     13Ida      15Reg

What I want to do is to transform this dataframe into some kind of "long" format with only two variables: Address and line. I mean, for the "string1" address for example, instead of it appearing only one time (only one row), I want it to appear 4 times (in four rows, since there are four line buses that go through that stop) and in each row I want it to have in the "line" column one of the lines that goes through it. Then the same for the "string2" address stop, but since this stop has only two lines that goes through it, I want it to be repeated only two times, the first one with "19Ida" in the line column and the second one with "12Reg" in the line column.

So what I want as a result is something like this:

Address      line       
string1     10Ida        
string1     12Reg
string1     13Ida
string1     15Reg
string2     19Ida        
string2     12Reg        
string3     10Ida       
string3     12Reg         
string3     13Ida   
string4     24Reg
string4     12Reg
string5     14Ida
string6     10Ida
string6     12Reg
string6     13Ida
string6     15Reg     

How can I do these ? Note: the real dataframe has over 1.500 diferent addresses and some addresses has more than 5 lines that goes through it so I need an efficient method.

Thank you.

zx8754
  • 52,746
  • 12
  • 114
  • 209
TomasC8
  • 161
  • 1
  • 7

1 Answers1

1

We can use pivot_longer

library(tidyr)
pivot_longer(df1, cols = -Address, values_to = 'line', values_drop_na = TRUE)

Or with melt

library(data.table)
melt(setDT(df1), id.var = 'Address', value.name = 'line', na.rm = TRUE)
akrun
  • 874,273
  • 37
  • 540
  • 662