I have a dataframe of 5 columns: address (the location of a bus stop), line1, line2, line3 and line4 (the buses lines that pass through that stop)
Address line1 line2 line3 line4
string1 10Ida 12Reg 13Ida 15Reg
string2 19Ida 12Reg NA NA
string3 10Ida 12Reg 13Ida NA
string4 24Reg 12Reg NA NA
string5 14Ida NA NA NA
string6 10Ida 12Reg 13Ida 15Reg
What I want to do is to transform this dataframe into some kind of "long" format with only two variables: Address and line. I mean, for the "string1" address for example, instead of it appearing only one time (only one row), I want it to appear 4 times (in four rows, since there are four line buses that go through that stop) and in each row I want it to have in the "line" column one of the lines that goes through it. Then the same for the "string2" address stop, but since this stop has only two lines that goes through it, I want it to be repeated only two times, the first one with "19Ida" in the line column and the second one with "12Reg" in the line column.
So what I want as a result is something like this:
Address line
string1 10Ida
string1 12Reg
string1 13Ida
string1 15Reg
string2 19Ida
string2 12Reg
string3 10Ida
string3 12Reg
string3 13Ida
string4 24Reg
string4 12Reg
string5 14Ida
string6 10Ida
string6 12Reg
string6 13Ida
string6 15Reg
How can I do these ? Note: the real dataframe has over 1.500 diferent addresses and some addresses has more than 5 lines that goes through it so I need an efficient method.
Thank you.