I have a data file (.csv) that looks like below: (only values)
| First Column | Second Column | Third Column | Fourth Column |
---------------------------------------------------------------
| Yes | 20 | | 0.35 |
| No | 6 | happy | 4.01 |
| Yes | 13 | Okay | 3.1 |
| | 2 | | 1 |
| No | 9 | Hello world | 0.5 |
| Yes | 50 | Puppies | |
Now I want to append the data values from the second, third, and the fourth columns to the first column so that the final output would look like this: (Basically just to stack up the values from each column. Note that there are NULLs and they should be kept. )
| First Column |
----------------
| Yes |
| No |
| Yes |
| |
| No |
| Yes |
| 20 |
| 6 |
| 13 |
| 2 |
| 9 |
| 50 |
| |
| happy |
| Okay |
| |
| Hello world |
| Puppies |
| 0.35 |
| 4.01 |
| 3.1 |
| 1 |
| 0.5 |
| |
I'd like to write in Python with Pandas data frame in iteration because the data file has a few hundred of columns. My initial thoughts on the logic are:
- Count total number of columns
N
. - Count total number of rows
R
. - Copy
R
rows of values from the[a + 1]
-th column, wherea
is 1 initially and increments by 1. - Append the copied values to the end of the first column.
- Iterate this
[N - 1]
times. - Drop all columns except the first column.
If you can help me with the core coding part or if you have any better suggestions, it would be greatly appreciated.