0

I have a data frame like so

0   2   5   12.0    1861.0  2230.0  NaN NaN     NaN     NaN     NaN     IMG_0083.JPG
1   2   5   9.0     1201.0  1500.0  1.0 1156.0  1612.0  1584.0  1935.0  IMG_0124.JPG
2   2   5   3.0     1159.0  1391.0  2.0 2957.0  3249.0  1317.0  1588.0  IMG_0352.JPG

I need to export it to csv in the following format

0   2   5   12.0    1861.0  2230.0  IMG_0083.JPG
1   2   5   9.0     1201.0  1500.0  1.0 1156.0  1612.0  1584.0  1935.0  IMG_0124.JPG
2   2   5   3.0     1159.0  1391.0  2.0 2957.0  3249.0  1317.0  1588.0  IMG_0352.JPG

Not just replacing NaN with blank or some other value, but skipping NaN rows entirely. It could be done in the data frame if you know a way, but really it just needs to happen on export to csv. Any help?

EDIT:

For those curious, I'm trying to get my data into .lst format so I can convert it to .rec format using MXNet. I'm trying to use this as a guide to the formatting. I'm trying to train on this data with AWS SageMaker. I'm getting all kinds of errors, see my question here if you know more about that particular topic. I'm guessing on all this, but per the docs I linked to, I think I need the format in my question above.

Frankie
  • 11,508
  • 5
  • 53
  • 60

1 Answers1

0

Here is one way

pd.DataFrame(df.apply(lambda x : sorted(x,key=pd.isnull),1).tolist()).fillna('')
   0   1   2     3       4   ...    7     8     9     10            11
0   0   2   5  12.0  1861.0  ...                                      
1   1   2   5   9.0  1201.0  ...  1156  1612  1584  1935  IMG_0124.JPG
2   2   2   5   3.0  1159.0  ...  2957  3249  1317  1588  IMG_0352.JPG
[3 rows x 12 columns]

Or we can using justify

BENY
  • 317,841
  • 20
  • 164
  • 234
  • I think OP wants an uneven output file. For example, first line with only 5 values, next line with 10, next with 8, etc etc depending on how many non-na it has. – rafaelc Jul 18 '19 at 01:08