1

I have a DataFrame d1 with strings and missing values, such as

d1 = pd.DataFrame([["A", "B", "C"],
                   ["D", np.nan, "F"],
                   ["G", "H", "I"],],
                  columns=[1, 2, 3])

enter image description here

whose columns I would like to aggregate in single-row DataFrame d2:

enter image description here

Folllowing suggestions in a previous post, tried the following code:

d2 = d1.agg(''.join).to_frame().T

Still, as one of the values in d1 was missing (and, thus, a float), I got the following error:

TypeError: sequence item 1: expected str instance, float found

Would you know how to change missing values in DataFrames to another data type such as string?

blackraven
  • 5,284
  • 7
  • 19
  • 45
Incognito
  • 331
  • 1
  • 5
  • 14
  • Does this answer your question? [Pandas dataframe fillna() only some columns in place](https://stackoverflow.com/questions/38134012/pandas-dataframe-fillna-only-some-columns-in-place) – Vladimir Fokow Aug 27 '22 at 00:04

3 Answers3

1

You can fill the missing value with an empty string:

d1.fillna('')

So the overall code becomes

d1.fillna('').agg(''.join).to_frame().T
     1   2    3
0  ADG  BH  CFI
Vladimir Fokow
  • 3,728
  • 2
  • 5
  • 27
1

You can do a replace for nan values into ''

d1 = pd.DataFrame([["A", "B", "C"],
                   ["D", np.nan, "F"],
                   ["G", "H", "I"],],
                  columns=['1', '2', '3'])
d1.replace(np.nan,'',inplace=True)
d2 = d1.agg(''.join,axis=1).to_frame().T
1

The null value is causing the error, so fill it with empty string. You could try this:

d2 = pd.DataFrame(d1.fillna('').agg(''.join)).T
print(d2)

     1   2    3
0  ADG  BH  CFI
danPho
  • 87
  • 1
  • 6