6

I am trying to export a Pandas dataframe to Excel where all columns are of text format. By default, the pandas.to_excel() function lets Excel decide the data type. Exporting a column with [1,2,'w'] results in the cells containing 1 and 2 to be numeric, and the cell containing 'w' to be text. I'd like all rows in the column to be text (i.e. ['1','2','w']).

I was able to solve the problem by assigning the column I need to be text using the .astype(str). However, if the data is large, I am concerned that I will run into performance issues. If I understand correctly, df[col] = df[col].astype(str) makes a copy of the data, which is not efficient.

import pandas as pd

df = pd.DataFrame({'a':[1,2,'w'], 'b':['x','y','z']})

df['a'] = df['a'].astype(str)

df.to_excel(r'c:\tmp\test.xlsx')

Is there a more efficient way to do this?

I searched SO several times and didn't see anything on this. Forgive me if this has been answered before. This is my first post, and I'm really happy to participate in this cool forum.

Edit: Thanks to the comments I've received, I see that Converting a series of ints to strings - Why is apply much faster than astype? gives me other options to astype(str). This is really useful. I also wanted to know if astype(str) was inefficient because it made a copy of the data, which I now see that it does not.

mherzog
  • 1,085
  • 1
  • 12
  • 24
  • Possible duplicate of [Converting a series of ints to strings - Why is apply much faster than astype?](https://stackoverflow.com/questions/49371629/converting-a-series-of-ints-to-strings-why-is-apply-much-faster-than-astype) – help-ukraine-now Aug 23 '19 at 21:11
  • You could use `copy=False` in the `DataFrame.astype()` function. Then just loop over all your columns with this. Not very elegant, but it does get the job done. – Todd Burus Aug 23 '19 at 21:14
  • The cited "Converting a series of ints to strings" is very helpful. I appreciate all the comments. – mherzog Aug 25 '19 at 02:33

1 Answers1

3

I don't think that you'll not have performance issues with that approach since data is not copied but replaced. You may also convert the whole dataframe into string type using

df = df.astype(str)
aesari
  • 207
  • 1
  • 6
  • I did not realize that this command does a replace, not a copy. Thank you for helping me understand this! – mherzog Aug 25 '19 at 02:32