25

I see that Pandas has read_fwf, but does it have something like DataFrame.to_fwf? I'm looking for support for field width, numerical precision, and string justification. It seems that DataFrame.to_csv doesn't do this. numpy.savetxt does, but I wouldn't want to do:

numpy.savetxt('myfile.txt', mydataframe.to_records(), fmt='some format')

That just seems wrong. Your ideas are much appreciated.

jkmacc
  • 6,125
  • 3
  • 30
  • 27
  • 5
    take a look at the `to_string` method so see if you can do what you want. – zach May 13 '13 at 01:02
  • This looks close. It seems that I'd have to give a formatter function for each column if any two float or string columns had different formats. It would do the trick, it just looks a little unwieldy. I'd hoped I was missing something. Thanks! – jkmacc May 13 '13 at 18:35
  • 1
    pandas *df.to_csv* has a *sep =" "* parameter that changes the comma to anything else in this case a space or empty string. That in conjunction with the formatter of the method should do it. – Joop Jun 14 '13 at 10:52
  • 1
    @Joop Actually with the `df.to_csv()` method using empty string as delimiter gives the error `TypeError: delimiter must be set`. – pbreach Jan 16 '15 at 21:48
  • 1
    true.. passing empty string to method would create a mess do ignore my reference to empty string. maybe trying pandas "to_string" method would help. if has formatters parameter that is pretty good – Joop Feb 02 '15 at 11:00

7 Answers7

26

Until someone implements this in pandas, you can use the tabulate package:

import pandas as pd
from tabulate import tabulate

def to_fwf(df, fname):
    content = tabulate(df.values.tolist(), list(df.columns), tablefmt="plain")
    open(fname, "w").write(content)

pd.DataFrame.to_fwf = to_fwf
Matt Kramer
  • 724
  • 9
  • 9
9

For custom format for each column you can set format for whole line. fmt param provides formatting for each line

with open('output.dat') as ofile:
     fmt = '%.0f %02.0f %4.1f %3.0f %4.0f %4.1f %4.0f %4.1f %4.0f'
     np.savetxt(ofile, df.values, fmt=fmt)
Brian Burns
  • 20,575
  • 8
  • 83
  • 77
Amir Uteuov
  • 116
  • 1
  • 3
7

Python, Pandas : write content of DataFrame into text File

The question aboves answer helped me. It is not the best, but until to_fwf exists this will do the trick for me...

np.savetxt(r'c:\data\np.txt', df.values, fmt='%d')

or

np.savetxt(r'c:\data\np.txt', df.values, fmt='%10.5f')
Community
  • 1
  • 1
brandog
  • 1,497
  • 5
  • 20
  • 28
  • IMO this is better than `tabulate` since numpy is included with pandas so doesn't require an additional library – maxymoo Mar 09 '17 at 03:55
7

pandas.DataFrame.to_string() is all you need. The only trick is how to manage the index.

# Write
# df.reset_index(inplace=True)  # uncomment if the index matters
df.to_string(filepath, index=False)

# Read
df = pd.read_fwf(filepath)
# df.set_index(index_names, inplace=True)  # uncomment if the index matters

If the index is a pandas.Index that has no name, reset_index() should assign it to column "index". If it is a pandas.MultiIndex that has no names, it should be assigned to columns ["level_0", "level_1", ...].

Alexandre Huat
  • 806
  • 10
  • 16
  • 1
    Note that `Dataframe.to_string()` has no option to remove space between output columns. – Jaime M. Feb 23 '23 at 11:58
  • @JaimeM. I’m sorry but I don’t understand how it relates to the question. – Alexandre Huat Feb 24 '23 at 13:52
  • 1
    `Dataframe.to_string()` is not able to write an string output without spaces between columns so is not valid to get a fixed-width file row. For example, when the column length is 1 character and left aligned, `to_string()` will output: `x yyy...` but you want `xyyy...`. – Jaime M. Feb 27 '23 at 18:04
  • Oh, okay. I haven’t thought about that use case because it’s very rare. When someone exports a table in CSV, TSV, Excel, fixed-width, or any other format, columns separators are usually expected. It makes the file simpler to read and process. Yet, if all columns have 1 character, we can remove the spaces with `df.to_string().replace(" ", "")`, and then write that string into a text file. – Alexandre Huat Mar 01 '23 at 14:51
  • 1
    Fixed-width format doesn't use columns separator. Because each column has fixed width, so you know where is the begining and the end of each column. Padding with white spaces is used between the value and the limit column, so you can use `str.replace()`. – Jaime M. Mar 02 '23 at 14:42
  • Check [this answer](https://stackoverflow.com/a/9040531/403616) for an overview of the fixed-widhth format advantages. – Jaime M. Mar 02 '23 at 14:47
  • "Fixed-width format doesn't use columns separator": if you want to visualise a table in FWF, it's easier with spaces, even if all columns have only one character. – Alexandre Huat Mar 03 '23 at 12:37
4

I'm sure you found a workaround for this issue but for anyone else who is curious... If you write the DF into a list, you can write it out to a file by giving the 'format as a string'.format(list indices) eg:

df=df.fillna('')
outF = 'output.txt'      
dbOut = open(temp, 'w')
v = df.values.T.tolist()        
for i in range(0,dfRows):       
    dbOut.write(( \
    '{:7.2f}{:>6.2f}{:>2.0f}{:>4.0f}{:>5.0f}{:6.2f}{:6.2f}{:6.2f}{:6.1f {:>15}{:>60}'\
    .format(v[0][i],v[1][i],v[2][i],v[3][i],v[4][i],v[5][i],v[6][i],v[7][i],v[8][i],\
    v[9][i],v[10][i]) ))
    dbOut.write("\n")
dbOut.close

Just make sure to match up each index with the correct format :)

Hope that helps!

leon yin
  • 829
  • 3
  • 10
  • 23
1

found a very simple solution! (Python). In the code snapped I am trying to write a DataFrame to a positional File. "finalDataFrame.values.tolist()" will return u a list in which each row of the DataFrame is turn into an another list just a [['Camry',2019,'Toyota'],['Mustang','2016','Ford']]. after that with the help of for loop and if statement i am trying to set its fix length. rest is obvious!

 with open (FilePath,'w') as f:
    for i in finalDataFrame.values.tolist():
        widths=(0,0,0,0,0,0,0)
        if i[2] == 'nan':
            i[2]=''
            for h in range(7):
                i[2]= i[2] + ' '
        else:
            x=7-len(str(i[2]))
            a=''
            for k in range(x):
               a=a+' '
            i[2]=str(i[2])+a

        if i[3] == '':
            i[3]=''
            for h in range(25):
                i[3]=i[3]+' '
        else:
            x = 25 - len(i[3])
            print(x)
            a = ''
            for k in range(x):
                a = a + ' '
            print(a)
            i[3] = i[3] + a


        i[4] = str(i[4])[:10]

        q="".join("%*s" % i for i in zip(widths, i))
        f.write(q+'\n')
0

Based on others' answer, here is the snippet I wrote, not the best in coding and performance:

import pandas as pd
import pickle
import numpy as np
from tabulate import tabulate


left_align_gen = lambda length, value: eval(r"'{:<<<length>>}'.format('''<<value>>'''[0:<<length>>])".replace('<<length>>', str(length)).replace('<<value>>', str(value)))
right_align_gen = lambda length, value: eval(r"'{:><<length>>}'.format('''<<value>>'''[0:<<length>>])".replace('<<length>>', str(length)).replace('<<value>>', str(value)))

# df = pd.read_pickle("dummy.pkl")
with open("df.pkl", 'rb') as f:
    df = pickle.load(f)

# field width defines here, width of each field
widths=(22, 255, 14, 255, 14, 255, 255, 255, 255, 255, 255, 22, 255, 22, 255, 255, 255, 22, 14, 14, 255, 255, 255, 2, )

# format datetime
df['CREATED_DATE'] = df['CREATED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['LAST_MODIFIED_DATE'] = df['LAST_MODIFIED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['TERMS_ACCEPTED_DATE'] = df['TERMS_ACCEPTED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))
df['PRIVACY_ACCEPTED_DATE'] = df['PRIVACY_ACCEPTED_DATE'].apply(lambda x: x.to_pydatetime().strftime('%Y%m%d%H%M%S'))


# print(type(df.iloc[0]['CREATED_DATE']))
# print(df.iloc[0])
record_line_list = []
# for row in df.iloc[:10].itertuples():
for row in [tuple(x) for x in df.to_records(index=False)]:
    record_line_list.append("".join(left_align_gen(length, value) for length, value in zip(widths, row)))

with open('output.txt', 'w') as f:
    f.write('\n'.join(record_line_list))

Github gist

Chen Du
  • 130
  • 2
  • 10