0

I have a dataset which I have extracted from ABAQUS. It is an '.rpt' file. I've previously been using Excel to sort the data but it can be quite tedious - the button which uses text to columns is incredibly helpful though.

My issue: I am trying to use pandas to clean my data. I have managed to find a way to import the data in to Python using:

df = pd.read_csv('.rpt', delim_whitespace=True)

Doing this delimits the data but the result leaves 3000 rows x 1 column. How do I split the data across multiple columns like excel... is there a way?? Thanks guys, I'm fairly new to this so would appreciate the help massively.

apang
  • 93
  • 1
  • 12
  • Have you tried using `read_fwf()` function instead of the `read_csv()`? This answer explains how you can load a .rpt file with read_fwf() https://stackoverflow.com/questions/42208832/how-to-load-space-separate-file-into-pandas-dataframe – sagar1025 Mar 17 '20 at 22:42
  • Is your data a valid csv file? – Stuart Mar 17 '20 at 22:43
  • What does an example row look like? It seems like there isn't any whitespace in the row to use as a delimiter between columns. – Code-Apprentice Mar 17 '20 at 22:45
  • Does this answer your question? [How to load space separate file into pandas dataframe?](https://stackoverflow.com/questions/42208832/how-to-load-space-separate-file-into-pandas-dataframe) – Code-Apprentice Mar 17 '20 at 22:46
  • @sagar1025 No I haven't used that before but i will give that a try - cheers. an example row looks like this '1007 0. 0. -0. NaN NaN NaN NaN NaN' – apang Mar 17 '20 at 22:48
  • I just tried the 'read_fwf()' function. It does pretty much the same thing - puts everything in to one column i.e. leaving 3000 rows x 1 column. I guess my question is pretty much, can I split this data, which currently exits in one column, across several columns? Just like the button in excel. – apang Mar 17 '20 at 22:52
  • A few people here have asked this as well, but can you please provide an example of what a row looks like? what's the output if you print `df.head()` ? – sagar1025 Mar 17 '20 at 23:00
  • @sagar1025 I posted an example row above ^. – apang Mar 17 '20 at 23:24

1 Answers1

0

Using simple pandas dataframe below with 2 rows and 1 column:

df = pd.DataFrame({'A': ['A1, A2, A3', 'B1, B2, B3']})
df

Following code will separate the column into three separate columns like 'Text to columns' in excel.

df['A'].str.split(',',  expand=True)

Output is pandas dataframe with 2 rows with 3 columns.

In this case, values were separated by comma, perhaps you could try using version of this script with the appropriate delimiter for your file?