1

Relatively new and trying to split some data with python from a CSV file. My data is structured as follows:

Time| Signature
--------------------
0   | Class1#Method1
1   | Class4#Method5
2   | Class5# <--note that Class 5 has no method

What I try to accomplish is to manipulate the data set such that it becomes

Time| Class  | Method
--------------------
0   | Class1 | Method1
1   | Class4 | Method5

Class 5 is removed in the splitting process since it has no method.

I've tried to iterate over the whole dataset - works, but it is VERY slow when dealing with a 5gb csv file. Does anyone have a faster approach? Speed is everything that counts

S3DEV
  • 8,768
  • 3
  • 31
  • 42
  • We'll need a bit more explanation : Are you using Pandas to deal with your csv rows ? Is your Signature attribute a string like "Class1#Method1" that you first have to separate, or is the separation already done by another way ? – Adept Sep 03 '20 at 09:23
  • Yes, I use a pandas dataframe to deal with the data. My signature attribute is a string like "Class1#Method1". What I try to accomplish is to split Class1#Method1 into Class1 and Method1 (so delimiter is #) and discarding signatures with no method @BeamsAdept –  Sep 03 '20 at 09:30
  • Does this answer your question? [splitting a column by delimiter pandas python](https://stackoverflow.com/questions/37333299/splitting-a-column-by-delimiter-pandas-python) – Stas Buzuluk Sep 03 '20 at 10:44

1 Answers1

1

You can probably use something like df[['Class','Method']] = df['Signature'].str.split('#',expand=True)

(from splitting a column by delimiter pandas python)