1

I'm trying to create a dataframe from the following code:

import pandas as pd
from io import StringIO

df = """
 b_id          Rejected                   Remediation                        user
 366           NaN                        38 days 22:05:06.807430            Test
 367           0 days 00:00:05.285239     NaN                                Test
 368           NaN                        NaN                                Test
 371           NaN                        NaN                                Test
 378           NaN                        451 days 14:59:28.830482           Test
 384           28 days 21:05:16.141263    0 days 00:00:44.999706             Test

"""
df= pd.read_csv(StringIO(df.strip()), sep='|')
df.set_index("b_id", inplace = True)

But I received error:

"None of ['b_id'] are in the columns"

Any friends can help ?

William
  • 3,724
  • 9
  • 43
  • 76

1 Answers1

1

Change your separator in read_csv, here \s\s+ (2 or more spaces) seems appropriate:

df= pd.read_csv(StringIO(df.strip()), sep='\s\s+', engine='python')
df.set_index("b_id", inplace = True)

Output:

                     Rejected               Remediation  user
b_id                                                         
366                       NaN   38 days 22:05:06.807430  Test
367    0 days 00:00:05.285239                       NaN  Test
368                       NaN                       NaN  Test
371                       NaN                       NaN  Test
378                       NaN  451 days 14:59:28.830482  Test
384   28 days 21:05:16.141263    0 days 00:00:44.999706  Test
mozway
  • 194,879
  • 13
  • 39
  • 75
  • https://stackoverflow.com/questions/74880905/pandas-how-to-get-mean-value-of-datetime-timestamp-with-some-conditions Hi friend can you help me with this one too ?This one is much harder. – William Dec 21 '22 at 19:27
  • @William sure, I answered – mozway Dec 21 '22 at 20:06