1

I have a dataframe column which has paranthesis with it. I would like to have only string inside it.

df:
ID  col1
1   [2023/01/06:12:00:00 AM]
2   [2023/01/06:12:00:00 AM]
3   [2023/01/06:12:00:00 AM] 

Expected:

ID  col1
1   2023/01/06:12:00:00 AM
2   2023/01/06:12:00:00 AM
3   2023/01/06:12:00:00 AM 

I tried with str.findall(r"(?<=[)([^]]+)(?=])") and also some other regex it is not working.

Can anyone please help me?

unicorn
  • 496
  • 6
  • 20
  • 1
    Just for clarity, in the original DF is `df['col1'][0]` the string literal `'[2023/01/06:12:00:00 AM]'` or is it a single-element list containing a string/timestamp `['2023/01/06:12:00:00 AM']`? – G. Anderson Jan 13 '23 at 20:37
  • single-element list containing a string/timestamp – unicorn Jan 13 '23 at 20:38
  • Then just extract the first element from the list? – MatBailie Jan 13 '23 at 20:50
  • Does this answer your question? [Accessing every 1st element of Pandas DataFrame column containing lists](https://stackoverflow.com/questions/37125174/accessing-every-1st-element-of-pandas-dataframe-column-containing-lists) – G. Anderson Jan 13 '23 at 21:38

3 Answers3

1

You can use pandas.Series.astype with pandas.Series.str.strip :

df["col1"] = df["col1"].astype(str).str.strip("['']")

Output : ​

print(df)
   ID                    col1
0   1  2023/01/06:12:00:00 AM
1   2  2023/01/06:12:00:00 AM
2   3  2023/01/06:12:00:00 AM
Timeless
  • 22,580
  • 4
  • 12
  • 30
0

if its "single-element list containing a string/timestamp"

this is how to extract the first element as "MatBailie" said in the comments

df['col1'] = df['col1'].str[0]
Hanna
  • 1,071
  • 1
  • 2
  • 14
0

Source

Let's suppose we have the following dataframe :

import pandas as pd

df1 = pd.DataFrame({'ID': [1, 2, 3],
                   'Year': [['2023/01/06:12:00:00 AM'], ['2023/01/06:12:00:00 AM'], ['2023/01/06:12:00:00 AM']]
                   }).set_index('ID')

Source Visualization :

                        Year
ID                          
1   [2023/01/06:12:00:00 AM]
2   [2023/01/06:12:00:00 AM]
3   [2023/01/06:12:00:00 AM]

Proposed command line

df1 = df1.explode(column='Year')

Result

                      Year
ID                        
1   2023/01/06:12:00:00 AM
2   2023/01/06:12:00:00 AM
3   2023/01/06:12:00:00 AM
Laurent B.
  • 1,653
  • 1
  • 7
  • 16