0
import pandas as pd
Dates = [list(['None','11/04/1911', '03/06/1919']),
          list(['None']),
          list(['01/26/1912', '01/15/1918', '02/06/1917']),
          list(['None'])]
df= pd.DataFrame({'Text':['Hey 10.31.11  22|1|13 03-02-1919 d',
                              'things here 01-23-18 or 03-23-1984 then ',
                                  'stuff 1-22-12 01.11.18 or 2.2.17 so so ',
                          'nothing much'],
                          'ID': ['E1','E2', 'E3', 'E4'],
                  'Dates' : Dates,

                          })

which looks like

                             Dates         ID   Text
0   [None, 11/04/1911, 03/06/1919]          E1  Hey 10.31.11 22|1|13 03-02-1919 d
1   [None]                                  E2  things here 01-23-18 or 03-23-1984 then
2   [01/26/1912, 01/15/1918, 02/06/1917]    E3  stuff 1-22-12 01.11.18 or 2.2.17 so so
3   [None]                                  E4  nothing much

I have the following df. My goal is to replace the ['None'] e.g. row 1 and 3 to an empty list [] . My desired output is

         Dates   ID Text  New_Date
0                         [None, 11/04/1911, 03/06/1919]           
1                         []                                       
2                         [01/26/1912, 01/15/1918, 02/06/1917]  
3                         []                                        

I have looked Check for None in pandas dataframe and Python: most idiomatic way to convert None to empty string? and How to replace None only with empty string using pandas?

I have also tried

df['New_Date] = df['Dates].replace('None', list())

How do I achieve my desired output?

ChiBay
  • 189
  • 1
  • 6

2 Answers2

1

you can try code below, it first converts rows with list having only None to None string then replace it with empty list

cond = df.Dates.str.join(",") == "None"
df.Dates.loc[cond] = [[] for _ in range(sum(cond))]
df
Dev Khadka
  • 5,142
  • 4
  • 19
  • 33
  • how would I create a new column with the code above? `df['New_Dates'].loc[cond] = [[] for _ in range(sum(cond))]` ? – ChiBay Oct 06 '19 at 20:40
  • you can make copy of Dates column ```df["new_dates"] = df.Dates.copy()``` then do the operation on new column – Dev Khadka Oct 07 '19 at 02:06
1

Use explode in pandas 0.25.1:

df['New_Date']=df['Dates'].explode().groupby(level=0).apply(lambda x: ','.join(x).split() if x.all() !='None' else [])

0          [None,11/04/1911,03/06/1919]
1                                    []
2    [01/26/1912,01/15/1918,02/06/1917]
3                                    []
Name: New_Date, dtype: object
ansev
  • 30,322
  • 5
  • 17
  • 31