1

Say I have a dataframe

df = pd.DataFrame({
    'column_1': ['ABC DEF', 'JKL', 'GHI  ABC', 'ABC ABC', 'DEF GHI', 'DEF', 'DEF DEF', 'ABC GHI DEF ABC'],
    'column_2': [9, 2, 3, 4, 6, 2, 7, 1 ]})
          column_1  column_2
0          ABC DEF         9
1              JKL         2
2         GHI  ABC         3
3          ABC ABC         4
4          DEF GHI         6
5              DEF         2
6          DEF DEF         7
7  ABC GHI DEF ABC         1

I am using extract all to get the matched pattern in my dataframe.

df_['column_1'].str.extractall('(ABC)|(DEF)').groupby(level=0).first()

I get

      0     1
0   ABC   DEF
2   ABC  None
3   ABC  None
4  None   DEF
5  None   DEF
6  None   DEF
7   ABC   DEF

However Expected output was (check index : 1)

      0     1
0   ABC   DEF
1  None  None
2   ABC  None
3   ABC  None
4  None   DEF
5  None   DEF
6  None   DEF
7   ABC   DEF
Himanshu Poddar
  • 7,112
  • 10
  • 47
  • 93
  • Does this answer your question? [Filling the missing index and filling its value with 0](https://stackoverflow.com/questions/50690963/filling-the-missing-index-and-filling-its-value-with-0) – Ynjxsjmh Aug 03 '22 at 15:32
  • Hello, Please read the question again, could it be done comprehensively in one line? Preferrably in the same code I wrote – Himanshu Poddar Aug 03 '22 at 15:35

2 Answers2

2

You can just reindex the new dataframe with the old one's:

out = df['column_1'].str.extractall('(ABC)|(DEF)').groupby(level=0).first().reindex(df.index, fill_value="None")
Himanshu Poddar
  • 7,112
  • 10
  • 47
  • 93
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
1

A simple solution might be to fill the missing index rows with None values, like so:

df.reindex(list(range(df.index.min(), df.index.max()+1)), fill_value="None")

Output:

    0       1
0   ABC     DEF
1   None    None
2   ABC     None
3   ABC     None
4   None    DEF
5   None    DEF
6   None    DEF
7   ABC     DEF
sander
  • 1,340
  • 1
  • 10
  • 20