-1

I have following dataframe

col1    col2
B20.0   B20 | B20-B20
B20.0   B20
A16     A15-A20

I would like to filter the rows such that for the same 'col1' value if col2 has a single value (without '|') choose this row otherwise chose another row. Should regex work here or a better approach work here.

The expected output is:

col1    col2
B20.0   B20
A16     A15-A20
rshar
  • 1,381
  • 10
  • 28
  • Does `-` matter? – doneforaiur Jun 20 '23 at 13:10
  • 1
    Can you give the expected results you seek? is `A15-A20` a range to match on? – JonSG Jun 20 '23 at 13:22
  • 1
    Your question needs a minimal reproducible example consisting of sample input, expected output, actual output, and only the relevant code necessary to reproduce the problem. See [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for best practices related to Pandas questions. – itprorh66 Jun 20 '23 at 13:45
  • What happens if there are either/both A) Two col1 entries with single col2 values or B) Only one col1 entry and that has a multiple value col2? – user19077881 Jun 20 '23 at 14:10

1 Answers1

0

You can do the following:

import pandas as pd
data = {'col1': ['B20.0', 'B20.0', 'A16'],
        'col2': ['B20 | B20-B20', 'B20', 'A15-A20']}
df = pd.DataFrame(data)
df['len'] = df['col2'].apply(lambda x: len(x.split('|')))
df.groupby('col1', as_index=False).agg('min').drop('len', axis=1)
TanjiroLL
  • 1,354
  • 1
  • 5
  • 5