0

I have a data set in which I want to find all objects, where -JP- is included. In each column there are rows with this kind of data: DE-JP-20438082/2066/A2@qwinfhcaer.cu/68849.

Tried .iloc and .isin methods as shown below.

screen from jupyter showing data strucutre

Community
  • 1
  • 1
Matadora
  • 17
  • 4

2 Answers2

1
filtered_data = df[df.column_name.str.contains('-JP-')]

This code returns a dataframe with column column_name contains "-JP-"

Kaibo
  • 111
  • 1
  • 4
1

If you want to look for the sign in all the columns, you can do it like following. Let me show it to you on sample data set called df.

df = pd.DataFrame(
    [
        {'var_1': '23', 'var_2': '-JP-', 'var_3':'23'},
        {'var_1': '24', 'var_2': '26', 'var_3':'3'},
        {'var_1': 'ua', 'var_2': 'C', 'var_3':'ABDC'},
        {'var_1': '26', 'var_2': '28', 'var_3':'Aaaa-JP-AAA'},
    ]
)

print(df)
  var_1 var_2        var_3
0    23  -JP-           23
1    24    26            3
2    ua     C         ABDC
3    26    28  Aaaa-JP-AAA

Now, I define function and apply it to my data frame. I make a column, in which you can see, if there is the chosen sign in any column of the data frame row.

def is_sign_anywhere(row, sign, cols):
    if any([sign in row[col] for col in cols]):
        return True
    return False

df['is_sign_in_row'] = df.apply(lambda row: is_sign_anywhere(row, '-JP-', df.columns), axis=1)
print(df)
  var_1 var_2        var_3  is_sign_in_row
0    23  -JP-           23            True
1    24    26            3           False
2    ua     C         ABDC           False
3    26    28  Aaaa-JP-AAA            True

I hope it will help you.

Jaroslav Bezděk
  • 6,967
  • 6
  • 29
  • 46