1

input data:

obj  number
1    433
2    342
3    111
4    345

output data:

true

tried:

df[df['number'].isin([111,433])]
df.number.isin([111,433])
df.number.any() == 111 or 433

but none of them is giving me the result I'm looking for

I'm trying to parse a file and any time the number is in 1 dataframe i would like to run some special algorithm to reformat it. For example if 111 is in the numbers column i would like to add a colum with layout-name where the value 'layout1' should appear

J. Doe
  • 459
  • 4
  • 26
  • duplicate of what? How is this exactly helping anyone? Guess what i was searching on google for 1h and didn't find an answer. So it may be that the other question has another terminologie so some of us cannot find it – J. Doe Sep 05 '19 at 11:04
  • reopened, because there is more questions like answered in dupe. – jezrael Sep 05 '19 at 11:08

2 Answers2

1

You are close, test values of scalar with Series.any for test at least one True:

print ((df.number == 111).any())
True

For test multiple values with OR use Series.isin with any:

df.number.isin([111, 222]).any()

And if need test consecutive values - 111 and in next row 222:

print (df)
   obj  number
0    1     433
1    2     342
2    3     111
3    4     222

print (((df['number'] == 111) & (df['number'].shift(-1) == 222)).any())
True
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • @DeepSpace - For me working nice, not sure what is problem – jezrael Sep 05 '19 at 11:06
  • what if i need to search for a pattern? lets say there need to be 111 and 222 in it. not just 1 of them so instead of your recomended OR an and – J. Doe Sep 05 '19 at 11:54
  • 1
    @J.Doe - it is more complicated, general solution is [here](https://stackoverflow.com/a/49005205/2901002) – jezrael Sep 05 '19 at 11:55
  • 1
    no simple way if i just need to know if 111 and 222 are in 1 dataframe? no sequencing ore something. just is it in the dataframe column or not – J. Doe Sep 05 '19 at 12:00
  • 1
    @J.Doe - it is `print ((df.number == 111).any() and (df.number == 222).any())`. Here is `and`, because working with one scalr from one comapring with another scalar – jezrael Sep 05 '19 at 12:02
  • thank you, so if i have 100 values to check then i need to write 100 and statements? – J. Doe Sep 05 '19 at 12:03
  • 1
    @J.Doe - no, then use another solution, give me a sec – jezrael Sep 05 '19 at 12:04
  • 1
    @J.Doe - use sets - function `issubset` like `print(set([111, 222]).issubset(set(df['number'])))` – jezrael Sep 05 '19 at 12:06
  • @J.Doe - Also very nice explanation about sets is [here](https://www.programiz.com/python-programming/set) – jezrael Sep 05 '19 at 12:11
1

You make it too complicated, you can here check if any of the values is 111 with:

(df['number'] == 111).any()

or shorter:

df['number'].eq(111).any()

If you want to check that two (or more values) occur in a series with:

>>> import numpy as np
>>> np.any(df[:,None] == np.array([[111, 222]]), axis=0).all()
False

If the number of items to check against is relatively small, this should do the trick.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555