0

Let's say I have a dataframe with N rows. I want to pick the rows where the row location modulo P gives Q. So for concreteness, let's say P = 7 and Q = 5.

Row 0: 0 mod 7 = 0 (not satisfied)
Row 1: 1 mod 7 = 1 (not satisfied)
...
Row 5: 5 mod 7 = 5 (satisfied)
...
Row 12: 12 mod 7 = 5 (satisfied)

So the rows that are selected will be 5, 12, 19, 26 ....

If Q=0, you can use the slicing method df.iloc[::P]. How does one do it for mod P = Q?

Spinor8
  • 1,587
  • 4
  • 21
  • 48

3 Answers3

5

df.iloc[Q::P] this indicates start at row Q then step in increments of P.

When the first argument isn't given like .iloc[::P] it is implicitly 0 (and the middle one is implicitly end of data frame), you can just specify it to be something other than 0 if that is what you need.

Tadhg McDonald-Jensen
  • 20,699
  • 5
  • 35
  • 59
0

Using the numpy package:

 import numpy as np

    #instantiate new col
    df["satisfied"] = 0
    
    #fill new col based on modulus condition
    df.satisfied = np.where(df.index % P == Q, "(satisfied)", "(not satisfied)")
ldren
  • 159
  • 5
0

code:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(100).reshape(25,4), columns = ['A','B','C','D'])
p = 7
q = 5
a = []

#piece of code for getting the p%7 value and appending in a list
for i in range(df.shape[0]):
    if i%p == q:
        a.append(i)

#printing the p%7 values
print(df.iloc[a,:])

Output:

================
     A   B   C   D
5   20  21  22  23
12  48  49  50  51
19  76  77  78  79
    
Alpha Green
  • 98
  • 10