Assign value to column and reset after nth row

Question

I have a pandas dataframe that looks like this...

index	my_column
0
1
2
3
4
5
6

What I need to do is conditionally assign values to 'my_column' depending on the index. The first three rows should have the values 'dog', 'cat', 'bird'. Then, the next three rows should also have 'dog', 'cat', 'bird'. That pattern should apply until the end of the dataset.

index	my_column
0	dog
1	cat
2	bird
3	dog
4	cat
5	bird
6	dog

I've tried the following code to no avail.

for index, row in df.iterrows():
    counter=3
    my_column='dog'
    if counter>3
    break
    else 
    counter+=1
    my_column='cat'
    counter+=1
    if counter>3
    break
    else 
    counter+=1
    my_column='bird'
    if counter>3
    break

As mentioned in other answers, your code has numerous logical and syntactical errors. The `break` keyword is used to exit a loop. You can't ever go back into a loop after breaking out of it, so if you want to repeat over the elements of an iterable, you need to stay in the loop but find a way to reset your counter. This is most easily achieved with the modulo operator: `%`. It's a really nifty operator with a ton of fascinating mathematical properties known collectively as modular arithmetic. — ddejohn, Nov 23 '22 at 03:55

score 0 · Answer 1 · answered Nov 23 '22 at 03:29

Several problems:

Your if syntax is incorrect, you are missing colons and proper indentation
You are breaking out of your loop, terminating it early instead of using an if, elif, else structure
You are trying to update your dataframe while iterating over it.

See this question about why you shouldn't update while you iterate.

Instead, you could do

values = ["dog", "cat", "bird"]

num_values = len(values)

for index in df.index():
    df.at[index, "my_column"] = values[index % num_values]

I tried this solution but was getting the following error: 'Int64Index' object is not callable. I'm running your code inside a function. — ealfons1, Nov 23 '22 at 04:22

ddejohn · Answer 2 · 2022-11-23T03:44:53.250

Advanced indexing

One solution would be to turn dog-cat-bird into a pd.Series and use advanced indexing:

dcb = pd.Series(["dog", "cat", "bird"])

df["my_column"] = dcb[df.index % len(dcb)].reset_index(drop=True)

This works by first creating an index array from df.index % len(dcb):

In [8]: df.index % len(dcb)
Out[8]: Int64Index([0, 1, 2, 0, 1, 2, 0], dtype='int64')

Then, by using advanced indexing, you can select the elements from dcb with that index array:

In [9]: dcb[df.index % len(dcb)]
Out[9]:
0     dog
1     cat
2    bird
0     dog
1     cat
2    bird
0     dog
dtype: object

Finally, notice that the index of the above array repeats. Reset it and drop the old index with .reset_index(drop=True), and finally assign to your dataframe.

Using a generator

Here's an alternate solution using an infinite dog-cat-bird generator:

In [2]: df
Out[2]:
  my_column
0
1
2
3
4
5
6

In [3]: def dog_cat_bird():
   ...:     while True:
   ...:         yield from ("dog", "cat", "bird")
   ...:

In [4]: dcb = dog_cat_bird()

In [5]: df["my_column"].apply(lambda _: next(dcb))
Out[5]:
0     dog
1     cat
2    bird
3     dog
4     cat
5    bird
6     dog
Name: my_column, dtype: object

I tried both methods, but the results was that the assignment of new values skipped a row for some reason. — ealfons1, Nov 23 '22 at 04:23
Sounds like you copy-pasted something incorrectly. No offense, but the code above is proof that the solution does exactly what you asked for, which means that there's something about your specific dataframe that doesn't match the question. It's not entirely clear what you mean by "skipped a row". — ddejohn, Nov 23 '22 at 04:27

score 0 · Accepted Answer · answered Nov 23 '22 at 03:39

0

Create a dictionary:

pet_dict = {0:'dog',
            1:'cat',
            2:'bird'}

You can get the index value using the .name and modulus (%) function by 3 to get your desired result:

df.apply (lambda x: pet_dict[x.name%3],axis=1)
0     dog
1     cat
2    bird
3     dog
4     cat
5    bird
6     dog
7     cat
8    bird
9     dog

answered Nov 23 '22 at 03:39

gputrain

186
2

Thank you. I was able to get the result I need with this solution! – ealfons1 Nov 23 '22 at 04:18

Assign value to column and reset after nth row

3 Answers3

Advanced indexing

Using a generator