0

I have a pandas dataframe and I want to loop over the last column "n" times based on a condition.

import random as random
import pandas as pd
p = 0.5
df = pd.DataFrame()
start = []
for i in range(5)):
  if random.random() < p:
    start.append("0")
  else:
    start.append("1")
df['start'] = start
print(df['start'])

Essentially, I want to loop over the final column "n" times and if the value is 0, change it to 1 with probability p so the results become the new final column. (I am simulating on-off every time unit with probability p).

e.g. after one iteration, the dataframe would look something like:

0 0
0 1
1 1
0 0
0 1

after two:

0 0 1
0 1 1
1 1 1
0 0 0
0 1 1

What is the best way to do this?

Sorry if I am asking this wrong, I have been trying to google for a solution for hours and coming up empty.

PharmDataSci
  • 115
  • 7

1 Answers1

1

Like this. Append col with name 1, 2, ...

# continue from question code ...
# colname is 1, 2, ...
for col in range(1, 5):
    tmp = []
    for i in range(5):
        # check final col
        if df.iloc[i,col-1:col][0] == "0":
            if random.random() < p:
                tmp.append("0")
            else:
                tmp.append("1")
        else:  # == 1
            tmp.append("1")
    # append new col
    df[str(col)]  = tmp
print(df)

# initial
    s
0   0
1   1
2   0
3   0
4   0

# result
    s   1   2   3   4
0   0   0   1   1   1
1   0   0   0   0   1
2   0   0   1   1   1
3   1   1   1   1   1
4   0   0   0   0   0
shimo
  • 2,156
  • 4
  • 17
  • 21
  • This is doing exactly what I was after, thanks. What does this do (How is it selecting the final colum)? ``` if df.iloc[i,col-1:col][0] == "0": ``` – PharmDataSci Jan 24 '20 at 12:53
  • 1
    df.iloc[row, col] is selection with row and col. [0] is just extracting the value, this time str. – shimo Jan 24 '20 at 12:57
  • What does col-1:col do? – PharmDataSci Jan 24 '20 at 13:06
  • 1
    col-1:col is slicing. (note in this case, col is integer.) When col = 1, slice column[0:1] for each column. I used this for accessing 1 value in df column. – shimo Jan 25 '20 at 00:59