I have a dataframe with a bunch of Q&A sessions. Each time the speaker changes, the dataframe has a new row. I'm trying to assign question characteristics to the answers so I want to create an ID for each question-answer group. In the example below, I want to increment the id each time a new question is asked (speakertype_id == 3
=> questions; speakertype_id == 4
=> answers). I currently loop through the dataframe like so:
Q_A = pd.DataFrame({'qna_id':[9]*10,
'qnacomponentid':[3,4,5,6,7,8,9,10,11,12],
'speakertype_id':[3,4,3,4,4,4,3,4,3,4]})
group = [0]*len(Q_A)
j = 1
for index,row in enumerate(Q_A.itertuples()):
if row[3] == 3:
j+=1
group[index] = j
Q_A['group'] = group
This gives me the desired output and is much faster than I expected, but this post makes me question whether I should ever iterate over a pandas dataframe. Any thoughts on a better method? Thanks.
**Edit: Expected Output:
qna_id qnacomponentid speakertype_id group
9 3 3 2
9 4 4 2
9 5 3 3
9 6 4 3
9 7 4 3
9 8 4 3
9 9 3 4
9 10 4 4
9 11 3 5
9 12 4 5