Python Pandas DataFrame Manipulation

Asked May 03 '18 at 18:31

Active May 03 '18 at 18:43

Viewed 38 times

I am working with a medical data set in python where each row of data represents a patient visit. Each visit is part of a parent case labeled with a unique Case ID#. What I need to do is create a new df column that iterates through the Case ID# column and assigns the visit number for the respective row of data. The date of each visit would be used to determine the order in which they are numbered. Sample data below:

Visit Date      Case ID#          NEW_COL
1/1/18          1111              Visit 1
1/15/18         1111              Visit 2
1/16/18         2222              Visit 1
1/29/18         1111              Visit 3
2/1/18          2222              Visit 2
.
. 
.
5/3/18          3434              Visit 1

I'll be back shortly to post my code.

My code:

unique_IDs = list(df['Case ID#'].unique())

for i in unique_IDs:
    count = 0
    for j in df['Case ID#']:
        if i == j:
            count = count + 1
            df['NEW_COL'] = 'visit' _ count

I think my problem is I don't quite understand how to iterate through a pandas series. I'm treating list a regular python list above, and I have a feeling that is my mistake.

Thanks in advance!

edited May 03 '18 at 18:43

asked May 03 '18 at 18:31

Andy

3

"I'll be back shortly to post my code." ... where are you going? ;) – cs95 May 03 '18 at 18:32
looks like a pivot problem – BENY May 03 '18 at 18:33

Python Pandas DataFrame Manipulation

0 Answers0