0

I have a pandas data frame like the sample below Sample dataset image attached

Based on the Course code column, I have to come up with the Result. My code is working fine for a small set of data (10 thousand). But when I run this for 4 Million-5 Million records the result never comes back.

I am iterating through each row of the data frame and apply the result logic based on the Course Code. I am not sure if there is an efficient way to achieve the same

Below is my code snippet:

    for index, row in df_report.iterrows():
    if AB in row['Course code'][0]:
        if 100 in row['Course code'][0].values():
            row['CoursePrecedence']=list(row['Course code'][0].keys())[0]
            row['Result']='AB code has Grade A'
            df_var = df_var.append(row, ignore_index=True)
        else:
            row['CoursePrecedence']=list(row['Course code'][0].keys())[0]
            row['Result']='Pass with lower grade'
            df_var = df_var.append(row, ignore_index=True)

    elif AC in row['Course code'][0]:
        if 100 in row['Course code'][0].values():
              row['CoursePrecedence']=list(row['Course code'][0].keys())[0]
              row['Result']='A in all subjects'
              df_var = df_var.append(row, ignore_index=True)
        else:
              row['CoursePrecedence']=list(row['Course code'][0].keys())[0]
              row['Result']='something'
              df_var = df_var.append(row, ignore_index=True)
    elif AA in row['Course code'][0]:
        if 100 in row['Course code'][0].values():
              row['CoursePrecedence']=list(row['Course code'][0].keys())[0]
              row['Result']='Pass with Grade A'

              df_var = df_var.append(row, ignore_index=True)
        else:
            row['CoursePrecedence']=list(row['Course code'][0].keys())[0]
            row['Result']='Student did not show up for the exam'

            df_var = df_var.append(row, ignore_index=True)
    else:
          row['Result']='ERROR'
          row['CoursePrecedence']=row['Course code'][0]
          df_var = df_var.append(row, ignore_index=True)


    student Course code
    JohnD   [{AA:100}]
    JohnB   [{AA:100},{AA:100}]
    Tom         [{AA:100}]
    Matt    [{AC:100},{AB:100}]
    Susan   [{AC:100},{AB:100},{21120:100}]**strong text**

    student Course code          Result                CoursePrecedence
    JohnD   [{AA:100}]          Pass with Grade A         AA
    JohnB   [{AA:100},{AA:100}] AB code has Grade A       AB
    Tom     [{AA:100}]          Pass with Grade A         AA
    Matt    [{AC:100},{AB:100}] A in all subject          AC
    Susan   [{AC:100},{AB:100},{21120:100}] A in all subject    AC
--Any help will be much appreciated 
  • kindly share sample data with expected output – sammywemmy Jun 03 '20 at 04:59
  • hi, sample dataset image is attached. "Sample dataset image attached". Result column is the output and Course code is the column use to calculate the Result – happycoding Jun 03 '20 at 08:21
  • 1
    no. no pics. data.[guide](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) – sammywemmy Jun 03 '20 at 08:23
  • student Course code JohnD [{AA:100}] JohnB [{AA:100},{AA:100}] Tom [{AA:100}] Matt [{AC:100},{AB:100}] Susan [{AC:100},{AB:100},{21120:100}] – happycoding Jun 03 '20 at 08:33
  • student Course code JohnD [{AA:100}] JohnB [{AA:100},{AA:100}] Tom [{AA:100}] Matt [{AC:100},{AB:100}] Susan [{AC:100},{AB:100},{21120:100}] Output will be student Course code Result CoursePrecedence JohnD [{AA:100}] Pass A AA JohnB [{AA:100},{AA:100}] AB code A AB Tom [{AA:100}] Pass A AA Matt [{AC:100},{AB:100}] A in all AC Susan [{AC:100},{AB:100},{21120:100}] A in all AC – happycoding Jun 03 '20 at 08:40

0 Answers0