1

I'm trying to compare a list and a dataframe. If an item in the list equals a value from the first column in the dataframe's row, I would like to print out that list's item with the dataframe's second column value after it.

If no items in the list match any items in the dataframe's second column, I would like to just print out the list's item. I thought a good way to go about this is to iterate through the whole list and dataframe, and if we get to the last row of the dataframe and non of the items match, print out just the list's item instead of the list's item plus the dataframe's second column.

What I need help with is determining the syntax needed to find the last row in the dataframe. Please see my code below.

The dataframe I'm using is 1003 rows X 2 columns. The row labels are numbers 0-1002. The column labels are col1 and col2

#compare items from List against items from dataframe to find matches
for item in List:
    for idx, row in df.iterrows():
        if item in row['col1']:
            print str(count) + " " + str(item) + " " + str(row['col2'])
            count=count+1

        #if it's the last row in dataframe:
            if item not in row['col1']:
                print str(count) + " " + str(item) 
someGuy45
  • 45
  • 1
  • 6
  • 1
    Please provide sample data and expected output. See: http://stackoverflow.com/help/mcve and [How to make good reproducible pandas examples](http://stackoverflow.com/a/20159305/3339965). – root Oct 28 '16 at 17:43

2 Answers2

1

I found out I could use the following line to find the last row in the dataframe

if count==len(df):
someGuy45
  • 45
  • 1
  • 6
1
#compare items from List against items from dataframe to find matches
for item in List:
    last_idx = df.iloc[-1].name
    for idx, row in df.iterrows():
        if item in row['col1']:
            print str(count) + " " + str(item) + " " + str(row['col2'])
            count=count+1

        if last_idx == idx:
            if item not in row['col1']:
                print str(count) + " " + str(item) 

consider df

df = pd.DataFrame(np.arange(16).reshape(-1, 4),
                  pd.MultiIndex.from_product([list('XY'), [2, 5]]),
                  list('ABCD'))

df

enter image description here

last index

df.iloc[-1].name

('Y', 5)

demo

for idx, row in df.iterrows():
    last_idx = df.iloc[-1].name
    if last_idx == idx:
        print(row)

A    12
B    13
C    14
D    15
Name: (Y, 5), dtype: int64
piRSquared
  • 285,575
  • 57
  • 475
  • 624