list index out of range in a python list

Question

I dont know why im getting this error, my list has a length of 21, but when it gets to 18 I get the list index out of range error. Help please

import pandas as pd
import os

mainpath = r"D:\Epoca de Cambio\Curso Python Machine Learning\python-ml-course-master\datasets"
filename = r"customer-churn-model\Customer Churn Model.csv"

fullpath = os.path.join(mainpath,filename)

data = pd.read_csv(fullpath,sep=",")

col_desired = ["Account Length","Phone","Eve Charge","Day Calls"]
columns = data.columns.values.tolist()
print(len(columns))

for i in range(len(columns)):
    print(i)
    if (columns[i] in col_desired):
        columns.pop(i)

enter image description here

Some rules on [How to Ask Questions on SO](https://stackoverflow.com/help/how-to-ask). Don't post an image of the output or error, post the actual text. Also, don't post your paths and CSV-reading code, you could just generate a sample dataframe that illustrates the issue ([mcve]). — smci, Mar 06 '22 at 21:40
[How to remove items from a list while iterating?](https://stackoverflow.com/questions/1207406/how-to-remove-items-from-a-list-while-iterating) — smci, Mar 06 '22 at 21:43
...answer: don't iterate, use a list comprehension to create a new list containing only the elements you don't want to remove. — smci, Mar 07 '22 at 23:48

smci · Answer 1 · 2022-03-06T21:50:28.843

Because your for-loop is destroying the list (columns.pop()) as it goes. This is not the right way to do it, see How to remove items from a list while iterating?.

Here's why your code is not doing what you intended:

in the first iteration, i=0, list starts with length 21, ends with length 20
in the second iteration, i=1, list starts with length 20, ends with length 19
... eventually you will run out of items before the for-loop reaches i=20

Anyway, don't write code like that. What are you trying to achieve? If you want a list of all columns that are not in col_desired, you don't even need any loop, just use a list comprehension:

[col for col in data.columns if not col not in col_desired]

or you could use:

set(df.columns) - set(col_desired)

But, tell us what your code is trying to do, then rewrite it.

score 0 · Answer 2 · answered Mar 06 '22 at 21:35

This is because you are using pop() statement. Whenever the column[i] is present in col_desired, you are reducing the length of the list by using pop() operation.

Instead you should do this :

for col in col_desired:
   if (col in columns):
     columns.remove(col)

Corralien · Answer 3 · 2022-03-06T23:08:18.770

0

You can really simplify your code:

col_desired = ['Account Length', 'Phone', 'Eve Charge', 'Day Calls']
data = pd.read_csv(fullpath, usecols=col_desired)

edited Mar 06 '22 at 23:08

answered Mar 06 '22 at 21:47

Corralien

109,409
8
28
52

You mean `usecols=col_desired` (without the quotes) – rachwa Mar 06 '22 at 22:42

list index out of range in a python list

3 Answers3