-2

I dont know why im getting this error, my list has a length of 21, but when it gets to 18 I get the list index out of range error. Help please

import pandas as pd
import os

mainpath = r"D:\Epoca de Cambio\Curso Python Machine Learning\python-ml-course-master\datasets"
filename = r"customer-churn-model\Customer Churn Model.csv"

fullpath = os.path.join(mainpath,filename)

data = pd.read_csv(fullpath,sep=",")

col_desired = ["Account Length","Phone","Eve Charge","Day Calls"]
columns = data.columns.values.tolist()
print(len(columns))

for i in range(len(columns)):
    print(i)
    if (columns[i] in col_desired):
        columns.pop(i)

enter image description here

warped
  • 8,947
  • 3
  • 22
  • 49
  • Some rules on [How to Ask Questions on SO](https://stackoverflow.com/help/how-to-ask). Don't post an image of the output or error, post the actual text. Also, don't post your paths and CSV-reading code, you could just generate a sample dataframe that illustrates the issue ([mcve]). – smci Mar 06 '22 at 21:40
  • [How to remove items from a list while iterating?](https://stackoverflow.com/questions/1207406/how-to-remove-items-from-a-list-while-iterating) – smci Mar 06 '22 at 21:43
  • ...answer: don't iterate, use a list comprehension to create a new list containing only the elements you don't want to remove. – smci Mar 07 '22 at 23:48

3 Answers3

0

Because your for-loop is destroying the list (columns.pop()) as it goes. This is not the right way to do it, see How to remove items from a list while iterating?.

Here's why your code is not doing what you intended:

  • in the first iteration, i=0, list starts with length 21, ends with length 20
  • in the second iteration, i=1, list starts with length 20, ends with length 19
  • ... eventually you will run out of items before the for-loop reaches i=20

Anyway, don't write code like that. What are you trying to achieve? If you want a list of all columns that are not in col_desired, you don't even need any loop, just use a list comprehension:

[col for col in data.columns if not col not in col_desired]

or you could use:

set(df.columns) - set(col_desired)

But, tell us what your code is trying to do, then rewrite it.

smci
  • 32,567
  • 20
  • 113
  • 146
0

This is because you are using pop() statement. Whenever the column[i] is present in col_desired, you are reducing the length of the list by using pop() operation.

Instead you should do this :

for col in col_desired:
   if (col in columns):
     columns.remove(col)
user19930511
  • 299
  • 2
  • 15
0

You can really simplify your code:

col_desired = ['Account Length', 'Phone', 'Eve Charge', 'Day Calls']
data = pd.read_csv(fullpath, usecols=col_desired)
Corralien
  • 109,409
  • 8
  • 28
  • 52