3

I'm trying to import datasets which have the following filenames (phone1, phone2, etc)

df1 = pd.read_csv(r'C:\Users\...\phone1.csv')
df2 = pd.read_csv(r'C:\Users\...\phone2.csv')
df3 = pd.read_csv(r'C:\Users\...\phone3.csv')
df4 = pd.read_csv(r'C:\Users\...\phone4.csv')
df5 = pd.read_csv(r'C:\Users\...\phone5.csv')
df6 = pd.read_csv(r'C:\Users\...\phone6.csv')

I tried the following code

for i in range(1, 7):
    'df'+i = pd.read_csv(r'C:\Users\siddhn\Desktop\phone'+str(i)+'.csv', engine = 'python')

But I get an error saying that cannot assign to operator

How to import the datasets using a loop.?

siiddd
  • 53
  • 1
  • 4
  • 5
    Don't do that. Instead, create a list `dfs = []` and use `dfs.append( pd.read_csv(...) )`. Now, you can use `dfs[0]` to refer to one specifically, but you can loop through them easily using `for df in dfs:`. – Tim Roberts Jun 19 '21 at 04:54
  • 2
    As @TimRoberts mentioned you can store your DataFrames in list, but if it essential for your DataFrames to have specific names, using dict in a loop also possible. Something like that could be made (inside your for loop): dfs[f"df_{i}"] = pd.read_csv(...) – Antonio Margaretti Jun 19 '21 at 04:58

5 Answers5

2

As @TimRoberts mentioned, you should use a list or a dictto store your dataframes but if you really want to have variable df1, df2, ..., df6, you can use locals() or globals():

for i in range(1, 7):
    locals()[f'df{i}'] = pd.read_csv(fr'C:\Users\siddhn\Desktop\phone{i}.csv')

print(df1)
print(df2)
Corralien
  • 109,409
  • 8
  • 28
  • 52
1

you can store it in a list, here is the idea

var = []
for i in range(1, 7):
    var.append(i)

print(var[0])
print(var[2])

and from the list you can access the value using their key.

cbrr09
  • 189
  • 8
1

Use the inbuilt glob package

from glob import glob

fullpath = f'C:\Users\siddhn\Desktop\phone[1-6].csv'
dfs = [pd.read_csv(file) for file in glob(fullpath)]

print(dfs[0])
Vishnudev Krishnadas
  • 10,679
  • 2
  • 23
  • 55
0

'df'+i return a lvalue, i.e that can be assigned to other variables but cannot store some contents. insted of using

for i in range(1, 7):
    'df'+i = pd.read_csv(r'C:\Users\siddhn\Desktop\phone'+str(i)+'.csv', engine = 'python')

create a List of data_frames as df = [] Now append your data_frames as

for i in range(7):
   df.append(pd.read_csv(r'C:\Users\siddhn\Desktop\phone'+str(i)+'.csv', engine = 'python')

You can then acess data_frames by indexing them like df[0] or df[1].....

0

You can create a list of data frames and then iterate over it or access by index.

df_list = [pd.read_csv(r'C:\Users\siddhn\Desktop\phone'+str(i)+'.csv', engine = 'python') for i in range(1, 7)]
df_list[1]

The variable cannot be an operator thats the reason you get the error.

Felix K Jose
  • 782
  • 7
  • 10