0

I'm just wondering how do I write a For loop in python to repetitively create variables as below:

file1 = '160321-PCU1.csv'
file2 = '160321-PCU2.csv'
file3 = '160321-PCU3.csv'

etc

file24 = '160321-PCU24.csv'

So I write a loop code below. But this gives me Syntax Error.

file = '160321-PCU'

for i in range(1,25):
 'file'+str(i) = '160321-PCU'+str(i)+'.csv'

However this gives me error...

I'd also then to loop the following, where it does the following.... how do I compress the code and do loop properly?

#Read csv file and assign header
df1 = pd.read_csv(gdrive_url+file1+'.csv', sep=';', names=['Date','Time_Decimal','Parameter','Value'])
df2 = pd.read_csv(gdrive_url+file2+'.csv', sep=';', names=['Date','Time_Decimal','Parameter','Value'])
df3 = pd.read_csv(gdrive_url+file2+'.csv', sep=';', names=['Date','Time_Decimal','Parameter','Value'])

etc until it arrives at df24:

df24 = pd.read_csv(gdrive_url+file24+'.csv', sep=';', names=['Date','Time_Decimal','Parameter','Value'])

UPDATE! Hi all, I tried the following and it works, but then I'm still trying to figure out the next step which is to actually combine all dfs...

file = '160321-PCU'

for i in range(1,25):
  filename = file+str(i)+'.csv'
  df = pd.read_csv(gdrive_url+filename, sep=';', names=['Date','Time_Decimal','Parameter','Value'])

Next step is to concatenate df1, df2, ... df24?

Thank you all.

  • Use a dictionary: https://docs.python.org/3/tutorial/datastructures.html#dictionaries – Maurice Meyer Mar 23 '21 at 23:20
  • do you need as many dataframes as files OR you need to combine all of them into a unique dataframe at the end? – DevLounge Mar 23 '21 at 23:36
  • **don't do this**. Don't dynamically create variables, use a *container* like a `list` or a `dict`. A list would be perfectly reasonable here. – juanpa.arrivillaga Mar 23 '21 at 23:41
  • Hi @DevLounge, I updated my question, eventually I need to concat all the dfs... any thoughts on better ways to do this? I also manage to run a for loop. But I'm not sure how to concat all dfs now. Thank you. – Psychefelic Mar 24 '21 at 00:04

1 Answers1

1

You're close, but you've got the wrong idea about variable creation. You don't need to create n number of variables to store n number of values. You'd be much better off using an iterable, like a list or dict. It would be much more memory efficient and easily scalable. For example, if you suddenly had 50 files, you wouldn't want to type out 50 x = y lines.

List Approach

file_names = []
for i in range(1, 25):
    f_name = f'160321-PCU{i}.csv'
    file_names.append(f)
>>> file_names
['160321-PCU1.csv', '160321-PCU2.csv', '160321-PCU3.csv', ..., '160321-PCU24.csv']

You could then do something similar when creating the DataFrames:

dfs = []
for file in file_names:
    df = pd.read_csv(gdrive_url+file, sep=';', names=['Date','Time_Decimal','Parameter','Value'])
    dfs.append(df)

Dictionary Approach

dict uses key/value pairing. Below, the key is the filen and the value is the 160321-PCUn.csv

file_names = {}
for i in range(1, 25):
    f_name = f'160321-PCU{i}.csv'
    file_names[f'file{i}'] = f_name
>>> file_names
{
    'file1': '160321-PCU1.csv',
    'file2': '160321-PCU2.csv', 
    'file3': '160321-PCU3.csv', 
    ..., 
    'file24': '160321-PCU24.csv'
}

Then, when you create the DataFrames, you need to iterate through the .values() of file_names, like so:

for file in file_names.values():
DevLounge
  • 8,313
  • 3
  • 31
  • 44
gmdev
  • 2,725
  • 2
  • 13
  • 28
  • 1
    @Psychefelic are you using a `list` or `dict` to store the files? – gmdev Mar 23 '21 at 23:39
  • @Psychefelic if you're using a `dict`, make sure you're looping through the `values`, not the `dict` itself. It seems as if you're looping through the `dict` itself, which yields the `keys`, not `values`. – gmdev Mar 23 '21 at 23:48
  • if you gdrive is mounted locally then you need to use `file://` as stated in the documentation, and for a url it needs to start with the scheme. – DevLounge Mar 23 '21 at 23:53