0

I recently posted on how to create multiple variables from a CSV file. The code worked in that I have the variables created. However, the code is creating a bunch of variables all equal to the first row. I need the code to make 1 variable for each row in the dataframe

I need 208000 variables labeled A1:A20800

The code I currently have:

    df = pandas.read_csv(file_name)
    for i in range(1,207999):
        for c in df:
            exec("%s = %s" % ('A' + str(i), c))
            i += 1

I have tried adding additional quotation marks around the second %s (gives a syntax error). I have tried selecting all the rows of the df and using that. Not sure why it isn't working! Every time I print a variable to test if it worked, it is printing the same value, (i.e. A1 = A2 = A3...=A207999) What I actually want is:

A1 = row 1 A2 = row 2 . . .

Thank you in advance for any assistance!

Owen
  • 404
  • 5
  • 13
  • 1
    You almost never want lots of variables. 20,800 variables, not a chance! A list of 20,800 values, everytime! – quamrana May 17 '19 at 20:25
  • I'm trying to do a form of web scraping with selenium which is selecting multiple drop downs and then download the excel files. The only way I've been able to figure out how to do it is to create a variable for each selection of drop downs and then have it go through my for loop. – Candice Sessa May 17 '19 at 20:30
  • Do you have any suggestions on a way to handle a webdriver going through 208000 combinations? – Candice Sessa May 17 '19 at 20:31
  • as @quamrana said you can use a list. Just do a for loop with the list to go through 20800 combinations, instead of having 20800 different variables. Then you can nest your own for loop in it. – Ben Pap May 17 '19 at 20:51
  • When you say you want A1 = row 1, etc, do you want each variable to be a tuple representing the values in the rows? – rcriii May 17 '19 at 20:56
  • @rcriii I would like it to be a tuple and not a list – Candice Sessa May 21 '19 at 19:06
  • Then I'd use @foglerit's answer, but with itertuples: https://stackoverflow.com/questions/9758450/pandas-convert-dataframe-to-array-of-tuples – rcriii May 21 '19 at 19:19

3 Answers3

0

I don't know how pandas reads a file, but I'm guessing it returns an iterable. In that case using islice should allow just 20800 rows to be read:

from itertools import islice

df = pandas.read_csv(file_name)
A = list(islice(df, 20800))

# now access rows: A[index]
quamrana
  • 37,849
  • 12
  • 53
  • 71
0

If you want to create a list containing the values of each row from your DataFrame, you can use the method df.iterrows():

[row[1].to_list() for row in df.iterrows()]

If you still want to create a large number of variables, you can do so in a loop as:

for row in df.iterrows():
    list_with_row_values = row[0].to_list()
    # create your variables here...
foglerit
  • 7,792
  • 8
  • 44
  • 64
0

You are getting the same value for all the variables because you are incrementing i in your inner for loop, so all the Annnn variables are probably set to the last value.

So you want something more like:

In [2]: df = pd.DataFrame({'a':[1,2,3], 'b':[42, 42, 42]})

In [3]: df
Out[3]:
   a   b
0  1  42
1  2  42
2  3  42

In [28]: for c in df:
...:     exec("%s = %s" % ('A' + str(i), c))
...:     i += 1
...:

In [29]: A1
Out[29]:
(0L, a     1
 b    42
 Name: 0, dtype: int64)

In [30]: A1[0]
Out[30]: 0L

In [32]: A1[1]
Out[32]:
a     1
b    42
Name: 0, dtype: int64
rcriii
  • 687
  • 6
  • 9