Suppose I want to generate index for a large header row automatically using forloop, to prevent writing index for each header.
In a file, I have say a header with lots of fruits name. Each column has a data which I have to access using index for downstream parsing. Rather than preparing index for each fruit name, I want to run a forloop to create the index values on fly to save time.
data =
apple banana orange
genus:x,species:b genus:x,species:b genus:x,species:b
genus:x,species:b genus:x,species:b genus:x,species:b
variety:gala,pinklady,... variety:wild,hybrid... variety:florida,venz,
flavors:tangy,tart,sweet..
global_consumption:....
pricePerUnit:...
seedstocks:.....
insect_resistance:.....
producer:....
# first I convert the header into list like this:
for lines in data:
if 'apple' in lines:
fruits = lines.split('\t')
# this will give me header as list:
# ['apple', 'banana', 'orange']
# then create the index as:
for x in fruits:
str(x) + '_idx' = fruits.index(x)
# this is where the problem is for me .. !??
# .. because this is not valid python method
print(x)
# if made possible, new variable are created as
apple_idx = 0, banana_idx = 1 ... so on
# Now, start mining your data for interested fruits
data = lines.split('\t')
apple_values = data[apple_idx]
for values in apple_values:
do something ......
same for others. I also need to do several other things.
Make sense??
How can this be made possible? in a very simply way.
Post Edit: After doing a lots of reading, I realized that it is possible to create a variable_name
using value(string)
of another varible in bash:
how to use a variable's value as other variable's name in bash
But, not possible in python as I had thought. My gut feeling is that, it is possible to prepare this method within python programming language (if hacked or if author decided), but it is also possible that author of python thought and knew about possible dangers or using this method.
- The danger is that you always want
variable_name
to be visible in the written python script. Preparing a dynamic variable_names would have been nice, but it could lead to a problem when tracing back, if any problem arose. - Since, the variable name was never typed in it would be a nightmare to track and debug if any problem arose (especially in large programme), say when the variable_value was like
2BetaTheta
or*ping^pong
which is not a valid variable_name.This is my thought. Please other people can chime in as to Why this capability was not introduced in python? - Dict method over comes this issue since we have the record of the origin of the
variable_name
, but still the issue with valid vs. invalid variable_name doesn't go away.
I am going to take some the provided answer using dict method
and see if I can work out a very simple-comprehensive way of making this possible.
Thanks everyone !