loops in python for different combinations

Question

I have seven dataframes tbl1851, tbl1861, tbl1871, tbl1881, tbl1891, tbl1901, tbl1911.

Each dataframe has the same fields 'Sex', 'Age', 'Num'.

I want to select a subset from each dataframe by first creating series of boolean.

My code looks like

AM1851 = ((tbl1851.Sex=="M") & (tbl1851.Age>=15) & (tbl1851.Age<999))
AM1861 = ((tbl1861.Sex=="M") & (tbl1861.Age>=15) & (tbl1861.Age<999))
AM1871 = ((tbl1871.Sex=="M") & (tbl1871.Age>=15) & (tbl1871.Age<999))
AM1881 = ((tbl1881.Sex=="M") & (tbl1881.Age>=15) & (tbl1881.Age<999))
AM1891 = ((tbl1891.Sex=="M") & (tbl1891.Age>=15) & (tbl1891.Age<999))
AM1901 = ((tbl1901.Sex=="M") & (tbl1901.Age>=15) & (tbl1901.Age<999))
AM1911 = ((tbl1911.Sex=="M") & (tbl1911.Age>=15) & (tbl1911.Age<999))

I am wondering if there is a looping script that can achieve the same results as the codes listed above?

There are many different selection combinations, so I don't really want to copy and paste and research and replace lots of times.

Possible duplicate of [How do I create a variable number of variables?](https://stackoverflow.com/questions/1373164/how-do-i-create-a-variable-number-of-variables) — G. Anderson, May 10 '19 at 15:35
Your best bet may be to create a function that takes in a dataframe and returns the subset of that dataframe as desired, then apply it to all your DFs according to the link above — G. Anderson, May 10 '19 at 15:37

score 1 · Answer 1 · answered May 10 '19 at 15:38

1

Instead of having each dataframe as a separate variable, put them in a list:

frames = [
    # dataframe 1,
    # dataframe 2,
    # etc.
]

Then you can easily loop through them to create another list:

AMs = []
for frame in frames:
    AMs.append((frame.Sex=="M") & (frame.Age>=15) & (frame.Age<999))

answered May 10 '19 at 15:38

John Gordon

29,573
7
33
58

I agree, a list seems most applicable here. – Reedinationer May 10 '19 at 15:41

score 0 · Answer 2 · answered May 10 '19 at 15:40

I think a function would do that, since each line uses the same tblxxxx object 3 times. I would try something like:

def build_my_data_set(input_data_frame):
    return ((input_data_frame.Sex=="M") & (input_data_frame.Age>=15) & (input_data_frame.Age<999))

my_data_frames = [build_my_data_set(data_item) for data_item in [tbl1851, tbl1861, tbl1871]] # but you would fill the list with every item you want to include

The resulting my_data_frames would represent a list with all the AMxxxx objects you have defined. Thereby condensing them all to a single variable that you would index to find the appropriate item. If you need to associate the xxxx bit you should implement a dictionary instead, and use that as the key!

score 0 · Answer 3 · answered May 10 '19 at 15:56

You could group them into an array and loop through them:

tbls = [tbl1851, tbl1861, tbl1871, tbl1881, tbl1891, tbl1901, tbl1911]
my_func = lambda x : ((x.Sex=="M") & (x.Age>=15) & (x.Age<999))
AMs=[]
for df in k:
   AMs.append(df.apply(my_func))

And if you want to access the element by their names, in stead of creating a list, you could create a dictionary, with the names of the variables as keys to them:

AM_names=["AM1851","AM1861","AM1871","AM1871","AM1881","AM1891","AM1901","AM1911"]
tbls = [tbl1851, tbl1861, tbl1871, tbl1881, tbl1891, tbl1901, tbl1911]
my_func = lambda x : ((x.Sex=="M") & (x.Age>=15) & (x.Age<999))
AMs={}
for idx, df in enumerate(tbls):
   AMs[df[AM_names[idx]]]=df.apply(my_func)

loops in python for different combinations

3 Answers3