0

I have quite a few dataframes which I defined early in my script and I would like to iterate over them and modify them like so:

for df in [df_fap, df_spf, df_skin, ...]:
     df = df.filter(regex=(assay + r"[0-9]+"))

However this does not work. The values of the dataframes are not modified when the loop finishes. I stumbled upon this post which is slightly similar (except I define my variables beforehand) but it doesn't really offer a solution to my exact problem. Thanks!

quantik
  • 776
  • 12
  • 26
  • No I don't get an error. The dataframes are just not modified at all once the loops finishes. For instance `df_fap` has a column `NC-1` which should be removed (and it is if I do `df_fap = df_fap.filter(regex=(assay + r"[0-9]+"))`) but isn't removed in the above script. – quantik Jun 01 '17 at 15:17
  • @Remolten No it shouldn't. – khelwood Jun 01 '17 at 15:22
  • Yes, because assignment doesn't modify anything. – juanpa.arrivillaga Jun 01 '17 at 16:10

3 Answers3

2

The looping variable df is in turn assigned each element of your list. If you reassign df, then you've made df refer to something else. It doesn't affect the list.

Reassigning the looping variable when iterating through a list doesn't alter the list, let alone altering the variables that were used to populate the list.

Try a list comprehension.

new_list = [df.filter(whatever) for df in (df_fap, df_spf, df_skin, ...)]

If you then also want to reassign your starting variables, you could use:

df_fap, df_spf, df_skin, ... = new_list

You could even do both those operations in one shot:

df_fap, df_spf, df_skin, ... = [df.filter(whatever) for df in (df_fap, df_spf, df_skin, ...)]
khelwood
  • 55,782
  • 14
  • 81
  • 108
  • I'm confused in what exactly is happening in the way I am doing it. The elements in the list are temporarily modified? They're just placeholders? Then when the next element is called what happens to the change that is made? – quantik Jun 01 '17 at 15:21
  • 1
    The way you're doing it, the list is not modified. Only the `df` variable is changed to refer to something different. – khelwood Jun 01 '17 at 15:24
1

so you have your list of variables

[df_fap, df_spf, df_skin, ...]

when you loop you're creating a new variable

for df in [df_fap, df_spf, df_skin, ...]:
    df = value

each iteration (loop) of your for is reseting the value of df, meaning none of your variables will change

the answer khelwood gave means you'll redeclare all of your variables and apply the filter in one

df_fap, df_spf, df_skin, ... = [df_fap, df_spf, df_skin, ...]

try doing something like

a,b = ["apple","banana"]

in your console and khelwood's explaination will make sense

Tomos Williams
  • 1,988
  • 1
  • 13
  • 20
0

Try

for i in range(len(df_list)):
    df_list[i] = df_list[i].filter(...)
Jerry Zhao
  • 184
  • 11
  • Ah I wanted to do this without defining the list beforehand. Not sure if it's possible though. EDIT: Tried. This does not work unfortunately – quantik Jun 01 '17 at 15:18