0

I have this code that assigns each time different set from the list in the train and test set. I am struggling to understand the code.

df_list = [df1, df2, df3, df4, df5, df6]
for i in range(6):
    train = pd.concat(df_list[0:i] + df_list[i+1:])
    test = df_list[i]

Does it say that it goes from df1[0] all the way until df6[5] and add to this every time one df?

Can you please help me to understanding the code?

deadshot
  • 8,881
  • 4
  • 20
  • 39
redplanet
  • 65
  • 7
  • https://towardsdatascience.com/the-basics-of-indexing-and-slicing-python-lists-2d12c90a94cf – Joe Apr 07 '20 at 11:04
  • https://stackoverflow.com/questions/509211/understanding-slice-notation – Joe Apr 07 '20 at 11:05

2 Answers2

0

For example, if i == 1, train will contain pd.concat(df_list[0:1] + df_list[2:]) that will become pd.concat([df1, df3, df4, df5, df6]) and test will contain df_list[1] that become df2.

So generally, in each iteration train will contain each dataframe in df_list except the dataframe at ith index and test will contain the dataframe at ith index.

Raf
  • 1,628
  • 3
  • 21
  • 40
deadshot
  • 8,881
  • 4
  • 20
  • 39
  • hi thanks for the answer. Just to understand. when you say train will contain each data frame in df_list except the data frame at i th index you mean both the places where i is? train = pd.concat(df_list[0:i] + df_list[i+1:]) – redplanet Apr 07 '20 at 10:33
  • `pd.concat(df_list[0:i] + df_list[i+1:])` this will never include`i` because `df[0:i]` in this `i` is exclusive and other part `df_list[i+1:]` it will never include `i` because it will always start from `i+1` – deadshot Apr 07 '20 at 10:43
  • one last question, please when i=2 we have df_list [0:2]. so here we take df0 and we exclude df3 which is the position 2 right? then we say df_list[i+1]=3:) which means that we take df4 which is the position 3, and then df5,df6 which are in the position 4.5 so we concat we take df0 +df4+df5+df6 for the train df3 goes to test set where is df2. i am confused – redplanet Apr 07 '20 at 10:59
  • oh when we say df_list [0:2] we take 0,1 and we exclude 2 right? – redplanet Apr 07 '20 at 11:01
  • @redplanet yeah `df_list[0:n]` in this `n` will always excluded – deadshot Apr 07 '20 at 11:04
-1

although i'm a beginner at python, I think the code train = pd.concat(df_list[0:i] + df_list[i+1:]) needs to be indented so its part of the i.

the meaning of the code is that from 0 to i (max is 5 because of the range) pd.concat (a function that is possibly imported and called upon here) from the variable df_list 0 to whatever i is in the loop and from the variable df_list i+1 and store that result as the variable test.

this would mean that the variable test is changing every time the loop is run. I think it would be better to use test(i) = df_list[i]

AMRN
  • 1
  • Please test before submitting and post working snippets of code. `test(i)` is clearly not ok since `test` is not a function. If you meant create a list and save values, it should be `test[i]`, or `test.append(df_list[i])`. – Raf Apr 07 '20 at 21:34