Here is a code that I've written, which creates some increments of 3 variables to be used within p-value calculations, where the three variables are loc values or indicators or whatever the numbers mean:
i = 0
k = 2
j = 2
result = []
df = pd.DataFrame()
while j < data.shape[1]:
tstat, data_stat = ttest_ind_from_stats(data.loc[i][k], data.loc[i + 1][k], data.loc[i + 2][k], data.loc[i][j],
data.loc[i + 1][j], data.loc[i + 2][j])
result.append([data_stat])
j+=1
if j == 8:
j = 2
i = i + 3
if i == data.shape[0]:
k = k + 1
i = 0
if k > 7:
break
data_result = pd.DataFrame(result)
Where data.shape[0] = 150
and data.shape[1] = 8
.
This code creates the correct p-values but as 1800 rows x 1 column dataframe. However, I would like to break the resulting df so that the code produces six different dataframes, each with data.shape[1]-2
number of columns (so 6 columns). With some example screenshots:
1) The data_result
dataframe from my current code:
1
0.658
0.1067
0.777
0.459
0.3307
1
0.622
0.4178
0.3158
0.7674
0.7426
2) What I want:
col1 col2 col3 col4 col5 col6
1 0.658 0.1067 0.777 0.459 0.3307
1 0.622 0.4178 0.3158 0.7674 0.7426
There should be six of the above dataframes from the code.
3) I would then preferably add a column to the left of each dataframe, which would be used to insert the placeholder values for each row (screenshot omitted). This step is just optional.
So basically, I am dividing the resulting dataframe by every 6 rows, transpose them from single column to six columns, then repeat for the next six values, and so on. I thought maybe creating a Series or a new df until j = 8
then append to the overall df by row, but wasn't sure if this would work or be possible. Thanks!
edit)
so basically, I want to create six separate dataframes, each with 50 rows x 6 column shape. My current dataframe has 1800 rows x 1 column.