I have a data frame made of tweets and their author, there is a total of 45 authors. I want to divide the data frame into groups of 2 authors at a time such that I can export them later into csv files.
I tried using the following: (given that the authors are in column named 'B' and the tweets are in columns named 'A')
I took the following from this question
df.set_index(keys=['B'],drop=False,inplace=True)
authors = df['B'].unique().tolist()
in order to separate the lists :
dgroups =[]
for i in range(0,len(authors)-1,2):
dgroups.append(df.loc[df.B==authors[i]])
dgroups.extend(df.loc[df.B ==authors[i+1]])
but instead it gives me sub-lists like this:
dgroups = [['A'],['B'],
[tweet,author],
['A'],['B'],
[tweet,author2]]
prior to this I was able to divide them correctly into 45 sub-lists derived from the previous link 1 as follows:
for i in authors:
groups.append(df.loc[df.B==i])
so how would i do that for 2 authors or 3 authors or like that?
EDIT: from @Jonathan Leon answer, i thought i would do the following, which worked but isn't a dynamic solution and is inefficient i guess, especially if n>3 :
dgroups= []
for i in range(2,len(authors)+1,2):
tempset1=[]
tempset2=[]
tempset1 = df.loc[df.B==authors[i-2]]
if(i-1 != len(authors)):
tempset2=df.loc[df.B ==authors[i-1]]
dgroups.append(tempset1.append(tempset2))
else:
dgroups.append(tempset1)