So I have the following dataframe, but with a valuable amount of rows(100, 1000, etc.):
# | Person1 | Person2 | Age |
---|---|---|---|
1 | Alex | Maria | 20 |
2 | Paul | Peter | 20 |
3 | Klaus | Hans | 30 |
4 | Victor | Otto | 30 |
5 | Gerry | Justin | 30 |
Problem:
Now I want to print separate dataframes, which contain all people, that visit the same age, so the output should look like this:
DF1:
# | Person1 | Person2 | Age |
---|---|---|---|
1 | ALex | Maria | 20 |
2 | Paul | Peter | 20 |
DF2:
# | Person1 | Person2 | Age |
---|---|---|---|
3 | Klaus | Hans | 30 |
4 | Victor | Otto | 30 |
5 | Gerry | Justin | 30 |
I've tried this with the following functions:
Try1:
def groupAge(data):
x = -1
for x in range(len(data)):
#q = len(data[data["Age"] == data.loc[x, "Age"]])
b = data[data["Age"] == data.loc[x,"Age"]]
x = x + 1
print(b,x)
return b
Try2:
def groupAge(data):
x = 0
for x in range(len(data)):
q = len(data[data["Age"] == data.loc[x, "Age"]])
x = x + 1
for k in range(0,q,q):
b = data[data["Age"] == data.loc[k,"Age"]]
print(b)
return b
Neither of them produced the right output. Try1 prints a few groups, and all of them twice, but doesn't go through the entire dataframe and Try2 only prints the first Age "group", also twice.
I can't identify firstly why it always prints the output two times, neither why it doesn't work through the entire dataframe.
Can anyone help?