0

I have a data frame containing name and father's name as shown in the example below

    name 1
    name 2
    name 3
    father's name 1
    father's name 2
    father's name 3
    name 4
    name 5
    name 6
    father's name 4
    father's name 5
    father's name 6

now I want to create a data frame from this so that it appears as shown below

    name 1 father's name 1
    name 2 father's name 2
    name 3 father's name 3
    name 4 father's name 4
    name 5 father's name 5
    name 6 father's name 6

I want to do it in pandas and I have tried something mentioned in the link below Pandas every nth row but it did not helped much.

gis.rajan
  • 517
  • 3
  • 20

2 Answers2

1

You can create a condition based on the index of that name column then filter the column with boolean condition

mask = df['col'].index.values // 3 % 2 == 0

out = df.loc[mask, ['col']].reset_index(drop=True) + ' ' + df.loc[~mask, ['col']].reset_index(drop=True)
print(out)

                      col
0  name 1 father's name 1
1  name 2 father's name 2
2  name 3 father's name 3
3  name 4 father's name 4
4  name 5 father's name 5
5  name 6 father's name 6
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
0

try:

df
    col1
0   name 1
1   name 2
2   name 3
3   father's name 1
4   father's name 2
5   father's name 3
6   name 4
7   name 5
8   name 6
9   father's name 4
10  father's name 5
11  father's name 6

df['col2'] = df['col1'].str.split('\s').str[-1]
df1 = df.groupby('col2')['col1'].apply(lambda x: ' '.join(x)).reset_index()

df1
    col2    col1
0   1   name 1 father's name 1
1   2   name 2 father's name 2
2   3   name 3 father's name 3
3   4   name 4 father's name 4
4   5   name 5 father's name 5
5   6   name 6 father's name 6

df1 = df1.drop(columns='col2')

df1

    col1
0   name 1 father's name 1
1   name 2 father's name 2
2   name 3 father's name 3
3   name 4 father's name 4
4   name 5 father's name 5
5   name 6 father's name 6

khaled koubaa
  • 836
  • 3
  • 14