2

I have a dataframe like:

    x1    y1    x2    y2
0  149  2653  2152  2656
1  149  2465  2152  2468
2  149  1403  2152  1406
3  149  1215  2152  1218
4  170  2692  2170  2695
5  170  2475  2170  2478
6  170  1413  2170  1416
7  170  1285  2170  1288

I need to pair by each two rows from data frame index. i.e., [0,1], [2,3], [4,5], [6,7] etc.,

and extract x1,y1 from first row of the pair x2,y2 from second row of the pair, similarly for each pair of rows.

Sample Output:

[[149,2653,2152,2468],[149,1403,2152,1218],[170,2692,2170,2478],[170,1413,2170,1288]]

Please feel free to ask if it's not clear.

So far I tried grouping by pairs, and tried shift operation. But I didn't manage to make make pair records.

jpp
  • 159,742
  • 34
  • 281
  • 339
Mohamed Thasin ah
  • 10,754
  • 11
  • 52
  • 111
  • @Rakesh - My question is completely different from marked question. First I need a result only for each pair. i.e., if my len(df) has n then my result list contains n/2 elements. next one is i need to slice a first two elements from first row of pair and second two element from next row of pair. I think you didn't properly understand the question – Mohamed Thasin ah Jun 04 '18 at 08:19
  • 1
    **"extract x1,y1 from odd-numbered rows, x2,y2 from even-numbered"** And for *"pair by each two rows from data frame index"* => *"group the odd-numbered rows separate to the even-numbered rows"* – smci Jun 04 '18 at 08:21
  • What output format do you want? A list-of-lists? A dataframe also with columns *x1, y1, x2, y2* but half as many rows? – smci Jun 04 '18 at 08:25
  • @smci - list of list format would be enough – Mohamed Thasin ah Jun 04 '18 at 08:27

3 Answers3

3

Python solution:

Select values of columns by positions to lists:

a = df[['x2', 'y2']].iloc[1::2].values.tolist()
b = df[['x1', 'y1']].iloc[0::2].values.tolist()

And then zip and join together in list comprehension:

L = [y + x for x, y in zip(a, b)]
print (L)
[[149, 2653, 2152, 2468], [149, 1403, 2152, 1218], 
 [170, 2692, 2170, 2478], [170, 1413, 2170, 1288]]

Thank you, @user2285236 for another solution:

L = np.concatenate([df.loc[::2, ['x1', 'y1']], df.loc[1::2, ['x2', 'y2']]], axis=1).tolist()

Pure pandas solution:

First DataFrameGroupBy.shift by each 2 rows:

df[['x2', 'y2']] = df.groupby(np.arange(len(df)) // 2)[['x2', 'y2']].shift(-1)
print (df)
    x1    y1      x2      y2
0  149  2653  2152.0  2468.0
1  149  2465     NaN     NaN
2  149  1403  2152.0  1218.0
3  149  1215     NaN     NaN
4  170  2692  2170.0  2478.0
5  170  2475     NaN     NaN
6  170  1413  2170.0  1288.0
7  170  1285     NaN     NaN

Then remove NaNs rows, convert to int and then to list:

print (df.dropna().astype(int).values.tolist())
[[149, 2653, 2152, 2468], [149, 1403, 2152, 1218], 
 [170, 2692, 2170, 2478], [170, 1413, 2170, 1288]]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

Here's one solution via numpy.hstack. Note it is natural to feed numpy arrays directly to pd.DataFrame, since this is how Pandas stores data internally.

import numpy as np

arr = np.hstack((df[['x1', 'y1']].values[::2],
                 df[['x2', 'y2']].values[1::2]))

res = pd.DataFrame(arr)

print(res)

     0     1     2     3
0  149  2653  2152  2468
1  149  1403  2152  1218
2  170  2692  2170  2478
3  170  1413  2170  1288
jpp
  • 159,742
  • 34
  • 281
  • 339
0

Here's a solution using a custom iterator based on iterrows(), but it's a bit clunky:

import pandas as pd
df = pd.DataFrame( columns=['x1','y1','x2','y2'], data=
    [[149, 2653, 2152, 2656], [149, 2465, 2152, 2468], [149, 1403, 2152, 1406], [149, 1215, 2152, 1218],
    [170, 2692, 2170, 2695], [170, 2475, 2170, 2478], [170, 1413, 2170, 1416], [170, 1285, 2170, 1288]] )

def iter_oddeven_pairs(df):

    row_it = df.iterrows()

    try:
        while True:
            _,row = next(row_it)
            yield row[0:2]
            _,row = next(row_it)
            yield row[2:4]
    except StopIteration:
        pass

print(pd.concat([pair for pair in iter_oddeven_pairs(df)]))
smci
  • 32,567
  • 20
  • 113
  • 146