1

I have a nparray shows below.

df=np.array([[None, 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', None],
 [None, None, None, None, None, None, None, None, None, None, None, None],
 ['E ', 'F ', 'D ', 'F ', 'D ', 'F ', 'D ', 'F ', 'D ', 'F ', 'D ', 'E '],
 ['E ', 'G ', None, 'H ', 'B ', 'H ', None, 'H ', None, 'H ', 'I ', 'E '],
 ['E ', None, 'B ', 'A ', None, 'G ', 'C ', None, 'C ', 'G ', None, 'E '],
 ['E ', 'C ', 'D ', None, 'H ', None, 'I ', 'D ', None, 'J ', 'G ', 'E '],
 ['E ', 'A ', None, 'I ', None, 'A ', 'B ', None, 'G ', 'H ', None, 'E '],
 ['E ', 'F ', 'C ', None, 'I ', None, None, 'F ', None, None, 'J ', 'E '],
 ['E ', 'B ', None, 'D ', None, 'C ', 'B ', None, 'J ', 'J ', None, 'E '],
 ['E ', 'H ', 'C ', None, 'G ', None, 'H ', 'A ', 'C ', None, 'H ', 'E '],
 ['E ', 'C ', None, 'A ', None, 'G ', None, None, 'I ', 'D ', None, 'E '],
 ['E ', None, 'G ', 'F ', 'B ', None, 'I ', None, 'G ', None, 'G ', 'E '],
 ['E ', 'B ', None, 'C ', None, 'H ', None, 'J ', None, 'I ', None, 'E '],
 ['E ', 'C ', 'D ', None, 'F ', 'C ', 'D ', None, 'B ', 'F ', 'G ', 'E ']])

Now I want to get a new dataframe or nparray that contains coordinates of each value. For example:

id c x y
1  A 1 0
2  B 2 0
...
11 E 0 2  
12 F 1 2
...

How to achieve it?

Thank you very much!

Rick
  • 61
  • 7

2 Answers2

4

You can use ndix_unique from the linked answer for a vectorized approach. Then construct a dataframe from the result, explode the (x,y) coordinate lists and assign back:

vals, ixs = ndix_unique(a)
df = pd.DataFrame({'c':vals, 'xy':ixs}).explode('xy')
x, y = zip(*df.xy.values.tolist())
df = df[['c']].assign(x=x, y=y).reset_index(drop=True)

print(df)
      c    x  y
0     A    0  1
1     A    6  1
2     B    0  2
3     B    8  1
4     B   12  1
5     B    4  2
6     C    5  1
7     C    9  2
....
yatu
  • 86,083
  • 12
  • 84
  • 139
1

This is one straight forward way:

import pandas as pd
import numpy as np
data = np.array([[None, 'A', 'B'], ['E', 'A', 'B']])
values = []
for y, row in enumerate(data):
    for x, char in enumerate(row):
        if char is not None:
            values.append({
                "id": 1 + len(values),
                "c": char,
                "x": x,
                "y": y
            })
df = pd.DataFrame(values)
df.set_index('id', inplace=True)
df

Output:

    c   x   y
id          
1   A   1   0
2   B   2   0
3   E   0   1
4   A   1   1
5   B   2   1
ExplodingGayFish
  • 2,807
  • 1
  • 5
  • 14
  • 1
    it's a matter of perspective, but shouldn't `y` be the `row` loop here? So that the `y` represents the "height" of the values? – Hampus Larsson May 25 '20 at 11:20