How to make each row of a DataFrame an array?

Question

If this is my data frame how do I convert it to array for each row?

            3        4        5        6       97       98       99      100
0         1.0      2.0      3.0      4.0     95.0     96.0     97.0     98.0
1     50699.0  16302.0  50700.0  16294.0  50735.0  16334.0  50737.0  16335.0
2     57530.0  33436.0  57531.0  33438.0      NaN      NaN      NaN      NaN
3     24014.0  24015.0  34630.0  24016.0      NaN      NaN      NaN      NaN
4     44933.0   2611.0  44936.0   2612.0  44982.0   2631.0  44972.0   2633.0
1792  46712.0  35340.0  46713.0  35341.0  46759.0  35387.0  46760.0  35388.0
1793  61283.0  40276.0  61284.0  40277.0  61330.0  40323.0  61331.0  40324.0
1794      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0
1795      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0
1796  27156.0  48331.0  27157.0  48332.0      NaN      NaN      NaN      NaN

For example, I want it to be [1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0] for the first one.

post sample data and your trial code – Dejene T. Jul 09 '22 at 04:58 — Dejene T., Jul 09 '22 at 04:58

7shoe · Answer 1 · 2022-07-09T06:52:17.453

0

You can loop the dataframe's rows and assign the NumPy arrays dynamically to the global symbol table dict. To loop rows, you can loop the transposes dataframe's columns.

# sample frame
df = pd.DataFrame({'col1' : [np.nan, 1.0, 4.5, 1.3, np.nan, 6.7],
                   'col2' : [-0.4, 0.5, -2.3, np.nan, 1.2, 2.4]})

# transpose 
df = df.transpose()

# dynamical assignment -> global symbol table
for i in df.columns:
    globals()['v{}'.format(i+1)] = np.array(df[i])

v1
>array([ nan, -0.4])

v2
>array([1. , 0.5])

EDIT: Added `tranpose() to provide rows rather than columns as in the initial answer. Thanks, BeRT2me

edited Jul 09 '22 at 06:52

answered Jul 09 '22 at 06:27

7shoe

1,438
1
8
12

2

OP wants row-wise, not column wise. Also, a better way to do what you're doing would be: `df.apply(np.array, result_type='reduce')` – BeRT2me Jul 09 '22 at 06:40

BeRT2me · Answer 2 · 2022-07-09T07:05:44.170

>>> import numpy as np
>>> out = df.apply(np.array, axis=1) # df.apply(list, axis=1)
>>> print(out.to_frame('arrays'))
                                                 arrays
0          [1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0]
1     [50699.0, 16302.0, 50700.0, 16294.0, 50735.0, ...
2     [57530.0, 33436.0, 57531.0, 33438.0, nan, nan,...
3     [24014.0, 24015.0, 34630.0, 24016.0, nan, nan,...
4     [44933.0, 2611.0, 44936.0, 2612.0, 44982.0, 26...
1792  [46712.0, 35340.0, 46713.0, 35341.0, 46759.0, ...
1793  [61283.0, 40276.0, 61284.0, 40277.0, 61330.0, ...
1794           [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
1795           [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
1796  [27156.0, 48331.0, 27157.0, 48332.0, nan, nan,...

>>> print(df.to_numpy().tolist())
[[1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0],
 [50699.0, 16302.0, 50700.0, 16294.0, 50735.0, 16334.0, 50737.0, 16335.0],
 [57530.0, 33436.0, 57531.0, 33438.0, nan, nan, nan, nan],
 [24014.0, 24015.0, 34630.0, 24016.0, nan, nan, nan, nan],
 [44933.0, 2611.0, 44936.0, 2612.0, 44982.0, 2631.0, 44972.0, 2633.0],
 [46712.0, 35340.0, 46713.0, 35341.0, 46759.0, 35387.0, 46760.0, 35388.0],
 [61283.0, 40276.0, 61284.0, 40277.0, 61330.0, 40323.0, 61331.0, 40324.0],
 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
 [27156.0, 48331.0, 27157.0, 48332.0, nan, nan, nan, nan]]

keepAlive · Answer 3 · 2022-07-09T07:11:37.547

0

What about

>>> rows = [*df.to_numpy()]  # list of arrays
>>> rows[0]
array([ 1.,  2.,  3.,  4., 95., 96., 97., 98.])

or since you seem to be using the words list and array interchangeably,

>>> [*rows] = map(list, df.to_numpy())  # list of lists
>>> rows[0]
[1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0]

?

edited Jul 09 '22 at 07:11

answered Jul 09 '22 at 07:06

keepAlive

6,369
5
24
39

`rows = df.to_numpy().tolist()` is definitely better/more performant than `[*rows] = map(list, df.to_numpy())` – BeRT2me Jul 09 '22 at 07:11
@BeRT2me True. The OP seems not to be clear on what he wants though. `map(list|tuple|set, ...)`. – keepAlive Jul 09 '22 at 07:14

score 0 · Answer 4 · answered Jul 09 '22 at 08:10

Is this what you want? You can try this by 'df.values' or 'np.array(df)'.

data = [[1, 10, 100],
    [2, 20, 200],
    [3, None, None]]
df = pd.DataFrame(data, columns=[1, 2, 3])
print(df)
print('------')
print(df.values)
print('------')
print(df.values[0])
print('------')
print(np.array(df))
print('------')
print(np.array(df)[0])

Output:

   1     2      3
0  1  10.0  100.0
1  2  20.0  200.0
2  3   NaN    NaN
------
[[  1.  10. 100.]
[  2.  20. 200.]
[  3.  nan  nan]]
------
[  1.  10. 100.]
------
[[  1.  10. 100.]
[  2.  20. 200.]
[  3.  nan  nan]]
------
[  1.  10. 100.]

How to make each row of a DataFrame an array?

4 Answers4