1

If this is my data frame how do I convert it to array for each row?

            3        4        5        6       97       98       99      100
0         1.0      2.0      3.0      4.0     95.0     96.0     97.0     98.0
1     50699.0  16302.0  50700.0  16294.0  50735.0  16334.0  50737.0  16335.0
2     57530.0  33436.0  57531.0  33438.0      NaN      NaN      NaN      NaN
3     24014.0  24015.0  34630.0  24016.0      NaN      NaN      NaN      NaN
4     44933.0   2611.0  44936.0   2612.0  44982.0   2631.0  44972.0   2633.0
1792  46712.0  35340.0  46713.0  35341.0  46759.0  35387.0  46760.0  35388.0
1793  61283.0  40276.0  61284.0  40277.0  61330.0  40323.0  61331.0  40324.0
1794      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0
1795      0.0      0.0      0.0      0.0      0.0      0.0      0.0      0.0
1796  27156.0  48331.0  27157.0  48332.0      NaN      NaN      NaN      NaN

For example, I want it to be [1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0] for the first one.

BeRT2me
  • 12,699
  • 2
  • 13
  • 31
SSS311
  • 13
  • 4

4 Answers4

0

You can loop the dataframe's rows and assign the NumPy arrays dynamically to the global symbol table dict. To loop rows, you can loop the transposes dataframe's columns.

# sample frame
df = pd.DataFrame({'col1' : [np.nan, 1.0, 4.5, 1.3, np.nan, 6.7],
                   'col2' : [-0.4, 0.5, -2.3, np.nan, 1.2, 2.4]})

# transpose 
df = df.transpose()

# dynamical assignment -> global symbol table
for i in df.columns:
    globals()['v{}'.format(i+1)] = np.array(df[i])

v1
>array([ nan, -0.4])

v2
>array([1. , 0.5])

EDIT: Added `tranpose() to provide rows rather than columns as in the initial answer. Thanks, BeRT2me

7shoe
  • 1,438
  • 1
  • 8
  • 12
  • 2
    OP wants row-wise, not column wise. Also, a better way to do what you're doing would be: `df.apply(np.array, result_type='reduce')` – BeRT2me Jul 09 '22 at 06:40
0
>>> import numpy as np
>>> out = df.apply(np.array, axis=1) # df.apply(list, axis=1)
>>> print(out.to_frame('arrays'))
                                                 arrays
0          [1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0]
1     [50699.0, 16302.0, 50700.0, 16294.0, 50735.0, ...
2     [57530.0, 33436.0, 57531.0, 33438.0, nan, nan,...
3     [24014.0, 24015.0, 34630.0, 24016.0, nan, nan,...
4     [44933.0, 2611.0, 44936.0, 2612.0, 44982.0, 26...
1792  [46712.0, 35340.0, 46713.0, 35341.0, 46759.0, ...
1793  [61283.0, 40276.0, 61284.0, 40277.0, 61330.0, ...
1794           [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
1795           [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]
1796  [27156.0, 48331.0, 27157.0, 48332.0, nan, nan,...

>>> print(df.to_numpy().tolist())
[[1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0],
 [50699.0, 16302.0, 50700.0, 16294.0, 50735.0, 16334.0, 50737.0, 16335.0],
 [57530.0, 33436.0, 57531.0, 33438.0, nan, nan, nan, nan],
 [24014.0, 24015.0, 34630.0, 24016.0, nan, nan, nan, nan],
 [44933.0, 2611.0, 44936.0, 2612.0, 44982.0, 2631.0, 44972.0, 2633.0],
 [46712.0, 35340.0, 46713.0, 35341.0, 46759.0, 35387.0, 46760.0, 35388.0],
 [61283.0, 40276.0, 61284.0, 40277.0, 61330.0, 40323.0, 61331.0, 40324.0],
 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
 [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
 [27156.0, 48331.0, 27157.0, 48332.0, nan, nan, nan, nan]]
BeRT2me
  • 12,699
  • 2
  • 13
  • 31
0

What about

>>> rows = [*df.to_numpy()]  # list of arrays
>>> rows[0]
array([ 1.,  2.,  3.,  4., 95., 96., 97., 98.])

or since you seem to be using the words list and array interchangeably,

>>> [*rows] = map(list, df.to_numpy())  # list of lists
>>> rows[0]
[1.0, 2.0, 3.0, 4.0, 95.0, 96.0, 97.0, 98.0]

?

keepAlive
  • 6,369
  • 5
  • 24
  • 39
  • `rows = df.to_numpy().tolist()` is definitely better/more performant than `[*rows] = map(list, df.to_numpy())` – BeRT2me Jul 09 '22 at 07:11
  • @BeRT2me True. The OP seems not to be clear on what he wants though. `map(list|tuple|set, ...)`. – keepAlive Jul 09 '22 at 07:14
0

Is this what you want? You can try this by 'df.values' or 'np.array(df)'.

data = [[1, 10, 100],
    [2, 20, 200],
    [3, None, None]]
df = pd.DataFrame(data, columns=[1, 2, 3])
print(df)
print('------')
print(df.values)
print('------')
print(df.values[0])
print('------')
print(np.array(df))
print('------')
print(np.array(df)[0])

Output:

   1     2      3
0  1  10.0  100.0
1  2  20.0  200.0
2  3   NaN    NaN
------
[[  1.  10. 100.]
[  2.  20. 200.]
[  3.  nan  nan]]
------
[  1.  10. 100.]
------
[[  1.  10. 100.]
[  2.  20. 200.]
[  3.  nan  nan]]
------
[  1.  10. 100.]
Almond
  • 1
  • 1