Python Remove brackets from arrays

Question

I have a list that contains many arrays.

coef

[array([[1.72158862]]),
 array([[3.28338167]]),
 array([[3.28004542]]),
 array([[6.04194548]])]

Put it into dataframe gives:

azone = pd.DataFrame(
    {'zone': zone,
     'coef': coef
    })

    zone    coef
0   1   [[1.7215886175218464]]
1   2   [[3.283381665861124]]

I wonder if there are ways to remove brackets. I tried tolist() but not giving me a result.

Also for another list:

value

[[array([8.46565297e-294, 1.63877641e-002]),
 array([1.46912451e-220, 2.44570170e-003]),
 array([3.80589351e-227, 4.41242801e-004])]

I want to have only keep the second value. desire output is:

   value
0  1.63877641e-002
1  2.44570170e-003
2  4.41242801e-004

The brackets show us that the arrays have a (1,1) shape. They aren't just a pretty printing device. — hpaulj, Jul 04 '18 at 01:11

score 9 · Answer 1 · answered Jul 04 '18 at 05:40

9

Using Ravel:

coef = [np.array([[1.72158862]]),
        np.array([[3.28338167]]),
        np.array([[3.28004542]]),
        np.array([[6.04194548]])]

coef = np.array(coef).ravel()

print(coef)

array([1.72158862, 3.28338167, 3.28004542, 6.04194548])

Furthermore, if you're not going to modify the returned 1-d array, I suggest you use numpy.ravel, since it doesn't make a copy of the array, but just return a view of the array, which is much faster than numpy.flatten

answered Jul 04 '18 at 05:40

min2bro

4,509
5
29
55

While I agree this is a better solution than mine and should probably be accepted, worth noting that the performance differential is marginal (copying an array is cheap). You will be feeding the array into `pd.DataFrame` which means you'll always need a copy. For 4mio items, I see 2.88s vs 2.75s performance. – jpp Jul 04 '18 at 09:14

score 3 · Answer 2 · answered Jul 04 '18 at 00:42

3

You can use NumPy's flatten method to extract a one-dimensional array from your list of multi-dimensional arrays. For example:

coef = [np.array([[1.72158862]]),
        np.array([[3.28338167]]),
        np.array([[3.28004542]]),
        np.array([[6.04194548]])]

coef = np.array(coef).flatten()

print(coef)

array([1.72158862, 3.28338167, 3.28004542, 6.04194548])

Since NumPy arrays underly Pandas dataframes, you will find your Pandas coef series will now be of dtype float and contain only scalars.

answered Jul 04 '18 at 00:42

jpp

159,742
34
281
339

1

that works! Thank you! – Celine Jul 04 '18 at 00:45
@Celine if this answers your question, consider upvoting and accepting it. – Matthew Story Jul 04 '18 at 04:08

Python Remove brackets from arrays

2 Answers2