How to L2 Normalize a list of lists in Python using Sklearn

Question

s2 = [[0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194], [0.2, 0.4892574205256839, 0.2, 0.2, 0.383258146374831], [0.3193817886456925, 0.16666666666666666, 0.16666666666666666, 0.16666666666666666, 0.3193817886456925, 0.3193817886456925], [0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194]]

from sklearn.preprocessing import normalize
X = normalize(s2)

this is throwing error:

ValueError: setting an array element with a sequence.

How to L2 Normalize a list of lists in Python using Sklearn.

Your issue is your lists don't all have the same lengths, so it can't be converted to an array properly. What did you expect it to normalize to? — Paritosh Singh, Mar 27 '20 at 07:59

Saurabh · Accepted Answer · 2020-03-27T22:36:13.077

Since I don't have enough reputation to comment; hence posting it as an answer.

Let's quickly look at your datapoint.

I have converted the given datapoint into NumPy array. Since it doesn't have the same length, so it will look like.

>>> n2 = np.array([[0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194], [0.2, 0.4892574205256839, 0.2, 0.2, 0.383258146374831], [0.3193817886456925, 0.16666666666666666, 0.16666666666666666, 0.16666666666666666, 0.3193817886456925, 0.3193817886456925], [0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194]])
>>> n2
array([list([0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194]),
       list([0.2, 0.4892574205256839, 0.2, 0.2, 0.383258146374831]),
       list([0.3193817886456925, 0.16666666666666666, 0.16666666666666666, 0.16666666666666666, 0.3193817886456925, 0.3193817886456925]),
       list([0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194])],
      dtype=object)

And you can see here that converted values are not in Sequence of Values and to achieve this you need to keep the same length for the internal list ( looks like 0.16666666666666666 is copied multiple time in your array; if not then fix the length), it will look like

>>> n3 = np.array([[0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194], [0.2, 0.4892574205256839, 0.2, 0.2, 0.383258146374831], [0.3193817886456925, 0.16666666666666666, 0.16666666666666666, 0.16666666666666666, 0.319381788645692], [0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194]])
>>> n3
array([[0.2       , 0.2       , 0.2       , 0.30216512, 0.24462871],
       [0.2       , 0.48925742, 0.2       , 0.2       , 0.38325815],
       [0.31938179, 0.16666667, 0.16666667, 0.16666667, 0.31938179],
       [0.2       , 0.2       , 0.2       , 0.30216512, 0.24462871]])

As you can see now n3 has become a sequence of values.

and if you use normalize function, it simply works

>>> X = normalize(n3)
>>> X
array([[0.38408524, 0.38408524, 0.38408524, 0.58028582, 0.46979139],
       [0.28108867, 0.6876236 , 0.28108867, 0.28108867, 0.53864762],
       [0.59581303, 0.31091996, 0.31091996, 0.31091996, 0.59581303],
       [0.38408524, 0.38408524, 0.38408524, 0.58028582, 0.46979139]])

How to use NumPy array to avoid this issue, please have a look at this SO link ValueError: setting an array element with a sequence

score 0 · Answer 2 · answered Mar 27 '20 at 22:47

Important: I removed one element from the 3rd list in order for all lists to have the same length.

I did that cause I really believe that it's a copy-paste error. If not, comment below and I will modify my answer.

import numpy as np

s2 = [[0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194], [0.2, 0.4892574205256839, 0.2, 0.2, 0.383258146374831], [0.3193817886456925, 0.16666666666666666, 0.16666666666666666, 0.3193817886456925, 0.3193817886456925], [0.2, 0.2, 0.2, 0.3021651247531982, 0.24462871026284194]]

X = normalize(np.array(s2))

How to L2 Normalize a list of lists in Python using Sklearn

2 Answers2