I have a list get from database.
[{
'name': 'John',
'score': 30
}, {
'name': 'Jan',
'score': 23
}, {
'name': 'Mike',
'score': 34
}]
Can numpy get the sum of the score? (without loop through 1 by 1 using for in
)
I have a list get from database.
[{
'name': 'John',
'score': 30
}, {
'name': 'Jan',
'score': 23
}, {
'name': 'Mike',
'score': 34
}]
Can numpy get the sum of the score? (without loop through 1 by 1 using for in
)
You can do this by performing a sum
on a list comprehension that collects all the "scores":
sum( [x['score'] for x in MyListOfDictionaries] )
(PS. Numpy is not necessary here)
Edit: as pointed out by @sebastian in the comments, the brackets around the list comprehension aren't necessary since we're plugging this directly into a function, i.e.:
sum(x['score'] for x in MyListOfDictionaries)
this is known as "generator" syntax; from a performance point of view it can be more efficient as it avoids the extra step of allocating memory for the list before processing it.
In [1963]: ll=[{
...: 'name': 'John',
...: 'score': 30
...: }, {
...
...: }]
First the obvious iterative solution
In [1965]: sum([d['score'] for d in ll])
Out[1965]: 87
I can turn it into an object array with:
In [1966]: np.array(ll)
Out[1966]:
array([{'score': 30, 'name': 'John'}, {'score': 23, 'name': 'Jan'},
{'score': 34, 'name': 'Mike'}], dtype=object)
but applying sum
directly to that won't help. But:
In [1967]: from operator import itemgetter
In [1970]: np.frompyfunc(itemgetter('score'),1,1)(ll).sum()
Out[1970]: 87
See my recent answer https://stackoverflow.com/a/38936480/901925 for more on how to access attributes of objects in an array.
frompyfunc
doesn't really get rid of iteration - it just wraps it in a user friendly manner. And the itemgetter
is still doing item['score']
for each dictionary in the list.
This use of itemgetter
is basically the same as:
In [1974]: list(map(itemgetter('score'), ll))
Out[1974]: [30, 23, 34]
List comprehension, map, frompyfunc
are all ways of iterating through the list and getting the score
value from each dictionary.
pandas
may be able to turn this whole list into a dataframe, but don't be fooled by its ease of use - it's doing all of this, and more, under the covers.
NumPy
is a library for numerical arrays processing. You use numbers as columns names, convert your collection to matrix and use NumPy
to make your calculations if you want to use exactly NumPy
and its performance.
I suggest you to try pandas library: it has a type DataFrame which was created to contain and process collections like yours (like dataframes in R
language or tables in MatLab
) — tables with columns and rows. It has sum method which solves your problem.
I guess, it's not the only thing that you want to do with your data and speed is important — I'd recommend to use this library.
Here are related StackOverflow questions, which will show you some abilities of the library:
lst = [{
'name': 'John',
'score': 30
}, {
'name': 'Jan',
'score': 23
}, {
'name': 'Mike',
'score': 34
}]
sum(map(lambda x: x['score'], lst))