Extract data from list and create dictionary

Question

Hello I am new to Python. I have a list which contains

[['year', 'month', 'date_of_month', 'day_of_week', 'births'], 
['1994', '1', '1', '6', '8096'], 
['1994', '1', '2', '7', '7772'], 
['1994', '1', '3', '1', '10142'], ......]

I want to create a dictionary like

days_counts = {
    0: 10000,
    1: 10000,
    2: 10000,
    ...
}

the key is day_of_week values which are from 1 to 7 and value is the total number of births on that day.

Update your question so the dictionary is actually what you expect and not all `10000` please, it's a bit unclear as is. — kabanus, Mar 31 '18 at 11:33
After reviewing some answers below I also suggest your example contains two lines at least of the same day. — kabanus, Mar 31 '18 at 11:44

kabanus · Answer 1 · 2018-03-31T12:06:21.267

One way using a defaultdict:

from collections import defaultdict
bdays = defaultdict(int)
for entry in mylist[1:]:
    bdays[int(entry[3])] += int(entry[4])

where mylist is the list you have. Another way, with less import overkill, and using the fact you actually know the what the keys are, are a short range of integers, so you don't need a dictionary at all:

bdays = [0 for _ in range(7)]
for entry in mylist:
    bdays[int(entry[3])] += int(entry[4])

Or in a more succinct, perhaps less readable fashion:

list((sum(int(x[4]) for x in mylist[1:] if int(x[3]) == i) for i in range(1,8)))

Or insisting on a dict:

dict(((i,sum(int(x[4]) for x in mylist[1:] if int(x[3]) == i)) for i in range(1,8)))

All these ensure also 0 b-day days are listed as well with 0 (perhaps a disadvantage?).

The first solution has the disadvantage (from one point of view at least) that any key will be valid and return 0 by default.

The final two are slower as they iterate mylist 7 times.

score 1 · Answer 2 · answered Mar 31 '18 at 11:21

like this: !?

lst=[['year', 'month', 'date_of_month', 'day_of_week', 'births'], 
['1994', '1', '1', '6', '8096'], 
['1994', '1', '2', '7', '7772'], 
['1994', '1', '3', '1', '10142'],
['1994', '1', '3', '1', '10']
]

d={}
for e in lst:
  if e[3].isdigit():
    if e[3] in d:
      d.update({e[3]:d[e[3]]+int(e[4])})
    else:
      d.update({e[3]:int(e[4])})

for e in d:
  print e, d[e]

Vasilis G. · Answer 3 · 2018-03-31T11:34:00.337

You can also do it this way:

l = [['year', 'month', 'date_of_month', 'day_of_week', 'births'], 
['1994', '1', '1', '6', '8096'], 
['1994', '1', '2', '7', '7772'], 
['1994', '1', '3', '1', '10142']]

births = []
for i in range(1,8):
    births.append([i, sum(int(elem[4]) for elem in l if elem[3] == str(i))])

births = dict(births)

print(births)

Output:

{1: 10142, 2: 0, 3: 0, 4: 0, 5: 0, 6: 8096, 7: 7772}

Or a simplified version of the above:

births = dict([[i, sum(int(elem[4]) for elem in l if elem[3] == str(i))] for i in range(1,8)])

You can also do it using the map function:

births = dict(map(lambda i: [i, sum(int(elem[4]) for elem in l if elem[3] == str(i))], range(1,8)))

Austin · Answer 4 · 2018-03-31T11:52:29.967

0

If you need a 1-liner:

from itertools import islice

lst = [['year', 'month', 'date_of_month', 'day_of_week', 'births'], 
       ['1994', '1', '1', '6', '8096'], 
       ['1994', '1', '2', '7', '7772'], 
       ['1994', '1', '3', '1', '10142']]

print({sublst[3]: sublst[4] for sublst in islice(lst, 1, None)})
# {'6': '8096', '7': '7772', '1': '10142'}

This iterates over all elements of lst skipping the first sub-list, extracting day_of_week and births each time.

For skipping, itertools.islice is appropriate.

Or, simply a slice lst[1:] works here. Thanks to @kabanus

lst = [['year', 'month', 'date_of_month', 'day_of_week', 'births'], 
       ['1994', '1', '1', '6', '8096'], 
       ['1994', '1', '2', '7', '7772'], 
       ['1994', '1', '3', '1', '10142']]

print({sublst[3]: sublst[4] for sublst in lst[1:]})
# {'6': '8096', '7': '7772', '1': '10142'}

edited Mar 31 '18 at 11:52

answered Mar 31 '18 at 11:35

Austin

25,759
4
25
48

There is no benefit I can tell for using `islice` rather than normal slicing (`lst[1:]` ) if used on a regular list. Is there a particular reason you use it? – kabanus Mar 31 '18 at 11:42
@kabanus `lst[1:]` makes a copy and may not work for all sequences, iterators etc. It may work here though. – Austin Mar 31 '18 at 11:45
I don't think it does create a copy. You can read up on https://stackoverflow.com/questions/5131538/slicing-a-list-in-python-without-generating-a-copy. I agree `islice` is appropriate for `iterators` though, I was wondering if there was something else, thanks. – kabanus Mar 31 '18 at 11:47
@kabanus Thanks for the link :) – Austin Mar 31 '18 at 11:53

Extract data from list and create dictionary

4 Answers4