pairing up three different lists

Question

I have the following list of dicts:

 authorvals= [
        {
            "author": "author1",
            "year": [
                "2016"
            ],
            "value1": 4.0
        },
        {
            "author": "author2",
            "year": [
                "2016"
            ],
            "value1": 2.0
        },
        {
            "author": "author1",
            "year": [
                "2016"
            ],
            "value3": 1.0
        },
        {
            "author": "author1",
            "year": [
                "2016"
            ],
            "value2": 4.0
        },
        {
            "author": "author2",
            "year": [
                "2016"
            ],
            "value2": 2.0
        }]

Now I want lists from the dict as follows:

val_list=["value1","value2","value3"]
num_list=[[4,2],[4,2],[1,0]]
auth_list=["author1","author2"]

I want the dict as three separate lists.

First list is the keys "value"+x in the dict
Second list is the value of that particular key for auth1 and auth2
Third list is just the list of authors

I have tried the following code:

num_list=[]
auth_list=[]
val_list=[]
for item in authors_dict: 
        if item['author'] not in auth_list: 
            auth_list.append(item['author']) 
            for k in item.keys(): 
                if k.startswith("value") and k not in val_list: 
                    val_list.append(k) 
                    val_list.sort() 
                    for v in val_list:
                        temp_val_list = [] 
                        for i in authors_dict: 
                            try: 
                                val = i[v] 
                                temp_val_list.append(val) 
                            except: 
                                pass
                        if len(temp_val_list) > 0: 
                            num_list.append(temp_val_list) 
                            print(val_list) 
                            print(num_list) 
                            print(auth_list)

but this is not what I want to accomplish the 0 in the last list of num_list is because there is no value for author2.If there is no value,then 0 should be printed

kluvin · Accepted Answer · 2020-12-03T18:09:45.117

Collect authors in a set
Collect keys and values in a defaultdict
Postprocess the values by adding padding upto the maxlength.

from collections import defaultdict

DATA_INDEX = 2

def collect(records):
    vals = defaultdict(list)
    authors = set()
    for record in records:
        for i, (k, v) in enumerate(record.items()):
            if k == 'author':
                authors.add(v)
            elif i == DATA_INDEX:
                vals[k].append(int(v))

    return (list(authors),
            list(vals.keys()),
            list(pad_by_max_len(vals.values())))



def pad_by_max_len(lol):
    lengths = map(len, lol)
    padlength = max(*lengths)
    padded = map(lambda l: pad(l, padlength), lol)
    
    return padded

def pad(l, padlength):
    return (l + [0] * padlength)[:padlength]

print(collect(authorvals))

Giving:

(
    ['author2', 'author1'],
    ['value1', 'value3', 'value2'],
    [[4, 2], [1, 0], [4, 2]]
)

no gurantee that an item starts with 'value'.It is just an example.It could be data or process or anything else @kluvin — Sam, Dec 03 '20 at 14:47
@Sam, please see the revised answer. I am afraid there could be a problem with ordering, since dictionaries don't normally guarantee this, however. Edit: https://stackoverflow.com/questions/39980323/are-dictionaries-ordered-in-python-3-6, looks like it isn't a problem :) — kluvin, Dec 03 '20 at 14:50

score 0 · Answer 2 · answered Dec 03 '20 at 13:12

Wasn't super clear on two things so I made assumptions:

Ordering of values doesn't matter
All values should appear as many times as the maximum occurring value. If not, add zeros to the num_list for that value.

The following code should work to that end:

val_list=[]
num_list=[]
auth_list=[]
max_values = 0

for d in authorvals:
    if d["author"] not in auth_list:
        auth_list.append(d["author"])
    for key in d:
        if key.startswith("value"):
            if key not in val_list:
                val_list.append(key)
                num_list.append([d[key]])
                max_values = max(max_values, 1)
            else:
                idx = val_list.index(key)
                num_list[idx].append(d[key])
                max_values = max(max_values, len(num_list[idx]))

for sublist in num_list:
    if len(sublist) != max_values:
        padding = [0] * (max_values - len(sublist))
        sublist.extend(padding)

print(val_list)  # ['value1', 'value3', 'value2']
print(num_list)  # [[4.0, 2.0], [1.0, 0], [4.0, 2.0]]
print(auth_list) # ['author1', 'author2']

no gurantee that an item starts with 'value'.It is just an example.It could be data or process or anything else @Saad Hussain — Sam, Dec 03 '20 at 14:48

darkash · Answer 3 · 2020-12-08T08:49:23.737

auth_list = set([x['author'] for x in authorvals]) # in case you need to access it by index, you can cast the type into list
indexed = {} # for easy representation

for auth in authorvals:
  keys = auth.keys()
  filtered = keys.__sub__(['author', 'year']).__iter__().__next__() # removing 'author' and 'year' key from the key list and take the first value
  if indexed.get(filtered) is None:
    indexed[filtered] = [] # initialize if no same key name found
  indexed[filtered].append(auth[filtered]) # append the value from iteration to respective index

val_list = list(indexed.keys())
num_list = [indexed[key] for key in val_list]

Note that the num_list might be different in that the number of pairs of each members does not have fixed number of members as in the example provided, but you can always process them afterwards

filtered = keys.__sub__(['author', 'year'])[0] # removing 'author' and 'year' key from the key list TypeError: 'set' object is not subscriptable @darkash I am getting this error — Sam, Dec 03 '20 at 14:43

pairing up three different lists

3 Answers3