0
  • How to convert the dictionary containing keys and list of list values to dictionary containing keys and list, drop duplicated in the newly created list ?
  • I tried running the following function and the computer got an 'memory error'
from collections import defaultdict

my_dict = defaultdict(list)    
for k, v in zip(df_Group.guest_group, df_Group.list_guest):
  for item in v:
    v.append(item)    
  my_dict[k].append(set(v))
  • My origin dictionary created from 2 columns of one dataframe like: {Group1: [[1,2,3,4], [1, 2, 5, 6 ]]}
  • I want my dictionary like : {Group1: [1,2,3,4,5,6]}
Praveen
  • 8,945
  • 4
  • 31
  • 49
astonle
  • 91
  • 7

1 Answers1

0

From what I understood, what you essentially want to do is flatten your list while keeping unique items. The unique items can be achieved by converting the list into set and then back to list. The unpacking part is really well explained in this post. Here's a working code for you -

df_dict = {
    'Group1': [[1,2,3,4], [1, 2, 5, 6 ]],
    'Group2': [[1,2], [2,3],[2,3,4]]
}

final_dict = {}

for k, v in df_dict.items():
    # flatten the list using double list comprehension
    flat_list = [item for sublist in v for item in sublist]
    final_dict[k] = list(set(flat_list))

This gives the final_dict as -

{'Group1': [1, 2, 3, 4, 5, 6], 'Group2': [1, 2, 3, 4]}

Please tell me if this answers your query.

Edit for Integer values in between the lists - If we have a list with integer values in between, then you will get the int object not iterable error, to solve it we can check the instance to be int and make the item list by ourselves

Working code -

df_dict = {
    'Group1': [[1,2,3,4], 3, [1, 2, 5, 6 ]],
    'Group2': [[1,2], [2,3],[2,3,4]],
}

final_dict = {}

for k, v in df_dict.items():
    # making a new list 
    new_list = []
    for item in v:
        # for int we will convert it to 1 length list
        if isinstance(item, int):
            item = [item]
        for ele in item:
            new_list.append(ele)
    final_dict[k] = list(set(new_list))

final_dict

Final dict -

{'Group1': [1, 2, 3, 4, 5, 6], 'Group2': [1, 2, 3, 4]}

As expected

Varun
  • 16
  • 4
  • Thanks for your help. I have tried this way, but got error 'int' object is not iterable' and cant solve with range(len(v)) – astonle Nov 03 '20 at 05:26
  • If you used the dictionary that I have given in the code, then you shouldn't get that error. Can I know what type of dictionary are you using? – Varun Nov 03 '20 at 05:56
  • I have added an edit for your problem, tell me if this solves your problem. – Varun Nov 03 '20 at 06:08
  • My dictionary type is 'defaultdict' with value type is 'defaultdict object of collections module'. When I print dict, its like: 'defaultdict(, {'Group1': [[170323, 38785, 43158, ...], [170323, 38785, 43158...]], 'Group2':[[42276, 67349, 56879,...], [422763, 673490, 588879,...],...]]}) – astonle Nov 03 '20 at 06:13
  • Thanks for your effort, but still get error ' 'int' object is not iterable'. This dictionary contains all keys and values of 2 columns in a dataframe, which keys is 'group_name' and values is 'list of guest_id'. – astonle Nov 03 '20 at 06:30
  • If I use this way, dictionary will not contain all NON-duplicated values. Each group (keys) only contains a certain amount of equal value is 11 list_guest_id) ! – astonle Nov 03 '20 at 07:02
  • I am not sure I understand you well, "not contain all NON-duplicated values" do you mean It doesn't have the values that appear in more than one list (in a single group)? then I think you are mistaken. In the above example, 4 is a NON-duplicated value and it appears in the final answer. – Varun Nov 03 '20 at 07:09
  • I dont know...it seems like the list_guest_id that are not duplicated with previous ids will be replaced when they have same 1 group_name (keys) and same length list_id_guest (11 id). – astonle Nov 03 '20 at 07:13
  • Can you provide me the data too? Without it whatever I answer you it would seem vague or won't work to you. – Varun Nov 03 '20 at 07:20
  • When i create dictionary with dict type 'defaultdict' for 2 columns . My dictionary have total 284 group_name, with each group containing 19 to 152 list_id and the same length list of 11 ids. For each row in dataframe, each group_name will change the list of ids in it, but the list length remains the same. – astonle Nov 03 '20 at 07:30