I have a dictionary users
with 1748 elements as (showing only the first 12 elements)-
defaultdict(int,
{'470520068': 1,
'2176120173': 1,
'145087572': 3,
'23047147': 1,
'526506000': 1,
'326311693': 1,
'851106379': 4,
'161900469': 1,
'3222966471': 1,
'2562842034': 1,
'18658617': 1,
'73654065': 4,})
and another dictionary partition
with 452743 elements as(showing first 42 elements)-
{'609232972': 4,
'975151075': 4,
'14247572': 4,
'2987788788': 4,
'3064695250': 2,
'54097674': 3,
'510333371': 0,
'34150587': 4,
'26170001': 0,
'1339755391': 3,
'419536996': 4,
'2558131184': 2,
'23068646': 6,
'2781517567': 3,
'701206260771905541': 4,
'754263126': 4,
'33799684': 0,
'1625984816': 4,
'4893416104': 3,
'263520530': 3,
'60625681': 4,
'470528618': 3,
'4512063372': 6,
'933683112': 3,
'402379005': 4,
'1015823005': 2,
'244673821': 0,
'3279677882': 4,
'16206240': 4,
'3243924564': 6,
'2438275574': 6,
'205941266': 3,
'330723222': 1,
'3037002897': 0,
'75454729': 0,
'3033154947': 6,
'67475302': 3,
'922914019': 6,
'2598199242': 6,
'2382444216': 3,
'1388012203': 4,
'3950452641': 5,}
The keys in users
(all unique) are all in partition
and also are repeated with different values(and also partition
contains some extra keys which is not of our use). What I want is a new dictionary final
which connects the keys of users
matching with those of partition
with the values of partition
, i.e. if I have '145087572' as a key in users
and the same key has been repeated twice or thrice in partition
with different values as: {'145087572':2, '145087572':3,'145087572':7} then I should get all these three elements in the new dictionary final
. Also I have to store this dictionary as a key:value RDD.
Here's what I tried:
user_key=list(users.keys())
final=[]
for x in user_key:
s={x:partition.get(x) for x in partition}
final.append(s)
After running this code my laptop stops to respond (the code still shows [*]) and I have to restart it. May I know that is there any problem with my code and a more efficient way to do this.