0

I have three dictionaries where keys are the same, however the scales of these dictionaries vary significantly. I want to normalize/standardize the values of these dictionaries, so I can then sum them together to create an overall combined score for each key (with equal weights for the three different inputs).

Current:
page_score = {'andrew.lewis: 6.599', 'jack.redmond: 4.28'.....'geoff.storey: 2.345)
eigen_score = {'andrew.lewis: 4.97', 'jack.redmond: 2.28'.....'geoff.storey: 3.927)
(1 more)


Normalized:
page_score = {'andrew.lewis: 0.672', 'jack.redmond: 0.437'.....'geoff.storey: 0.276)
hub_score = {'andrew.lewis: 0.432', 'jack.redmond: 0.762'.....'geoff.storey: 0.117)
(1 more)

End Output:
overall_score = {'andrew.lewis: 2.738.....) """combination of values across the three standardized dictionaries"""

How can I achieve this? I know how to do this for a list, however I'm not sure how to achieve this for a dictionary. I've tried the solutions provided here and here already, however strangely they created various errors. Any help would be appreciated. Code so far:

G = nx.read_weighted_edgelist('Only_50_Employees1.csv', delimiter=',', create_using = nx.DiGraph(), nodetype=str)

between_score = dict(nx.betweenness_centrality(G))
eigen_score = dict(nx.eigenvector_centrality(G))
page_score = nx.pagerank(G)

Tried Already

factor=1.0/sum(page_score.values())
normalised_d = {k: v*factor for k, v in page_score.items()}

def normalize(page_score, target=1.0):
raw = sum(page_score.values())
factor = target/raw
return {key:value*factor for key,value in page_score.items()}

def really_safe_normalise_in_place(page_score):
factor=1.0/math.fsum(page_score.values())
for k in page_score:
    page_score[k] = page_score[k]*factor
key_for_max = max(page_score.tems(), key=operator.itemgetter(1))[0]
diff = 1.0 - math.fsum(page_score.values())
#print "discrepancy = " + str(diff)
page_score[key_for_max] += diff
d={v: v+1.0/v for v in xrange(1, 1000001)}
really_safe_normalise_in_place(d)
print math.fsum(page_score.values())

Screenshot of page_score dictionary: enter image description here

Laurie
  • 1,189
  • 1
  • 12
  • 28
  • Various errors? – timgeb Jun 26 '18 at 10:16
  • Willem's solution in the former link resulted in an invalid syntax error on the last bracket (no idea what could be causing that, I tried a variety of different structurings), while Ajax's solution in the latter link computes without error however no values have changed in the target dictionary. – Laurie Jun 26 '18 at 10:25
  • Three blocks of code I've tried listed above, the first two run without error (however don't change values), whereas the third returns an invalid syntax. I've tried applying the second block to my dictionary (using normalize(hub_score)), however still nothing. – Laurie Jun 26 '18 at 10:49
  • One last query, can you pos the output of `print 'hub_score` . Also, can you please indent the updated code or you can post a screenshot of the code if you want. – Gambit1614 Jun 26 '18 at 11:17
  • Do you mean for the last function? – Laurie Jun 26 '18 at 11:18
  • No, I mean just the original values of any one of the dictionary like `hub_score` – Gambit1614 Jun 26 '18 at 11:29
  • 1
    See screenshot above. – Laurie Jun 26 '18 at 11:34
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/173811/discussion-between-mohammed-kashif-and-laurie-bamber). – Gambit1614 Jun 26 '18 at 11:37

1 Answers1

1

For anyone interested, I found a very unpythonic way to achieve this through using a dataframe:

# Libraries
import networkx as nx
import pandas as pd
import operator

# Loading files and node metrics
G = nx.read_weighted_edgelist('Only_50_Employees1.csv', delimiter=',', create_using = nx.DiGraph(), nodetype=str)
page_score = dict(nx.pagerank(G))
eigen_score = dict(nx.eigenvector_centrality(G))
betweenness_score = dict(nx.betweenness_centrality(G))
mydicts = [page_score, betweenness_score, eigen_score]

# Creating pandas dataframe
df = pd.concat([pd.Series(d) for d in mydicts], axis=1).fillna(0).T
df.index = ['page_score', 'betweenness_score', 'eigen_score']
df = df.transpose()
del page_score, eigen_score, betweenness_score, mydicts

# Scaling (and making values positive)
df = (df - df.mean()) / (df.max() - df.min())
minus_columns = ['page_score', 'betweenness_score', 'eigen_score']
df = df[minus_columns] + 1

# Creating new column with overall score
df['score'] = df['page_score'] + df['betweenness_score'] + df['eigen_score']
del df['page_score'], df['betweenness_score'], df['eigen_score']

# Reverting df back to dict
score_dict = df['score'].to_dict()
Laurie
  • 1,189
  • 1
  • 12
  • 28