I have questions when I'm trying to implementing PageRank with mapreduce. I want to cite the codes here https://stackoverflow.com/a/5029780/1117436 to describe the problem.
map ((url,PR), out_links) //PR = random at start
for link in out_links
emit(link, ((PR/size(out_links)), url))
reduce(url, List[(weight, url)):
PR =0
for v in weights
PR = PR + v
Set urls = all urls from list
emit((url, PR), urls)
In the above process, it's clearly that the second parameter of the input of map procedure is the Out links of url but the second parameter of the output of reduce procedure seems to be the In links of url. So how can these codes work iteratively?
Then what I want to ask is how to write codes to make the pagerank alrorithm work properly?
UPDATE: I think this answer solves my problem. https://stackoverflow.com/a/13568286/1117436