0

I have a scenario where I have a dictionary containing an ID a list corresponding to that ID in the following format:

{
 "abc123":["https://www.url.com/", "https://www.url2.com"],
 "def456": ["https://www.url3.com/"],
    .
    .
    .
 "xyz890": ["https://www.anotherurl.com/", "https://anotherurl2.com/", "https://www.anotherurl3.com/"]
}

I have an external API that requires all URLs to be passed as a single list so, I want to concatenate all the URLs against each key in a single list and pass that list to the API (I could send each list separately but I want to avoid multiple API calls as network costs are a deciding factor). The API as a response, sends back the URLs in another list that it finds malicious. Now I want to find out what the ID value for these URLs is in my dictionary.

The problem is that I will have to lookup those malicious URLs against each list in the dictionary and find out their original IDs which I believe is a very crude and slow approach. Can someone please suggest me any simple technique to achieve this?

Let me know if someone needs additional context. Thanks for taking out the time to read this.

Marry35
  • 387
  • 4
  • 16
  • While creating the first dictionary of key->URLs, you could also create a URL->keys dictionary. It will double space requirements, but greatly reduce cost of lookups. – Carcigenicate Jan 10 '21 at 15:40
  • Great Idea! Also, since one URL could be part of multiple IDs, a URL->List of keys structure would do the job. – Marry35 Jan 10 '21 at 16:29
  • Yep. See [here](https://stackoverflow.com/questions/3318625/how-to-implement-an-efficient-bidirectional-hash-table) for examples of how this can be done. – Carcigenicate Jan 10 '21 at 16:32

0 Answers0