0

areaDict = {
"id": "",
"name": "",
"urls": [["", ""], ["", ""]],
}
areas = []

def getCouncils():
    page = requests.get('https://profile.id.com.au/')
    soup = BeautifulSoup(page.content, 'html.parser') #Parsing content

    links = soup.select("a.councilCard")
    
    id = 0
    for anchor in links:
        areaInfo = dict(areaDict)
        areaInfo['id'] = id
        id += 1
        areaInfo['name'] = anchor.text.strip() #Strip gets rid of new line characters
        areaInfo['urls'][0][0] = "profileID"
        areaInfo['urls'][0][1] = anchor['href']
        areas.append(areaInfo)
        #print(anchor['href'])

When I uncomment that print statement, it prints out the correct string that I am after, however when printing the entire 'areas' list, all of the urls are the href from the last anchor in links. I am assuming this has something to do with Pythons memory storage but I am unsure of how to fix it. Thankyou.

  • See: https://stackoverflow.com/questions/5105517/deep-copy-of-a-dict-in-python. `areaInfo = dict(areaDict)` does a shallow copy, i.e. `areaInfo` is a new object, but `areaInfo['urls']` refers to the same list object as `areaDict['urls']`. So you're looking for a deep copy instead. – slothrop Jun 03 '23 at 11:04
  • 1
    Why don't you see this problem with the `id` and `name` elements? Because (for example) `areaInfo['id'] = id` **replaces** `areaInfo['id']`. Whereas `areaInfo['urls'][0][0] = "profileID"` **modifies** `areaInfo['urls"]`. – slothrop Jun 03 '23 at 11:24
  • Does this answer your question? [Deep copy of a dict in python](https://stackoverflow.com/questions/5105517/deep-copy-of-a-dict-in-python) – slothrop Jun 04 '23 at 19:44

0 Answers0