0

I found a XHR request with all the street address info I want to scrape within it.

However, I do not know how to extract it to a pandas dataframe or a python list. Any ideas? Thank you very much!

  • Can you provide what you have tried? You can provide an example with desired input and output. What programming languages are you using? – Danizavtz Jul 19 '20 at 20:20

2 Answers2

0

You can replicate the XHR using python's requests library using the information in the headers tab (more info here). Then parse the data using json library, and extract information.

sqz
  • 317
  • 5
  • 12
0

Since it's graphql, you can formulate the query string however you like, but here I've written it the same way it's sent when the browser makes a request:

def main():

    import requests

    url = "https://api-endpoint.cons-prod-us-central1.kw.com/graphql"

    headers = {
        "x-shared-secret": "MjFydHQ0dndjM3ZAI0ZHQCQkI0BHIyM="
    }

    query = """{
  ListOfficeQuery {
    id
    name
    address
    subAddress
    phone
    fax
    lat
    lng
    url
    contacts {
      name
      email
      phone
      __typename
    }
    __typename
  }
}
"""

    payload = {
        "operationName": None,
        "variables": {},
        "query": query
    }

    response = requests.post(url, headers=headers, json=payload)
    response.raise_for_status()

    offices = response.json()["data"]["ListOfficeQuery"]

    print(f"There are {len(offices)} offices, and the first one's address is \"{offices[0]['address']}\"")

    return 0


if __name__ == "__main__":
    import sys
    sys.exit(main())

Output:

There are 1173 offices, and the first one's address is "1801 South Mo-Pac Expressway, Suite 100"
>>> 
Paul M.
  • 10,481
  • 2
  • 9
  • 15