-1

Can anyone tell me how can I extract the highchart data from the following link into python?

https://www.ree.es/en/datos/generation/generation-structure

mxs
  • 25
  • 6

1 Answers1

1

Try below approach using python - requests simple, straightforward, reliable, fast and less code is required when it comes to requests. I have fetched the API URL from website itself after inspecting the network section of google chrome browser.

What exactly below script is doing:

  1. First it will take the API URL which is created using dynamic parameters(all in caps) and do GET request. URL is dynamic you can pass any valid value in the params and the URL is created for you every time you want to fetch something from the chart.

  2. After getting the data script will parse the JSON data using json.loads library.

  3. Finally it will iterate all over the list of attributes and different values of the chart for ex:- Title, Type, Color, Last updates, percentage etc. you can modify these attributes as per your need.

    import json
    import requests
    from urllib3.exceptions import InsecureRequestWarning
    requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
    
    def scrape_chart_data():
    #### Dynamic Paramters######
    START_DATE = '2020-10-22T00:00'
    END_DATE = '2020-10-29T23:59'
    TIME_TRUNC = 'day'
    CACHED = 'true'
    SYSTEM_ELECTRIC = 'nacional'
    
    URL = 'https://apidatos.ree.es/en/datos/generacion/estructura-generacion?start_date=' + START_DATE + '&end_date=' + END_DATE + '&time_trunc=' + TIME_TRUNC + \
    '&cached=' + CACHED + '&systemElectric=' + SYSTEM_ELECTRIC  # Dynamic URL created using params
    
    response = requests.get(URL,verify = False) # GET API request
    result = json.loads(response.text) # Parse JSON data
    extracted_chart_data = result['included'] # extracted data using GET API call
    
    for idx in range(len(extracted_chart_data)): # iterate over the data and print attributes and values
        print('-' * 100)
        attributes = extracted_chart_data[idx]['attributes'] #attributes
        values = extracted_chart_data[idx]['attributes']['values'] #values
        print('Type : ', attributes['type'])
        print('Title : ', attributes['title'])
        print('Color : ', attributes['color'])
        print('Last Update : ', attributes['last-update'])
        print('Magnitude : ', attributes['magnitude'])
        print('-' * 50 + ' Values of ' + attributes['title'] + ' ' + '-' * 50)        
        for val in range(len(values)):           
            print('Date and Time : ', values[val]['datetime'])
            print('Percentage : ', values[val]['percentage'])
            print('Value : ', values[val]['value'])
        print('-' * 100)
    
    scrape_chart_data()
    
Vin
  • 968
  • 2
  • 10
  • 22
  • Thanks a lot it is perfect. Is it possible to do the same with PowerBI? Like the following link: https://www.terna.it/en/electric-system/transparency-report/renewable-generation – mxs Oct 29 '20 at 16:11
  • No scrapping of Power BI is totally different, as there are certain things which are needed to fetch the data then only you can scrap the data from BI dashboards. Tableau is more difficult but Power BI is still achievable but not straightforward like this. – Vin Oct 29 '20 at 16:26
  • But can you help me with this link? I just need to extract the total actual generation value. https://www.terna.it/en/electric-system/transparency-report/actual-generation – mxs Nov 03 '20 at 12:22