1

I am trying to scrape data from https://www.eex.com/en/market-data/power/futures for german futures and even though I specified xpath it returns data for Austrian prices as its default when loading the page. I already tried with a more specific URL ( https://www.eex.com/en/market-data/power/futures#%7B%22snippetpicker%22%3A%2228%22%7D) but then it returns nothing.

code:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options

options = Options()
driver = webdriver.Chrome(options=options)

def main():
    driver.get('https://www.eex.com/en/market-data/power/futures')
    
    tbody = driver.find_element(By.XPATH, '//*[@id="baseloadwidget_pfpde"]/table/tbody')
    
    data = []
    
    for tr in tbody.find_elements(By.XPATH, '//tr'):
        row = [item.text for item in tr.find_elements(By.XPATH,'.//td')]
        data.append(row)
    print(data)
    
if __name__ == '__main__':
    main()

another question I have. I would need data for D0 and D-1, D-2, D-3. any idea how to achieve that?

I already tried a more specific URL and improving XPATH but it didn't work.

Ajeet Verma
  • 2,938
  • 3
  • 13
  • 24

2 Answers2

0

Perhaps you can try using their Ajax API (the data is loaded from there using JavaScript). You can simulate this in Python:

import requests
import pandas as pd


api_url = "https://webservice-eex.gvsi.com/query/json/getChain/gv.pricesymbol/gv.displaydate/gv.expirationdate/tradedatetimegmt/gv.eexdeliverystart/ontradeprice/close/onexchsingletradevolume/onexchtradevolumeeex/offexchtradevolumeeex/openinterest/"
params = {"optionroot": '"/E.DEBY"', "expirationdate": "2023/08/01"}

headers = {
    "Referer": "https://www.eex.com/",
}

base_table_data = requests.get(api_url, params=params, headers=headers).json()
df_base = pd.DataFrame(base_table_data["results"]["items"])
print(df_base)


params = {"optionroot": '"/E.DEPY"', "expirationdate": "2023/08/01"}
peak_table_data = requests.get(api_url, params=params, headers=headers).json()
df_peak = pd.DataFrame(peak_table_data["results"]["items"])
print()
print(df_peak)

Prints:

  gv.pricesymbol gv.displaydate gv.expirationdate      tradedatetimegmt   gv.eexdeliverystart  ontradeprice   close  onexchsingletradevolume  onexchtradevolumeeex  offexchtradevolumeeex  openinterest
0     /E.DEBYF24       1/1/2024        12/27/2023   8/2/2023 3:59:56 PM  1/1/2024 11:00:00 PM        134.00  134.52                   8784.0             2714256.0              2643984.0       72459.0
1     /E.DEBYF25       1/1/2025        12/27/2024   8/2/2023 3:14:51 PM  1/1/2025 11:00:00 PM        124.75  124.67                   8760.0              604440.0               289080.0       17377.0
2     /E.DEBYF26       1/1/2026        12/29/2025   8/2/2023 3:16:32 PM  1/1/2026 11:00:00 PM        106.00  105.59                   8760.0               87600.0               350400.0        4072.0
3     /E.DEBYF27       1/1/2027        12/29/2026  8/2/2023 10:24:43 AM  1/1/2027 11:00:00 PM         90.25   90.23                   8760.0               17520.0               113880.0         787.0
4     /E.DEBYF28       1/1/2028        12/29/2027                  None  1/1/2028 11:00:00 PM           NaN   84.18                      NaN                   NaN                    NaN         111.0
5     /E.DEBYF29       1/1/2029        12/27/2028                  None  1/1/2029 11:00:00 PM           NaN   82.65                      NaN                   NaN                    NaN          13.0
6     /E.DEBYF30       1/1/2030        12/27/2029                  None  1/1/2030 11:00:00 PM           NaN   83.11                      NaN                   NaN                    NaN           7.0
7     /E.DEBYF31       1/1/2031        12/27/2030                  None  1/1/2031 11:00:00 PM           NaN   82.93                      NaN                   NaN                    NaN           2.0
8     /E.DEBYF32       1/1/2032        12/29/2031                  None  1/1/2032 11:00:00 PM           NaN   82.78                      NaN                   NaN                    NaN           2.0
9     /E.DEBYF33       1/1/2033        12/29/2032                  None  1/1/2033 11:00:00 PM           NaN   81.93                      NaN                   NaN                    NaN           NaN

  gv.pricesymbol gv.displaydate gv.expirationdate     tradedatetimegmt   gv.eexdeliverystart  ontradeprice   close  onexchsingletradevolume  onexchtradevolumeeex  offexchtradevolumeeex  openinterest
0     /E.DEPYF24       1/1/2024        12/27/2023  8/2/2023 3:20:02 PM  1/1/2024 11:00:00 PM        159.35  159.26                   3144.0               37728.0                56592.0        6776.0
1     /E.DEPYF25       1/1/2025        12/27/2024  8/2/2023 2:35:11 PM  1/1/2025 11:00:00 PM        149.25  149.38                   3132.0               31320.0                34452.0        2075.0
2     /E.DEPYF26       1/1/2026        12/29/2025  8/2/2023 3:17:30 PM  1/1/2026 11:00:00 PM           NaN  129.58                      NaN                   NaN                 3132.0         260.0
3     /E.DEPYF27       1/1/2027        12/29/2026                 None  1/1/2027 11:00:00 PM           NaN  109.42                      NaN                   NaN                    NaN           8.0
4     /E.DEPYF28       1/1/2028        12/29/2027                 None  1/3/2028 11:00:00 PM           NaN  104.42                      NaN                   NaN                    NaN           NaN
5     /E.DEPYF29       1/1/2029        12/27/2028                 None  1/1/2029 11:00:00 PM           NaN  100.75                      NaN                   NaN                    NaN           NaN

EDIT: If you need daily data:

params = {"optionroot": '"/E.DB_DAILY"', "expirationdate": "2023/08/01"}
base_daily_data = requests.get(api_url, params=params, headers=headers).json()
df_daily = pd.DataFrame(base_daily_data["results"]["items"])
print()
print(df_daily)

Prints:

   gv.pricesymbol gv.displaydate gv.expirationdate      tradedatetimegmt    gv.eexdeliverystart  ontradeprice  close  onexchsingletradevolume  onexchtradevolumeeex  offexchtradevolumeeex  openinterest
0      /E.DB02Q23       8/2/2023          8/2/2023                  None   8/2/2023 10:00:00 PM           NaN  68.29                      NaN                   NaN                    NaN        5481.0
1      /E.DB03Q23       8/3/2023          8/3/2023  8/2/2023 10:19:26 AM   8/3/2023 10:00:00 PM          44.0  45.41                     48.0               20112.0                48456.0        4740.0
2      /E.DB04Q23       8/4/2023          8/4/2023   8/2/2023 3:59:03 PM   8/4/2023 10:00:00 PM          90.5  90.95                    120.0                3216.0                 7200.0         569.0
3      /E.DB05Q23       8/5/2023          8/7/2023   8/2/2023 2:43:15 PM   8/5/2023 10:00:00 PM           NaN  74.78                      NaN                   NaN                 1200.0          50.0
4      /E.DB06Q23       8/6/2023          8/7/2023   8/2/2023 2:58:37 PM   8/6/2023 10:00:00 PM           NaN  38.58                      NaN                   NaN                  600.0         125.0
5      /E.DB07Q23       8/7/2023          8/7/2023                  None   8/7/2023 10:00:00 PM           NaN  50.28                      NaN                   NaN                    NaN           NaN
6      /E.DB08Q23       8/8/2023          8/8/2023                  None   8/8/2023 10:00:00 PM           NaN  69.40                      NaN                   NaN                    NaN           NaN
7      /E.DB09Q23       8/9/2023          8/9/2023                  None   8/9/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
8      /E.DB10Q23      8/10/2023         8/10/2023                  None  8/10/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
9      /E.DB11Q23      8/11/2023         8/11/2023                  None  8/11/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
10     /E.DB12Q23      8/12/2023         8/14/2023                  None  8/12/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
11     /E.DB13Q23      8/13/2023         8/14/2023                  None  8/13/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
12     /E.DB22U23      9/22/2023         9/22/2023                  None  9/22/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
13     /E.DB23U23      9/23/2023         9/25/2023                  None  9/23/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
14     /E.DB24U23      9/24/2023         9/25/2023                  None  9/24/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
15     /E.DB25U23      9/25/2023         9/25/2023                  None  9/25/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
16     /E.DB26U23      9/26/2023         9/26/2023                  None  9/26/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
17     /E.DB27U23      9/27/2023         9/27/2023                  None  9/27/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
18     /E.DB28U23      9/28/2023         9/28/2023                  None  9/28/2023 10:00:00 PM           NaN    NaN                      NaN                   NaN                    NaN           NaN
Andrej Kesely
  • 168,389
  • 15
  • 48
  • 91
  • worked like a charm, thank you so much. I have one more question. How can I add tradedatetime as one of the params? is it possible to use this param to query three different dates on this page? – Nikolina9113 Aug 02 '23 at 19:19
  • @Nikolina9113 If I understand you correctly, you can always do 3 different requests (each with different time) – Andrej Kesely Aug 02 '23 at 19:22
  • yes, but if I change the expirationdate param, i get the same data, since expiration date in general is the expiry date of the product. So for cal24 expiration date is 1/1/2024 and it's not the same as the trade date. – Nikolina9113 Aug 02 '23 at 19:33
  • @Nikolina9113 I don't understand... if I change the `params = {"optionroot": '"/E.DEPY"', "expirationdate": "2022/08/01"}` I get row for year `23` for example (With expiration data `"2023/08/01"` I didn't receive it) – Andrej Kesely Aug 02 '23 at 19:36
  • so the table represents auction prices for electricity and they happen every day. The yearly products you can buy in 2023 are Cal24 (whole year of electricity for 2024), which expires on 1/1/2024, Cal25 (expires 1/1/2025). Prices change daily so I would need settlement prices for 4 consecutive days. That is why I would need one of the params to be tradedatetimegmt – Nikolina9113 Aug 02 '23 at 19:47
  • @Nikolina9113 I've edited the answer and put the params to get daily data – Andrej Kesely Aug 02 '23 at 19:52
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/254765/discussion-between-nikolina9113-and-andrej-kesely). – Nikolina9113 Aug 02 '23 at 19:55
0

To scrape data for german futures from the website you need to select the option with text EEX German Power Futures from the dropdown and then inducing WebDriverWait for the visibility_of_element_located() for the <table> element and using DataFrame from Pandas you can use the following locator strategy:

Code Block:

driver.get(url='https://www.eex.com/en/market-data/power/futures')
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[value='I accept all cookies.']"))).click()
time.sleep(3)
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button.btn.dropdown-toggle.form.input-select div.filter-option-inner"))).click()
WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@class='dropdown-menu show']//li/a[@class='dropdown-item']/span[contains(., 'EEX German Power Futures')]"))).click()
table_data = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div#baseloadwidget_pfpde > table.mv-quote"))).get_attribute("outerHTML")
df = pd.read_html(table_data)
print(df)
driver.quit()

Console Output:

[         0           1            2                 3                4                          5              6   7
0   Cal-24      134.00         8784            134.52          2714256                    2643984          72459 NaN
1   Cal-25      124.75         8760            124.67           604440                     289080          17377 NaN
2   Cal-26      106.00         8760            105.59            87600                     350400           4072 NaN
3   Cal-27       90.25         8760             90.23            17520                     113880            787 NaN
4   Cal-28           -            -             84.18                -                          -            111 NaN
5   Cal-29           -            -             82.65                -                          -             13 NaN
6   Cal-30           -            -             83.11                -                          -              7 NaN
7   Cal-31           -            -             82.93                -                          -              2 NaN
8   Cal-32           -            -             82.78                -                          -              2 NaN
9   Cal-33           -            -             81.93                -                          -              - NaN
10  Future  Last Price  Last Volume  Settlement Price  Volume Exchange  Volume Trade Registration  Open Interest NaN
11     NaN         NaN          NaN               NaN              NaN                        NaN            NaN NaN]
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • yes, also works. but is there also a way to get data for different selected dates in the dropdown menu? I would need this to be an input in my code. so for example I would like to get data for 2/8/2023, 31/7/2023... – Nikolina9113 Aug 03 '23 at 07:40
  • @Nikolina9113 yes, that can also be done easily. Feel free to raise a new ticket with all the relevant details. – undetected Selenium Aug 03 '23 at 08:55
  • @Nikolina9113 Glad to be able to help you. [What should I do when someone answers my question?](https://stackoverflow.com/help/someone-answers) – undetected Selenium Aug 03 '23 at 20:15