I wish to create a script that can interact with any webpage with as little delay as possible, excepting network delay. This is to be used for arbitrage trading so any tips in this direction are welcome.
My current approach has been to use Selenium with Python as I am a beginner in web scraping. However, I have investigated some approaches and from my findings it seems that Selenium has a considerable delay (See benchmark results below) I mention other things besides it in this question but the focus is on Selenium.
In general, I need to send a BUY or SELL request to whatever broker I am currently working with. Using the web browser client, I have found a few approaches to do this:
1. Actually clicking the button
a. Use Selenium to Click the button with the corresponding request
b. Use Puppeteer to Click
c. Use Pynput or other direct mouse input manipulation
2. Reverse-engineering the request and sending it directly without clicking any buttons
Now, the delay in sending this request needs to be minimized as much as possible. Network delay is out of our control for the purpose of this optimization.
I have benchmarked the approaches for 1.
by opening a page in the standard way you would, either with puppeteer and selenium then waiting for a few seconds. While the script waits, I injected into the browser the following code:
$x('//*[@id="id_demo_webfront"]/iframe')[0].contentDocument.body.addEventListener('click', (data => console.log(new Date().getTime())), true);
The purpose of this code is to log in console the current time when the click is registered by the browser.
Then, I log in my python(Selenium, pynput
)/javascript(Puppeteer
) script the current time right before issuing the click. I am running on Ubunutu 18.04 as opposed to Windows so my system clock should have good resolution. Additional reading about this here and here.
Actual results, all were run on the same webpage multiple times, for roughly 10 clicks each run:
1. a. ~80ms
1. b. ~10-30ms
1. c. ~5-10ms
For 2.
, I haven't reliably tested the delay. The idea behind this approach is to inject a function that when fired will send a request that exactly resembles a request that would be sent when the button is clicked. I have not tested the delay between issuing the command to run such an injected function and it actually being run, however I expect this approach to be even faster, basically creating my own client to interact with whatever API the broker has on the backend.
From the results, it is pretty clear that issuing mouse commands seems to be the quickest, but it also is the hardest for me to implement reliably. Also, I seem to find puppeteer running faster across the board, but I prefer selenium in python for ease of development and was wondering if there are any tips and ideas to speed up those delays I am experiencing.
Summary:
Why does Selenium with Python have such a delay to issue commands and can it be improved?
Why does Puppeteer seem to have a lower delay when interacting with the same browser?
Some code snippets of my approach:
class DMMPageInterface:
# These are part of the interface script, self.page is the webpage after initialization
def __init__(self, config):
self.bid_price = self.page.get_element('xpath goes here')
...
# Mostly only the speed of this operation matters, sending the trade request
def bid_click(self):
logger.debug("Clicking bid")
if USE_MOUSE:
mouse.position = (720,390)
mouse.click(Button.left, 1)
else:
self.bid_price.click()