3

I want to write a program that can use a particular website. I want it to be able to recognize fairly trivial things (text), click links, and submit forms.

I want the server logs to look no different than an actual user's activity, so I don't want to operate outside of a browser, as I normally would. I want things like javascript to be able to run as expected on the page, so I don't want to just fake the user-agent being sent.

What should I be looking at, to accomplish this? It would run in Windows. If I had to pick a single browser, it would be Chrome, with Firefox as my second choice. If it wasn't that much more complicated, I'd love to be able to have it work with Chrome, Firefox, IE, and Edge, but just picking 1 is also OK.

I'm very familiar with C++, and would prefer to use that for this project. (Yes, I know other languages might be quicker development for someone familiar with them, but it's what I want to stick with.)

I need it to be able to also interact outside of the browser, with a database. I'm fine either having a browser add-on that is capable of interprocess communication to handle this, or having a fully external program that's able to effectively scrape the browser and create user-looking input.

user1902689
  • 1,655
  • 2
  • 22
  • 33
  • Does this answer your question? [Web automation from C++](https://stackoverflow.com/questions/17345551/web-automation-from-c) – ggorlen Nov 25 '20 at 15:35

2 Answers2

2

Selenium seems like it would be a good fit for you. It's generally used for automated testing of webapps, but there's no reason it couldn't interact with any site. It can be used to drive any of the major browsers (not sure about Edge; it's been a while since I used Selenium) in a fully automated fashion.

Selenium has no C++ bindings, but it does have Java, C#, Python, Ruby, PHP, Perl, and Javascript bindings you can use.

Miles Budnek
  • 28,216
  • 2
  • 35
  • 52
0

Because your using windows, Microsoft provides api's to use to simulate mouse clicks, position, and keyboard.

VOID WINAPI mouse_event(
_In_ DWORD     dwFlags,
_In_ DWORD     dx,
_In_ DWORD     dy,
_In_ DWORD     dwData,
_In_ ULONG_PTR dwExtraInfo
);

and keyboard from User32.dll and winuser.h

VOID WINAPI keybd_event(
  _In_ BYTE      bVk,
  _In_ BYTE      bScan,
  _In_ DWORD     dwFlags,
  _In_ ULONG_PTR dwExtraInfo
);

both these are obsolete but have replacements which essentially work the same but implement security level mechanism.

marshal craft
  • 439
  • 5
  • 18
  • So much uninformed opinion, it hurts. The replacement for those API calls is [SendInput](https://msdn.microsoft.com/en-us/library/windows/desktop/ms646310.aspx). It differs from `mouse_event`/`keybd_event` with respect to reliability, not security. Worse, still, faking user input is the wrong approach altogether. [UI Automation](https://msdn.microsoft.com/en-us/library/windows/desktop/ee684009.aspx) is the correct way to automate UIs. Unless you are dealing with Internet Explorer. IE provides its own [automation interface](https://msdn.microsoft.com/en-us/library/hh995096.aspx). – IInspectable May 21 '16 at 12:39
  • Funny, please point out the opinions. Mouse event and keyboard event aren't opinions, there api's by microsoft which send authentic user input as asked by ask-er. Please provide documentation about reliability improvements I did not read about that in msdn documentation. These api's do provide authentic user input not distinguishable to the server. Also per msdn 'This function is subject to UIPI. Applications are permitted to inject input only into applications that are at an equal or lesser integrity level.' I do not find such statements about the obsolete api's. – marshal craft May 21 '16 at 17:12
  • [Reliability improvements](https://msdn.microsoft.com/en-us/library/windows/desktop/ms646310.aspx): *"These events are not interspersed with other keyboard or mouse input events inserted either by the user (with the keyboard or mouse) or by calls to keybd_event, mouse_event, or other calls to SendInput."* The input goes into the same buffer, that `keybd_event` writes to. Input synthesized through `keybd_event` is equally subject to UIPI. Synthesized input is distinguishable from authentic input, using a low-level mouse/keyboard hook, for example. – IInspectable May 21 '16 at 17:22