15

I am writing a small app that will among other things expand shortcuts into full text while typing. example: the user writes "BNN" somewhere and presses the relevant keyboard combination, the app would replace the "BNN" with a "Hi I am Banana".

after some research i learned that it can be done using user32.dll and the process of achieving this task is as follows:

1) get the active window handle
2) get the active window thread handle
3) attach input to active thread
4) get focused control handle (+caret position but that is not the issue)
5) detach input from active thread
6) get the text from the focused control using its handle

and here is my code so far:

try
{
    IntPtr activeWindowHandle = GetForegroundWindow();
    IntPtr activeWindowThread = GetWindowThreadProcessId(activeWindowHandle, IntPtr.Zero);
    IntPtr thisWindowThread = GetWindowThreadProcessId(this.Handle, IntPtr.Zero);
    AttachThreadInput(activeWindowThread, thisWindowThread, true);
    IntPtr focusedControlHandle = GetFocus();

    AttachThreadInput(activeWindowThread, thisWindowThread, false);
    if (focusedControlHandle != IntPtr.Zero)
    {
        TB_Output.Text += focusedControlHandle + " , " + GetText(focusedControlHandle) + Environment.NewLine;
    }
}
catch (Exception exp)
{
    MessageBox.Show(exp.Message);
}

//...
//...

[DllImport("user32.dll", CharSet = CharSet.Auto, ExactSpelling = true)]
internal static extern IntPtr GetForegroundWindow();

[DllImport("user32.dll", CharSet = CharSet.Auto, SetLastError = true)]
internal static extern int GetWindowThreadProcessId(int handle, out int processId);

[DllImport("user32", CharSet = CharSet.Ansi, SetLastError = true, ExactSpelling = true)]
internal static extern int AttachThreadInput(IntPtr idAttach, IntPtr idAttachTo, bool fAttach);

[DllImport("user32.dll", CharSet = CharSet.Auto, ExactSpelling = true)]
internal static extern IntPtr GetFocus();

this works perfectly for some windows forms apps but it doesnt work with WPF nor browsers, just gives me the title of the WPF app or the title of the tab in chrome.

if i run the app on this page while typing this question for instance, instead of the content of the question, the text i get is:

Get text from inside google chrome using my c# app - Stack Overflow - Google

probably because they use graphics to render the elements, and im not sure how i can get to the active element and read it's text.

i only referred to web browsers in the question's title because this tool will be mostly used with web browsers.

thank you in advance for any feedback.

Banana
  • 7,424
  • 3
  • 22
  • 43
  • 2
    Not sure if it is the best approach, I would go https://developer.chrome.com/extensions/devguide It is doable imho, but hooking into the web browser could trigger AV software like hell. – Cleptus Apr 24 '18 at 13:47
  • @bradbury9 i considered making an extension but it causes too many problems, the main one being that this tool will be used mostly with chrome but not only, so i cant restrict it to a chrome extension. or any other browser extension actually. +its easier to maintain and update as an app if i install it to my whole company... – Banana Apr 24 '18 at 13:50
  • @bradbury9 arranging an exception in our overly protective anti virus is not a problem. – Banana Apr 24 '18 at 13:51
  • 1
    If you want to do that in web browsers and WPF apps, you will have to create a keylogger that constantly monitors the keyboard and replaces the text simulating the keyboard input. WPF controls have no Windows handles, so WinAPI is useless for them. Same for the controls rendered in the web browsers. – dymanoid May 29 '18 at 16:49
  • @dymanoid thanks for the input, technically my app already is a keylogger as it monitors for the combination of keys that triggers the expanding. I am aware unfortunately that browsers and WTF window controsl have no handles (since they are technically graphical objects), but maybe there is a creative way of achieving this? spell checkers do manage to do it somehow, why cant we? – Banana May 29 '18 at 16:54
  • Maybe try to create a google chrome extension for this purpose. Hope it helps! – vCillusion Jun 02 '18 at 21:12
  • As @dymanoid suggested, for Winforms and WPF Apps we can try creating keylogger and monitor keyboard. It will handle the web browsers case as well. – vCillusion Jun 02 '18 at 21:13
  • @Banana Please share which spell checkers do it for all browsers and wpf and winforms? – vCillusion Jun 02 '18 at 21:14
  • Major browsers now support WebExtensions so you can develop an extension lets say chrome first then you can port it to mozilla, opera or ie quite easily. – m.qayyum Jun 05 '18 at 15:18

2 Answers2

3

I would personally attempt to create a library which chrome prefers. There are many available such as Kantu, which is specialized for Chrome.

Examples: TestCafe, Watir, SlimerJS

Alex Skorkin
  • 4,264
  • 3
  • 25
  • 47
foyss
  • 973
  • 2
  • 8
  • 24
1

I think that library is not the optimal way to do what you want. I would use a library more suited to browser DOM manipulation, like Selenium.

Dan Csharpster
  • 2,662
  • 1
  • 26
  • 50