I'm trying to build a lightweight Text-to-Speech GUI where I can select text in, say, Word or Chrome, and then press a button in my GUI and have it read it.
I've already figured out all the pieces to create the GUI and get the TTS to work, but I can't get the form factor right. I'm trying to mimic the form factor of Dragon Naturally Speaking's text-to-speech because, well it's simple and what I'm used to.
Here are the missing steps in the user story I can't get to work, in order:
1) user highlights text in an application (word, chrome, notepad, whatever) with the mouse and presses the gui button
2) data from the external application is pulled in as UTF-8 and stored in a variable called "text"
I know there's a problem in that several windows can have selected text. My solution is to pull the selected text from the most recently previously selected window.
Right now the kludgy work around is to Ctrl-C whatever text I want read and then press the button, because I can pull the data from the clipboard, but this is a really terrible user experience and confusing, as well. I tried using pyperclip to get the button to put the text in the clipboard, but it doesn't seem to work, so I'm not sure if the clipboard idea is a dead end.
def select_text(self):
#copy
pyperclip.copy() # doesn't work
#get text
win32clipboard.OpenClipboard()
text = win32clipboard.GetClipboardData()
win32clipboard.CloseClipboard()
#say it!
self.say_text(text)
I can't seem to find anything like this anywhere, and I have no idea where to start. Any help would be appreciated.