How can I get selected text in pdf in Javascript?

Question

I'm writing a Chrome Extention to manipulate pdf file so I want to get selected text in the pdf. How can I do that.

Some thing like that:

maybe this will help you: https://stackoverflow.com/questions/5379120/get-the-highlighted-selected-text — GoranLegenda, Apr 07 '20 at 09:03
It's possible, because some chrome extensions can definitely do this. For example, the Google Scholar extension. — MaudPieTheRocktorate, Jan 18 '22 at 00:14

wOxxOm · Accepted Answer · 2021-10-17T10:08:08.370

1

You can use the internal undocumented commands of the built-in PDF viewer.

Here's an example of a content script:

function getPdfSelectedText() {
  return new Promise(resolve => {
    window.addEventListener('message', function onMessage(e) {
      if (e.origin === 'chrome-extension://mhjfbmdgcfjbbpaeojofohoefgiehjai' &&
          e.data && e.data.type === 'getSelectedTextReply') {
        window.removeEventListener('message', onMessage);
        resolve(e.data.selectedText);
      }
    });
    // runs code in page context to access postMessage of the embedded plugin
    const script = document.createElement('script');
    if (chrome.runtime.getManifest().manifest_version > 2) {
      script.src = chrome.runtime.getURL('query-pdf.js');
    } else {
      script.textContent = `(${() => {
        document.querySelector('embed').postMessage({type: 'getSelectedText'}, '*');
      }})()`;
    }
    document.documentElement.appendChild(script);
    script.remove();
  });
}

chrome.runtime.onMessage.addListener((msg, sender, sendResponse) => {
  if (msg === 'getPdfSelection') {
    getPdfSelectedText().then(sendResponse);
    return true;
  }
});

This example assumes you send a message from the popup or background script:

chrome.tabs.query({active: true, currentWindow: true}, ([tab]) => {
  chrome.tabs.sendMessage(tab.id, 'getPdfSelection', sel => {
    // do something
  });
});

See also How to open the correct devtools console to see output from an extension script?

ManifestV3 extensions also need this:

manifest.json should expose query-pdf.js

  "web_accessible_resources": [{
    "resources": ["query-pdf.js"],
    "matches": ["<all_urls>"],
    "use_dynamic_url": true
  }]

query-pdf.js

document.querySelector('embed').postMessage({type: 'getSelectedText'}, '*')

edited Oct 17 '21 at 10:08

answered Apr 07 '20 at 09:34

wOxxOm

65,848
11
132
136

This did not work for me. The message listener did not intercept any events from the pdf viewer, unfortunately. – Yao Oct 17 '21 at 00:16
@AlexZhong, this is still working so if you can post a new question with an [MCVE](/help/mcve) that describes all the specifics of your case someone (or I) might be able to help. Note that this answer only works with the built-in viewer and only in the main page, so for an iframe you would need to make a couple of changes. – wOxxOm Oct 17 '21 at 05:28
Hey, I tried it with the built-in viewer. What I did was I copied your code in my CRX, tried in both the background and content script separately -> the message listener is registered -> I cannot observe any messages received from the listener when I select the text in the pdf viewer. – Yao Oct 17 '21 at 07:03
Also, I could not find any "getPdfSelection" message being sent in the source code that you linked – Yao Oct 17 '21 at 07:04
You are supposed to send that message yourself, of course. – wOxxOm Oct 17 '21 at 10:01
Is there a way to listen for a text selection and trigger it this way? – Andrew Feb 20 '23 at 17:33
This works for a pdf file served on the web ( https:// blabla ), but I couldn't make it work for a local file. It says: "Failed to execute 'postMessage' on 'DOMWindow': The target origin provided ('file://') does not match the recipient window's origin ('null')." (manifest file is configured as instructed) – Alperen Belgiç Jun 20 '23 at 15:03

score 0 · Answer 2 · edited Nov 09 '22 at 21:37

0

There is no one generic solution for all pdf extensions. Every extention has is own API. If you work with google-chrome extension i belive it's impossible.

How to get the selected text from an embedded pdf in a web page?

How extension get the text selected in chrome pdf viewer？

edited Nov 09 '22 at 21:37

General Grievance

4,555
31
31
45

answered Apr 07 '20 at 09:21

אברימי פרידמן

418
5
13

How can I get selected text in pdf in Javascript?

2 Answers2

ManifestV3 extensions also need this:

Linked

Related