0

I'm having issues finding a way to copy text with Puppeteer.

In my research, I've found this post but, this wasn't what I was trying to do (copy text from input). What I am trying to do is copy text from a website. Then paste it into a Google Doc file. My main goal is to keep the formatting.

I have been able to get the HTML text with:

let html_content = await page.evaluate(el => el.innerHTML, await page.$('#sites-canvas-main-content > table > tbody > tr > td > div'));

This unfortunately does not keep the formatted text.

Is this even possible to do with Puppeteer?

Joshua Jones
  • 109
  • 9
  • Can you provide what you've attempted for pasting HTML into a google doc? What you may not understand is that the CSS behind the HTML styles it, and that in order to "copy" the styling you need to include that; however it may require that you "apply" the CSS to the html element you are copying so that when you paste it into google docs it has the correct CSS rather than just being HTML without the `style` attribute. (`
    ` versus `
    `)
    – Cody G Sep 14 '18 at 17:13
  • What I paste the scarped HTML into the Google doc it just uses it as plan text. I'm trying to it to do bold, links, ect. I can get the HTML and the text inside the HTML it just uses it's formatting when being put into the document. – Joshua Jones Sep 14 '18 at 18:03
  • Can you post the code you're using to paste into google docs? :) It would help make your question more complete. – Cody G Sep 14 '18 at 18:03
  • Absolutely! I'll update that! When it comes to the part that get the HTML I have – Joshua Jones Sep 14 '18 at 18:05
  • I have ``` for(let link in links) let html_content = await page.evaluate(el => el.innerHTML, await page.$('#sites-canvas-main- content > table > tbody > tr > td > div')) console.log(html_content)} ``` I'm on mobile sorry for any poor formating – Joshua Jones Sep 14 '18 at 18:12
  • I saw that in your post, no need to put it in the comments. You can also edit your question with the paste code when you get the chance. The first problem/question I think you should work out is "Can I paste html code with styles into google docs?" – Cody G Sep 14 '18 at 18:24
  • Okay! I will try that out and report back! :) – Joshua Jones Sep 14 '18 at 18:29

1 Answers1

0

The first part of the solution is to get the calculated styles, using

https://github.com/GoogleChrome/puppeteer/issues/696

const button = await page.evaluate(() => {
        const btn = document.querySelector('.button');
        return JSON.parse(JSON.stringify(getComputedStyle(btn)));
});

The second part of the solution is to apply those to an element and paste it into google docs...

Cody G
  • 8,368
  • 2
  • 35
  • 50
  • Thank for you pointing me in the right direction! I was really lost. – Joshua Jones Sep 14 '18 at 23:05
  • By the way. I have found this library. [inline-css](https://www.npmjs.com/package/inline-css) – Joshua Jones Sep 17 '18 at 13:21
  • I still have not been able to solve to over all problem. I just pasts the HTML with styles. It doesn't ever change the font size/color of text like the site has. It might just be an issue with google docs. – Joshua Jones Sep 17 '18 at 13:42
  • 1
    Just to let anyone who see's this post in the future. This site was a Google site. I ended up using Google's scripting language to get everything I needed! – Joshua Jones Sep 18 '18 at 21:45