0

I need to grab the url or the content of the page from a redirected + rejected site. Everything works perfectly fine in headless: false mode, but as soon as I turn headless to true it has an empty body.

url: http://localhost/#code=GcPNMvfOMTVMnbhWfF-r4ob74zAqvzNJ8uzzwqun8C8.C1EEBPygJToI1K5b9hlcxCcxIOCr7dLF54f44CNkdTQ&id_token=eyJhbGciOiJSUzI1NiIsImtpZCI6InB1YmxpYzpmYjc3NDA4OS02NzljLTQ2OGItOTE0MS04NzFhYWI5MTBkZWMiLCJ0eXAiOiJKV1QifQ.eyJhbXIiOlsibG9naW46c2lsZW50Il0sImF1ZCI6WyIxZmRkZWU0ZS1iMTAwLTRmNGUtYjJiMC0wOTdmOTA4OGY5ZDIiXSwiYXV0aF90aW1lIjoxNjkyNjM5Njg5LCJjX2hhc2giOiJjV1d4YndCVWYwM3ZUbEVaY3BYbkVRIiwiZXhwIjoxNjkyNjQzMjkyLCJpYXQiOjE2OTI2Mzk2OTIsImlzcyI6Imh0dHBzOi8vYWNjb3VudC5qYWdleC5jb20vIiwianRpIjoiYTcyMTRlYjQtZmMzMS00MTQzLWIwM2YtOGYzZDA4MTQzMzdjIiwibm9uY2UiOiJTWjBPcHVORENEWUtzc1JBanNxZDVoNUFTSE5aa1BwbWRmT0pYNW9BNHlERnVOZlQiLCJyYXQiOjE2OTI2Mzk2OTEsInNpZCI6IjYyYjRjZDQyLTYxYTYtNDlmOS1hMThiLTRlZjdhNzZhNjkzYyIsInN1YiI6IjdEU2pnQTFwVXpMWkdhYjRZYlQya3QifQ.lr1z9DSjmMc5PKN0qSxFym79i7-yMtVrNKsZdQZfgeBY-2KVku5Z5Kt6bVbAXl0DQznJeFSdZrxBp2U8h5TpX6H5s0aueKY28L_4eSJLVGVWhm5t3srLFrozJtmX_KT1ckh4ebRBQR9-lJAZ5yILtwJ64W_55LZBnwHLiMRVpxac355rJXpbDSSfX6xotT7uH5WWUxbSVPRiYLXM8X6GKTdfMYTBUvbAPg88RsbbXgYpl1QUEs6h9lbCCFh8SkuvQyU9Sp3nexTnGo1_QSic6RazRmXZ7hgoUHlgXod2JXTk5I86l2LvqiVgzpdXtvaSl_aCPvt_t1TUCUl0sFVXM_8TFeCAWoO3vKv1Rp8p65EAUhcsC3fDoJJlwMBqB-YXQ-VQqHOgsDu843btUTgx4-5CuW_rZswfrZGad9b2MtYSjCX9netZtGudOnxW-Z7dCT45MEQbp34EQDcbIP1ocvDoAa4IYXXasMwUBiVnZHRrte2iL9LV9UnxgWqlEX5dqnBhthJYaqCi2Yd2xjIVEj8aQgD1NxV35VlRCMTiiLvgLcWSu3vnnBDby9JBi7zxJpPzFJdJE0qfQ5A_i-0p9LZx9gkVGCdvDJhbugw8T_yr_6WM1X5_3woUUDbT3rQPX-I7zIPyrovmsOnqmiDkDSDXwfpYRyqvPPjja48Ijek&state=ffGBa3X2GO1c

// headless = false
await page.waitForNavigation({'waitUntil': 'domcontentloaded', timeout: 60000});
const content = await page.content();
console.log(content)

content: <!DOCTYPE html><html dir="ltr" lang="en"><head>
  <meta charset="utf-8">
  <meta name="color-scheme" content="light dark">
  <meta name="theme-color" content="#fff">
  <meta name="viewport" content="width=device-width, initial-scale=1.0,
                                 maximum-scale=1.0, user-scalable=no">
  <title>localhost</title>

// headless = true
await page.waitForNavigation({'waitUntil': 'domcontentloaded', timeout: 60000});
const content = await page.content();
console.log(content)

content: <html><head></head><body></body></html>

Is there a way to fix this? Or potentially grab the url which is what I need cause its actually in the body but not in; await page.url() cause that contains; "chrome-error://chromewebdata/" for both headless false & true. I also tried await page.evaluate(() => document.location.href); but that also gives the same result.

I tried looking around on stack overflow and found one similar question but it was never answered.

  • The canonical is [Why does headless need to be false for Puppeteer to work?](https://stackoverflow.com/questions/63818869/why-does-headless-need-to-be-false-for-puppeteer-to-work). Did you try adding a user agent header? – ggorlen Aug 21 '23 at 17:57
  • Did you try setting headless to "new"? – Chris Aug 21 '23 at 18:22
  • Setting headless to "new" seems to work! No empty body anymore but the URL is still "chrome-error://chromewebdata/" but I can work with that, thanks Chris! – CurBStone Aug 21 '23 at 18:28

0 Answers0