0

I'm trying to log into a page, but when headless mode is True it doesn't work, but when it does, this is my code:

const puppeteer = require("puppeteer")           
const sqlite3 = require("sqlite3").verbose();   
const db = new sqlite3.Database('./db.db');     
const fs = require("fs")       
                                                                                                                    
const login = async () => {                      
 db.serialize(function() {                        
  db.all("SELECT * from usuarios",async function(err,rows){                                        
   try{                                             
    console.log("Dormindo ...")                     
    while(true){                                     
     if(err) {                                        
      console.log(err);                              
     }                                                
     else{                                            
      for(logins of rows){                             
        var data = Date.now()
        const browser = await puppeteer.launch({
         executablePath: '/usr/bin/chromium',
         args:['--no-sandbox',  '--Mozilla/5.0 (X11; Linux aarch64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36', '--lang=en-US,en;q=0.5']        
        });
        await page.goto("https://unite.nike.com.br/oauth.html?client_id=QLegGiUU042XMAUWE4qWL3fPUIrpQTnq&redirect_uri=https%3A%2F%2Fwww.nike.com.br%2Fapi%2Fv2%2Fauth%2Fnike-unite%2Fset&response_type=code&locale=pt_BR&state=", { waitUntil: 'load', timeout:0});
        await page.waitForSelector('input[name="emailAddress"]')
        await page.type('input[name="emailAddress"]', logins.email)                                     
        await page.type('input[name="password"]', logins.password)                                      
        await page.click("input[value='ENTRAR']")       
        await page.waitForTimeout(6000)  
        await page.screenshot({fullPage: true, path: 'screenshot.png'})                                 
        console.log(Date.now() - data)                  
        console.log("Cookie de usuario armazenado!");                                                   
        browser.close()  
       }
      }
     }                                               
    }
   }                                               
   catch(err){
    console.log("Erro durante login!",err)
   }                                              
  })
 })
}

I want to be able to login in True Headless Mode, If you can help me I appreciate it, I can't solve this at all, but from what I see when I enter the page with headless false mode, and open devtools, in the browser console there is a message like this: INFO Fri Jul 16 2021 23:42:16 GMT-0300 (Brasilia Standard Time) OPTIMIZELY: Skipping JSON schema validation.

My theory is that site somehow manages to check whether or not I'm in headless mode, but I don't know how to fix this anyway, since I sent a user-agent and set the language in args:, someone help me with that please.

  • your `args:` part has invalid syntax in the above snippet, can you confirm that is a typo only in the example and fix it also? at this point, it cannot be told if your issue is related to this error or the headless browser detection (the latter is more probable) – theDavidBarton Jul 17 '21 at 19:01
  • @theDavidBarton Yes, it was a typo, I had seen it but couldn't find where it was, well can you tell me why it works with headless false but not with headless True? is it something related to headers? If you can help me I would appreciate it young man. –  Jul 17 '21 at 20:54
  • @theDavidBarton It does some checking, and when I'm in headless mode false it displays on the chromium console "Skipping json validity", and I can log in without problem, but when it's True I can't –  Jul 17 '21 at 21:00

1 Answers1

0

Warning: Nike's page is detecting headless mode and prevents logins, it is not a surprise: according to their Terms and Conditions it is forbidden to scrape their content.

In theory, you could

  1. make your login permanent by reusing a logged-in session (there are plenty of other answers about this already, e.g. [1] [2]) [ᴍᴏsᴛ ʀᴇʟᴀɪʙʟᴇ sᴏʟᴜᴛɪᴏɴ],
  2. try using puppeteer-extra with plugin-stealth [ᴡᴏʀᴛʜ ᴛᴏ ᴛʀʏ] or
  3. you could POST an authentication to the right endpoint to login in headless mode [ᴜɴᴛᴇsᴛᴇᴅ, ᴅᴇᴘᴇɴᴅᴇɴᴄʏ-ғʀᴇᴇ sᴏʟᴜᴛɪᴏɴ].

The latter could be tried like this and plese note it requires further reverse engineering to make it work (hence I don't encourage this approach, see the warning above):

const data = {
  username: username, // from real registration
  password: password, // from real registration
  client_id: cliendId, // from your existing login URL
  ux_id: 'com.nike.commerce.nikedotcom.web',
  grant_type: 'password',
  gzip: true
}
const headers = {
  'DNT': '1',
  'Accept-Encoding': 'gzip, deflate, br',
  'Accept-Language': 'en-US,en;q=0.9',
  'User-Agent': 'Mozilla/5.0 (X11; Linux aarch64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.212 Safari/537.36',
  'Accept': '*/*',
  'Cache-Control': 'max-age=0',
  'Connection': 'keep-alive',
  'Content-type': 'application/json'
}
const options = {
  method: 'POST',
  url: 'https://unite.nike.com.br/login?appVersion=900&experienceVersion=900&uxid=com.nike.commerce.nikedotcom.web&locale=pt_BR&backendEnvironment=identity&browser=Google%20Inc.&os=undefined&mobile=false&native=false&visit=1&visitor=ec83f58d-0bd0-44a5-8ccd-a17f5efc3333',
  json: data  
}

;(async () => {
  const browser = await puppeteer.launch({ headless: true, devtools: true, args: ['--no-sandbox', '--disable-setuid-sandbox', '--disable-web-security'] })
  const page = await browser.newPage()
  await page.goto('https://unite.nike.com.br/oauth.html?client_id=' + data.clientId)
  
  await page.evaluate((options, headers) => {
    fetch('/login?appVersion=900&experienceVersion=900&uxid=com.nike.commerce.nikedotcom.web&locale=pt_BR&backendEnvironment=identity&browser=Google%20Inc.&os=undefined&mobile=false&native=false&visit=1&visitor=ec83f58d-0bd0-44a5-8ccd-a17f5efc3333', {
      method: 'POST',
      postData: JSON.stringify(options),
      headers: headers
    })
  }, options, headers)
// refresh page
...
})()

You might get ideas by checking how others solving this issue: https://github.com/search?q=unite.nike.com+puppeteer&type=Code

Note: the Optimizely console message you mentioned is not related to the login prevention.

theDavidBarton
  • 7,643
  • 4
  • 24
  • 51
  • Okay, the problem with this solution is that I would have to use the false headless mode, I'll try to make the True headless mode undetectable –  Jul 18 '21 at 19:44
  • I'll try to go through the login prevention on this page: https://bot.sannysoft.com –  Jul 18 '21 at 19:46
  • actually, I just forgot to change it to `headless: true` in the snippet, all three suggestions are headless solutions ;) – theDavidBarton Jul 18 '21 at 19:47
  • Sorry to ask, what is this client id? What should I put there? I didn't understand well –  Jul 18 '21 at 20:03
  • you are able to log client-side console-log-s (from the 'Console' tab) in Node.Js as well using [dumpio](https://pptr.dev/#?product=Puppeteer&show=api-puppeteerlaunchoptions) like: `await puppeteer.launch({ headless: true, dumpio: true })`. I can ensure you the same Optimizely output is there. they are related to product management rather than bot detection, for that purpose they are using different technologies. – theDavidBarton Jul 18 '21 at 20:03
  • `clientId` is in the URL you also gave in your sample code: `https://unite.nike.com.br/oauth.html?client_id=QLegGiUU042XMAUWE4qWL3fPUIrpQTnq` it is related to a user and you need it as a GET parameter to be able to connect the login with the actual session – theDavidBarton Jul 18 '21 at 20:06