-1

I have a Google spreadsheet with links to questionnaires. I wanted to know how to get the questions from each of the questionnaires.

introducir la descripción de la imagen aquí

I guess I have to do: at best, use the script editor and iterate on the lines, and at worst, do webscraping.

const puppeteer = require('puppeteer');

function appendString() {
  var range = SpreadsheetApp.getActiveSheet().getActiveRange();
  var numRows = range.getNumRows();
  var numCols = 0;
  for (var i = 1; i <= numRows; i++) {
    for (var j = 1; j <= numCols; j++) {
      var currentValue = range.getCell(i,j).getValue();
      await page.goto(currentValue);

      const pollFrame = page.frames().find() # From there I have some difficulties
      
    }
  }
}

But I get the following error:

SyntaxError: await is only valid in async function (ligne 10, fichier "Code.gs")

Not to mention the async problem or the buttonthat I still have to click, the selection looks like this:

<div class="freebirdFormviewerViewItemsItemItemTitle exportItemTitle freebirdCustomFont" id="i1" role="heading" aria-level="3" aria-describedby="i.desc.310938276">How often did you fly before the Covid-19 epidemic? </div>

But the IDs don't follow a logical numerical order, so I don't know how to extract them automatically.

Then I don't know how to do it. I wonder if it's simpler because they're products from the same supplier.

Here is the equivalent in csv format:

https://docs.google.com/forms/d/e/1FAIpQLSfzocEm6IEDKVzVGOlg8ijysWZyAvQur0NheJb_I_xozgKusA/viewform?usp=sf_link
https://docs.google.com/forms/d/e/1FAIpQLScrm0ZTrvlONf5MX37N93H_FajNzfbNy9ZtitX-Vq9PPuLPHA/viewform?usp=sf_link

https://docs.google.com/forms/d/e/1FAIpQLSeolFSh3OyS_XpX1lRIJP-8CH8WG0X0hL98SM9d85LqC22Bow/viewform?usp=sf_link

Update

So I tried the anwer kindly posted by Neven Subotic's:

// this array will store forms and their questions
let formAndQuestions = [];

let formIds = ["https://docs.google.com/forms/d/e/1FAIpQLSfzocEm6IEDKVzVGOlg8ijysWZyAvQur0NheJb_I_xozgKusA/viewform?usp=sf_link",
        "https://docs.google.com/forms/d/e/1FAIpQLScrm0ZTrvlONf5MX37N93H_FajNzfbNy9ZtitX-Vq9PPuLPHA/viewform?usp=sf_link",
        "https://docs.google.com/forms/d/e/1FAIpQLSeolFSh3OyS_XpX1lRIJP-8CH8WG0X0hL98SM9d85LqC22Bow/viewform?usp=sf_link"]

formIds.forEach( formId => {
  const form = FormApp.openById( formId );
  // lets get the name
  const formName = form.getTitle();
  // first we get all items
  const allItemsInThisForm = form.getItems();

  // then we get filter out anything that is not a questions
  const allQuestionsInThisForm = allItemsInThisForm.filter( item => {
      return isThisItemAQuestion( item )
  });

  // now we store them in our object
  formAndQuestions.push( {
    formId: formId,
    formName: formName,
    questions: allQuestionsInThisForm
  })
});

// this function is used to only get the itemTypes you want
// see reference for more information
function isThisItemAQuestion( item ){
  const itemType = item.getType();
  const validQuestionItemTypes = [ FormApp.ItemType.TEXT, "add others here" ]
  let isValid = false;

  validQuestionItemsTypes.forEach( validItemType => {
    if( itemType == validItemType ) {
      isValid = true;         
    }
  });
  return isValid
}

Unfortunately I obtain the following error message with the following details Exception: No item with the given ID could be found, or you do not have permission to access it. (line 9, "const form = FormApp.openById( formId );"). I don't understand. As you can see in the gif, I can open these links, so I should have the permission to access them isn't it?

I also tried Ruben's ideas with:

// this array will store forms and their questions
let formAndQuestions = [];

let formIds = ["https://docs.google.com/forms/d/e/1FAIpQLSfzocEm6IEDKVzVGOlg8ijysWZyAvQur0NheJb_I_xozgKusA/viewform?usp=sf_link"]//,
        //"https://docs.google.com/forms/d/e/1FAIpQLScrm0ZTrvlONf5MX37N93H_FajNzfbNy9ZtitX-Vq9PPuLPHA/viewform?usp=sf_link",
        //"https://docs.google.com/forms/d/e/1FAIpQLSeolFSh3OyS_XpX1lRIJP-8CH8WG0X0hL98SM9d85LqC22Bow/viewform?usp=sf_link"]


function scrapeForms(){
  formIds.forEach( formId => {
                  // The code below logs the HTML code of the Google home page.
                  var response = UrlFetchApp.fetch(formId);
                  results = response.getElementsByClassName("freebirdFormviewerViewItemsItemItemTitleContainer");
                  Logger.log(results.getContentText())
  });
}

But got back:

TypeError: response.getElementsByClassName is not a function (ligne 13, fichier "Code")
Revolucion for Monica
  • 2,848
  • 8
  • 39
  • 78
  • 1
    Why don't you use [FormApp](https://developers.google.com/apps-script/reference/forms/form)? – Rafa Guillermo May 22 '20 at 13:16
  • @RafaGuillermo Thanks for the reference! I didn't knew it. Unfortunately I have received the file as such and have to work my way around. Actually the actual file has a lot more rows. – Revolucion for Monica May 22 '20 at 13:27
  • Does the code you posted work? Especially the `require(puppet)`? – TheMaster May 22 '20 at 13:41
  • @TheMaster puppeteer? It seems to be alright, the main error is `await page.goto`. But I think it's related. – Revolucion for Monica May 22 '20 at 14:05
  • 1
    That's not alright. That's a major error, which would require hours of research and weeks more to invest in a solution. As far as the reported error is concerned, as the error states , `await` is only valid inside `async` functions: Add `async` before `function`. Not that it matters anyway. – TheMaster May 22 '20 at 14:49
  • 1
    @Rubén line 10 is `await` :) – Revolucion for Monica May 22 '20 at 14:50
  • Cross-posted in [es.so] https://es.stackoverflow.com/q/358209/65 (note: it's valid to make cross-posting through sites of different languages, just mentioning this for those who speak Spanish too) – Rubén May 24 '20 at 23:30

2 Answers2

1

According to What is this Javascript "require"? require is not part of the standard JavaScript an AFAIK it's not supported by Google Apps Script.

By the other hand, the error message can't be easily solved as Google Apps Script Chrome V8 engine doesn't support async functions. Related Is google apps script synchronous?


If you will be using Google Apps Script, and you are the form owner or a form editor, instead of trying to web scraping a Google Form use the Forms Service of Google Apps Script. For this you will need the form ../edit URLs instead of the ../viewform URLs. On the official docs there is a quickstart that might help you https://developers.google.com/apps-script/quickstart/forms.

You could use openByUrl to "open" a form. It will not be actually opened in your web browser, it will be opened on the server side. Then you could use getItems to get all the questions, sections, images, videos, etc.

If you aren't the form owner or a form editor then you should use UrlFetchApp service and somehow parse the web page source code of each form based on the position of the questions. Related question: Google Sheets: How to import the following data?

Also, if the form has several sections you should do a post request to emulate clicking on the next button in order to get the second and following sections. There are more "also if the form has..." but I will stop here as the main part of question was already answered, I think.

Rubén
  • 34,714
  • 9
  • 70
  • 166
0

You first want to get all the forms, so place those in an array:

const formIds = ["someId", "anotherId", "andSoOn"]

Then, lets use the FormApp to get the form and all items. Items can be of different types, see documentation.

// this array will store forms and their questions
let formAndQuestions = [];

formIds.forEach( formId => {
  const form = FormApp.openById( formId );
  // lets get the name
  const formName = form.getTitle();
  // first we get all items
  const allItemsInThisForm = form.getItems();

  // then we get filter out anything that is not a questions
  const allQuestionsInThisForm = allItemsInThisForm.filter( item => {
      return isThisItemAQuestion( item )
  });

  // now we store them in our object
  formAndQuestions.push( {
    formId: formId,
    formName: formName,
    questions: allQuestionsInThisForm
  }
});

// this function is used to only get the itemTypes you want
// see reference for more information
function isThisItemAQuestion( item ){
  const itemType = item.getType();
  const validQuestionItemTypes = [ FormApp.ItemType.TEXT, "add others here" ]
  let isValid = false;

  validQuestionItemsTypes.forEach( validItemType => {
    if( itemType == validItemType ) {
      isValid = true;         
    }
  });
  return isValid
}

Then you can initially log out the results and see what it looks like:

Logger.log( formAndQuestions )
Neven Subotic
  • 1,399
  • 1
  • 6
  • 18
  • 2
    if op only has the published links `/viewform`, this won't work – TheMaster May 22 '20 at 14:46
  • true, but not sure if that implies. He states in his description "I guess I have to do: at best, use the script editor and iterate on the lines, and…" and I assume he has access to those files. If this is not the case, my approach is not suitable. – Neven Subotic May 22 '20 at 14:49
  • Thanks for your answer! I will try it tonight. Yes, I do have access to this file full of links. I shared it at the top. – Revolucion for Monica May 25 '20 at 15:52
  • Unfortunately I obtain the following error message with the following details `Exception: No item with the given ID could be found, or you do not have permission to access it. (line 9, "const form = FormApp.openById( formId );")`. I don't understand as I can open these links, so I should have the permission to access them isn't it? – Revolucion for Monica May 27 '20 at 13:04