0

I've been attempting to import a table of events from the following website:  https://scpgajt.bluegolf.com/bluegolf/scpgajt23/schedule/index.htm?type=2&display=champ (and others similar in structure).

I am attempting to reproduce the example website table on a Google Sheet where I would later add a check-box column and then select the events I need (which would copy the selection to another sheet for personalized planning).

So far, I have been able to use copied/pasted Apps Script coding found on Stack Overflow (see my Example Sheet HERE) and this =ImportTableHTML(A1,1) formula on the sheet to pull the table from the site into the sheet.

This Apps Script method has finally produced a complete list of events, however, the results are horribly formatted incorrectly (see Example Sheet 1 - Scrape Import / Raw). The result I am looking for should format close to the the original columns and rows as the original table, or filter and distribute the pulled data into certain specified cells (see Example Sheet 2 - Model Result).

This is the farthest I have been able to get, thanks to the scripts found on Stack Overflow, combining scripts posted in Replacing =ImportHTML with URLFetchApp) and Creating a UrlFetchApp script to replace the Google Sheet importHTML function.

Unfortunately, now I cannot figure out the options in the script to affect formatting / distributing of the results into the proper cells.

Is it possible to reproduce the table in my example sheet with proper or modifiable formatting?


The site I am attempting to capture table data from

The resulting import using =ImportTableHTML(A1,1)

The way the imported data should be parsed and distributed


App Script Code I am currently using:

    function importTableHTML(url,n){
      var html = '<table' + UrlFetchApp.fetch(url, {muteHttpExceptions: true}).getContentText().replace(/(\r\n|\n|\r|\t|  )/gm,"").match(/(?<=\<table).*(?=\<\/table)/g) + '</table>';
      var trs = [...html.matchAll(/<tr[\s\S\w]+?<\/tr>/g)];
      var data = [];
      for (var i=0;i<trs.length;i++){
        var tds = [...trs[i][0].matchAll(/<(td|th)[\s\S\w]+?<\/(td|th)>/g)];
        var prov = [];
        for (var j=0;j<tds.length;j++){
          donnee=tds[j][0].match(/(?<=\>).*(?=\<\/)/g)[0];
          prov.push(stripTags(donnee));
        }
        data.push(prov);
      }
      return(data);
    }
    function stripTags(body) {
      var regex = /(<([^>]+)>)/ig;
      return body.replace(regex,"");
    }
Rubén
  • 34,714
  • 9
  • 70
  • 166
onsitenow
  • 9
  • 2
  • 2
    Welcome to [so]. While links to external resources and posts from this site might be helpful, posts on this site should be self-contained. Considering this, all all the relevant details, as a [mcve], directly into this question body. – Rubén Jan 05 '23 at 05:41
  • Rueben, Thank you for your comment, but this question was written in a "minimal" way with links to the sample "Live" spreadhsheets so those who can and are able to work on it, or answer it, can do so within a "reproducible example" environment. The live spreadsheet pages that I have linked to are just for this question and collaboration, and is a representative example of the actual spreadsheets the eventual answers will be applied to. I have provided the example sheet and links to the membership just as others have always done. Thank you for your comment. – onsitenow Jan 05 '23 at 18:21
  • You are free to write your question as you wish but this question is not a good fit for this site and it might be closed. Ideally questions on this site should be helpful for a large audience during a long time not only for the post author, not only to solve and transitional problem. The problem with relying only on "live spreadsheets" is that could be modified / deleted at any time making the question hard to understand and unhelpful for future readers. – Rubén Jan 05 '23 at 18:34

0 Answers0