-2

How do I extract array data from the object <div class="gwt-Label">Some Data</div>?

Since @BrockAdams provided an excellent answer and solution for the general problem of data extraction from a DIV object, described by class name only (no id) and since the web page is made of 100+ DIV objects, described by the same class name, mainly "gwt-Label"

How do I extract the text from a (dynamic) div, by class name, using a userscript?

I am looking for a solution to limit the output to the console from 100+ lines to just few by modifying the code by @BrockAdams below

waitForKeyElements (".gwt-Label", printNodeText);

function printNodeText (jNode) {
    console.log("gwt-Label value: ", jNode.text().trim());
}

Since the output I read in the console is 100+ lines long, but all I need is just a few selected lines by array index.

Do you know how to manipulate jNode to save output to an array first and have only the selected array elements to be reread and send to the console?

I would prefer pseudocode like this:

jNode.text().trim()[0]
jNode.text().trim()[5]

Run as a script in Greasemonkey or Tampermonkey.

And what's more, I need to loop the script over a numerical query string setting dynamic @match URL in the script.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
darius
  • 29
  • 1
  • 8
  • It's not too hard to select the `[0]`th or `[5]`th, etc. value. But this that approach is fraught with pitfalls and [I can tell that it well not work well for you](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem). You need to change the question to show the steps you would use to manually get what you're really after, along with snippets of the page code. See [this answer](https://stackoverflow.com/a/15052467) for an example of the kind of screenshots and code snips needed. **OR**, you could link to the actual target page and explain your goal. – Brock Adams Dec 26 '17 at 07:00
  • Reading your Nike shop expertize you really deserve Nobel Prize in WWW. You are exactly right, I don't need array to save data from all the DIV nodes featuring the same class name: gwt-Label since I know XPath for the DIV node of interest to me, calculated by Firebug to be: /html/body/div[3]/div[3]/div/div[6]/div[2]/div/div/div Read a number of XPath query examples to work for XML but none for a remote web page via GM. Really don't know how to use my XPath to query the targeted DIV node for some data. – darius Dec 26 '17 at 07:56
  • follow-up Just installed FirePath plugin for Firefox and can easily get XPath calculated and saved and run html/body/div[3]/div[3]/div/div[6]/div[2]/div/div/div and nice tutorial on how to query HTML code with XPath https://www.script-tutorials.com/how-to-parse-web-pages-using-xpath/ Hope to learn how to use XPath in GM scripts by examples – darius Dec 26 '17 at 08:27
  • Thank you @BrockAdams Today experimented more with labels = document.getElementsByClassName("gwt-Label") and what is returned via console is not any array but HTML collection like this: nonc2.user.js:16:2 gwt-Label value: 1 nonc2.user.js:15:4 HTMLCollection [ , , , , , , , ... – darius Dec 26 '17 at 20:19
  • Uh, ok... Note that depending on your actual page structure you may be able to use something like: `waitForKeyElements(".gwt-Label:eq(0),.gwt-Label:eq(5)", printNodeText)` etc. I do not recommend such an approach, nor XPath, but we don't have enough info to provide a sensible alternative. – Brock Adams Dec 26 '17 at 20:24
  • Thank you @BrockAdams Today experimented more with labels = document.getElementsByClassName("gwt-Label") and what is returned via console is not any array but HTML collection like this: nonc2.user.js:16:2 gwt-Label value: 1 nonc2.user.js:15:4 HTMLCollection [ , , , , , , , ... How to turn jNode.text ().trim () into a single array element since 235 labels are sent to console ? – darius Dec 26 '17 at 20:27
  • Replaced // waitForKeyElements (".gwt-Label", printNodeText); by waitForKeyElements(".gwt-Label:eq(2)", printNodeText); to get unreachable code after return statement[Learn More] via console – darius Dec 26 '17 at 20:46
  • Just stopped to send any gwt-Label data to console maybe due to: downloadable font: download failed (font-family: ... website is http://srv1.yogh.io/#mine:height:0 – darius Dec 26 '17 at 20:57
  • I am back with var HTMLCollection = document.getElementsByClassName("gwt-Label"); var element = HTMLCollection.length; but I get 0 elements for this collection: 0 Nonc.user.js:45:4 HTMLCollection [ , , , , , , .. no way to access individual item to read: innerHTML: "Insert anything, press enter" or innerTEXT: "Insert anything, press enter" – darius Dec 26 '17 at 21:15

2 Answers2

1

Okay, if you have lots of class gwt-Label elements and, assuming that they are AJAX'd in in separate batches, you can put them into an array with code like this:

var valArry     = [];
var ajxFinshTmr = 0;

waitForKeyElements (".gwt-Label", storeValue);

function storeValue (jNode) {
    valArry.push (jNode.text ().trim () );
    if (ajxFinshTmr)  clearTimeout (ajxFinshTmr);

    //-- Let all initial AJAX finish, so we know array is complete.
    ajxFinshTmr = setTimeout (printFinalArray, 200);
}

function printFinalArray () {
    console.log ("The final values are: ", valArry);
}

Note that there are almost certainly more robust/efficient/sensible alternatives, but we need to see more of the true page, and the true goal, to engineer those.

ETA: I see you've just linked to the site. Now describe in detail what you want to do. The possibilities can be quite messy.

Brock Adams
  • 90,639
  • 22
  • 233
  • 295
  • There is a nice fiddle to work like I have planned at http://jsfiddle.net/jfriend00/FzZ2H/ All I need is just to save b lock header data to a file (log) made of 6 variables. – darius Dec 26 '17 at 21:37
  • Just C&P your code as-is into Greasemonkey script but no output into console – darius Dec 26 '17 at 21:47
  • Did you include all of the metadata block from [the previous script](https://stackoverflow.com/a/47973580/)? The code won't run without it. – Brock Adams Dec 26 '17 at 21:55
  • excellent, you did it!!! I modified valArry array to print a single element to console only: function printFinalArray () { console.log ("The final values are: ", valArry[3]); } The final values are: undefined The final values are: 4A5E1E4B... I get 2 lines sent to console for a single array index value. Since I need index range function printFinalArray () { console.log ("The final values are: ", valArry[0], valArry[1]); worked, generating first label twice The final values are: Insert anything, press enter undefined The final values are: Insert anything, press enter 1 – darius Dec 26 '17 at 22:27
  • You did it. Now it deserves some cleaning since function printFinalArray () { console.log (valArry[0]); } is printing first array element three times after cleaning Wen Console Output Insert anything, press enter Insert anything, press enter Insert anything, press enter What comes next is looping over query index incremented by +1. @match can support query index replaced with * but not sure how to reload web page from GM script. Thank you. Just another question – darius Dec 26 '17 at 23:19
  • Hence why I warned repeatedly, that this was not a good approach. It's difficult to guess what you're really trying to accomplish and a critical factor in what a good solution would be. This is a classic [X Y problem](https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem) and I won't answer another such. – Brock Adams Dec 26 '17 at 23:38
  • Nothing special to guess. Just was interested to script remote web page to extract some data and save them to a file for review as a proof of concept and test for GM scripting. ASCII is great for general discussions, drawing or image work better. Thank for your great help and job done. VB for Windows came exactly with parser, syntaxer and on-line help, making code elements clickable and self-explanatory. HTML in concept is easy in theory but tools are available to insiders only. So ppl just ask Ms of questions to get just few answers. I was really lucky to meet you. What can I offer you ? – darius Dec 26 '17 at 23:50
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/162004/discussion-between-brock-adams-and-darius). – Brock Adams Dec 27 '17 at 00:28
-4

it took me days to understand the problem

// ==UserScript==
// @name        Important test1
// @namespace   bbb

// @include     http://srv1.yogh.io/#mine:height:0

// @version     1

// ==/UserScript==
console.log("sdfasg");

window.setTimeout(function() {
  
  var HTMLCollection = document.getElementsByClassName("gwt-Label");
console.log(HTMLCollection);
var elements = HTMLCollection.length;
console.log(elements);
  element = HTMLCollection[6];
  console.log(element);
  text = element.innerHTML;
  console.log(text);
  textclass= text.innerHTML;
 // console.log(textclass);

console.log("15minutes");
}, 15000);


  
 var HTMLCollection = document.getElementsByClassName();
console.log(HTMLCollection);

@BrockAdams was very helpful. Since the above site loaded at random time duration so GM script one day worked fine, generating HTMLCollection in full but another time generated it empty, making innerHTML to generate undefined value. Great success came with "353 more… ]" HTMLCollection link to Open In Variables View, generating exactly what I tried to accomplish via XPath, generating indexed (numered) list of all DIV objects in my HTMLCollection, to let me select DIV object of interest by known number.

I am still looking for alike solution provided by DOM Parsing and Serialization

  [https://w3c.github.io/DOM-Parsing/#][1] 

to work for me to append natural number, index to every DOM object of HTML document to let me use it in parallel with XPath generated by Firebug

example

html/body/div[3]/div[3]/div/div[6]/div[2]/div/div/div

followed by

html1/body2/div[3]5or-higher/div[3]/div/div[6]/div[2]/div/div/div

just serializing HTML DOM object, appending unique index to each object to let me call it via HTMLCollection[index] or better HTMLDOMCollection[index]

I am sure such approach has been known and adopted by HTML DOM serializer parser but I don't know how to access it.

thank you all

darius
  • 29
  • 1
  • 8