0

I'm going to start by pasting my current code.

var array1 = [741, 451, 54188, 5847, 5418, 54944, 310, 541, 7451, 10211, 113, 9115, 62, 2841, 52482481, 24];
var array2 = [15, 418, 488, 130000, 8482, 55, 16, 14, 2546, 651, 4521, 11, 54, 659, 542, 1152];

var myObj = {};
array1.forEach(function(item, i) {
    myObj[item] = array2[i];
});

I want to try and write a function that will modify the URL for each element of array1. This URL would be like

http://blahblahblah.com/blah/EachElementToBeScanned/cost

My intention would be for it to scan through the list of them, and then...

I have some Regex to run on that page it finds (see below).

Finally, I want to see if the value that the webpage gives for each element in array1 is less than or equal to the corresponding one in array2.

This is the purpose of myObj.


Here is my Regex, which I believe is already fine:

var someRegex = /\<span class="cost-in-usd">([\d,]+)\<\/span\>/
var costOfIt = data.match(Regex)[1]
costOfIt = Number(costOfIt.replace(",",""))
cookie monster
  • 10,671
  • 4
  • 31
  • 45
  • Why are you parsing HTML using a regular expression? If this is in a browser you can always use DOM tools to extract the data you want. – tadman Jul 07 '14 at 17:36
  • To get all of the `` tags, use jQuery. – Jonathan M Jul 07 '14 at 17:38
  • 6
    @JonathanM: Loading jQuery just to get `span` elements is bad advice. – cookie monster Jul 07 '14 at 17:39
  • I'd rather just focus on the Javascript element of it, and leave the Regex to stay. I didn't see it necessary to use jQuery, as I only want 1 span tag for each page it gives. I added a random link, which doesn't exist at all, but it should give the idea of what I desire to change in it ("EachElementToBeScanned" to the element in array1). And then, I'd want to see if the value that I receive after the Regex is equal to the corresponding value for each one (see array2), which is why _myObj_ exists. – user3810560 Jul 07 '14 at 17:42
  • @user3810560, yes, but regex + HTML = bad. http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – Jonathan M Jul 07 '14 at 17:45
  • @user3810560: It isn't a question of regex vs jQuery. It's a question of regex vs DOM manipulation. If your URL represents a reliable pattern, then regex would be an OK solution, depending upon the nature of the document. Do you really need to target `span.cost-in-usd` elements, or do the URLs have a distinct pattern that can be matched instead? – cookie monster Jul 07 '14 at 17:48
  • 2
    You mean to say the text your regex will look for resides at a remote source - so you probably don't own ***bla.com*** (why else wouldn't there be an API)? I spy with my little eye... a cross-origin spanner in the works. – Emissary Jul 07 '14 at 17:49
  • You do not want to use regex for HTML. You don't need to use jQuery either for this. Just use `document.querySelector('span.class')` which will return the first element it finds that matches that selector if there is only one element you are after. You can use `document.querySelectorAll('span.class')` if you want multiple elements returned. This is supported by IE8+ so you can count on it being broadly available. – pseudosavant Jul 07 '14 at 18:01
  • @Emissary The site concerned is one that I'm not willing to mention, currently. "blahblahblah" is a placeholder, if you like. The site has an API, which I find tricky to work with (as you need to access a separate page where the items in array1 are listed, and then scan that to find the items in that, and their corresponding costs). – user3810560 Jul 07 '14 at 18:18
  • @pseudosavant I'm not using HTML at all, and don't intend to. – user3810560 Jul 07 '14 at 18:25
  • ah ha, so it's an XY problem... seriously - opting to scrape a web page just because you don't understand an API is just downright stupid. If you don't want to reveal the page in question fine - but you will have to provide an example of a response from the API. – Emissary Jul 07 '14 at 18:26
  • @Emissary Ok, this is an incredibly cut-down response from one of the json "API" pages. It'd need to find if array1 within the response of the API, and then see if the LowestCost is equal to/less than the corresponding value in array2. `[{"ID":741,"LowestCost":19,"Exist":12532},{"ID":48748,"LowestCost":874,"Exist":14}]` – user3810560 Jul 07 '14 at 18:32
  • @Emissary It's just that the API pages in question cannot be sorted by individual items, but only by pages of items (on which the order of the items of change frequently). I'd rather locate the individual ID page and use Regex to scrape the bit I want from it, instead of crawling through 300 pages of 20 items each to find the items I wish to find the cost of. – user3810560 Jul 07 '14 at 19:22
  • 1
    So instead of 300 concurrent requests you'd want 6000? That's madness! If that wasn't enough of a reason, as previously mentioned any request not served by a properly configured API will be subject to a same-origin policy. The only way around that is via a CORS proxy which significantly increases the round-trip time of each request. What you're asking is realistically infeasible without server-side intervention. Depending on request frequency, the volume of data you're polling warrants a caching mechanism (or you'll likely be blocked). The scope of this question is now too broad to answer :/ – Emissary Jul 07 '14 at 19:44
  • @Emissary To start, all I really want to do is shove the element in array1 into the URL (as marked above), and loop all of the elements indefinitely, to see if it's selling for a low amount. I won't bother going through the API pages, and it'd just be easier for me to just scrape the element of the page with Regex. – user3810560 Jul 07 '14 at 20:12

0 Answers0