I need to extract URLs belonging to the https://twitter.com domain from a JS string of HTML code and store them as a variable array. I know I'm looking for a RegEx (https?:\/\/(.+?\.)?twitter\.com(\/[A-Za-z0-9\-\._~:\/\?#\[\]@!$&'\(\)\*\+,;\=]*)?)
. My problem is that I don't know what command finds this in JS, although I have looked for it.
My project partner is populating a Google Sheets table which I'm storing as an HTML file locally, which I fetched on a separate HTML page and pushed to the console as such below. My end goal is to have the links of twitter profiles he put in multiple columns in a JS array for later use.
fetch('Directory.html').then(function (response) {
return response.text();
}).then(function (html) {
console.log(html);
}).catch(function (err) {
console.warn('Ooga booga.', err);
});
Any insight is appreciated. I love this community, blessings to you all.
Edit
On the heels of a comment below, I've implemented this code, yet Chromium console prints the entire document as if it's filtering nothing. Why is this? I initially tried it without the forwardslash / before and after the regex content, but Chromium console complained of an unexpected : (colon) token. Why is this?
fetch('Directory.html').then(function (response) {
// The API call was successful!
return response.text();
}).then(function (html) {
// This is the HTML from our response as a text string
console.log(html);
}).catch(function (err) {
// There was an error
// console.warn('Something went wrong.', err);
});
const paragraph = html;
const regex = /(https?:\/\/(.+?\.)?twitter\.com(\/[A-Za-z0-9\-\._~:\/\?#\[\]@!$&'\(\)\*\+,;\=]*)?)/;
const found = paragraph.match(regex);
console.log(found);