Trying to automate scaping by finding exact string matches
Trying to scrape image links and their unique product numbers called "sku's" on jewellery websites hosted on platforms such as shopify, woocommerce and magento. For each jewellery website when trying to webscape the class names of the div tags change, but the links and the sku. How the string starts and ends is almost the same. so I have to find a match in the string of the whole HTML document and find the index postion and bascially move some position after the mentioned index position and grab the string. I'm facing problem in matching the exact string in the whole HYML webpage and match a string and get the respective index positon Exactly.
This is the string in have to find the match for see the first set of characters and last 3 characters are the same so i need to find a match for the same and trying to extract this link "//cdn.shopify.com/s/files/1/2237/1833/products/16_b903a3fc-5529-4937-91ef-98568f965182_490x@3x"
Need to find the above string match from the below set of code in the html webpage
Or is there any way to automatically download all the images and their sku's on the forementioned jewellery websites? If so, do let me know!