Kimono desktop source URLs to csv

Question

I have a list of multiple URLs for use with a Kimono desktop API I created, but for the life of me I can't figure out how to make it clear in the data output (csv) what rows of results come from which source URL.

Is there a way to pull in the source URL as another column to easily distinguish rows of data when there are 100+ URLs? Thanks!

score 0 · Answer 1 · answered Mar 25 '16 at 03:55

It's based off the html and css within the generated source code, so unless you have a dependable value to use within the source that explicitly state the url (such as wikipedias link canonical tags), then you are left with using the scrape index values.

If a scrape is unsuccessful for one page, it won't skip it, it'll still create a row with an index number. It also will be in order of entered page values, so if you're using a predetermined list of urls, you can just have the url list numbered yourself and then correlate the two indexes together like id's.

Otherwise, use a value on the page that you already know in order to confirm the relevant content, such as an ID number, product number or any other data.

Kimono desktop source URLs to csv

1 Answers1