Questions tagged [parsehub]

parsehub is a data-extraction platform for static and dynamic data web-sites. The platform has free and paid tiers of service. Templates can be written that will select elements, extract data to CSV/JSON, and interact with elements in the page. Multi-page navigation is possible. There is an API to gain access to the platform's capabilities for user projects.

28 questions
2
votes
2 answers

Is there a general selector for a specific nth of typ and all nths after?

I am using a data scraper and I know just enough code to not know what I'm doing. The scraper is free and has a crawl limit. There is no option to resume the crawl from end point. I need to scrape all li's on the page. The program will only let me…
2
votes
2 answers

Regex: Keep last word after text

Tommy Hilfiger Men Teal Blue Analogue Watch TH1791293_BBD I need to split this and keep the last part i.e TH1791293_BBD The issue is that the part before the target string i.e Tommy Hilfiger Men Teal Blue Analogue Watch Can be of varying…
2
votes
2 answers

Parsehub API PHP

How can I dump the results into a MySql db? Specially decode gzip and parse it to an HP array then dump into a db.
2
votes
0 answers

Pasehub: Extract data from the event url

Need to extract data from the url that is provided as a link while parsing a template. I have created a template for a website and it is working fine. But the link that is provided with each set of extracted object, I need to go to that particular…
Syed Asad Abbas Zaidi
  • 1,006
  • 1
  • 17
  • 32
2
votes
2 answers

ParseHub Webhook with RoR

Parsehub provides the webhook feature. But currently I'm testing my Rails app locally. So how could I provide the webhook url for a project on Parsehub to point to my local server or any specific method in my controller. Parsehub Doc…
Syed Asad Abbas Zaidi
  • 1,006
  • 1
  • 17
  • 32
2
votes
1 answer

Parsehub main_template and renaming

So using the parsehub tool to experiment with data-scraping and wondering if there is a rule to keeping the main_template name which is automatically given to all projects. Is it possible to change it and what is the significance of the name and the…
Shawn Mehan
  • 4,513
  • 9
  • 31
  • 51
1
vote
5 answers

regex convert text to date

I'm using parsehub to extract data. There is a date in the format thursday 22 december 2022 but I need it in a date string dd/mm/yyyy. Is there a way to do that with regex javascript? I can't find info about a way to solve this.
1
vote
0 answers

Parsehub - extract list of URLS to a single column in spread sheet/json

When extracting data over multiple pages, each page's results are placed in a sperate column. Saving as CSV/Excel for example it will look like this: urls urls urls urls page2_urls page2_urls page2_urls page3_urls page3_urls page4_urls and so…
1
vote
0 answers

Setting up Callback Function for ParseHub API

I am currently trying to set up ParseHub's API in order to feed data from the web scraping software into a MongoDB database. I am currently attempting to take the last ready scraping run and write it to a JSON file. However, I receive the following…
1
vote
1 answer

Parsehub website elements only display on a certain date

I'm using Parsehub to scrape certain data from certain pages on a website into a google spreadsheet. The issue I'm having is that a certain html element only displays on a certain date and I'm wondering if there is a way to set it up so Parsehub…
1
vote
1 answer

Parsehub: Pagination not work on http://eservices.dubaitrade.ae

Im trying to scrap data by pagination but pagination not work on next button below you can see i applied code but it not get data by pagination below is URL please…
0
votes
1 answer

Parsehub "This Stencil app is disabled for this browser."

I need to scrape some data from transfermarkt.com using parsehub, but when i try to load the website with parse hub I'm only met with: This Stencil app is disabled for this browser. Developers: ES5 builds are disabled during development to take…
0
votes
1 answer

How to scrape changing survey data of an ajax web survey

I have a site where I want to scrape out the contents of the survey. The survey can not be extracted it is a limesurvey service with a PHP backend. I tried using ParseHub but I am stuck. The survey uses an ajax next button which I can happily loop…
0
votes
1 answer

How to use if control statement in jinju2 embedded html file

I'm using parsehub to scrape a bunch of movie names and have a python script export it to an html file and that is working fine. However I want to use an if statement to only print titles that have "The" in them. The structure is fine and it is…
0
votes
3 answers

How do I grab the string on the next line in HTML code following tag with specific class and specific text?

I'm trying to scrape out some product specifications from some e-commerce website. So I have a list of URLs to various products, I need my code to go to each (this part is easy) and scrape out the product specs I need. I have been trying to use…
hkm
  • 342
  • 1
  • 2
  • 10
1
2