1

I had a short 3-month course in University learning Java previously (the only project was coding the Sudoku game).

I'd like to learn a programming language that is most popular for general automation tasks. So far, I've picked up AHK and it has helped me with a lot of text expansion/app shortcuts and more.

Would like to advance further and accomplish the following, for example:

1 Go to this site: https://carousell.com/search/products/?query=12-35mm

2 Scrape all data that contains "Panasonic", "12-35mm" (Will it be difficult to scrape every page of these results?)

3 Grab price for price column. If price is misleading (e.g. $1 or below $X 4 value), search for price value ("$) within item description.

4 Tabluate results in Excel

5 Compare latest result to Average price.

6 If latest price lower than average price > Alert me via email.

Most of my automation projects will be something like that. What would be the best programming language, and which are the paid tutorials that can guide me to do exactly that?

I have narrowed it down to Import.IO and Python; but I might be wrong.

This course seems useful but I'm not sure whether it will teach me ALL that I need to complete this personal project.

https://www.udemy.com/automate/

Please advise, thanks!

Dave2e
  • 22,192
  • 18
  • 42
  • 50
  • The items in the search results have the same html structure. This simplifies the scraping process. I am sure that you can accomplish the scraping, the price comparison and the automatic mail using R or Python. It seems that the course will cover the most important part which is the scraping. – R. Schifini Feb 04 '17 at 07:10
  • I'm not sure it's allowed under their Terms of Service. They ban a form of automated scraping and it's likely the intent of it is to ban all scraping. Tread carefully since LinkedIn sued folks last year for scraping. Reading ToS/T&C shld be the first thing you do when deciding to scrape something. – hrbrmstr Feb 04 '17 at 16:30
  • @hrbrmstr this depends on where you live. Some countries have law's which supercede LinkedIn's ToS/T&C. For the actual programming however it will be difficult to adapt to every change on the site. – Roy Holzem Feb 10 '17 at 08:03
  • Fine, but ethics > laws. Just because you can do something doesn't mean you should @Rizzit. You may not have ethics, but many folks do. – hrbrmstr Feb 10 '17 at 14:21

1 Answers1

2

Python is great for these kinds of web scraping and processing. You will need several modules for your job:

Get the page via HTTP(S): As the page you want to scrape does not use Javascript for outputting the information you need, I suggest the great requests will be enough.

Parse the HTML and extract information: Many choices here, my personal favorite is BeautifulSoup. If you want to dig deeper, there is a question about this.

Save the results into Excel: A couple of modules will do the job again, my favorite is openpyxl. If you don't need to save very large files, this should be just fine.

Price comparison: You can compare the price just with your script, the Excel sheet working as "a database".

Emailing: You can send emails with the Python standard library. This is a fine tutorial how to do it..


No try to write the script and if you need a hint on any particular phase, please come back with a specific code, which is not working for that specific part and then we can help you.

Community
  • 1
  • 1
petr
  • 1,099
  • 1
  • 10
  • 23
  • Is there 1 particular paid tutorial that can teach me Python from ground zero? I'll probably need to learn from scratch, and even the linked tutorials are a bit too complex for my current level. Are the Udemy tutorials good enough for this purpose? – Automator_Junkie Feb 11 '17 at 11:53