0

How can I extract data using Talend from websites such as below to do some data analysis:

Airbnb, change.org
monster.com ebay

I am new to TOS and not familiar with internet components. I think I may be confused regarding what connectors to use (trest, tsoap...). If anyone could help me understand which kind of connectors are needed that would be great.

Blue
  • 135
  • 9
  • What do you mean by extract ? parsing a web page, or consuming a web service, or something else ? – 54l3d Jun 11 '15 at 09:11
  • @54l3d I mean consuming a web service. – Blue Jun 11 '15 at 09:29
  • You have to know the difference between [Rest and SOAP](http://stackoverflow.com/questions/2131965/main-differences-between-soap-and-restful-web-services-in-java) ws, then check the api doc on target site to know the type of the ws and you will be able to choose perfectly which Talend connector to use. – 54l3d Jun 11 '15 at 10:58
  • @54l3d thanks for link. I understand Rest/SOAP are 2 architectures. But how can what architecture is used in an api if it is not specified ? For example on change.org the api architecture is not mentioned in the documentation => https://github.com/change/api_docs/blob/master/v1/documentation/index.md. Is there a way to tell ? – Blue Jun 12 '15 at 05:33
  • As far I know, if you speak about json responses, then its a Rest ws, if xml then its SAOP, for change.org its Rest ws. – 54l3d Jun 12 '15 at 08:23

1 Answers1

0

You can use following architecture

tREST --> tExtractJSON or tExtractXMLFields component depending on your requirement.