0

I am using Jsoup in my app to scrape data from a site. Everything was fine until I came across the 'download' part of the app. It would be easy if the download link is in the href value but this site uses a javascript function

Here's how the site is layed out:

This is the link to the file:

<a href="javascript:download(11848,'d915f46123');">Ai Ai Ai ni Utarete Bye Bye Bye</a>

Below is the javascript download function. It accepts a songid and a key, builds a string with the passed arguments and sets it as the form's action attribute, and calls the form's submit method:

function download(songId, key) {
var form = document.getElementById('dlForm'); form.action = '/download/zephzeph/' + key + '/' + songId + '.mp3';
form.submit();
}

Below is the form:

<form id="dlForm" action="/amusic/download.php" method="POST"></form>

2 Answers2

0

Hopefully I understand your question correctly, do elaborate, if not.

I would try to right click on the download link and open it in a new tab. The new link is what you need to emulate in your scraper.

My experience with net scraping is very limited, but I would be more than happy to help you find a solution. :)

Sipty
  • 1,159
  • 1
  • 10
  • 18
  • Yeap. Already tried that but it only gives me the function call not the link: `javascript:download(11848,'d915f46123');` – Vladymyr Vi Britannia Feb 10 '15 at 15:49
  • Could you share the website in question? I will try to hack something together. – Sipty Feb 10 '15 at 15:53
  • sure. Though you need to create an account in order to generate the href value or else it will just generate a link to login/register page. The site is gendou.com/amusic/ – Vladymyr Vi Britannia Feb 10 '15 at 15:54
  • btw the you can download the file by clicking the name of the song – Vladymyr Vi Britannia Feb 10 '15 at 15:58
  • It's hard for me to get the context of your last comment. – Sipty Feb 10 '15 at 15:59
  • I apologize for not being clear. You know how the data is presented in a table? You need to click on the name of the song under the 'Song(Link)' column to download it. – Vladymyr Vi Britannia Feb 10 '15 at 16:03
  • Could you try this answer, please? http://stackoverflow.com/a/8645576/2796939 I used to use XMLHttpRequest for image scraping, maybe the same methodology can be applied to your problem as well? Especially since the link to the php script is exposed. – Sipty Feb 10 '15 at 16:09
  • not quite sure with the posted thread. Though i found a similar scenario http://stackoverflow.com/questions/17621778/htmlunit-to-invoke-javascript-from-href-to-download-a-file. Bu I am wondering how to implement this in android. It uses HtmlUnit third party library – Vladymyr Vi Britannia Feb 11 '15 at 01:24
  • I, personally, am not a big proponent of third party libraries, especially in Android. – Sipty Feb 11 '15 at 09:13
0

Looks like the only way I can achieve this is to use a webview. http://android-er.blogspot.com/2011/10/call-javascript-inside-webview-from.html?m=1. I'll try and see if it will work.