0

I want to get the code of a website. In this case it's a tiktok page.

My code looks like this:

  try {
        URL url = new URL("https://www.tiktok.com/@petyyyyyy?lang=de");
        BufferedReader in = new BufferedReader(new InputStreamReader(url.openStream()));

        String inputLine;

        while((inputLine = in.readLine()) != null) {
            System.out.println(inputLine);
        }

    } catch (MalformedURLException me) {
        System.out.println(me);
    } catch (IOException ioe) {
        System.out.println(ioe);
    }

This works perfect for any normal website. The problem is that it doesnt' work with this tiktok site.

I think it's because there is an @ in the URL and Java has a problem with it.

Any help appreciated

WhySoToxic
  • 155
  • 1
  • 9
  • 1
    Try encoding it, i.e. replace the '@' in the url with '%40' – cry0genic Aug 21 '20 at 18:47
  • Unfortunately this didn't help. It still returns nothing – WhySoToxic Aug 21 '20 at 18:54
  • "Doesn't work" is not a problem we can debug. _Most likely_, this is a JavaScript-based site. – chrylis -cautiouslyoptimistic- Aug 21 '20 at 19:02
  • Okay, I'll try to describe it better. All I know is that it returns nothing. The console output is just nothing. Let me know what other information I can provide. Also - how does the site being based on Javascript affect this? Could you explain? – WhySoToxic Aug 21 '20 at 19:04
  • 1
    `URL url = new URL("https://www.tiktok.com/notfound");` for this url it's working fine. – Sagar Gangwal Aug 21 '20 at 19:04
  • 1
    The site can look at your user agent, and if it sees it's not a recognized modern browser, it can send back an empty request, probably to stop crawlers. If you're not getting an exception, then this is the problem and there's not really anything you can do. There is big incentive for tiktok and other social media sites to stop crawlers and you probably won't be able to bypass this. – cry0genic Aug 21 '20 at 19:07
  • You can use a `URLConnection` and [set the user agent](https://stackoverflow.com/questions/2529682/setting-user-agent-of-a-java-urlconnection) to whatever string your browser uses. But that would only give you the page source (basic HTML) - which probably does not contain anything you want/need/expect (because the page relies heavily on JavaScript to load the "interesting" stuff). – andrewJames Aug 21 '20 at 19:25

2 Answers2

1

You can try something like that:

try {
        URL url = new URL("https://www.tiktok.com/%40petyyyyyy?lang=de");
        HttpURLConnection connection = (HttpURLConnection) url.openConnection();
        connection.setRequestProperty ("User-Agent", "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:79.0) Gecko/20100101 Firefox/79.0");
        
        BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));

        String inputLine;

        while ((inputLine = in.readLine()) != null) {
            System.out.println(inputLine);
        }

    } catch (MalformedURLException me) {
        System.out.println(me);
    } catch (IOException ioe) {
        System.out.println(ioe);
    }
0

It looks like Tiktok is blocking requests to the user page.

Workaround:

In my case, I just wanted to get the follower amount from the site.

I simply use another website, i.e. tiktok follower counter

WhySoToxic
  • 155
  • 1
  • 9