0

I need to get the HTML markup of a YouTube video page. Here is the code:

async function get_subtitles(video_id: string): Promise<string> {
    if (video_id.includes('https://') || video_id.includes('http://'))
    throw new EvalError("You provided an invalid video id. Make sure you are using the video id and NOT the url!")

    const WATCH_LINK: string = `https://www.youtube.com/watch?v=${video_id}`
    
    const ytRES = await fetch(WATCH_LINK);
    console.log(ytRES)
    const ytHTML = await ytRES.text();

    //extract_captions_json(ytHTML, video_id);
}

When I try to load page, I get this error:

Cross-Origin error

Front-end written on React, maybe this helps.

tried use custom headers, axios and etc. Nothing works.

wineT
  • 15
  • 4
  • Does this answer your question? [No 'Access-Control-Allow-Origin' header is present on the requested resource—when trying to get data from a REST API](https://stackoverflow.com/questions/43871637/no-access-control-allow-origin-header-is-present-on-the-requested-resource-whe) – jub0bs Jun 11 '23 at 09:01

2 Answers2

1

why do you get cors with react:
your react is most probably set for client side rendering (default for CRA).
in this case, your app's client is accessing your app through http://yourhome.com but is then asked to fetch data from https://youtube.com.
cors sees that both url ain't in no way related (no subdomain or such) and just blocks youtube requests.
what can you do to deal with it:

  • set a proxy which will translate your youtube.com requests to yourhome/youtube.com
    • with a conf in your config.json in the project root, see related answear (can either proxy directly to youtube or a custom server (express server for example))
    • with a reverse proxy server
  • server side rendering (probably, didn't test), with nextjs for exemple
safir
  • 84
  • 6
  • Maybe there is a workaround? I saw proxy before?.but I thought this is overkill. I just need to get html markup on client and that's it. Maybe there is a something like request from python? – wineT Jun 11 '23 at 13:44
  • creating a proxy setting in your config.json is probably the simplest way to tackle this issue. any other way i can think of would either need a server service running or prefetching data into a file added to your react app ( but only applicable to predictable / common data ). – safir Jun 11 '23 at 22:34
0

You can use separate server for this. For example Flask implementation:

from flask import Flask, jsonify
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api.formatters import TextFormatter

app = Flask(__name__)

@app.route("/api/subtitles/<video_id>")
def get_subtitles(video_id: str):
    if type(video_id) != str:
        return jsonify({
            "status": "error",
            "reason": "Video ID is not a string value o_0"
        })
    
    try:
        transcription = YouTubeTranscriptApi.get_transcript(video_id, languages=("ru", "en"))
        transcription = TextFormatter().format_transcript(transcription)
    
    except Exception as e:
        print(e)
        return jsonify({
            "status": "error",
            "reason": str(e)
        })
    
    return jsonify({
        "status": "success",
        "data": transcription
    })
wineT
  • 15
  • 4