2

so I have a working code and I am getting the data fields that I want. However, I have run into the issue where the code is churning out seemingly random videos in my limited results.

I would just scrape all video objects, but for my channels of interest, this would exceed my daily quota for using the API of 10,000. My best solution would be the ability to pick a specific date, and then retrieve all video dated from then until now. I have briefly looked into this and observed that there are apparently long-standing issues with ordering these results via the API backend. Also I am not sure how to implement ordering into the code that I am working with. Therefore, is this possible, and how should I do it? Thanks

EDIT: Actually, I just realized that this codeblock is irrelevant to this issue, since it is the prior codeblock that is parsing through videos to retrieve video IDs. This is the codeblock where I need to implement the order, although I am unsure how to implement the GET function into this

limit = 5 
video_Ids = []
nextPageToken ="" #for 0th iteration let it be null
for i in range(limit):
    url = f"https://www.googleapis.com/youtube/v3/search?key={api_key}&part=snippet&channelId={channel_Id}&maxResults=50&pageToken={nextPageToken}"
    data = json.loads(requests.get(url).text)
    for key in data['items']:
        if 'videoId' in key['id']:
            video_Id = key['id']['videoId']
            video_Ids.append(video_Id)

    nextPageToken = data['nextPageToken']

EDIT2: I've just been adding "&publishedAfter=" and "order=" into the URL and it appears to be a limited success. Unfortunately it orders in reverse chronological order, and the documentation doesn't say how to reverse it.

It could be fine, if I can figure out how to make this run without the limitations and just collect all data objects in the temporal parameters.

  • Have you tried YouTube Data API v3 [Search: list](https://developers.google.com/youtube/v3/docs/search/list) endpoint with filters [`publishedBefore`](https://developers.google.com/youtube/v3/docs/search/list#publishedBefore), [`publishedAfter`](https://developers.google.com/youtube/v3/docs/search/list#publishedAfter) and [`channelId`](https://developers.google.com/youtube/v3/docs/search/list#channelId)? Note that this endpoint usage costs 100 of quota. – Benjamin Loison Oct 01 '22 at 11:36
  • No I haven't, and I don't know how to implement that into my current code. I want what my code is currently doing, but in order. – BeaverFever Oct 02 '22 at 05:12
  • Maybe I missunderstood and you are just looking for [`order=date`](https://developers.google.com/youtube/v3/docs/search/list#order). Otherwise as far as I understand [this alternative](https://stackoverflow.com/a/27872244/7123660) would also help you while using far less quota. – Benjamin Loison Oct 02 '22 at 10:37
  • Check my edits for my updated status. My issue now is that, using tokens, I can apparently only retrieve 500 results at a time via submitting videoIDs for video data in the API. Some channels would have 1000+ results in my time period of interest. I have a way to just rip all video IDs from a channel, BUT, API token only gives me 10,000 quota and some channels will exceed this. hmmmm... – BeaverFever Oct 02 '22 at 11:41
  • Then you can request [a quota increase](https://developers.google.com/youtube/v3/getting-started#quota) or use [my no-key service](https://stackoverflow.com/a/73792461/7123660) at https://yt.lemnoslife.com. – Benjamin Loison Oct 02 '22 at 11:48
  • No point in requesting a quota increase for a single project. The 500-limit token issue was for extracting video IDs, but I found a 3rd party package that does all that for me. Processing this video IDs into video data still has the 10,000 daily quota via YT API, so workaround is to break the video list into smaller chunks and process by piecemeal over a few days. Not an efficient solution but it works for a smaller project. – BeaverFever Oct 04 '22 at 13:23

0 Answers0