I am doing a mini-project to apply clustering/retrieval ML methods on youtube video data to return some specific videos that I desire from a large dataset of videos.
I am having some trouble figuring out how to get the dataset of youtube videos in the first place. End goal is I want something like:
video_id | video_title | category | likes | dislikes | views | words_comment |
for bunch of videos (maybe ~10000 rows?) in csv format that I could apply Python machine learning algorithms to.
What's the best way of going about this? I've tried the youtube API but I am not familiar how it works I am stuck with errors. Is scraping directly from the youtube website easier?
Thanks!