1

I have downloaded the Yelp data set from https://www.yelp.com/dataset_challenge. The file that downloaded is called yelp_dataset_challenge_round9.tar

However, the file that is extracted from the tar file has no extension. I have checked https://github.com/Yelp/dataset-examples, however it assumes that the file is a json file called yelp_academic_dataset.

I have the tar file downloaded as well as the contents of the tar extracted. I'm using Windows 10. I used Winrar to extract the contents. I would really appreciate any assistance on how to open and view the dataset.

Community
  • 1
  • 1
Bjafri5
  • 166
  • 6
  • See the answer to this question: http://stackoverflow.com/questions/159521/text-editor-to-open-big-giant-huge-large-text-files – eli-bd May 01 '17 at 07:25

2 Answers2

5

Turns out that the file inside the tar (the one without the extension) is a tar file as well - so the download is basically a tar file inside a tar file. After extracting the original file, add the tar extension to it, and then extract that. After extracting that, you'll have all the different json files for the data set.

Bjafri5
  • 166
  • 6
0

Extract the file and then again rename the file to .tar extension and extract the file again, thus you get access to the dataset !

Step 1: yelp_dataset_challenge_round9.tar extract this file that you downloaded.

Step 2: You need to rename the extracted file to .tar extension

Step 3: Again extract your renamed file, then you will see files inside of it.

I hope this helps.

  • 1
    As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Oct 17 '22 at 17:31