2

I would like to load up in R memory the data from text files I downloaded. The zip file is the same folder of the rStudio project, and it has two sublevels, where there are three files of my interest, in temp.zip/final/en_US.

I went through the zip function documentation and this brilliant post, whitout fortune.

Please find here my last attempt.

temp <- tempfile()
download.file("https://d396qusza40orc.cloudfront.net/dsscapstone/dataset/Coursera-SwiftKey.zip", temp)

The temp.zip file has 10 subfolders.

 [1] "./final/de_DE/de_DE.twitter.txt" "./final/de_DE/de_DE.blogs.txt"  
 [3] "./final/de_DE/de_DE.news.txt"    "./final/ru_RU/ru_RU.blogs.txt"  
 [5] "./final/ru_RU/ru_RU.news.txt"    "./final/ru_RU/ru_RU.twitter.txt"
 [7] "./final/en_US/en_US.twitter.txt" "./final/en_US/en_US.news.txt"   
 [9] "./final/en_US/en_US.blogs.txt"   "./final/fi_FI/fi_FI.news.txt"   
[11] "./final/fi_FI/fi_FI.blogs.txt"   "./final/fi_FI/fi_FI.twitter.txt"

Since temp.zip is quite big, I would like to open a connection or extract the data only from the 7th, 8th and 9th element, without unzip/load the whole temp.zip.

Community
  • 1
  • 1
Worice
  • 3,847
  • 3
  • 28
  • 49
  • It is not clear. You have unzipped your file, then you have multiple subfolders? – Soheil Apr 23 '16 at 15:23
  • That is way too big of a file to download. If you want help reading a particular file into R, include a snippet of the text file. – bramtayl Apr 23 '16 at 15:28
  • Yes it is confusing. I would like to load in memory only the data in the subfolders 7, 8 and 9 of the `temp.zip`. Thank you for the suggestion. A snippet is part of one of the text files? – Worice Apr 23 '16 at 15:29
  • You can try read.table(unz("temp.zip", "final/en_US/en_US.blogs.txt")) – chinsoon12 Apr 23 '16 at 22:21

0 Answers0