0

I have 2 large .txt data frames I'm working with in R (greater than 5GB each with more than 5 million observations) and I was wondering if there was an easy way to randomly sample ~ 20,000 rows from each when reading the data in.

Presently, I cannot read the data in and then sample from there, as the data is too big and I run into an error.

Phil
  • 7,287
  • 3
  • 36
  • 66
Jake C
  • 1
  • 1
  • How is the data formatted in the text file? As a dataframe? – mark_1985 Nov 30 '21 at 03:16
  • Yes, it is a large voter data file with ~7 million observations and 12 columns. – Jake C Nov 30 '21 at 03:18
  • [See here](https://stackoverflow.com/q/5963269/5325862) on making a reproducible example that is easier for folks to help with. There's very little information here—what you've tried, what the error is, what type of tools you're using—so it's hard to know how those other related posts do or don't answer your question – camille Nov 30 '21 at 15:24

0 Answers0