Questions tagged [vroom]

24 questions
6
votes
1 answer

Vroom/fread won't read LARGE .csv file - cannot memory map it

I have a .csv file that is 112GB in weight but neither vroom nor data.table::fread will open it. Even if I ask to read in 10 rows or just a couple of columns it complains with mapping error: Cannot allocate memory. …
HCAI
  • 2,213
  • 8
  • 33
  • 65
6
votes
1 answer

Partially read really large csv.gz in R using vroom

I have a csv.gz file that (from what I've been told) before compression was 70GB in size. My machine has 50GB of RAM, so anyway I will never be able to open it as a whole in R. I can load for example the first 10m rows as follows: library(vroom) df…
Martin
  • 1,141
  • 14
  • 24
4
votes
1 answer

vroom id argument - use filenames instead of archive name

I'd like to read a remote archive file with vroom and get a additional column with the filenames instead of archive name. Is this possible with vroom without the local archive_extract step as shown in the example below? Thank…
ckluss
  • 1,477
  • 4
  • 21
  • 33
3
votes
0 answers

Is there an easy way to pre-filter data into vroom?

In R, I have switched to using vroom due to its speed at reading in large delimited files, but I cannot find a simple way to pre-filter large datasets as I could do with say the sqldf package or through using SQLite and dplyr as described here The…
kam
  • 31
  • 1
3
votes
1 answer

Using vroom to read in Date column and all other columns as double in R

I have csv files with over 10000 variables in them. I want to use vroom to read them in, and want to identify column 1 as a date, column 2 as character, columns 3 and 4 as integer, and all the rest of the columns as double. How do I do this? My…
user8229029
  • 883
  • 9
  • 21
2
votes
2 answers

Pre-filtering with pipe connections and vroom

I want to read a large .txt file into R using the vroom package, because is fast and supports pipe connections for pre-filtering. For reproducibility, let's read this UK cats csv file from the Tidy Tuesday project and pre-filter for id == "Ares".…
dzegpi
  • 554
  • 5
  • 14
2
votes
2 answers

`read_fwf` and `vroom_fwf` accidentally skipping first lines?

I'm sure I'm doing something silly, but I can't quite figure it out. Both read_fwf and vroom_fwf are producing files that lack one line (the first line, to be precise) when importing fixed-width files. There are two…
Kim
  • 4,080
  • 2
  • 30
  • 51
2
votes
0 answers

Define the linebreak character importing a csv in R

I am wondering, if there's no way to import this type of csv file into R. The csv file, one can download from…
hannes101
  • 2,410
  • 1
  • 17
  • 40
2
votes
1 answer

Define decimal separator with vroom

I often face csv files, which were saved with a German locale and are therefore not properly comma-separated, but rather are separated with a semi-colon. This is of course easily solvable by defining the separator. But vroom in contrast to for…
hannes101
  • 2,410
  • 1
  • 17
  • 40
1
vote
1 answer

vroom_write writes negative zeros to file

I am trying to use vroom::vroom_write to write a tibble to a text file. Within my R session, I see that the third column of my tibble has some zeros. When I examine the text file, I see that some of zeros are written as negative zero. Here is…
Fred Boehm
  • 656
  • 4
  • 11
1
vote
0 answers

How to write POSIXct with milliseconds using vroom_write()?

How can I write POSIXct columns with milliseconds using vroom::vroom_write()? I can use format() before saving to "render" the time as character (see below), but I wonder if there's a neater way, e.g., by setting some option? # Example data df =…
Jonas Lindeløv
  • 5,442
  • 6
  • 31
  • 54
1
vote
0 answers

Wrong results in osrm self hosted API

Happy new year 2022! This is my first question I've implemented a setup of OSRM + Vroom in GCP (Google Cloud Platform) following the instructions described in this tutorial:…
Rafael C.
  • 11
  • 1
1
vote
0 answers

R readr read_csv skip error with VROOM_CONNECTION_SIZE

I have a large (~18gb) csv file that I would like to read in chuncks. The chuncks are separately processed (filtered) and concatenated. Since I'm iterating through several chunks I'm using the skip parameter of the read_csv function. Here is an…
1
vote
0 answers

Cannot install vroom under R.4.0.5 on HPC node

To install another package that depends on vroom (which failed) on a HPC node, I sought to install vroom manually. I tried to install the package manually but it fails too: My code: install.packages("vroom") error message: installing to…
1
vote
1 answer

R: Reading specific columns from txt files with slightly different column headers (differing spaces) and binding them?

I have many txt files that contain the same type of numerical data in columns separated by ;. But some files have column headers with spaces and some don't (created by different people). Some have extra columns which that I don't want. e.g. one file…
HCAI
  • 2,213
  • 8
  • 33
  • 65
1
2