Questions tagged [ffbase]

Basic statistical functions for package ff

R package ffbase provides basic functionality to allow to do basic data operations on ff structures. This functionality is also available in the R base library and ffbase tries to provide the methods needed for ff structures such that switching between in-RAM objects and on-disk objects in ff is more straightforward.

More info:

64 questions
7
votes
3 answers

R ff package ffsave 'zip' not found

Reproduceable Example: library("ff") m <- matrix(1:12, 3, 4, dimnames=list(c("r1","r2","r3"), c("m1","m2","m3","m4"))) v <- 1:3 ffm <- as.ff(m) ffv <- as.ff(v) d <- data.frame(m, v) ffd <- ffdf(ffm, v=ffv,…
TongZZZ
  • 756
  • 2
  • 8
  • 20
5
votes
1 answer

delete rows ff package

Since a while now I´ve been using ff package in order to work with big data. The R object I´ve worked with has about 130.000.000 rows and 14 columns. Two of those columns, Temperature and Precipitation have missing values “NA” so I need to delete…
lpchaparro
  • 129
  • 2
  • 6
3
votes
1 answer

Grow a ffdf data frame on disk gradually

From documentation of save.ffdf: Using ‘save.ffdf’ automagically sets the ‘finalizer’s of the ‘ff’ vectors to ‘"close"’. This means that the data will be preserved on disk when the object is removed or the R sessions is closed. Data can be…
qed
  • 22,298
  • 21
  • 125
  • 196
3
votes
2 answers

How to convert a factor vector to POSIXct in ff or ffbase

After reading in a large data set with read.csv.ffdf, one of the columns is time. Such as 2014-10-18 00:01:02, for 1 million rows in that column. That column is a factor. How do I convert it to POSIXct supported by ff? Simply using as.POSIXct() just…
MM Cui
  • 51
  • 6
3
votes
0 answers

How to do matrix multiplication with ff objects

Suppose I have ff_matrix (also doesn't work with ffdf) objects called x and y. x is a 100*10 matrix and y is a 10*1 matrix. library(ffbase) x <- as.ffdf(data.frame(matrix(rnorm(100*10),ncol=10))) y <- as.ffdf(data.frame(matrix(rnorm(10)))) x <-…
user2763361
  • 3,789
  • 11
  • 45
  • 81
3
votes
4 answers

How can I perform full outer joins of large data sets in R?

I am trying to do data analysis in R on a group of medium sized datasets. One of the analyses I need to do requires me to do a full outer join amongst around 24-48 files, each of with has about 60 columns and up to 450,000 lines. So I've been…
Drew75
  • 277
  • 1
  • 3
  • 11
2
votes
1 answer

Efficient Combination and Operating on Large Data Frames

I have 2 relatively large data frames in R. I'm attempting to merge / find all combos, as efficiently as possible. The resulting df turns out to be huge (the length is dim(myDF1)[1]*dim(myDF2)[1]), so I'm attempting to implement a solution using ff.…
ch-pub
  • 1,664
  • 6
  • 29
  • 52
2
votes
2 answers

What does the "by" argument in ffbase::as.character do?

In the post below, aggregation using ffdfdply function in R There is a line like this. splitby <- as.character(data$Date, by = 250000) Just out of curiosity, I wonder what by argument means. It seems to be related to ff dataframe but I'm not sure.…
dixhom
  • 2,419
  • 4
  • 20
  • 36
2
votes
1 answer

R - ff package : find the most frequent element in ffdf and delete the rows where is located

I need a suggestion to find the most frequent element in ffdf and after that to delete the rows where is located. I decided to try the ff package as I'm working with very big data and with base R I am running out of memory. Here is a little…
pshls
  • 137
  • 10
2
votes
1 answer

Data.table setDT functionality in ff/ffbase R packages

Calculate column of conditional means, in ff/ffbase packages. I'm searching for functionality in ff/ffbase packages, which allow me for data manipulation similar to carried below with data.table package : library(data.table) irisdf <-…
Qbik
  • 5,885
  • 14
  • 62
  • 93
2
votes
0 answers

R ffdfappend SIGBUS error

I have an R script which uses the ffbase and ff packages. In Windows the script runs fine. In Linux (different box, higher RAM though) it crashes with a bus (SIGBUS) error. Windows (Version 6.1.7601) session info: R version 3.1.0…
user444628
2
votes
2 answers

Functions for creating and reshaping big data in R using the FF package

I'm new to R and the FF package, and am trying to better understand how FF allows users to work with large datasets (>4Gb). I have spent a considerable amount of time trawling the web for tutorials, but the ones I could find generally go over my…
Luke23
  • 33
  • 5
2
votes
1 answer

How to subset a large data frame (ffdf) in R by date?

I am trying to subset an FFDF by a date. Below, I have successfully created such a subset using a normal data frame. But I needed some help in applying this to an FFDF. My attempt, along with the error message, is listed in the code comment. Many…
Tyler Durden
  • 303
  • 5
  • 12
2
votes
1 answer

ffdfdply, splitting and memory limit in R

I'm having "Error: cannot allocate vector of size ...MB" problem using ff/ffdf and ffdfdply function. I'm trying to use ff and ffdf packages to process large amount of data that has been keyed into groups. Data (in ffdf table format) looks like…
tanvach
  • 389
  • 1
  • 5
  • 12
2
votes
3 answers

ff package write error

I'm trying to work with a 1909x139352 dataset using R. Since my computer only has 2GB of RAM, the dataset turns out to be too big (500MB) for the conventional methods. So I decided to use the ff package. However, I've been having some troubles. The…
1
2 3 4 5