0

I do have a data.table, which needs to be able to expand, by appending rows. I've now been trying to combine basically the results of these two questions add-a-row-by-reference-at-the-end-of-a-data-table-object and why-is-rbindlist-better-than-rbind and came up with the following idea. I make use of rbindlist to expand the number of rows, but only when there are no empty rows left in the data.table. As long as empty rows are available I just use set() to fill in values.

dt <- structure(list(OilWellID = 1:5, Invest.Period = c(1L, 1L, 1L, 1L, 1L)
, Abandon.Period = c(1568L, 1833L, 2529L, 3384L, 1559L )
, Initial.Production = c(942.430661758408, 1354.41458085552,   674.456247827038, 770.618930924684, 922.09160188213)
, D = c(0.02, 0.02, 0.02, 0.02, 0.02))
, .Names = c("OilWellID", "Invest.Period", "Abandon.Period", "Initial.Production", "D"), class = c("data.table", "data.frame"), row.names = c(NA, -5L))

n.oil.well.portfolio <-  dt[,.N]

if(is.na(dt[.N, Invest.Period]) == FALSE){
    # If data.table is full increase size
    dt <- rbindlist( #Original data.table
            list( dt,
                    data.table( OilWellID = (n.oil.well.portfolio+1):(n.oil.well.portfolio*2)
                                , Invest.Period = rep(NA,n.oil.well.portfolio )
                                , Abandon.Period = rep(NA,n.oil.well.portfolio )
                                , Initial.Production = rep(NA,n.oil.well.portfolio )
                                , D = rep(NA,n.oil.well.portfolio )
                    )
            )
    )
 } else {
    # Do nothing - enough space to add additional rows with set()
 }
# Example of the set()- call
set(dt,as.integer(n.oil.well.portfolio+1), j = 2L, 4)

This combination seems to be one of the most efficient ways to do this, until it's possible to add rows by reference. A nice comparison on different ways of expanding a data.table can be found in this answer to another question.

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
hannes101
  • 2,410
  • 1
  • 17
  • 40
  • 1
    Why do you have to append to a data.table repeatedly? – Roland Oct 21 '16 at 12:17
  • I would like to keep track of a simulated portfolio and additional rows, i.e. assets can be added to the data.table. – hannes101 Oct 21 '16 at 12:18
  • I don't think it's a duplicate, since I am not only interested in the difference between rbindlist() and set() but rather would like to know if a combination of both is better. Especially the problem, that set() can't be used if there are no empty rows is addressed here. – hannes101 Oct 21 '16 at 12:27
  • I'm not convinced that you can't avoid this. I would look for alternatives. Maybe you can restructure so that you can at least append columns (thereby making use of over-allocation) instead of rows? – Roland Oct 21 '16 at 12:34
  • If I append columns and thereby transpose the data, I can't calculate the necessary results, which are already implemented. Another more related question is https://stackoverflow.com/questions/10790204/how-to-delete-a-row-by-reference-in-data-table where Matt Dowle talks about an efficient insert() command. Although no idea if there exists something in the development tree. – hannes101 Oct 21 '16 at 13:03
  • It's also rather a duplicate of this http://stackoverflow.com/questions/20689650/how-to-append-rows-to-an-r-data-frame/38052208#38052208 – hannes101 Oct 21 '16 at 13:39
  • Yeah, that's a good duplicate, though I went with this one because the question posed is data.table specific. Re Matt Dowle's post, there's an open FR as well: https://github.com/Rdatatable/data.table/issues/635 I'm not sure why there isn't a similar FR for insertion... – Frank Oct 21 '16 at 16:15
  • Should I delete the question or add some more information from the linked answer. – hannes101 Oct 25 '16 at 09:54

0 Answers0