-5

A variable rows is defined as tProcRows <<- 0 in main.R file, which is by default a 'double' than in the another function on a separate R file I am trying to do-

tProcRows <<- as.double(row.names(rawData)[nrow(rawData)]) + tProcRows 

which suprisingly resulting in -

> tProcRows 
numeric(0)

I am reading rawData from a csv file and it goes through 2-3 filters after being read. so I want to keep count of my processed rows in this way so that I can skip that many lines of rows when rawData reading next time. and as it seems I'm not able to do that...

this is - main.R

rm(list=ls())
cat("\014")  

# ffName    <<- "BTPdata004.csv"  # data file full Name
ffName    <<- "BUND009.csv"  # data file full Name
tProcRows <<- 0        # total processed rows so far form the file 
cProcRows <<- 0             # total processed rows form currently loaded chunk ]
chSize    <<- 100000            # Chunk size
lastFlag  <<- 0              # flag for indicating last chunk from the file 
opData    <<- data.frame()
TimeFrame <<- 5
fileName  <<- "R_OHCL_lite_DD.csv" # Output file

maxLen <<- 0
minLen <<- 1000000

if (file.exists(fileName)) file.remove(fileName) # delete old one

StartTimeG <<- Sys.time()


Open   <<- vector(mode = "numeric",length = 0)
Close  <<- vector(mode = "numeric",length = 0)
High   <<- vector(mode = "numeric",length = 0)
Low    <<- vector(mode = "numeric",length = 0)
Volume <<- vector(mode = "numeric",length = 0)
Time   <<-   vector(mode = "character",length = 0)
Date   <<-   vector(mode = "character",length = 0)

source("loadData.R")
source("processData.R")
source("saveData.R")




############################### Repeat utill complete data is processed ###########################

# while(lastFlag != 1) {
# load data
 #print("#####   main: Let's Load some data")
# loadData()

# process data
  print("#####   main: Let's process the data")
processData()

# append the processed data frame to Storage file

# }

if(length(Open) < 100){
  opData <<- cbind(Date,Time,Open,High,Low,Close,Volume)
  saveData()  # saveToFile
}



print("#####   End From Main Function, Total time taken-->")
time.taken = Sys.time() - StartTimeG
print(time.taken)

############################### Repeat utill complete data is processed ###########################

loadData.R

# loads the file in chunks

library("iotools")
library("chron")
library("lubridate")


loadData <- function(){



   # if(tProcRows != 0)  tProcRows <<- as.numeric(row.names(rawData)[nrow(rawData)],length=1) + tProcRows


      cProcRows <<- 0 
      nskip     <<- tProcRows
     # rawData   <<- NULL




      rawData <<- read.csv.raw(file = ffName,sep=",",skip=nskip, nrows = chSize,nrowsClasses = 5000)   



      if(nrow(rawData) < chSize){
        lastFlag <<- 1  # this chunk is the last from the file 
      }



      rawData <<- subset.data.frame(rawData,rawData$Type=="Trade")
      rawData$Date <<- as.Date(rawData$'Date[G]',format = "%d-%b-%Y")
      rawData$Time <<- lubridate::hms(rawData$"Time[G]")

      if(lastFlag!=1){
      lastDay <<- rawData$Date[nrow(rawData)]  # last complete day
      rawData <<- subset.data.frame(rawData,rawData$Date < lastDay)
      }

      ############################## this is the line #########

       tProcRows <<- tProcRows + as.numeric(row.names(rawData)[nrow(rawData)]) 
       print(tProcRows)

      ###########################################################

      rawData$`#RIC`        <<-   NULL
      rawData$Type          <<-   NULL
      rawData$`GMT Offset`  <<-   NULL
      rawData$`Bid Price`   <<-   NULL
      rawData$`Bid Size`    <<-   NULL
      rawData$`Ask Price`   <<-   NULL
      rawData$`Ask Size`    <<-   NULL
      rawData$Qualifiers    <<-   NULL
      rawData$'Date[G]'     <<-   NULL



}#function

output

[1] "#####   main: Let's process the data"
[1] 88230
[1] "##### File Saved-------> "
Time difference of 2.2081 secs
numeric(0)
numeric(0)
[1] "##### File Saved-------> "
Time difference of 5.0582 secs
numeric(0)
[1] "##### File Saved-------> "
Time difference of 7.1483 secs
numeric(0)
[1] "##### File Saved-------> "
Time difference of 9.4814 secs
numeric(0)
[1] "##### File Saved-------> "
Time difference of 11.5785 secs

so it is working first time but after that iteration nothing....

ps. there is another files also but those has nothing to do with...this varible

Abhinav Rawat
  • 452
  • 3
  • 15
  • 2
    It's easier to help you if you provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input data we can run run the code to see what's going on. Also i'm not very clear as to exactly what your question is here. Are you confused about the data type? – MrFlick Aug 18 '17 at 15:19
  • 4
    `<<-` avoid using it. – M-- Aug 18 '17 at 15:24
  • @Masoud here i am defining a global variable using '<<-' – Abhinav Rawat Aug 18 '17 at 15:29
  • @MrFlick the code is here – Abhinav Rawat Aug 18 '17 at 15:30

1 Answers1

0

Seems your code needs to be cleaned up a bit:

  if(lastFlag!=1){
  lastDay <<- rawData$Date[nrow(rawData)]  # last complete day
  rawData <<- subset.data.frame(rawData,rawData$Date < lastDay)
  }

I think the "rawData" could be an empty dataframe here but not be checked;

Let's suppose:

rawData <- data.frame(x=c(), y=c())
tProcRows <- 100

So:

tProcRows <- tProcRows + as.numeric(row.names(rawData)[nrow(rawData)]) 
print(tProcRows)  

Output:

numeric(0)
Satie
  • 116
  • 1
  • 3
  • lastDay <<- rawData$Date[nrow(rawData)] # last complete day rawData <<- subset.data.frame(rawData,rawData$Date < lastDay) there was data for only one day. and so the second line of code was giving a null dataFrame... for 2nd iteration... but in the third iteration when although tProcRows was null the nskip was taking it as 0 and reading the source data file from the beginning... so whenever I used to stop the process, 'rawData' data frame was never null. – Abhinav Rawat Aug 19 '17 at 13:12
  • numeric(0) means a numeric vector which has 'zero' length, any real number plus numeric(0) in R always returns numeric(0). – Satie Aug 19 '17 at 16:54