I'm pretty sure what I'm looking for is a Regular expression in R for reading scientific notion. Below is what I have done and the specifics. I very much appreciate any help.
I have a text file where some numbers are scientific notation and some are just decimals or integers. I'm trying to read them into R using regular expressions. I wrote a program to do this, and I was successful as long as the numbers did not use scientific notation or negative numbers.
The program I wrote was
getBig <-function(fileName,rows,columns)
{
dat <-readChar(fileName, file.info(fileName)$size)
gregexpr('[0-9][/.0-9]+',dat,perl = TRUE)
s <- regmatches(dat,m)
s <- s[[1]]
s<-s[-1] #the first element is the list size
S <- matrix(s,ncol=rows,nrow=columns)
S<- t(S)
return(S)
}
I tried to modify the regular expression to include negative numbers and scientific notation by modifying the above program with the below regular expression but was not successful. Does anyone have an idea where I am going wrong? Any help is appreciated, and I have the example file format below as well.
m <- gregexpr(' [-+]?[0-9]*(/.?[0-9]*([eE][-+]?[0-9]?))?',dat,perl = TRUE)
[-+]? + or - optional
[0-9]* a digit 0-9 at most 0 times
( start non optional block /.? optinal [0-9]* match 0 or more times
( start another block [eE][-+]? e or E + or - optional [0-9]* a digit 0-9 1 or more times )?)? close blocks matching optional
The file format below is rows,columns
where (rowN,rowN,rowN) refers to columns 1-3 for the Nth row. i.e
[3,1] ((1,1,-1),-2.542611418857958448210085379141884323299379672715620518130686999531487002844642281770330354890802745e-05,8.586192002176000052697976968885158408090751670240233300961472896241959822732337130019333683974778635e-05))