SparkR: Turn Off Scientific Notation when write.json

Question

I'm having to use SparkR for some portion of a project, I typically use scala. I'm writing out a file using the following code

# Let's set the scipen
options(scipen=999)  
# create a spark dataframe and write out
sdf <- SparkR::as.DataFrame(df)   
SparkR::head(sdf) # all looks good
SparkR::write.json(sdf, path=somePath, mode="append") # does not look good

However, when I go to view the written out output one of my vars, timestamp in this case, is written out using scientific notation, e.g. 1.4262E12. When I would rather have it long, e.g. 1426256000000. I can't for some reason figure out why write.json is writing the file out this way. Before writing the file out I view my spark data frame and see timestamp written out long. Can anyone help/advise to work around this problem?

Here is an example of the schema, must be kept this way:

root
 |-- price: integer (nullable = true)
 |-- timestamp: double (nullable = true)

This may help https://stackoverflow.com/questions/40206592/how-to-turn-off-scientific-notation-in-pyspark — David, May 17 '18 at 17:55
You could also try casting the timestamp to a `long` before writing it out. — nate, May 17 '18 at 18:01
Okay, long worked! sdf$timestamp<-SparkR::cast(sdf$timestamp, "long") Thankyou @nate — fletchr, May 17 '18 at 18:07

score 1 · Answer 1 · answered May 17 '18 at 18:08

1

Thank you to @nate, this solved my problem and it works with the Schema I have to use anyways:

sdf$timestamp <- SparkR::cast(sdf$timestamp, "long")

answered May 17 '18 at 18:08

fletchr

646
2
8
25

SparkR: Turn Off Scientific Notation when write.json

1 Answers1