how to define the number of digits after second in timestamp of spark streaming data?

Asked Nov 15 '18 at 21:47

Active Nov 16 '18 at 08:47

Viewed 69 times

My timestamp in real data would be like this or as shown below

2018-02-28T00:05:20.3717898Z 
2018-02-28T00:05:23.6589778Z 
2018-02-28T00:05:23.9119922Z 
2018-02-28T00:05:25.4230787Z 
2018-02-28T00:05:25.6710929Z 
2018-02-28T00:05:26.4271361Z

And I use this code to read the data

userSchema=StructType().add('time','timestamp')
s=spark.readStream.schema(userSchema).csv('xxxx')

The result is like this

Complete no idea how it happened.

edited Nov 16 '18 at 08:47

user238607

1,580
3
13
18

asked Nov 15 '18 at 21:47

ellie

I think spark might be reading it in the correct format. What could be happening is that it is showing you the truncated form. Try to use s.show(10, truncate=false). Here is a question you with exactly the same problem as yours : https://stackoverflow.com/questions/33742895/how-to-show-full-column-content-in-a-spark-dataframe – user238607 Nov 16 '18 at 07:17
Thanks, your answer is very heuristic. But the streaming object doesn't support shown() function. I tried to modify the timestamp format when read data and use option("truncate", False) for writestream(), the results look much better. – ellie Nov 16 '18 at 15:27

how to define the number of digits after second in timestamp of spark streaming data?

0 Answers0