3

I have a very large CSV file (1 million+ rows) with four columns of data time, id, x and y. Here is a sample:

t   id  x   y
434 84  0   0
435 84  28.22   -4.5
435 611 1895.13 755.17
435 872 2401.08 159.12
435 65  0   226.39
436 84  50.44   -4.5
436 611 1890.63 732.5
436 872 2373.9  151.04
436 990 2614.97 372.74
...

In my simulation, as time elapses I need to do one of three things:

  1. If it's the first time an id has appeared, create an object with that id at the x,y coordinates

  2. If an object with an id already exists update that object's x,y coordinates

  3. If an id does not appear anymore delete that object

I'm guessing it's very intensive to keep a running timer, check the CSV every second, locate all rows with the current time and execute one of the above steps. Is there a more efficient way of dealing with time-series data in Unity simulations?

rafvasq
  • 1,512
  • 3
  • 18
  • 48
  • 1
    Why not read the file once and load everything on memory? 2 integers, 2 floats, (lets say a lot of rows) 1MM rows should be 15MB approx. – mayo Jun 01 '18 at 18:24
  • @mayo, I have over a million rows. I tried to read all of the data into a 2D array on start but unfortunately this doesn't help the fact that I need to parse the array for the currentTime every update. As expected I'm experiencing horrible framerate and I haven't even begun to activate any objects yet – rafvasq Jun 01 '18 at 22:53
  • Oh, I see. In that case I think that Nick pointed some good tips. Now, if your CVS will be updated by an external source, maybe you read the file using a FileStream. So you can keep the file open, not read all the file at once and keep reading/going trough the stream if new data are available to be consumed (you stop reading if there is no more data to read, but you will keep checking if it is more at some point). – mayo Jun 01 '18 at 23:07
  • Here there is an example of reading a text file 'forever' ... https://stackoverflow.com/a/23306151/4848859 you can do something similar until you decide to stop the simulation. – mayo Jun 01 '18 at 23:17

1 Answers1

3

With files that large, you should start looking for alternatives. Here's some ideas, but the best option depends on what specifically you're doing.

  • Can the component updating this CSV instead communicate directly with Unity3d? Using, e.g. a socket connection would avoid the need to save and read this information to and from the disk constantly. However, this depends on how this CSV data is being created, obviously
  • Could you split the csv into smaller files? One for each timestamp, for instance? This way, there's less overhead to update the simulation at each step.
  • Can you reduce the frequency of your updates? Is it essential to update every second?
  • Or, can you only read from the CSV every, say, 10 seconds, load in all the data for the next 10 seconds (all timestamps in that range, for instance), store them in memory, and then for the next 10 seconds only use the info from memory to update instead of reading the file again? This will reduce your calls to the disk.
N.D.C.
  • 1,601
  • 10
  • 13