0

The first priority is performance (speed) in ms. What data structure to use to store text file quickly? Given that the text file could have variable no. of lines. Each line have variable no. of co-ordinates. And each of those co-ordinates need to go through calculations. Also, I want to sequentially access points, to do the calculations. And remove some of the co-ordinates, if necessary. There will be enough memory space available, thousand time more than the file size.

x1,y1 x2,y2 ....
x6,y6 x7,y7 .....
......

To be precise, the file is as below-

7866.777,505.821 -7866.773,508.291 -786.8402,500.845 -7864835.125147422,5084020.882938482
-7865228.42,508.491642 -7864114.999361482,5081606.040795522
-8865228.42,508.4642 -7864.999361,5081.040795522

Now how do I store each element quickly? Can I store it in vector? It is flexible, but its slow. Can I store it in 2d array? Is it fastest? But it has variable number of lines and different number of elements in each line. Also, number of elements in array needs to be constant, so is there other way to use dynamically growing array?

Update Since, details were asked, I'm updating the questions with details and trying to be precise as possible.

csrockstar
  • 55
  • 1
  • 9
  • _'Can I store it in vector? It is flexible, but its slow.'_ How did you measure this? – πάντα ῥεῖ Jul 20 '14 at 12:50
  • 2
    What you should really do first, is to profile and measure your program, to see where the actual bottlenecks are. Also, do you *really need* it to be faster? What are your requirements? What are your use-cases? What *are you doing*? Please read about [the XY problem](http://meta.stackoverflow.com/questions/66377/what-is-the-xy-problem). – Some programmer dude Jul 20 '14 at 12:54
  • For any sane situation, most of the time is spent on text-formatting, not storing the data. If you really need to significantly boost performance, use binary serialize representation and skip formatting. – Non-maskable Interrupt Jul 20 '14 at 13:00
  • @πάντα ῥεῖ - In this case, we don't know number of lines and sub elements ahead in time. So, I'm using vector with pushback. But, I read pushback is expensive than array. http://stackoverflow.com/questions/3664272/stdvector-is-so-much-slower-than-plain-arrays – csrockstar Jul 20 '14 at 13:04
  • @Joachim - Speed is requirement beside correctness, that's why I mentioned it in the question. – csrockstar Jul 20 '14 at 13:07
  • 1
    _'But, I read pushback is expensive'_ depends on initial/subsequent alloc size. Also you don't need the line numbers, if it's all integer data in the file you're well off with `while(instream >> number) { datavec.push_back(number); }` IMHO (there are various methods to just skip the `','` characters). – πάντα ῥεῖ Jul 20 '14 at 13:09
  • You are going to need to be more precise on your requirements. What operations do you need (exhaustively) ? What kind of measurements are you looking at (memory, time) ? *fast*, *expensive*, etc... mean nothing, `< 1ms` or `O(1)` is meaningful. – Matthieu M. Jul 20 '14 at 13:24
  • @MatthieuM. I have updated the question. I have to do spatial operations in those co-ordinates. The main concern is to improve the time, since enough memory is provided. – csrockstar Jul 21 '14 at 01:31

2 Answers2

4

"Store it quickly" doesn't make sense.
The fastest way to literally store the data in memory is a plain string/vector.
If you need to retrieve the data is also to read it from a plain string/vector.
The fastest way to manipulate the data in-place is probably to use a data structure called a "rope", but that's not part of the C++ standard library, so you'll have to find it elsewhere.

At the moment I feel like you're unclear on what you actually want to do, though, so it's hard to give an accurate answer.

user541686
  • 205,094
  • 128
  • 528
  • 886
2

One way could be to read the whole file into a std::istringstream, then use that instead of the file. Using std::istringstream means that the whole file will be in memory which is much faster than a disk.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • 3
    Did you really just recommend using I/O streams to a question about high-performance data structures?! – user541686 Jul 20 '14 at 12:53
  • @Mehrdad At least it's a way to get into memory. The main question is, I think, do the OP really *need* to have the arcane, cryptic and unreadable (and therefore hard to maintain) code that comes from the most optimized low-level stuff that's needed to shave of a couple of milliseconds? – Some programmer dude Jul 20 '14 at 12:54
  • @JoachimPileborg How can I use istringstream to read/stores values in the above case? Do you have any example or reference link. – csrockstar Jul 20 '14 at 14:49