0

Is there any known hierarchy of C++ Data Types and Containers? Such as a Byte is made up of Bits, an Int is made of Char, which is a Byte. Strings are collections of Chars....

I am interested in butting together a DataTable which will need to store many different data types in a DataRow and would hate to choose a List, if a List is made up of a Set and a Map, which is a multidimensional array of Vectors when I could have just chosen Vectors to start off with.

Performance is the main goal with a belief that choosing the most basic container that will support the data will lead to greater performance.

I am using a JSON definition file such as this:

[{
  "Column": "Column1",
  "StartingPosition": 15,
  "ColumnWidth": 3,
  "DataType": "Int"
},
{
  "Column": "Column2",
  "StartingPosition": 19,
  "ColumnWidth": 15,
  "DataType": "String"
},
{
  "Column": "Column3",
  "StartingPosition": 35,
  "ColumnWidth": 15,
  "DataType": "String"
},
{
  "Column": "Column4",
  "StartingPosition": 51,
  "ColumnWidth": 4,
  "DataType": "Double"
}]

To parse binary data files. The JSON is read at runtime and needs to create the containers to store the data in. Currently it all parses to a vector<string*>, which works, but if I wanted to preserve the original data type, I need to expand the complexity of my storage to incorporate containers that support multiple data types. I have looked at the std::any, tuples and Heterogeneous containers (https://gieseanw.wordpress.com/2017/05/03/a-true-heterogeneous-container-in-c/).

I was thinking it may end up as an array of a custom struct that would have each data type in it, with an extra to define which to use, which seemed like it may eat up extra memory, and if each cell of data was going to have to have a nested multidimensional array of multiple types, I felt it would be important to choose the right one to start.

Alan
  • 2,046
  • 2
  • 20
  • 43
  • 1
    There is no such hierarchy between the types at all. You choose the container type based on the operations that you are intending to perform on it. – walnut Oct 21 '19 at 13:09
  • 1
    Besides the stream library, standard C++ doesn't have a hierarchy. `int`, `long`, `long long` are all separate distinct data type. Are you coming from a Java? – NathanOliver Oct 21 '19 at 13:10
  • there is not really a hierarchy. There are few container adaptors, but other than that containers are usually not build from containers. Make `std::vector` your default and look for `std container cheat sheet` or similar – 463035818_is_not_an_ai Oct 21 '19 at 13:11
  • Last year when I built my data parser in C# I chose a Dictionary instead of a multidimensional array, it ended up being a horrible choice that almost ended the projects usefulness. While it worked fine on 50 records at a time, parsing an entire file became impossible. Just trying to avoid making similar mistakes here. – Alan Oct 21 '19 at 13:14
  • 2
    The three "main" (by this I mean the ones with the best performance) are `std::vector`, `std::unordered_set` and `std::unordered_map`. Those generally should be the first thing you go to. Once you pick one, write the logic and then benchmark it. That way you wont get to far along before you see how it will really preform. If those aren't doing it for you then you need to start looking into alternatives. – NathanOliver Oct 21 '19 at 13:17
  • @NathanOliver your comment could be an answer. I see a vote to close this question, but it seems a legitimate question, with a valid answer. – Alan Oct 21 '19 at 13:20
  • 1
    Perhaps this will help https://stackoverflow.com/questions/471432/in-which-scenario-do-i-use-a-particular-stl-container – Support Ukraine Oct 21 '19 at 13:24
  • To further improve performance of the std containers, there are some [really fast custom allocators out there](https://github.com/search?q=c%2B%2B+allocator)... – nada Oct 21 '19 at 13:24
  • 2
    @Alan To me the question as is is too broad, POB or a resource request. If you are looking for advice on which data structure to use, I would edit the question to detail what you want to do with it, and then we can give you better, less opinionated answers to that. – NathanOliver Oct 21 '19 at 13:25
  • @Alan Using an interpreted language like C# doesn't seem a very smart choice for a _data parser_ – nada Oct 21 '19 at 13:27
  • @nada, you are right, sadly it was the language I knew at the time I did the project. It did work, and with about 10 columns of data will parse about 20,000 records per second, but with C++ I should be able to do better. – Alan Oct 21 '19 at 13:41

0 Answers0