1

I have a comma delimited text file that has a few million entries. After every 23 entries there is a newline. I will add each full line as an instance of a vector, with the 23 fields as instances of a sub-vector. So, the first instance will be vec[0][0-22], followed by vec[1][0-22], etc.

This file is a part of my program and needs to be compiled with it. Meaning, I don't want to have to provide the file additionally and use ifstream to read the data from the separate file.

I already can sort the data using ifstream, but now I need to integrate the raw data into the program so that I can compile it all together.

I am unable to make this large comma-delimited-field text file into one long string and then separate it into fields because some of the fields have quotes within them, with commas between the quotes too.

example:

  `19891656,PLANTAE,TRACHEOPHYTA,MAGNOLIOPSIDA,FABALES,FABACEAE,Zygia,ampla,(Benth.) Pittier,,,,,Pithecellobium amplum  |Pithecolobium brevispicatum  ,Jarendeua de Sapo,,,LC,,3.1,2012,stable,N
   19891919,PLANTAE,TRACHEOPHYTA,MAGNOLIOPSIDA,FABALES,FABACEAE,Zygia,biflora,L.Rico,,,,,,,,,VU,B2ab(iii),3.1,2012,stable,N
   2060,ANIMALIA,CHORDATA,MAMMALIA,CARNIVORA,OTARIIDAE,Arctocephalus,pusillus,"(Schreber, 1775)",,,,,Phoca pusilla,"Afro-Australian Fur Seal, Australian Fur Seal, Brown Fur Seal, Cape Fur Seal, South African Fur Seal",Arctocphale d'Afrique du Sud,,LC,,3.1,2015,increasing,N`

When my program runs it will source data from this mass of text, and it will not need to use ifstream with a path to an external file. How can I include this text file in my program? Is there a way to "include" text files? If I need to make a massive array of strings, how do I do this with quoted fields with commas between the quotes? I would be happy to clarify any part of this question which seems vague as I am really curious as to how I can make this work.

Technically this text file is a csv, but I am hesitant to include csv as a tag because I think people will think I am looking for a csv parsing solution.

1 Answers1

1

You may want to write a script to convert each line of your data file into an initializer of a record struct with a trailing comma after each lins [if you don't want to use a terminator entry (see below) than except the last line]. This script may be your data type specific. Say,

12,Joe,,,YES -> MyType(12,"Joe",0,0,true),

Then #include the entire converted file in place of your data array/vector element initializers, for ex

MyType myData [] = 
{
#include "my_data_file_converted"
   MyType() //an optional terminal entry
};

Of course MyType should have constructor(s) accepting your initialization sequences.

AndreyS Scherbakov
  • 2,674
  • 2
  • 20
  • 27
  • Programming is still fairly new to me, however I understood your suggestion for the most part. I was planning to learn Perl anyway so I will write the script converting the data to vector initializer list format using Perl as my next step. I have never seen #include used like that. I will experiment with using files like that. – user9232598 Jan 18 '18 at 04:58