How do I read in a specified number of characters from a file while still iterating through it?

Question

I have a file of data like this:

    Judy Henn      2 Oaklyn Road       Saturday 2001
    Norman Malnark 15 Manor Drive      Saturday 2500
    Rita Fish      210 Sunbury Road    Friday   750

I need to assign the first 20 characters as the name, next 20 as address, next 10 as day, and the number as yardSize, using the istream::get() method. My professor is requiring the use of .get() to accomplish this.

I am having a really hard time figuring out how to assign the data from the file to the right variables while still looping.

struct Customer{
    char name[21];
    char address[21];
    char day[11];
    int yardSize;
};

int main(){
    const int arrSize = 50;
    Customer custArr[arrSize];
    int i = 0;
    
    //set up file
    ifstream dataFile;
    dataFile.open("Data.txt");
       
    //try to open file
    if(!dataFile){
        cout << "couldn't open file";
    }
       
    //while dataFile hasn't ended
    while(!dataFile.eof()){
        dataFile.get(custArr[i].name, 21);   
        cout << custArr[i].name;
        i++;
    }
}; //end

I would have thought that the while loop would assign the first 21 characters into custArr[i].name, then loop over and over until the end of file. However, when I print out custArr[i].name, I get this and ONLY this:

Judy Henn           2 Oaklyn Road       Saturday   2001

I'm not sure how to go about assigning a specified number of characters to a variable, while still iterating through the entire file.

First see [Why !.eof() inside a loop condition is always wrong.](https://stackoverflow.com/q/5605125/9254539) — David C. Rankin, Feb 24 '21 at 22:18
@DavidC.Rankin reading that, should I have: `while(dataFile)` or should I have declared a variable to hold data like `string line` and used it like so `while(dataFile >> line)` ? — Rene, Feb 24 '21 at 22:55

score 1 · Accepted Answer · answered Feb 24 '21 at 23:38

First off, the character counts you mentioned don't match the data file you have shown. There are only 19 characters available for the name, not 20. And only 9 characters available for the day, not 10.

After fixing that, your code is still broken, as it is reading only into the Customer::name field. So it will try to read Judy Henn into custArr[0].name, then 2 Oaklyn Road into custArr[1].name, then Saturday into custArr[2].name, and so on.

I would suggest something more like this instead:

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
using namespace std;

struct Customer
{
    char name[21];
    char address[21];
    char day[11];
    int yardSize;
};

int main()
{
    const int arrSize = 50;
    Customer custArr[arrSize];
    string line;
    int i = 0;
    
    //set up file
    ifstream dataFile("Data.txt");
    if (!dataFile)
    {
        cout << "couldn't open file";
        return 0;
    }
       
    //while dataFile hasn't ended
    while ((i < arrSize) && getline(dataFile, line))
    {
        istringstream iss(line);
        if (iss.get(custArr[i].name, 21) &&
            iss.get(custArr[i].address, 21) &&
            iss.get(custArr[i].day, 11) &&
            iss >> custArr[i].yardSize)
        {
            cout << custArr[i].name;
            ++i;
        }
    }

    return 0;
}

That is cleaner than using `getline()` and having to clear `failbit` after each extraction using a `count`. OP will still need to loop from the end of `name`, `address`, and `date` to trim trailing whitespace. Why not a `std::vector custArr{};`? — David C. Rankin, Feb 24 '21 at 23:59
So I messed around with the code a bit and noticed that if I used `dataFile.get(custArr[i].name, 21);` instead of using the `istringstream iss(line)` it worked then too. Is there a reason why we want to use the `std::istringstream` and `iss.get()` instead of just `dataFile.get()` ? I'm still trying to learn what's best practice. — Rene, Feb 25 '21 at 00:22
@Rene it is safer to read an entire line first and then parse the line, rather than trying to parse a line while reading the line. — Remy Lebeau, Feb 25 '21 at 00:27

David C. Rankin · Answer 2 · 2021-02-25T02:03:34.280

Reading fixed-width (mainframe type) records isn't something C++ was written to do specifically. While C++ provides a wealth of string manipulation functions, reading fixed-width records is still something you have to put together yourself using basic I/O functions.

In addition to using to the great answer by @RemyLebeau, a similar approach using std::vector<Customer> instead of an array of customers eliminates bounds concerns. By using a std::vector instead of an array, you can adapt the code to read as many records as needed (up to the limits of your physical memory) without the fear of adding information past an array bound.

Additionally, as currently written, you leave the leading and trailing whitespace in each array. For example, your name array would hold " Judy Henn " instead of just "Judy Henn". Generally you will always want to trim leading and trailing whitespace from what you store as a variable. Otherwise, when you use the stored characters you will have to have someway to deal with the whitespace each time the contents are used. While std::string provides a number of methods you can use to trim leading and trailing whitespace, your use of plain old char[] will require a manual removal.

Adding code to trim the excess leading and trailing whitespace from the character arrays in the collection of Customer could be written as follows.

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <cstring>

#define NAMLEN  20      /* if you need a constant, #define one (or more)  */
#define ADDRLEN 21      /* (these marking the fixed-widths of the fields) */
#define DAYLEN  10

struct Customer {
   char name[21];
   char address[21];
   char day[11];
   int yardSize;
};

int main (int argc, char **argv) {
    
    if (argc < 2) { /* validate at least one argument given for filename */
        std::cerr << "error: insufficient no. of arguments\n"
                     "usage: " << argv[0] << " <filename>\n";
        return 1;
    }
    
    std::string line {};                    /* string to hold each line read from file */
    std::vector<Customer> customers {};     /* vector of Customer struct */
    std::ifstream f (argv[1]);              /* file stream (filename in 1st arg) */
    
    if (!f.is_open()) { /* validate file open for reading */
        std::cerr << "error: file open failed '" << argv[1] << "'.\n"
                  << "usage: " << argv[0] << " <filename>\n";
        return 1;
    }
    
    while (getline (f, line)) {             /* read each line into line */
        std::stringstream ss (line);        /* create stringstream from line */
        Customer tmp {};                    /* declare temporary instance */
        char *p;                            /* pointer to trim leading ws from name */
        size_t wslen;                       /* whitespace len to use in trim */
        
        ss.get (tmp.name, NAMLEN);          /* read up to NAMLEN chars from ss */
        if (ss.gcount() != NAMLEN - 1) {    /* validate gcount()-1 chars read */
            std::cerr << "error: invalid format for name.\n";
            continue;
        }
        for (int i = NAMLEN - 2; tmp.name[i] == ' '; i--)   /* loop from end of name */
            tmp.name[i] = 0;                        /* overwrite spaces with nul-char */
        for (p = tmp.name; *p == ' '; p++) {}       /* count leading spaces */
        wslen = strlen (p);                         /* get remaining length */
        memmove (tmp.name, p, wslen + 1);           /* move name to front of array */
        
        ss.get (tmp.address, ADDRLEN);      /* read up to ADDRLEN chars from ss */
        if (ss.gcount() != ADDRLEN - 1) {   /* validate gcount()-1 chars read */
            std::cerr << "error: invalid format for address.\n";
            continue;
        }
        for (int i = ADDRLEN - 2; tmp.address[i] == ' '; i--)/* loop from end of name */
            tmp.address[i] = 0;                     /* overwrite spaces with nul-char */
        
        ss.get (tmp.day, DAYLEN);           /* read up to DAYLEN chars from ss */
        if (ss.gcount() != DAYLEN - 1) {    /* validate gcount()-1 chars read */
            std::cerr << "error: invalid format for day.\n";
            continue;
        }
        for (int i = DAYLEN - 2; tmp.day[i] == ' '; i--)    /* loop from end of name */
            tmp.day[i] = 0;                         /* overwrite spaces with nul-char */
        
        if (!(ss >> tmp.yardSize)) {        /* extract final int value from ss */
            std::cerr << "error: invalid format for yardSize.\n";
            continue;
        }
        
        customers.push_back(tmp);           /* add temp to vector */
    }
    
    for (Customer c : customers)    /* output information */
        std::cout << "\n'" << c.name << "'\n'" << c.address << "'\n'" << 
                    c.day << "'\n'" << c.yardSize << "'\n";
}

(note: the program expects the filename to read to be provided on the command line as the first argument. You can change how you provide the filename to suite your needs, but you should not hardcode filenames or use MagicNumbers in your code. You shouldn't have to re-compile your program just to read from another filename)

Also note that in the for() loop trimming whitespace, you are dealing with 0-based indexes instead of a 1-based count of characters which is why you are using gcount() - 1 or the total number of chars minus two, e.g. NAMLEN - 2 to loop from the last character in the array back towards the beginning.

The removal of trailing whitespace simply loops from the last character in each string from the end of each array back toward the beginning overwriting each space with a nul-terminating character. To trim leading whitespace from name, the number of whitespace characters are counted and then C memmove() is used to move the name back to the beginning of the array.

Example Use/Output

$ ./bin/read_customer_day_get dat/customer_day_get.txt

'Judy Henn'
'2 Oaklyn Road'
'Saturday'
'2001'

'Norman Malnark'
'15 Manor Drive'
'Saturday'
'2500'

'Rita Fish'
'210 Sunbury Road'
'Friday'
'750'

The output of each value has been wrapped in single-quotes to provide visual confirmation that the name field has had both leading and trailing whitespace removed, while address and day have both had trailing whitespace removed.

How do I read in a specified number of characters from a file while still iterating through it?

2 Answers2