1

I am trying to create a function, readBooks, that opens an input file stream, reads a list of books and authors separated by a comma with 1 book and author pair on each line of the file (example: Douglas Adams,The Hitchhiker's Guide to the Galaxy). I am having trouble with how I should either tokenize or split the string so that I can insert the author and the title of the book into two separate arrays by using the comma as a delimiter. Any help is appreciated.

The size of the arrays are defined by the capacity parameter in the function. The arrays are allocated prior to calling the readBooks() function, so there is no need to dynamically allocate them.

Here is the code I have so far:

int readBooks (string filename, string titles[], string authors[], int books, int capacity){
    ifstream file;
    file.open (filename);
    if (file.fail()){
        return -1;
    }
    else{
        int i = 0;
        int j = 0;
        while (i < capacity){
            string line;
            getline (file, line);
            if (line.length() > 0){

            }
        }
    }
}
Gardener
  • 2,591
  • 1
  • 13
  • 22
Jake
  • 13
  • 4

2 Answers2

0

This would be a little simpler using the boost libraries where you can check for multiple delimiters. However, you can use getline() to search for end of line delimiters and then use find() to look for the comma. Once you find the comma, you have to be sure to advance past it for the title, and also to trim off any white space.

#include <iostream>
#include <fstream>
#include <string>
#include "readBooks.h"

#include <algorithm>
#include <cctype>
#include <locale>

/* trim from start (in place) [Trim functions borrowed from 
 * https://stackoverflow.com/questions/216823/whats-the-best-way-to-trim-stdstring] 
 */

static inline void ltrim(std::string &s) {
    s.erase(s.begin(), std::find_if(s.begin(), s.end(), [](int ch) {
        return !std::isspace(ch);
    }));
}

// trim from end (in place)
static inline void rtrim(std::string &s) {
    s.erase(std::find_if(s.rbegin(), s.rend(), [](int ch) {
        return !std::isspace(ch);
    }).base(), s.end());
}

// trim from both ends (in place)
static inline void trim(std::string &s) {
    ltrim(s);
    rtrim(s);
}


using namespace std;

int readBooks (string filename, string titles[], string authors[], int books, int capacity){
    ifstream file;
    file.open (filename);
    if (file.fail()){
        return -1;
    }
    else{
        int i = 0;
        string line;

        while(  i < books && i < capacity && getline(file,line) ) {
            // Find the position of the comma, and grab everything before it
            string author(line.begin(), find(line.begin(), line.end(), ','));
            trim(author);
            authors[i] = author;
            // Find position of first character after the ','
            string title(find(line.begin(), line.end(), ',') + 1, line.end());
            trim(title);
            titles[i] = title;
            i++; // increment our index
        }
    }
    file.close();
    return 0;
}

Here is a sample main() to call it.

#include <iostream>
#include "readBooks.h"

int main() {

  const int capacity{1000};
  const int books{3};
  std::string authors[capacity];
  std::string titles[capacity];
  std::string filename{"booklist.txt"};

  int retval = readBooks(filename, titles, authors, books, capacity);

  return retval;
}
Nimantha
  • 6,405
  • 6
  • 28
  • 69
Gardener
  • 2,591
  • 1
  • 13
  • 22
  • I like this idea a lot with the trim functions. Could this idea work as well if I had the book name followed by multiple numbers that are separated by spaces and there is a single comma that separates the book title and numbers (example: book, 1 2 3 4 5 6 ...) and trying to separate the book title and each number following it? I am assuming it would look very similar but you would change the parameters for the find functions? – Jake Oct 21 '18 at 19:19
  • Yes. You could do that. However, once you add a third type of item you are searching for on the same line, it would probably be better to switch over to regular expressions. The find() solution is a quick a dirty solution where we had to do explicit substringing. If you have three fields, then regular expressions takes care of the substringing more elegantly. – Gardener Oct 21 '18 at 21:12
-1

First of all, why do you want to use arrays of output data (std::string[]) if you don't even sure about the sizes of outputs. std::vector is always better solution.

void readBooks(std::string const& filename, std::vector<std::string> &titles, std::vector<std::string> &authors) {
    std::ifstream file;
    // .....
    // file is opened here
    // ....
    std::string temp;
    while (file) {
        if (!std::getline(file, temp, ','))
            throw std::exception("File is broken?");
        authors.push_back(temp);
        std::getline(file, temp, '\n');
        titles.push_back(temp); //make sure there is no space after ',', as it'd be included in the string.
        //To remove such a space temp.substr(1) can be used.
    }
}

In short, it's based on delimiter parameter of std::getline().

EDIT: Check for the case when file ends in ',' was added.

John Cvelth
  • 522
  • 1
  • 6
  • 19
  • Whenever you read something, the read function must check that the read succeeded. –  Oct 19 '18 at 17:29
  • @NeilButterworth , I understand, that if one isn't sure about topology of the file, it's better to be safe than sorry, but in this case, it looks like file was generated using similarly structured program, so I don't think of how something worse than an empty line being `push`ed into `titles` vector can happen. – John Cvelth Oct 19 '18 at 17:33
  • "so I don't think of how something worse than an empty line being pushed into titles vector can happen" - but that is wrong behaviour and it is trivial to prevent. –  Oct 19 '18 at 17:35
  • The size of the arrays should be defined by the input parameter "capacity" in the function. I would like to use vectors as well but I want some more experience with arrays and am pushing myself to use them in this program I am writing. – Jake Oct 19 '18 at 17:35
  • @Jake, then don't mind my usage of vectors and just add `temp` values into your arrays instead of `push_back`s. – John Cvelth Oct 19 '18 at 17:38