1

I need to extract specific values in a string stored from a file input. It has multiple delimiters and i cant figure out how to extract every specific value from it.

#include <vector>
#include <string>
#include <sstream>
#include <iostream>    
#include <fstream> 
using namespace std;


string ss = "[4, 90]-3-name";
// i need to extract the values 4, 90, 3 and name 
// the numbers can have multiple digits

stringstream tr(ss);
vector<string> result;

while( tr.good() )
{
    string substr;
    getline( ss, substr, '-' );
    result.push_back( substr );

}

for (int i = 0; i< result.size();i++)
    cout << result[i]<< endl;

output:
[4, 90]
3
name
Silent
  • 52
  • 6

3 Answers3

0

If you know all the possible delimiters then you can replace each one in ss with a hyphen and then your code above will work. See link on the replace function http://www.cplusplus.com/reference/string/string/replace/

Paul McCarthy
  • 818
  • 9
  • 24
0

Paul's answer is clever, but maybe the string is read only. Here's a version that doesn't require modifying the string

int main()
{
    string ss = "[4, 90]-3-name"; // i need to extract the values 4, 90, 3     and name
    vector<string> results;
    size_t size = ss.size();
    size_t first = 0;
    size_t i = 0;
    while (i < size)
    {
        char ch = ss[i];
        if (ch == '[' || ch == ']' || ch == ' ' || ch == '-' || ch == ',') // delimiter check
        {
            if (i > first)
                results.push_back(ss.substr(first, i - first));
            first = i + 1;
        }
        ++i;
    }
    if (i > first)
        results.push_back(ss.substr(first, i - first));
    for (auto s : results)
        cout << s << '\n';
    return 0;
}

Hopefully that's reasonably clear. The trick is the first variable which tracks the index of the character we expect to be the first character of the next value to extract (i.e. one beyond the delimiter we've just found). And the if (i > first) checks just make sure that we don't add any zero length strings to the results.

john
  • 85,011
  • 4
  • 57
  • 81
0

And now the C++ approach. This is using Object Oriented idioms and modern C++ algorithms.

We have data and methods which belong somehow together. For this there are classes (structs) in C++. So you can define a class, with member variables and methods, which can work with the class varaibles. Everything works as one object.

Additionally. The class knows, how to read or print its values. And only the class should know that. This wisdom is encapsulated.

And, next, we want to search interesting data embedded somewhere in a string. The string contains always a certain pattern. In your case your have 3 integers and one string as interesting data and some delimiters in between, whatever they are.

To match such patterns and search for interesting parts of a string, C++ has std::regex. They are extremely powerful and hence a little bit complicated to define.

In the below example I will use const std::regex re(R"((\d+).*?(\d+).*?(\d+).*?([\w_]+))");. This defines 4 groups of submatches (in brackets) and something in between. So any delimiter, space or whatever is possible.

If you want to be more strict, you can simply change the pattern and you can detect errors in the source data. See const std::regex re(R"(\[(\d+)\,\ (\d+)\]\-(\d+)\-([\w_]+))");. This is a more strict approach. The inputfile will not be read in case of error. Or only the beginning with the valid data.

Please see below example:

#include <string>
#include <regex>
#include <iterator>
#include <iostream>
#include <sstream>
#include <fstream>
#include <vector>
#include <algorithm>
#include <ios>
#include <iomanip>

std::istringstream testFile{ R"([1, 1]-3-Big_City
  [1, 2] - 3 - Big_City
  [1, 3] - 3 - Big_City
  [2, 1] - 3 - Big_City
  [2, 2] - 3 - Big_City
  [2, 3] - 3 - Big_City
  [2, 7] - 2 - Mid_City
  [2, 8] - 2 - Mid_City
  [3, 1] - 3 - Big_City
  [3, 2] - 3 - Big_City
  [3, 3] - 3 - Big_City
  [3, 7] - 2 - Mid_City
  [3, 8] - 2 - Mid_City
  [7, 7] - 1 - Small_City)" };



const std::regex re(R"((\d+).*?(\d+).*?(\d+).*?([\w_]+))");


struct CityData
{
    // Define the city's data
    int xCoordinate{};
    int yCoordinate{};
    int cityId{};
    std::string cityName{};

    // Overload the extractor operator >> to read and parse a line
    friend std::istream& operator >> (std::istream& is, CityData& cd) {

        // We will read the line in this variable
        std::string line{};

        // Read the line and check, if it is OK
        if (std::getline(is, line)) {

            // Find the matched substrings
            std::smatch sm{};
            if (std::regex_search(line, sm, re)) {
                // An convert them to students record
                cd.xCoordinate = std::stoi(sm[1]);
                cd.yCoordinate = std::stoi(sm[2]);
                cd.cityId = std::stoi(sm[3]);
                cd.cityName = sm[3];
            }
            else {
                is.setstate(std::ios::failbit);
            }
        }
        return is;
    }

    friend std::ostream& operator << (std::ostream& os, const CityData& cd) {
        return os << cd.xCoordinate << ' ' << cd.yCoordinate << ' ' << cd.cityId;
    }
};

constexpr int MinimumArrayDimension = 8;

int main()
{
    // Define the variable cityData with the vectors range constructor. Read complete input file and parse data
    std::vector<CityData> cityData{ std::istream_iterator<CityData>(testFile),std::istream_iterator<CityData>() };

    // The following we are doing, because we want to print everything with the correct width
    // Read the maximum x coordinate
    const int maxRow = std::max(std::max_element (
        cityData.begin(), 
        cityData.end(), 
        [](const CityData & cd1, const CityData & cd2) { return cd1.xCoordinate < cd2.xCoordinate; }
    )->xCoordinate, MinimumArrayDimension);

    // Read the maximum y coordinate
    const unsigned int maxColumn = std::max(std::max_element(
        cityData.begin(),
        cityData.end(),
        [](const CityData & cd1, const CityData & cd2) { return cd1.yCoordinate < cd2.yCoordinate; }
    )-> yCoordinate, MinimumArrayDimension);

    // Read the maximum city
    const unsigned int maxCityID = std::max_element(
        cityData.begin(),
        cityData.end(),
        [](const CityData & cd1, const CityData & cd2) { return cd1.cityId < cd2.cityId; }
    )->cityId;

    // Get the number of digits that we have here
    const int digitSizeForRowNumber = maxRow > 0 ? (int)log10((double)maxRow) + 1 : 1;

    const int digitSizeForColumnNumber = std::max(maxColumn > 0 ? (int)log10((double)maxColumn) + 1 : 1,
                                                  maxCityID > 0 ? (int)log10((double)maxCityID) + 1 : 1);

    // Lambda function for printing the header and the footer
    auto printHeaderFooter = [&]() {
        std::cout << std::setw(digitSizeForColumnNumber) << "" << " #";
        for (int i = 0; i <= (maxColumn+1)* (digitSizeForColumnNumber+1); ++i)
            std::cout << '#';
        std::cout << "#\n";
    };


    // Print the complete map
    std::cout << "\n\n";
    printHeaderFooter();

    // Print all rows
    for (int row = maxRow; row >= 0; --row) {

        // Ptint the row number at the beginning of the line
        std::cout << std::setw(digitSizeForColumnNumber) << row << " # ";

        // Print all columns
        for (int col = 0; col <= maxColumn; ++col)
        {
            // Find the City ID for the given row (y) and column (x)
            std::vector<CityData>::iterator cdi = std::find_if(
                cityData.begin(),
                cityData.end(),
                [row, col](const CityData & cd) { return cd.yCoordinate == row && cd.xCoordinate == col; }
            );
            // If we could find nothing
            if (cdi == cityData.end()) {
                // Print empty space
                std::cout << std::setw(digitSizeForColumnNumber) << "" << ' ';
            }
            else {
                // Print the CityID
                std::cout << std::right << std::setw(digitSizeForColumnNumber) << cdi->cityId << ' ';
            }
        }
        // Print the end of the line
        std::cout <<  "#\n";
    }
    printHeaderFooter();
    // Print the column numbers
    std::cout << std::setw(digitSizeForColumnNumber) << "" << "   ";
    for (int col = 0; col <= maxColumn; ++col)
        std::cout << std::right << std::setw(digitSizeForColumnNumber) << col << ' ' ;
    // And, end
    std::cout << "\n\n\n";

    return 0;
}

Please note: main reads the file and displays the output.

And, because I cannot use file on SO, I read the data from "std::istringstream". This is the same as reading from a file.

A M
  • 14,694
  • 5
  • 19
  • 44