3

If I have the stdin input as follows:

2014-01-23,  AA, 20
2014-05-30,  BB,2    //notice that I might have optional space
2015-03-24, CC,   5
//...
//... and so on 

How do I write a program in C++ that efficiently parse the month and year, and also subsequent field? I am really stuck by this parsing issue.

What I want to do with the subsequent field is stored AA, 20 as a map. So map[AA]=20 so on.

I can do this myself. But I can't figure out how to read and parse it. Please help.


Attempt:

int year, month;
int  num;
string key;
map<string, int> mapping;
string s;
getline(cin,s, '-'); 
year=stoi(s); 
getline(cin,s, '-');
month=stoi(s); 
getline(cin,s, ',');
//reading the AA, BB, CC field;
getline(cin,s, ',');
for (int i=0; i<s.size(); i++);
   if (s[i]==' ') s.erase(i,1);
key=s;
//now, reading the number field following AA,BB, CC
getline(cin,s,'\n');
for (int i=0; i<s.size(); i++);
   if (s[i]==' ') s.erase(i,1);
num=stoi(s);
mapping[key]=num;
phuclv
  • 37,963
  • 15
  • 156
  • 475
wrek
  • 1,061
  • 5
  • 14
  • 26
  • 1
    You need to take this one step at a time. First, write a program that reads each line of text, one line at a time. Step two: parse each line of text into the individual fields. Step three: parse the first field into its component, year, month, and day. Problem solved. See how easy it was? – Sam Varshavchik Nov 01 '16 at 02:26
  • yes, that is easy. But my code is a little bit long. – wrek Nov 01 '16 at 02:40
  • There's an old Vulcan proverb: the longer the code, the likelier is that it has a bug. – Sam Varshavchik Nov 01 '16 at 02:42
  • Just need help to figure it out – wrek Nov 01 '16 at 05:07

3 Answers3

1

Another option is to use std::regex (or Boost.Regex if you're on an "ancient" compiler)

Match the line with this

(\d{4})\-(\d{2})\-(\d{2}),\s*(.+),\s*(.+)

then get year, month, day, first field, second field from the match groups \1, \2, \3, \4, \5 respectively

phuclv
  • 37,963
  • 15
  • 156
  • 475
  • if I have a very large amount of data (many lines to read), do you think this method would still be efficient? – wrek Nov 01 '16 at 05:07
  • 1
    It depends. The only way to know is benchmarking it. A compiled regex can be reused and therefore have quite good performance and can be easily changed unlike a fixed parser – phuclv Nov 01 '16 at 05:29
  • This might be a dumb question. How do I use`\d{4})-(\d{2})-(\d{2}),\s*(.+),\s*(.+)` with regex? – wrek Nov 02 '16 at 02:48
  • [learn](http://www.regular-expressions.info/) about [tag:regex] first then use [`regex_search` or `regex_match`](http://en.cppreference.com/w/cpp/regex) like in the example – phuclv Nov 02 '16 at 03:00
0

An answer to a similar problem was given here using std::basic_string::find. You can use -, , and , as delimiters.

ZeroPad
  • 11
  • 2
0

Try this:

#include <bits/stdc++.h>
using namespace std;

int main(){
    string s;
    char c;
    int x;
    cin >> s >> c >> x;
    s = s.substr(0,s.length() - 2);
    cout << s << " " << c << " " << x << endl;
    return 0;
}
Genarito
  • 3,027
  • 5
  • 27
  • 53
  • [Why should I not #include ?](https://stackoverflow.com/q/31816095/995714) and [Why is “using namespace std;” considered bad practice?](https://stackoverflow.com/q/1452721/995714) – phuclv Oct 01 '20 at 02:53
  • Thank you for pointing it out. This answer was a few years ago and my knowledge was very limited – Genarito Oct 01 '20 at 12:41