6

I was trying to solve leetcode problem "929. Unique Email Addresses", the code works fine at my computer on Visual Studio Code but when I pasted it on the leetcode, I got address sanitizer heap-buffer-overflow error. The code is shown below:

class Solution {
 public:
  int numUniqueEmails(vector<string>& emails) {
    string::iterator it;
    for (int i = 0; i < emails.size(); i++) {
      for (it = emails[i].begin(); *it != '@'; it++) {
        if (*it == '.') {
          emails[i].erase(it);
          it--;
        }
        if (*it == '+') {
          while (*it != '@') {
            emails[i].erase(it);
          }
          break;
        }
      }
      sort(emails.begin(), emails.end());
      for (int j = 0; j < emails.size(); j++) {
        if (emails[j] == emails[j + 1]) {
          emails.erase(emails.begin() + j);
          j--;
        }
      }
    }
    return emails.size();
  }
};

Can someone let me know why the error occurs?

rsjaffe
  • 5,600
  • 7
  • 27
  • 39
Liu john
  • 79
  • 1
  • 2
  • 1
    `emails[j] == emails[j+1]` in a loop enumerating the entire container is not going to bode well when hitting up the last entry in the list at `emails[j]`. Ask yourself what is at `emails[j+1]` in that case. Frankly, since you already sorted the sequence, why you're not using [remove-erase idiom](https://stackoverflow.com/questions/347441/erasing-elements-from-a-vector) is the real question. – WhozCraig Feb 24 '19 at 21:47
  • 1
    What happens with the loop `for(it = emails[i].begin(); *it!='@'; it++)` is the string *doesn't* contain a `'@'` character? – Some programmer dude Feb 24 '19 at 21:49
  • Also what happens if the string starts with `.` ? – M.M Feb 24 '19 at 22:56

1 Answers1

1

Because the following loops:

for (it = emails[i].begin(); *it != '@'; it++)
   ...

and

while (*it != '@')
  ...

iterate with it over the string without never ever verifying if the end of the string was reached. If you have a single ill-formatted email address in the input data, you therefore go out of bounds. So UB.

In addition, you also go out of bounds here:

 for (int j = 0; j < emails.size(); j++) {
    if (emails[j] == emails[j + 1]) {  // what when j == size-1 ????

Finally a potentially nasty issue as well (but could affect the result): your first for loop should end before the sort(). Why? because with some bad luck, the sorting will change the order of the elements, causing an unprocessed address to move before the current i index and thus remain unprocessed.

Demo with some additional display of intermediary results and some extreme cases.

Christophe
  • 68,716
  • 7
  • 72
  • 138