0

I'm trying to understand this code block, I saw this in class and I still don't really get it.

I understand what and HOW a map works. It's a key-pair value object. In this case I just don't understand what is happening. I see that we have a char and int, but I don't understand how they interact in this instance.

class Solution {
public:
    bool isIsomorphic(string s, string t) {
        map<char, int> mapS;
        map<char, int> mapT;

        for(int i = 0; i < s.length(); i++)
        {
            if(mapS[s[i]] != mapT[t[i]]) return false;
            
            mapS[s[i]] = i+1;
            mapT[t[i]] = i+1;
        }
        return true;
    }
};

I tried printing out the results after each for, and I got 0 and 1 (not in the binary sense). I also know that we are taking a character at 'i' + 1 and placing it at the maps index. What am I missing?

Thanks!

Sorry I'm still not used to posting good questions here.

I am trying to understand the solution first and foremost. So instead, like someone suggested, I'll go through it with my reasoning.

First, we initiate two maps (mapS and mapT).

Second, iterate over the length of string s. (I guess we can assume string t is the same length? That part isn't clear.)

We check if the character at s[i] is equal to t[i], and that also has to exist in the map. (this is where it falls apart for me)

I'm treating the line after that as looking ahead one and adding it to the map.

We then return if we don't have an issue.

Now, excuse me if im wrong, but according to the documentation shouldn't we comparing the keys here? Or am missing something entirely.

Also, someone mentioned if I understand isomorphic. I don't entirely. I have a general idea.

Alan Birtles
  • 32,622
  • 4
  • 31
  • 60
  • what exactly do you not understand? What is the input? What is the code suppsoed to do? Btw the code invokes undefined behavior when `t.length() < s.length()`. – 463035818_is_not_an_ai Jan 11 '23 at 15:27
  • the key is the character in the string and the mapped value is the index+1 where that character appears in the string – 463035818_is_not_an_ai Jan 11 '23 at 15:28
  • 5
    Better approach to this question: State what you think it does, provide your reasoning, and ask if you're right. If you're right. Cool beans. You didn't need us. If you're wrong, we can examing the flaw in the reasoning and explain how to avoid similar problems in the future. If someone just barfs out an answer to the question as currently worded, "It, uhm, loops." you won't learn nearly as much. It also shows you're put some effort into the problem, and one should never underestimate the social importance of showing their work. – user4581301 Jan 11 '23 at 15:28
  • 4
    or different way to approach it: How would you write the code to check if two strings are isomorphic? If you have no clue, if you do not understand what it means for two strings to be isomorphic then its natural that you struggle to understand this code, thats to be clarified first – 463035818_is_not_an_ai Jan 11 '23 at 15:31
  • Unrelated: Pass `string s, string t` by `const` reference and you probably save the computer some work. The compiler loves hints that let it know stuff won't change and that it doesn't need to be copied. Also give things descriptive names. It makes the code easier to read and it REALLY sucks when you spend hours figuring out you used `i` where you should have used `j`. – user4581301 Jan 11 '23 at 15:38
  • ... dont forget to tell them about [Why is “using namespace std;” considered bad practice?](https://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice), that they should pass the strings by reference, that they should rather use a unordered map, and that not everything should be member of a class, and last but not least, that the function potentially invokes undefined behavior – 463035818_is_not_an_ai Jan 11 '23 at 15:38
  • *We check if the character at s[i] is equal to t[i], and that also has to exist in the map.* Not quite. It's comparing the position returned by looking up the character from each string in their respective maps . We're checking to see if we have the same structure with different contents. What we want to find is, for example that T and E are consistently in the same spots in the string. We're using the map to see if the last T and the last E were in the same spot. If they were not, we found a T where we didn't find an E, or vice versa, and the structure is not the same. – user4581301 Jan 11 '23 at 16:16
  • @user4581301 and since the key is the character, when we use s[i] (the char, aka the key) the value assigned would be I, so if the key has 2 values it’s not isomorphic? Am I getting closer or.. I promise I’m doing my best – Wayne Miles Jan 11 '23 at 16:23
  • Recommendation: Feed in two simple strings, 3-4 characters is enough, that you know are isomorphic. Step through the program with a debugger line by line and watch what happens. Do the same with input you know is not isomorphic. Once you have a grip on simple examples, Complicate things a bit. If you don't know how to use a debugger, good odds I'd say since it never seems to be taught in schools, [here's a link to how to use the Visual Studio debugger](https://learn.microsoft.com/en-us/visualstudio/debugger/debugger-feature-tour?view=vs-2022) that'll get you started. – user4581301 Jan 11 '23 at 16:26
  • All modern GUI-driven debuggers are pretty much the same, just different button placement and wording. Here's a [slightly deeper walkthrough on how to quickly move through the code](https://learn.microsoft.com/en-us/visualstudio/debugger/debugger-feature-tour?view=vs-2022) – user4581301 Jan 11 '23 at 16:26
  • In a `map` a key can only have one value, but only the most recent one matters. The function would have mismatched and exited if any of the previous values inspected didn't match, so you don't care about storing them. – user4581301 Jan 11 '23 at 16:27
  • 1
    Important thing to know about `map`: When you use [`operator []`](https://en.cppreference.com/w/cpp/container/map/operator_at) on a key that has not been stored, a new pair is created mapping the key to a [default-initialized](https://en.cppreference.com/w/cpp/language/default_initialization) value. So `mapS[s[i]]` if `s[i]i` hasn't been seen yet will map `s[i]` to 0 and then return 0. This is why the position are all `i+1`, to avoid false matches between with the key that actually otherwise would legitimately be at 0 and new keys. This is also why you can't use `[]` on a constant `map`. – user4581301 Jan 11 '23 at 16:36

2 Answers2

3

First I'll go over the explanation you provided.

First, we initiate two maps (mapS and mapT).

Pedantically, we initialize two maps.

Second, iterate over the length of string s. (I guess we can assume string t is the same length? That part isn't clear.)

We can assume for the sake of this problem that t is at least as long as s.

We check if the character at s[i] is equal to t[i],

No. That would be s[i] == t[i].

and that also has to exist in the map. (this is where it falls apart for me)

No. mapS[s[i]] is 0 if we never stored a value for s[i] in mapS.

Here's what the if condition actually does:

  • It looks up s[i] in mapS, getting back some int—either the int we most recently stored for key s[i], or 0 if we never stored an int for key s[i].

  • It looks up t[i] in mapT, getting back some int—either the int we most recently stored for key t[i], or 0 if we never stored an int for key t[i].

  • If the two ints are different, it makes the function return early with the value false.

I'm treating the line after that as looking ahead one and adding it to the map.

It doesn't look ahead. Looking ahead would be something like s[i+1] or t[i+1]. It does store i+1 in both maps, but it doesn't “look ahead”.

We then return if we don't have an issue.

We return true if we exit the loop body because i reached the length of s without ever returning early.

Now, excuse me if im wrong, but according to the documentation shouldn't we comparing the keys here?

What documentation?

Also, someone mentioned if I understand isomorphic. I don't entirely. I have a general idea.

It is not obvious what “isomorphic” means here, but we'll figure it out.

Now I'll walk through my analysis of this function.

One way to understand what a loop does is to start with simple input and walk through the loop “manually”, as many times as needed. (You start with a simple input so you don't have to walk through it too many times!)

Let's focus on the mapS variable and see what the loop does to it. Since mapS is indexed using elements from s, we need to pick some string for s. Let's use "abcab".

The program only updates mapS in one statement:

        mapS[s[i]] = i+1;

Now let's walk through the loop and see how mapS evolves. We'll assume for now that the if condition is always false.

i s[i] mapS before loop body mapS after loop body
0 'a' (empty) ['a']=1
1 'b' ['a']=1 ['a']=1 ['b']=2
2 'c' ['a']=1 ['b']=2 ['a']=1 ['b']=2 ['c']=3
3 'a' ['a']=1 ['b']=2 ['c']=3 ['a']=4 ['b']=2 ['c']=3
4 'b' ['a']=4 ['b']=2 ['c']=3 ['a']=4 ['b']=5 ['c']=3

So at the start of the loop body, we can say: for every character c in s.substr(0, i) (the first i characters of s), mapS[c] is one larger than the last index of c in s.substr(0, i).

Note that s.substr(0, i) is the empty string when i == 0.

As I mentioned earlier, mapS returns 0 for any key that isn't actually stored in the map. So let's think of 0 as meaning “s.substr(0, i) doesn't contain c”.

With that meaning for 0, we can say that, at the start of the loop body, for every c, mapS[c] encodes the last index of c in s.substr(0, i) as follows:

  • mapS[c] == 0 encodes the meaning “c is not in s.substr(0, i)”.

  • mapS[c] == x encodes the meaning “x - 1 < i and s[x - 1] == c and s.substr(0, i) does not contain c at any larger index than x - 1”.

Since mapT is updated exactly the same way as mapS, except using characters from t, we can say that it encodes the same meaning as mapS, but relative to t instead of s.

Now we're ready to analyze the if statement's condition: mapS[s[i]] != mapT[t[i]].

Remember that s.substr(0, i) does not reach s[i]; it stops just before s[i]. That is, "abcab"[2] is 'c', and "abcab".substr(0, 2) is "ab"; the substring does not end with 'c'.

So, in the if condition:

  • mapS[s[i]] is the encoded last index of s[i] in s.substr(0, i). (Remember that the “encoded last index” may encode the meaning “no such index”.)

  • mapT[t[i]] is the encoded last index of t[i] in t.substr(0, i).

The if statement thus makes the function return early, with value false, if these encoded last indexes different.

Now suppose the loop goes all the way to i == s.length(), with the if condition never being true. Then the loop exits normally, and the program has proven that every time it saw some character c in s, and some corresponding character d at the same index in t, it had previously seen c and d at identical earlier (encoded) indexes in s and t.

It's a little bit of a leap to get from that to a more holistic understanding of isIsomorphic, but here's what isIsomorphic(s, t) means:

Is there a way to uniformly map each character c that appears in s to a unique character d that appears in t? And vice versa? If so, isIsomorphic(s, t) returns true. Otherwise, it returns false.

Consider this example: isIsomorphic("abcab", "zyxzy"). It returns true, because there is a uniform, bidirectional mapping between the characters that occur in s and the characters that occur in t:

s t
a z
b y
c x

But isIsomorphic("abcab", "zyzzy") is false. Both 'a' and 'c' in s would have to map to 'z' in t—but 'z' in t can only map back to one character, not to both 'a' and 'c'.

This sort of uniform, unique, bidirectional mapping is called a bijection.

An isomorphism is always a bijection, but generally has additional requirements, depending on the kind of structures it's defined on. For example, in graph theory, you could define a bijection between the nodes of two graphs, but an isomorphism would also have to define a bijection between the edges of those graphs.

A better name for this function might be bijectionExists, or samePattern.

rob mayoff
  • 375,296
  • 67
  • 796
  • 848
  • I was wrong. My point is still valid. Somewhere earlier the answer should describe exactly what `mapS` stores. In code reviews at work, I almost always insist that each map has a comment explaining the key, value, and what the mapping represents. – Mooing Duck Jan 11 '23 at 17:43
  • 1
    This function should absolutely have comments if used outside of a toy context. – rob mayoff Jan 11 '23 at 17:45
  • Thank you! You also gave me information on what I actually should look at when I am studying these maps! I appreciate it. I'll have to look at both of the answers I see here a ton to get it. – Wayne Miles Jan 12 '23 at 00:24
  • Did you mean to have `'b'` in 4th row of first table? – sklott Jan 12 '23 at 15:16
  • I did. Thank you for pointing out the error. – rob mayoff Jan 12 '23 at 18:37
2

Getting started with understanding the code first requires understanding the problem it is attempting to solve.

Two strings are isomorphic if every character in string A can be mapped to every character in string B.

For example, ACAB and XCXY are isomorphic.

A -> X
C -> C
B -> Y

And another, ACAB and XCZY are not isomorphic.

A -> X, Z  # 'A' cannot map to multiple characters
C -> C
B -> Y

So, how does the code go about figuring this out? We'll look at the parts of the function body.

std::map<char, int> mapS;
std::map<char, int> mapT;

Declare two maps, one for each string. They are both empty.

for(int i = 0; i < s.length(); i++)

This loop only cares about the length of s, and we've already hit a road block. If we wanted to check for a one-way mapping where s -> t but t !-> s, this function doesn't do that correctly. I'd go further and say that if the lengths of s and t are different, the function should immediately return false. Because we are guaranteed to read out of bounds if t.length() < s.length().

Moving on:

{
    if(mapS[s[i]] != mapT[t[i]]) return false;

    mapS[s[i]] = i+1;
    mapT[t[i]] = i+1;
}

We have to take these together. The if check simply wants to know if the corresponding map elements have the same value. If they don't, we have a mismatch in our mapping and can immediately return false. If the values do match, we update both maps together. i + 1 is used because of how operator[] works with maps. If a key doesn't exist, attempting to index with that key will create an entry in the map, and initialize the int value to 0. Adding 1 is a cheap way to get around false positives.

Finally, if you make it through the loop without returning false, it is assumed that the strings are isomorphic and you return true.

But this function is bogus. It can almost be said that it checks for "one-way" isomorphism, when t is longer than s, but if s is longer, it can still return true and invoke undefined behavior while doing so.

The better solution, to me, is to store the actual character mapping. You also need to keep track of the unique characters in the second string. And the first check made should be for equal length.

#include <iostream>
#include <set>
#include <string>
#include <unordered_map>

class Solution {
 public:
  bool isIsomorphic(const std::string& s, const std::string& t) {
    if (s.length() != t.length()) return false;

    // The map stores the character mapping from s to t. Characters in s
    // are the key, and the value is the corresponding character in t.
    std::unordered_map<char, char> map;

    // The set stores the **unique** characters of t. This allows us to know
    // if we already have a mapping to a character in t, if it exists in
    // the set.
    std::set<char> trackedChars;

    for (std::size_t i = 0; i < s.length(); ++i) {
      char sChar = s[i];
      char tChar = t[i];

      // Have I mapped this character yet?
      if (map.find(sChar) != map.end()) {  // Yes
        // Is the mapping still correct?
        if (map[sChar] != tChar) {  // Map mismatch found
          return false;
        }
      } else {  // No
        // Has the corresponding character been mapped already?
        if (trackedChars.find(tChar) != trackedChars.end()) {
          // tChar is already mapped
          return false;
        } else {  // tChar is also not yet mapped
          map[sChar] = tChar;
          trackedChars.insert(tChar);
        }
      }
    }

    // Making it out of the loop implies every character mapped correctly.
    return true;
  }
};

int main() {
  std::string one{"ACAB"};
  std::string two{"XCXY"};

  Solution s;
  std::cout << std::boolalpha << s.isIsomorphic(one, two) << '\n';
}

Output:

true

I don't want to pile on to what's been said, but for a class, this code structure is garbage. It should just be a free function, or maybe inserted into a namespace. But that's beyond the scope of this question.

sweenish
  • 4,793
  • 3
  • 12
  • 23
  • I'll admit, that's a ton of info to take in. I'm going to need to read through these multiple times to understand it completely. BUT this does help so much. The prof was..less than helpful. I even dare to say they were leading me in the wrong direction. Thank you! – Wayne Miles Jan 12 '23 at 00:23