6

Let's say you have a string S and a sequence of digits in a list L such that len(S) = len(L).

What would be the cleanest way of checking if you can find a bijection between the characters of the string to the digits in the sequence such that each character matches to one and only one digit.

For example, "aabbcc" should match with 115522 but not 123456 or 111111.

I have a complex setup with two dicts and loop, but I'm wondering if there's a clean way of doing this, maybe by using some function from the Python libraries.

inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
Ehsan Kia
  • 1,475
  • 18
  • 26
  • if a="abcabc" and b="123127" what will be the expected output? True or False – raton Nov 21 '12 at 16:01
  • False, because 'c' maps to both 3 and 7 (or the other way, 3 and 7 both map to 'c'). In a bijection, each element has one and only one matching element in the other set. – Ehsan Kia Nov 21 '12 at 22:09

5 Answers5

9

I would use a set for this:

In [9]: set("aabbcc")
Out[9]: set(['a', 'c', 'b'])

In [10]: set(zip("aabbcc", [1, 1, 5, 5, 2, 2]))
Out[10]: set([('a', 1), ('c', 2), ('b', 5)])

The second set will have length equal to the first set if and only if the mapping is surjective. (if it is not, you will have two copies of a letter mapping to the same number in the second set, or vice versa)

Here is code that implements the idea

def is_bijection(seq1, seq2):
    distinct1 = set(seq1)
    distinct2 = set(seq2)
    distinctMappings = set(zip(seq1, seq2))
    return len(distinct1) == len(distinct2) == len(distinctMappings)

This will also return true if one sequence is shorter than the other, but a valid mapping has already been established. If the sequences must be the same length, you should add a check for that.

reynoldsnlp
  • 1,072
  • 1
  • 18
  • 45
acjay
  • 34,571
  • 6
  • 57
  • 100
  • Hmm, I don't believe this works? With [1,1,1,1,1,1] You end up with (a,1),(b,1),(c,1) which has 3 items, just like the other set. So this gives you a surjection, not a full bijection. – Ehsan Kia Nov 21 '12 at 08:40
  • True. I initially just provided the idea. The code in the edited version checks both sets. – acjay Nov 21 '12 at 08:42
  • Quick offtopic question, is `a == b == c` considered bad practice? – Ehsan Kia Nov 21 '12 at 08:46
  • 2
    Knowing the Python community, it's probably actually considered more Pythonic than what I did. – acjay Nov 21 '12 at 08:47
  • 3
    @EhsanKia Hardly so. It's explicit and concise. – Lev Levitsky Nov 21 '12 at 08:47
  • @EhsanKia: I would imagine so. If you accidentally typo'd `a = b == c`, `a` turns into a `bool`. Though, I do agree, `a == b == c` is more expressive – inspectorG4dget Nov 21 '12 at 08:48
  • How to extend this function to filter out a sublist from both the sequences to show the elements that are behind of being non-bijective ? – sajis997 Mar 14 '17 at 14:09
0
import itertools

a = 'aabbcc'
b = 112233

z = sorted(zip(str(a), str(b)))
x = all(
    gx == g0
    for k, g in itertools.groupby(z, key=lambda x: x[0])
    for gx in g for g0 in g
)
print x

or:

import itertools

a = 'aabbcc'
b = 112233

z = zip(str(a), str(b))
x = all(
    (z1[0] == z2[0]) == (z1[1] == z2[1]) for z1 in z for z2 in z
)
print x
Eric
  • 95,302
  • 53
  • 242
  • 374
0

There's a more elegant way to do this (with sorting and itertools.groupby), but I'm wayy to sleep-deproved to figure that out right now. But this should still work:

In [172]: S = "aabbcc"

In [173]: L = [1, 1, 5, 5, 2, 2]

In [174]: mapping = collections.defaultdict(list)

In [175]: reverseMapping = collections.defaultdict(list)

In [176]: for digit, char in zip(L, S):
    mapping[digit].append(char)
    reverseMapping[char].append(digit)
   .....:     

In [177]: all(len(set(v))==1 for v in mapping.values()) and all(len(set(v))==1 for v in reverseMapping.values())
Out[177]: True

In [181]: S = "aabbcc"

In [182]: L = [1, 2, 3, 4, 5, 6]

In [183]: mapping = collections.defaultdict(list)

In [184]: reverseMapping = collections.defaultdict(list)

In [185]: for digit, char in zip(L, S):                                                                         
    mapping[digit].append(char)
    reverseMapping[char].append(digit)
   .....:     

In [186]: all(len(set(v))==1 for v in mapping.values()) and all(len(set(v))==1 for v in reverseMapping.values())
Out[186]: False

Hope this helps

inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
0

This respects the order:

>>> s = "aabbcc"
>>> n = 115522
>>> l1 = dict(zip(s, str(n))).items()
>>> l2 = zip(s, str(n))
>>> l1
[('a', '1'), ('c', '2'), ('b', '5')]
>>> l2
[('a', '1'), ('a', '1'), ('b', '5'), ('b', '5'), ('c', '2'), ('c', '2')]
>>> not bool([i for i in l2 if i not in l1])
True
>>> n = 115225
>>> l1 = dict(zip(s, str(n))).items()
>>> l2 = zip(s, str(n))
>>> not bool([i for i in l2 if i not in l1])
False
Emmanuel
  • 13,935
  • 12
  • 50
  • 72
0

Since you normally only talk about bijections between sets, I assume, unlike the other answers, that the order of the digits need not match the order of the letters. If so, there's a short, elegant solution, but it requires the collections.Counter class, which was introduced in python 2.7. For those stuck with an older version, there's a backport for 2.5+.

from collections import Counter

def bijection_exists_between(a, b):
    return sorted(Counter(a).values()) == sorted(Counter(b).values())

Testing:

>>> bijection_exists_between("aabbcc", "123123")
True
>>> bijection_exists_between("aabbcc", "123124")
False

Your examples are rather light on the edge cases, because another way of reading your question allows for the number of digits and number of letters to be unequal (i.e. you look for a bijection from the set of unique characters to the set of unique digits, so e.g. "aabbcc" would biject onto "123333".). If this is what you meant, use this version instead:

def bijection_exists_between(a, b):
    return len(set(a)) == len(set(b))
Lauritz V. Thaulow
  • 49,139
  • 12
  • 73
  • 92
  • Maybe I wasn't clear, bit a bijection is a two way mapping. In your last example, 'a' maps to both 1 and 2, where as 3 maps to both 'b' and 'c', so not only it is not bijective, it's not even surjective or injective. – Ehsan Kia Nov 21 '12 at 22:12
  • @EhsanKia You're using the term *bijection* in a strange way. A bijection is a two way mapping, yes, but it only exists between [sets](http://en.wikipedia.org/wiki/Set_(mathematics)). A string is not a set because it may contain duplicate values. So to answer your question it needs to be interpreted, and I've presented two fully valid such interpretations. My last example is a bijection in the sense that there exists a bijection from the set of characters in `"aabbcc"` ({a, b, c}) to the set of digits in `123333` ({1, 2, 3}). – Lauritz V. Thaulow Nov 22 '12 at 08:53