0

I'm interested in an efficient Python-implementation of the so-called 'interleaving function' f which takes two numbers a, b in (0,1) and interleaves their decimal digits, i.e.

f(a,b) := 0.a1 b1 a2 b2 a3 b3 ... where a = 0.a1 a2 a3... and b = 0.b1 b2 b3... are the decimal representations of a,b.

Mathematically speaking, the function f is a one-to-one map from (0,1)x(0.1) to (0,1).

Can you suggest how to efficiently implement this map in Python so as to preserve it being one-to-one?

rmcerafl
  • 115
  • 3
  • Would this give you enough information to produce your own answer: [How to take the nth digit of a number in python](https://stackoverflow.com/questions/39644638/how-to-take-the-nth-digit-of-a-number-in-python) – mapto Oct 08 '20 at 14:19
  • Thanks, @mapto. Though I remain unsure about whether the naive implementation of f preserves injectivity: if I pass two 'decimal strings' a and b to f, it will return a decimal string c = f(a,b) of length |c| = |a| + |b| (with |a| the length of a). This procedure will be one-to-one only if the string c is the 'full (interleaved) concatenation of a and b' (i.e., only if none of the letters in a or b are lost after their interleaving); can this be guaranteed in Python? – rmcerafl Oct 08 '20 at 14:29
  • I was about to implement it for you, but to me it appears there's some ambiguity: for example if |a| = 2*|b|, the first 2*|a| digits are interleaved. How would you interleave the second half of a, when there are no more corresponding digits in b? – mapto Oct 10 '20 at 13:13
  • I presume that you speaking of |c| = |a| + |b| is an imprecision, because the '0.' part will not be duplicated so c will have less symbols than a and b combined. – mapto Oct 10 '20 at 13:14
  • You are right, @mapto, I was being imprecise about how the 'length' |a| of an argument a of f is actually defined, sorry for that. Strictly speaking, every number (a or b) has infinitely many decimal digits (making my above notion of 'length' meaningless), though if a and b are rationals (as is the case for machine numbers) only finitely many of those digits will be non-zero. In this case, I hope the following example illustrates how my above comment may be interpreted in a meaningful way: – rmcerafl Oct 10 '20 at 20:32
  • Say that a = 0.01200305 and b = 0.1004. Then, as we can fill b up with zeros up to the highest significant decimal power of a (which is 8), that is write b = 0.10040000, we can (unambigously) 'define' |a| = |b| = 8 and obtain that c = f(a,b) = 0.0110200400300050, which has length |c| = |a| + |b| = 16. – rmcerafl Oct 10 '20 at 20:52
  • (So if a and b are rationals, we may define their length relative to one another --- as you correctly pointed out I missed making that explicit in my first comment --- by setting |a| = |b| = max(highstdecpow(a), highstdecpow(b)), where 'highstdecpow(q)' is the highest (in terms of the absolute value of its exponent) decimal power of the rational number q whose decimal coefficient does not vanish.) Sorry for the confusion! – rmcerafl Oct 10 '20 at 20:53

1 Answers1

0

For an efficient implementation one needs to make sure to achieve two things: minimal asymptotic complexity in terms of big O notation and efficient computational operators, avoiding repeated or otherwise unnecessary calculation.

Given the problem, it is unlikely that it could be solved with an algorithm that is less than linear on the length of the input numbers. In terms of operators, given that we work with decimal formatting, it would be difficult that we could benefit from some bit-wise (binary) computations. Thus we're probably best with general mathematical operations.

Using float

A first naive implementation would attempt executing the function on floating point numbers:

def interleave_float(a: float, b: float) -> float:
    a_rest = a
    b_rest = b
    result = 0
    dst_pos = 1.0  # position of written digit
    while a_rest != 0 or b_rest != 0:
        dst_pos /= 10  # move decimal point of write
        a_rest *= 10  # move decimal point of read
        result += a_rest // 1 * dst_pos
        a_rest %= 1  # remove current digit

        dst_pos /= 10
        b_rest *= 10
        result += dst_pos * (b_rest // 1)
        b_rest %= 1

    return result

However, a simple test shows a problem - inherently limited precision of floating point arithmetic which distorts already at the 16-17th digit after the floating point:

>>> a = 0.987654321
>>> b = 0.1234567890123456789
>>> print(a)
0.987654321
>>> print(f"{b:.20}")  # formatted to show higher precision
0.12345678901234567737
>>> print(f"Float:  {interleave_float(a, b):.50}")
Float:  0.91827364554637280757987127799424342811107635498047

Using Decimal

A common way to overcome the precision problem is to use decimal.Decimal, the python implementation of fixed-point decimal arithmetic:

from decimal import Decimal, getcontext
getcontext().prec = 50  # increase number precision

def interleave_fixed(a: Decimal, b: Decimal) -> Decimal:
    a_rest = a
    b_rest = b
    result = 0
    dst_pos = Decimal(1)
    while a_rest != 0 or b_rest != 0:
        dst_pos *= Decimal(0.1)
        a_rest *= 10  # move decimal point
        result += a_rest // 1 * dst_pos
        a_rest %= 1  # remove current digit

        dst_pos *= Decimal(0.1)
        b_rest *= 10
        result += dst_pos * (b_rest // 1)
        b_rest %= 1

    return result

This seems to work better for b, but unfortunately, it also leads to imprecision at about the same digit in the result. This imprecision is also signalled by the Inexact flag in the context after the calculation:

>>> print(getcontext())
Context(prec=50, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[InvalidOperation, DivisionByZero, Overflow])
>>> a = Decimal(".987654321")
>>> b = Decimal(".1234567890123456789")
>>> print(a)
0.987654321
>>> print(b)
0.1234567890123456789
>>> print(f"Fixed:  {interleave_fixed(a, b)}")
Fixed:  0.91827364554637287146771953200668367263491993253785
>>> print(getcontext())
Context(prec=50, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[Inexact, FloatOperation, Rounded], traps=[InvalidOperation, DivisionByZero, Overflow])

Using str

Another approach which should not impose limits due to precision (and which you have brought up yourself) is to do syntactical processing with strings:

def interleave_str(a: str, b: str) -> str:
    result = "0."
    src_pos = 2  # position of read digit
    while len(a) > src_pos or len(b) > src_pos:
        result += a[src_pos] if len(a) > src_pos else "0"

        result += b[src_pos] if len(b) > src_pos else "0"

        src_pos += 1

    return result[:-1] if result.endswith("0") else result

remove traling 0 if present

The algorithm doesn't do validation, so it remains to you to decide what you might want to add. Yet, testing this gives the desired precision:

>>> a = "0.987654321"
>>> b = "0.1234567890123456789"
>>> print(a)
0.987654321
>>> print(b)
0.1234567890123456789
>>> print(f"String: {interleave_str(a, b)}")
String: 0.91827364554637281900010203040506070809

...but what can one do with the resulting string? Maybe convert it into a Decimal again? Depends how you want to use the outcome.

mapto
  • 605
  • 9
  • 23
  • (As to the possible usage of this function: I originally hoped to, as you mentioned, convert the interleaved string into a decimal again and then use this decimal to perform some calulations with it. Though it seems rather difficult to not have rounding errors distort these calculations, not to speak of the overflow in memory that would result from simple operations such as (repeatedly) multiplying the decimals.) – rmcerafl Oct 11 '20 at 13:15