Where is nat base 10 converted to num base 2?

Question

For term "15::nat", the value 15 is automatically converted to the binary value (num.Bit1 (num.Bit1 (num.Bit1 num.One))). I would like to know where that's done, so I can know how it's done.

(Small update: I know that 15 is a type class numeral constant, which gets converted to binary Num.num, which gets mapped to nat, so maybe the nat is decimal, or maybe it's binary. However, my basic question remains the same. Where is the decimal to binary conversion done?)

I show below how I know about the conversion.

I define notation to show me that Num.numeral :: (num => 'a) is coercing 15 to Num.num.

abbreviation nat_of_numeral :: "num => nat" where
  "nat_of_numeral n == (numeral n)"

notation nat_of_numeral ("n@N|_" [1000] 1000)

Next, 15 gets coerced to binary in a term command:

term "15::nat"
(*The output:*)
term "n@N|(num.Bit1 (num.Bit1 (num.Bit1 num.One))) :: nat"

And next, 15 gets coerced before it gets used in a proof goal:

lemma "15 = n@N|(num.Bit1 (num.Bit1 (num.Bit1 num.One)))" (*
  goal (1 subgoal):
   1. n@N|(num.Bit1 (num.Bit1 (num.Bit1 num.One))) =
      n@N|(num.Bit1 (num.Bit1 (num.Bit1 num.One))) *)
by(rule refl)

The conversion seems to be decently fast, as shown by this:

(*140 digits: 40ms*)
term "12345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
  ::nat"

I also want to convert base 2 to base 10, but if I see how the above is done, it might show me how to do that.

score 0 · Accepted Answer · edited May 23 '17 at 10:26

Here's the overview of how I think it's done.

It starts in the Num.thy parse_translation, at the ML function numeral_tr.

In that function, there is the use of Lexicon.read_xnum of lexicon.ML, which takes a string argument.

I don't know the details, but string "15" is extracted from an expression like

  "15 = n@N|(num.Bit1 (num.Bit1 (num.Bit1 num.One)))",

and fed to read_xnum, where there is this equivalency:

  Lexicon.read_xnum "15" = {leading_zeros = 0, radix = 10, value = 15}

In read_xnum the function Library.read_radix_int of library.ML is used, and it takes as arguments an integer radix and a list of digits, such as shown in this equivalency:

  Library.read_radix_int 10 ["1","5"] = (15, []);

Next, the conversion from 15 to (num.Bit1 (num.Bit1 (num.Bit1 num.One))) is a result of IntInf.quotRem in the parse_translation.

This takes me out of convenient HTML linking for isabelle.in.tum.de/repos.

IntInf.quotRem is in Isabelle2013-2\contrib\polyml-5.5.1-1\src\basis\IntInf.sml, and is defined as

  val quotRem: int*int->int*int = RunCall.run_call2C2 POLY_SYS_quotrem,

which leads to Isabelle2013-2\contrib\polyml-5.5.1-1\src\basis\RuntimeCalls.ML line 83:

  val POLY_SYS_quotrem = 104 (* DCJM 05/03/10 *).

For Windows that takes me to Isabelle2013-2\contrib\polyml-5.5.1-1\src\libpolyml\x86asm.asm line 1660, though I could be leaving out some important details:

  quotrem_really_long:
    MOVL    Reax,Redi
  CALLMACRO   CALL_IO    POLY_SYS_quotrem
  CALLMACRO   RegMask quotrem,(M_Reax OR M_Redi OR M_Redx OR Mask_all).

I think that's enough of an overview to answer my question. A string "15" is converted to an ML integer, and then some assembly language level quotient/remainder division is used to convert 15 to binary Num.num.

I'm actually only interested in how "15" is converted to 15 in read_radix_int, and whether the details of that function will help me. I explain the application more below.

From here, I put in more detail for myself, to put in a nice form much of the information I collected.

What's sort of the deal

I start with a binary number as a bool list, something like [True, True, True, True] for binary 15, though here, I simplify it in some ways.

That then gets converted to something like [[True, False, True, False], [True, False, True]], which is decimal 15.

The real problem can be finding the right search phrase

Searching on something like "decimal to binary conversion" returns links to a lot of basic math algorithms, which aren't what I'm looking for.

Normal programming conversions aren't what I need either, which is just making explicit the underlying fact that integers are already in binary form:

Finally, other searches led me to the right word, "radix". Additionally, I resorted to doing searches on how things are done in C, where bit shifts are what I'm trying to tie into, though these may not be what I need:

Radix leading back to Num.thy

"Radix" took me back to Num.thy, which is where I thought the action might be, but hadn't seen anything that was obvious to me.

I include some source from Num.thy, lexicon.ML, and IntInf.sml:

(* THE TWO MAIN EXTERNAL FUNCTIONS IN THE TRANSLATIONS: read_xnum, quotRem *)
ML{* 
  Lexicon.read_xnum;       (* string -> 
                               {leading_zeros: int, radix: int, value: int}.*)
  Lexicon.read_xnum "15";  (* {leading_zeros = 0, radix = 10, value = 15}.*)
  Lexicon.read_xnum "15" = {leading_zeros = 0, radix = 10, value = 15};

  IntInf.quotRem;          (* int * int -> int * int.*)
  IntInf.quotRem (5,3);    (* (1, 2) *)
*}

parse_translation {* (* Num.thy(293) *)
  let
    fun num_of_int n =
      if n > 0 then
        (case IntInf.quotRem (n, 2) of
          (0, 1) => Syntax.const @{const_name One}
        | (n, 0) => Syntax.const @{const_name Bit0} $ num_of_int n
        | (n, 1) => Syntax.const @{const_name Bit1} $ num_of_int n)
      else raise Match
    val pos = Syntax.const @{const_name numeral}
    val neg = Syntax.const @{const_name neg_numeral}
    val one = Syntax.const @{const_name Groups.one}
    val zero = Syntax.const @{const_name Groups.zero}
    fun numeral_tr [(c as Const (@{syntax_const "_constrain"}, _)) $ t $ u] =
          c $ numeral_tr [t] $ u
      | numeral_tr [Const (num, _)] =
          let
            val {value, ...} = Lexicon.read_xnum num;
          in
            if value = 0 then zero else
            if value > 0
            then pos $ num_of_int value
            else neg $ num_of_int (~value)
          end
      | numeral_tr ts = raise TERM ("numeral_tr", ts);
  in [("_Numeral", K numeral_tr)] end
*}

ML{* (* lexicon.ML(367) *)    
  (* read_xnum: hex/bin/decimal *)      
  local     
  val ten = ord "0" + 10;
  val a = ord "a";
  val A = ord "A";
  val _ = a > A orelse raise Fail "Bad ASCII";

  fun remap_hex c =
    let val x = ord c in
      if x >= a then chr (x - a + ten)
      else if x >= A then chr (x - A + ten)
      else c
    end;      
  fun leading_zeros ["0"] = 0
    | leading_zeros ("0" :: cs) = 1 + leading_zeros cs
    | leading_zeros _ = 0;      
  in      
  fun read_xnum str =
    let
      val (sign, radix, digs) =
        (case Symbol.explode (perhaps (try (unprefix "#")) str) of
          "0" :: "x" :: cs => (1, 16, map remap_hex cs)
        | "0" :: "b" :: cs => (1, 2, cs)
        | "-" :: cs => (~1, 10, cs)
        | cs => (1, 10, cs));
    in
     {radix = radix,
      leading_zeros = leading_zeros digs,
      value = sign * #1 (Library.read_radix_int radix digs)}
    end;      
  end;  
*}

ML{* (* IntInf.sml(42) *)
  val quotRem: int*int->int*int = RunCall.run_call2C2 POLY_SYS_quotrem
*}

A big part of what I was looking for; radix again

(* THE FUNCTION WHICH TRANSLATES A LIST OF DIGITS/STRINGS TO A ML INTEGER *)
ML{* 
  Library.read_radix_int;     (* int -> string list -> int * string list *)
  Library.read_radix_int 10 ["1","5"];  (* (15, []): int * string list.*)
  Library.read_radix_int 10 ["1","5"] = (15, []);
*}

ML{* (* library.ML(670)  *)
  fun read_radix_int radix cs =
    let
      val zero = ord "0";
      val limit = zero + radix;
      fun scan (num, []) = (num, [])
        | scan (num, c :: cs) =
            if zero <= ord c andalso ord c < limit then
              scan (radix * num + (ord c - zero), cs)
            else (num, c :: cs);
    in scan (0, cs) end;
*}

The low-level division action

There's the high-level Integer.div_mod in integer.ML, which wasn't used for the translation above:

  fun div_mod x y = IntInf.divMod (x, y);

In Isabelle2013-2\contrib\polyml-5.5.1-1\src\basis\IntInf.sml, the higher-level divMod can be compared to the lower-level quotRem:

  ML{* (* IntInf.sml(42) *)
    val quotRem: int*int->int*int = RunCall.run_call2C2 POLY_SYS_quotrem

    (* This should really be defined in terms of quotRem. *)
    fun divMod(i, j) = (i div j, i mod j)
  *}

With the low-level action for quotRem apparently being done at the assembly language level:

  ML{* (* RuntimeCalls.ML(83) *)
    val POLY_SYS_quotrem = 104 (* DCJM 05/03/10 *)
  *}

  (* x86asm.asm(1660)
    quotrem_really_long:
        MOVL    Reax,Redi
    CALLMACRO   CALL_IO    POLY_SYS_quotrem
    CALLMACRO   RegMask quotrem,(M_Reax OR M_Redi OR M_Redx OR Mask_all)
  *)

These various forms of div and mod are important to me.

I'm thinking that div and mod are to be avoided if possible, but tying into quotRem would be the way to go, I think, if division is unavoidable.