-1

How can I get the Amount of digits of a number by using Delphi's inline assembler?

For example :

13452     should return 5
1344      should return 4
9721343   should return 7

etc

My attempt was this:

function CalculateLength(Number : integer) : Integer;
begin
asm
  PUSH Length(Number)
  MOV @Result, EAX
end;
end;
rkhb
  • 14,159
  • 7
  • 32
  • 60
yq8
  • 145
  • 1
  • 10
  • 2
    I'm voting to close this question as off-topic because it is simply asking for code. – Jonathon Reinhart May 03 '15 at 12:54
  • No, I've added my attempt. I am not just asking for code, I am askign for help. – yq8 May 03 '15 at 12:55
  • What are you looking for? Is your problem fully specified? How to cater for negative values? Why do you need to use asm? Do you know anything at all about asm? Do you have performance constraints? What are they? What will you do with the output of your function? Are you expecting us to write the code, or are keen to learn? – David Heffernan May 03 '15 at 12:57
  • 2
    Write it in Delphi first. Then look at disassembly (debug), and see what kind of code is generated. Try to understand how you can write it yourself. – Sedat Kapanoglu May 03 '15 at 12:59
  • @David Heffernan : I try to learn assembly, thats why I try to code this in ASM and not in Delphi – yq8 May 03 '15 at 13:00
  • Probably the most efficient is a series of if statements. Assume x positive, then `if x < 10 then len := 1 else if x < 100 then len := 2` etc. – David Heffernan May 03 '15 at 13:00
  • So first of all you need to work out the algorithm you want to use. You've not thought about that yet. I've given you a start above. Then code it in Delphi. Then look at code compiler produces. Then make your own. However, I suspect you don't know enough of the foundations to write asm yet, judging by your attempt in the question. You need to walk before you run. – David Heffernan May 03 '15 at 13:02
  • 1
    Problems with your asm include: 1. Use of asm inlined into Pascal. That's dangerous. Use pure asm functions. 2. Use of variable names in asm. That is brittle because it hides register use. 3. Pushing onto stack, but not popping. 4. Wishful thinking that `Length(...)` would magically do what you need. That indicates a huge disconnect between your understanding and reality. 5. Inexplicable MOV statement. Basically, you don't know enough foundation to have any hope of being successful. That might sound harsh but I'm trying to help. You need to gain that foundation. – David Heffernan May 03 '15 at 13:07
  • I assume Length will return the function address instead of my value? :/ – yq8 May 03 '15 at 13:09
  • Don't assume anything. You have zero hope of success if you base your work on assumptions. You cannot guess your way to success in a sensible time frame. – David Heffernan May 03 '15 at 13:11
  • Have a look at the IntToStr function in Sysutils. There you have your answer. If you cant read it, then you better start by buying a book – Jens Borrisholt May 03 '15 at 13:15
  • @Jens We don't need to convert to string though do we. What aspect of that function are you referring to? – David Heffernan May 03 '15 at 13:17
  • @yq8 I wonder if you can manage to write the function in Pascal. Post that here in an edit. – David Heffernan May 03 '15 at 13:17
  • @DavidHeffernan in order for converting an integer to a string you need to count the digits else you can not allocate the string. – Jens Borrisholt May 03 '15 at 13:20
  • @Jens Not really. You can write output into a fixed length buffer that you know is large enough, and count up as you are going along. – David Heffernan May 03 '15 at 13:22

2 Answers2

5

Here is a faster algorithm avoiding a division:

function CountDigits(anInt: Cardinal): Cardinal; inline;
var 
  cmp: Cardinal;
begin
  cmp := 10;
  Result := 1;
  while (Result < 10) and (cmp <= anInt) do
  begin
    cmp := cmp*10;
    Inc(Result);
  end;
end;

function CountDigitsAsm(anInt: Cardinal): Cardinal;
asm
   mov ecx,$a          // cmp := 10;
   mov edx,$1          // Result := 1;
   jmp @loop2
   cmp eax,edx         // while cmp <= anInt do
   jb @done
@loop1:
   add ecx,ecx         // cmp := cmp*10;
   lea ecx,[ecx+ecx*4]
   inc edx             // Inc(Result);
@loop2:
   cmp edx,$0a         // (Result < 10)
   jnb @done
   cmp eax,ecx
   jnb @loop1
@done:
   mov eax,edx
end;

begin
  WriteLn(CountDigitsAsm(10));
  WriteLn(CountDigitsAsm(99));
  WriteLn(CountDigitsAsm(999));
  WriteLn(CountDigitsAsm(9999));
  WriteLn(CountDigitsAsm(99999));
  ReadLn;
end.

Note that the pas version can be inlined and could possibly be faster than the asm version.


Ok, here is a lut (lookup table) solution to avoid any multiplication:

function CountDigitsLUT(anInt: Cardinal): Cardinal; inline;
const
  lut: array[1..10] of cardinal =
    (9,
     99,
     999,
     9999,
     99999,
     999999,
     9999999,
     99999999,
     999999999,
     $FFFFFFFF);
begin
  Result := 1;
  while anInt > lut[Result] do
    Inc(Result);
end;

And an unrolled version:

function CountDigitsUnrolled(anInt: Cardinal): Cardinal; inline;
begin
  if (anInt < 10) then Result := 1 else
  if (anInt < 100) then Result := 2 else
  if (anInt < 1000) then Result := 3 else
  if (anInt < 10000) then Result := 4 else
  if (anInt < 100000) then Result := 5 else
  if (anInt < 1000000) then Result := 6 else
  if (anInt < 10000000) then Result := 7 else
  if (anInt < 100000000) then Result := 8 else
  if (anInt < 1000000000) then Result := 9 else
    Result := 10;
end;

And @TLama's case contribution:

function CountDigitsCase(Value: Cardinal): Cardinal; inline;
begin
  case Value of
    0..9: Result := 1;
    10..99: Result := 2;
    100..999: Result := 3;
    1000..9999: Result := 4;
    10000..99999: Result := 5;
    100000..999999: Result := 6;
    1000000..9999999: Result := 7;
    10000000..99999999: Result := 8;
    100000000..999999999: Result := 9;
  else
    Result := 10;
  end;
end;

Timing the different solutions:

Unrolled: 4097 ms
Case:     1444 ms
LUT:      3233 ms
pas:      6199 ms
asm:      6747 ms

Test code:

  sw := TStopWatch.StartNew;
  for i := 1 to 1000000000 do
    j := CountDigitsXXX(i);
  WriteLn(sw.ElapsedMilliseconds,' ',j);

Addendum

Inspired by this answer, here is a Delphi implementation which is an O(1) solution:

function OpenBit(AValue: Cardinal): Cardinal; register;
asm // Highest bit set
  BSR EAX, EAX
end;

function CountDigitsO1(value: Cardinal): Cardinal; inline;
const
  Powers: array[0..9] of Cardinal = (
    0, 
    10, 
    100, 
    1000, 
    10000, 
    100000, 
    1000000,
    10000000, 
    100000000, 
    1000000000);
  MaxDigits: array[0..32] of cardinal =
    (1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5,
     6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 9, 9, 9, 10, 10, 10, 10);
begin
  Result := MaxDigits[OpenBit(value)];
  if (value < Powers[Result-1]) then
    Dec(Result);
end;

Compared to the CountDigitsCase() it has a more even time distribution in finding the solution for a given digit. But still overall a bit slower (on my machine).

Digit Case   O1
------------------  
  1   0.930  2.200  nanoseconds per call
  2   0.922  1.689
  3   0.944  1.500
  4   1.078  1.578
  5   1.878  1.522
  6   1.200  1.667
  7   1.356  1.567
  8   1.356  1.522
  9   1.502  1.664
 10   1.246  1.761

Test code:

procedure TestXXX(var Distribution: array of Double);
var
  sw: TStopWatch;
  i,j,k,m: Cardinal;
const 
  StartIx: array[0..9] of Cardinal = ( 0,10,100,1000,10000,100000,1000000,
    10000000,100000000,100000000);
  StopIx: array[0..9] of Cardinal = ( 9,99,999,9999,99999,999999,9999999,
    99999999,999999999,$FFFFFFFF);
  Repeats: array[0..9] of Cardinal = (10000000,1000000,100000,10000,1000,100,10,1,1,1);
begin
  for k := 0 to 9 do begin
    sw := TStopWatch.StartNew;
    for m := 1 to Repeats[k] do
     for i := StartIx[k] to StopIx[k] do
      j := CountDigitsXXX(i);
    Distribution[k] := sw.ElapsedMilliseconds*1000000.0/(1.0*Repeats[k]*(StopIx[k]- StartIx[k] + 1)); 
    WriteLn(sw.ElapsedMilliSeconds,' ',j);      
  end;
end;
Community
  • 1
  • 1
LU RD
  • 34,438
  • 5
  • 88
  • 296
  • The question asked for assembly. Answering in Pascal hides some assembly issues like preserving registers, handling stack frame or calling convention. – rkhb May 03 '15 at 18:57
  • @rkhb, my answer includes assembly, but in this case I would advice not to go for such a solution, since it is difficult to beat the compiler in this case (inlining asm is not possible) . – LU RD May 03 '15 at 19:01
  • Try to test also a case statement version it is faster bacause start comparision in the middle with value 10000. – GJ. May 03 '15 at 19:23
  • Or, you will get faster code if start your unroled version with comparison with the highist value. – GJ. May 03 '15 at 19:55
  • @GJ., ok you can tweak with where to start comparisons. But I'll leave that to the interested reader. I have one more solution which seems the fastest, but I'll stop here. – LU RD May 03 '15 at 20:08
  • @GJ. You cannot say that starting in the middle is faster. It depends on the distribution of the input data. Not necessarily uniformly distributed. – David Heffernan May 03 '15 at 22:34
  • @David Heffernan. Yes I can say that, and the fastest way is to start to the comparison with the biggest number. Comparison with number 1000000000 cover about 76.7% of all possible hits and comparission with 10 only 0.00000023% of all possible hits. Properly inverse unrolled LUT version is faster for about 50% against case version. – GJ. May 03 '15 at 23:40
  • @GJ. You cannot say that. What if the numbers are percentages? The distribution of the data is critical. – David Heffernan May 03 '15 at 23:56
  • @David Heffernan. In case that we want to cover all possible numbers, than starting comparison with the biggest number is the fastest way in any other case we can change the comparison way. – GJ. May 04 '15 at 00:08
  • @GJ. No. It depends on the distribution of the input. You need to understand some probability and statistics to reason about this. – David Heffernan May 04 '15 at 00:36
  • 1
    Woohoo! My `case` won this round :) @GJ. that's true for this loop test, but who knows the data used for this function... Out of curiosity, I've checked how long takes returning that unrolled loop constantly for value 1 (returning from the first code line). And even then the `case` was a bit faster on my machine (old i5, app. XE8 32-bit, 64-bit, release config, default settings). I haven't checked the generated code, but I would also expect the `case` to be slower. In any case, I won't dig into this deeper. This question was not about performance... – TLama May 04 '15 at 00:52
  • Wouldn't it be faster and much easier to convert the Integer to string and then simply check the length of that string? – SilverWarior May 04 '15 at 06:39
  • @David Heffernan: OP did'n talk about probability or statistics and we have only LU RDs test routune, :) – GJ. May 04 '15 at 06:40
  • @TLama: check the generated code :) I agree this is not a question about performance... – GJ. May 04 '15 at 06:42
  • 1
    @SilverWarior, yes that is easy, but does not perform so well (and is tricky to code in asm). – LU RD May 04 '15 at 08:39
  • @Silver No, it would not be faster to do that. – David Heffernan May 04 '15 at 10:31
  • @GJ. I'm talking about statistics and distributions. It's impossible to make statements about performance off this function without making a distributional assumption. Your assumption was uniform distribution of data over the whole domain. With that assumption testing for large number of digits first is fastest. But if the input is distributed differently, entirely likely, then a different strategy will be faster. – David Heffernan May 04 '15 at 10:34
  • @David Heffernan: Agree, timing the different solutions with LU RDs code doesn't mean a lot, if we do not know nothing about distribution of the input. – GJ. May 04 '15 at 11:06
  • @GJ., unless there is a solution that is insensitive to the distribution. I worked that out yesterday, but did not publish. – LU RD May 04 '15 at 11:10
  • @LU RD: It will be very kind of you to see that solution... :) – GJ. May 04 '15 at 11:19
  • @GJ., will do when I get home. But the case variant seems to be the winner. I will profile timing for the different number distributions. – LU RD May 04 '15 at 11:24
  • I guess the fastest way to get number of digits in an integer would be by calculating a Log10 and then adding 1 as it is recommended in this Math Question http://math.stackexchange.com/questions/469606/how-to-get-count-of-digits-of-a-number I haven't tested this in code yet but I'm willing to bet that it would outperform any logical solution to the problem. – SilverWarior May 04 '15 at 14:46
  • Well initial tests show that the approach mentioned in my comment above is not faster but in fact much slower and I'm puzzled why. – SilverWarior May 04 '15 at 14:58
  • @SilverWarior, how much were you willing to bet? The floating-point coprocessor is not so fast for logarithms. – LU RD May 04 '15 at 15:10
  • @LURD Yes I'm seeing that such approach is pretty bad in performance. Perhaps trying to implement this in OpenCL and thus using GPU hardware acceleration might get better performance since GPU's are much better at crunching numbers, but unfortunately this is out my knowledge reach to implement. Also I'm wondering if CPU might provide some hardware acceleration for this but the Delphi might not generate proper code to make use of that. – SilverWarior May 04 '15 at 15:17
  • @GJ., added the O(1) solution and some time statistics for the digit distribution. – LU RD May 04 '15 at 17:53
  • 1
    Very nice approach, using quasi Ln2 and LUT. – GJ. May 04 '15 at 18:26
3

My approach:

function count_of_digits (n:Integer) : Cardinal;    {No problems for `n:Cardinal`}
var cnt : Cardinal;
begin
    cnt := 0;
    repeat
        inc (cnt);
        n := n div 10;
    until (n=0);
    count_of_digits := cnt;
end;

function count_of_digits_asm_signed (n:Integer) : Cardinal;
begin
    asm
        push ebx                { An asm statement must preserve the EDI, ESI, ESP, EBP, and EBX registers }
        xor ecx, ecx
        mov ebx, 10             { Base 10 (decimal), just change it for another base }
        mov eax, n

        @L1:
        add ecx, 1
        cdq                     { Set EDX for `idiv` }
        idiv ebx                { Decimal shift right by one decimal digit }
        test eax, eax
        jne @L1

        mov @result, ecx
        pop ebx
    end;
end;

function count_of_digits_asm_unsigned (n:Cardinal) : Cardinal;
begin
    asm
        push ebx                { An asm statement must preserve the EDI, ESI, ESP, EBP, and EBX registers }
        xor ecx, ecx
        mov ebx, 10             { Base 10 (decimal), just change it for another base }
        mov eax, n

        @L1:
        add ecx, 1
        xor edx, edx            { Set EDX for `div` }
        div ebx                 { Decimal shift right by one decimal digit }
        test eax, eax
        jne @L1

        mov @result, ecx
        pop ebx
    end;
end;

VAR
    i: Integer;
    c1, c2, c3: Cardinal;

BEGIN
    i := 13452;
    c1 := count_of_digits (i);
    c2 := count_of_digits_asm_signed (i);
    c3 := count_of_digits_asm_unsigned (i);
    writeln (i:11, c1:3, c2:3, c3:3);

    i := 1344;
    c1 := count_of_digits (i);
    c2 := count_of_digits_asm_signed (i);
    c3 := count_of_digits_asm_unsigned (i);
    writeln (i:11, c1:3, c2:3, c3:3);

    i := 9721343;
    c1 := count_of_digits (i);
    c2 := count_of_digits_asm_signed (i);
    c3 := count_of_digits_asm_unsigned (i);
    writeln (i:11, c1:3, c2:3, c3:3);

    i := -13452;
    c1 := count_of_digits (i);
    c2 := count_of_digits_asm_signed (i);
    c3 := count_of_digits_asm_unsigned (i);
    writeln (i:11, c1:3, c2:3, c3:3);
END.
rkhb
  • 14,159
  • 7
  • 32
  • 60
  • That massively expensive, performing those divides where comparisons could be used instead. It is bad practise to inline asm into a Pascal function. And also you ask for trouble by using Pascal variable names. Put the result in eax rather than referring to a name. – David Heffernan May 03 '15 at 14:41
  • Some thing like this: http://pastebin.com/L1heePPt – Jens Borrisholt May 03 '15 at 14:51