5

Given an infinite sequence like so (commas inserted to make pattern more apparent):

1, 1 2, 1 2 3, 1 2 3 4, 1 2 3 4 5, 1 2 3 4 5 6 ,1 2 3 4 5 6 7, 1 2 3 4 5 6 7 8, 1 2 3 4 5 6 7 8 9, 1 2 3 4 5 6 7 8 9 1 0, 1 2 3 4 5 6 7 8 9 1 0 1 1, 1 2 3 4 5 6 7 8 9 1 0 1 1 1 2, 1 2 3 . . . . . . . . . .

I am given an index (1 <= index <= 10^10) and I need to find what digit is in that index.

I have wrote this working code but it is too slow. I have optimized it as much as I can but it's still not enough. Is there any other way I can make this run faster?

public class Foo {

    private static Scanner sc = new Scanner(System.in);
    private static long input;
    private static long inputCounter = 0;
    private static int numberOfInputs;


    public static void main(String[] args) {
        numberOfInputs = Integer.parseInt(sc.nextLine().trim());

        while (inputCounter != numberOfInputs) {
            input = Long.parseLong(sc.nextLine().trim());
            System.out.println(step());
            inputCounter++;
        }
    }

    public static char step() {
        int incrementor = 1;
        long _counter = 1L;

        while (true) {
            for (int i = 1; i <= incrementor; i++) {
                _counter += getNumberOfDigits(i);

                if (_counter > input) {
                    return ((i + "").charAt((int)(input - _counter
                            + getNumberOfDigits(i))));
                }
            }
            incrementor++;
        }
    }

    private static long getNumberOfDigits(int n) {
        // 5 or less
        if (n < 100) {
            // 1 or 2
            if (n < 10)
                return 1;
            else
                return 2;
        } else {
            // 3 or 4 or 5
            if (n < 1000)
                return 3;
            else {
                // 4 or 5
                if (n < 10000)
                    return 4;
                else
                    return 5;
            }
        }
    }
}

EDIT: Credit to Marian's method of getting the number of digits in a number. His divide and conquer method which I've named getNumberOfDigits(int n) sped up my program execution a lot. Initially I was converting the number to a String then calling length() and that was taking a lot longer than I expected

EDIT2: Some sample I/O:

1 : 1
2 : 1
3 : 2
4 : 1
5 : 2
6 : 3
7 : 1
8 : 2
9 : 3
10 : 4
11 : 1
12 : 2
13 : 3
14 : 4
15 : 5
16 : 1
17 : 2
18 : 3
19 : 4
20 : 5
21 : 6
22 : 1
23 : 2
24 : 3
25 : 4
26 : 5
27 : 6
28 : 7
29 : 1
30 : 2
31 : 3
32 : 4
33 : 5
34 : 6
35 : 7
36 : 8
37 : 1
38 : 2
39 : 3
40 : 4
41 : 5
42 : 6
43 : 7
44 : 8
45 : 9
46 : 1
47 : 2
48 : 3
49 : 4
50 : 5
51 : 6
52 : 7
53 : 8
54 : 9
55 : 1
56 : 0
57 : 1
58 : 2
59 : 3
60 : 4
61 : 5
62 : 6
63 : 7
64 : 8
65 : 9
66 : 1
67 : 0
68 : 1
69 : 1
70 : 1
71 : 2
72 : 3
73 : 4
74 : 5
75 : 6
76 : 7
77 : 8
78 : 9
79 : 1
80 : 0
81 : 1
82 : 1
83 : 1
84 : 2
85 : 1
86 : 2
87 : 3
88 : 4
89 : 5
90 : 6
91 : 7
92 : 8
93 : 9
94 : 1
95 : 0
96 : 1
97 : 1
98 : 1
99 : 2
Community
  • 1
  • 1
Ogen
  • 6,499
  • 7
  • 58
  • 124
  • 1
    if i give the input as 66 what would be the expected output? – OnePunchMan Aug 30 '14 at 14:26
  • 1
    It seems like your `step` function should be "read until next separator" and you shouldn't worry about the number of digits... Especially since your digits code does not seem to account for all the possibilities in an infinite space... – abiessu Aug 30 '14 at 14:27
  • @kaze You will get an answer of 1 – Ogen Aug 30 '14 at 14:27
  • 1
    so, you are saying the '1' and '0' which comes after 9 will be count as two numbers, not a single number '10'? – OnePunchMan Aug 30 '14 at 14:30
  • @kaze Yes, Please see my edit I uploaded some sample inputs and outputs – Ogen Aug 30 '14 at 14:31
  • 1
    Why do you recreate this sequene on every user input? It could be faster if you cache the latest sequence and extend it if you need a longer one. – Tom Aug 30 '14 at 14:51
  • 1
    I'm quite sure there is a closed form solution for this. Only my number theory books are in my attic! Consider reposing as a mathematics problem and put on the corresponding se site. – Bathsheba Aug 30 '14 at 14:52
  • How much pre-computation time do we have? – Mshnik Aug 30 '14 at 23:28
  • @Bathsheba I'm afraid the closed form would be too complicated, but it's possible to work through powers of ten. Anyway, binary search makes it damn fast (5 ms for index 4e18). – maaartinus Aug 30 '14 at 23:43

4 Answers4

5

I think the triangular numbers come into play here if we look at the positions of the digits:

Position: 1  2 3  4 5 6  7 8 9 10  11 12 13 14 15,  16 17 18 19 20 21,  22 23 24 25 26 27 28 
Number:   1, 1 2, 1 2 3, 1 2 3 4,  1   2  3  4 5,    1  2  3  4  5  6,  1   2  3  4  5  6  7, 

Call this sequence N(p).

Now look at the triangular numbers which have formula k(k+1)/2

k            :  1   2    3    4    5     6
k(k+1)/2     :  1   3    6   10   15    21      triangle numbers
k(k+1)/2+1   :  2   4    7   11   16    22      plus one
N(k(k+1)/2+1):  1   1    1    1    1     1      item at this position

so the item just after the n'th triangular number is always 1.

Give a position p we can find nearest k so that k(k+1)/2 +1 <= p. We can solve the quadratic x(x+1)/2+1=p by rearranging

0.5 x^2 + 0.5 x + 1 - p = 0.

So a=0.5, b=0.5 and c=1-p. Solving for x gives

x = -0.5 +/- sqrt( 0.25 - 2 * (1-p) )

take the positive sign this gives these values

1    0
2    1
3    1.5615528128
4    2
5    2.3722813233
6    2.7015621187
7    3
8    3.2749172176
9    3.5311288741
10   3.7720018727
11   4
12   4.216990566
13   4.4244289009
14   4.623475383
15   4.8150729064
16   5

So if we take k=floor(-0.5 +/- sqrt( 2 p - 1.75 ) ) we find the number k. Next find l = p-k(k+1)/2 which gives the digit in the p-th place.

As pointed out this fails as soon as we get two digit numbers. But we could make an adjustment. We can get a formula "triangular-digit-number" TD(k). Which behaves like triangular numbers, T(k), for k < 10, but adds the extra digits.

k     : 1  ...  9  10  11  12
T(k)  : 1      45  55  66  78
change              1   3   6
TD(k) : 2      45  56  69  84

We see that for 10 <= k <= 99 we just need to add T(k)+T(k-9). This should give us another quadratic we could be solved. Similar happens for 100<=k<=999 with T(k)+T(k-9)+T(k-99).

Now T(k)+T(k-9) + 1 = k(k+1)/2 +(k-9)(k-8)/2 + 1
                     = 0.5 k^2 + 0.5 k + 0.5 k^2 - 17/2 k + 72/2 + 1
                     = k^2 -8 k + 37 

Solve x^2 -8 k + 37 - p =0 gives

x = ( 8 +/- sqrt(64 - 4 *(37-p) ) ) /2
  = ( 8 +/- sqrt(4 p - 64) )/2
  =   4 +/- sqrt(p - 21) 

taking the floor of this gives us the k value.


We want to find the sum of triangles T(k) + T(k-9) + T(k-99) + .... To a first approximation T(k-n) = T(n) for any n. So the sum is simply d * T(k) where d in the number of digits of k. T(k) is approximately k^2/2 so the sum is approx d * k^2/2. This is easy to solve let d be the number of digits of the position p then k = sqrt(2*p/d). You could use this to get a rough guess for k.

Salix alba
  • 7,536
  • 2
  • 32
  • 38
  • Fails for p=55, p=56, p=57, (perhaps all inbetween), p=67 – rslemos Aug 30 '14 at 16:51
  • The idea of using triangular numbers is clever indeed, but things start to get messy when the inner sequence reaches 10, which occupies 2 positions (at most), and then when it reaches 100, which occupies 3 positions (at most again) and so on. Any closed formula should somehow consider the base in which the numbers are represented (in the OP case: the base 10). – rslemos Aug 30 '14 at 16:53
  • @Beefyhalo Agreed. Not the solution, but a valuable contribution (also see http://oeis.org/A002260 ), so +1 – Marco13 Aug 30 '14 at 17:36
  • 1
    @rslemos I'm bet that pure math would be too complicated here. Maybe with some special casing for powers of ten it could work. – maaartinus Aug 30 '14 at 21:41
4

The following code is a nearly direct calculation. It produces the exact same results as that of @maaartinus (see results below) but does it in < 1ms as opposed to 30ms.

See the code comments for details on how it works. Let me know if I need to explain a bit more.

    package com.test.www;

    import java.util.ArrayList;
    import java.util.List;

    public final class Test  {

        /** <p> 
         *  Finds digit at {@code digitAt} position. Runs in O(n) where n is the max
         *  digits of the 'full' number (see below), e.g. for {@code digitAt} = 10^10, 
         *  n ~ 5, for 10^20, n ~ 10. 
         *  <p>
         *  The algorithm is thus basically a direct 'closed form' calculation. 
         *  It finds the quadratic equation to calculate triangular numbers (x(x+1)/2) but also
         *  takes into a account the transitions from 9 to 10, from 99 to 100, etc. and 
         *  adjusts the quadratic equation accordingly. This finds the last 'full' number
         *  on each 'line' (see below). The rest follows from there.
         *     
         */
        public static char findDigitAt(long digitAt) {

            /* The line number where digitAt points to, where:
             *  1, 1 2, 1 2 3, 1 2 3 4, etc. ->
             *  1        <- line 1
             *  1 2      <- line 2
             *  1 2 3    <- line 3
             *  1 2 3 4  <- line 4
             */
            long line;

            // ---- Get number of digits of 'full' numbers where digitAt at points, e.g. 
            //      if digitAt = 55 or 56 then digits = the number of digits in 10 which is 2.

            long nines = 0L; // = 9 on first iteration, 99 on second, etc.
            long digits = 0;
            long cutoff = 0; // Cutoff of digitAt where number of digits change
            while (digitAt > cutoff) {
                digits++;
                nines = nines + Math.round(Math.pow(10L, digits-1L)) * 9L;

                long nines2 = 0L;
                cutoff = 0L;
                for (long i = 1L; i <= digits; i++) {
                    cutoff = cutoff + ((nines-nines2)*(nines-nines2+1)/2);
                    nines2 = nines2 + Math.round(Math.pow(10L, i-1L)) * 9L;
                }
            }

            /* We build a quadratic equation to take us from digitAt to line */

            double r = 0; // Result of solved quadratic equation 
                          // Must be double since we're using Sqrt()
                          // even though result is always an integer.

            // ---- Define the coefficients of the quadratic equation
            long xSquared = digits;
            long x = 0L;
            long c = 0L;
            nines = 0L; // = 9 on first iteration, 99 on second, etc.

            for (long i = 1L; i <= digits; i++) {
                x = x + (-2L*nines + 1L);
                c = c + (nines * (nines - 1L));
                nines = nines + Math.round(Math.pow(10L, i-1L)) * 9L;
            }
            // ---- Solve quadratic equation, i.e. y - ax^2 + bx + c  =>  x = [ -b +/- sqrt(b^2 - 4ac) ] / 2 
            r = (-x + Math.sqrt(x*x - 4L*xSquared*(c-2L*digitAt))) / (2L*xSquared); 

            // Make r an integer
            line = ((long) r) + 1L;
            if (r - Math.floor(r) == 0.0) { // Simply takes care of special case
                line = line - 1L;
            }

            /* Now we have the line number ! */

            // ---- Calculate the last number on the line
            long lastNum = 0; 
            nines = 0;
            for (int i = 1; i <= digits; i++) {
                long pline = line - nines; 
                lastNum = lastNum + (pline * (pline+1))/2;
                nines = nines + Math.round(Math.pow(10, i-1)) * 9;
            }

            /* The hard work is done now. The piece of cryptic code below simply counts
             * back from LastNum to digitAt to find first the 'full' number at that point
             * and then finally counts back in the string representation of 'full' number
             * to find the actual digit.
             */
            long fullNumber = 0L;
            long line_decs = 1 + (int) Math.log10(line);
            boolean done = false;
            long nb;
            long a1 = Math.round(Math.pow(10, line_decs-1));
            long count_back = 0;
            while (!done) {
                nb = lastNum - (line - a1) * line_decs; 
                if (nb-(line_decs-1) <= digitAt) {
                    fullNumber = line - (lastNum - digitAt) / line_decs;
                    count_back = (lastNum - digitAt) % line_decs; 
                    done = true;
                } else {
                    lastNum = nb-(line_decs); 
                    line = a1-1; 
                    line_decs--; 
                    a1 = a1 / 10; 
                }
            }

            String numStr = String.valueOf(fullNumber);
            char digit = numStr.charAt(numStr.length() - (int) count_back - 1);  

            //System.out.println("digitAt = " + digitAt + "  -  fullNumber =  " + fullNumber + "  -  digit = " + digit);
            System.out.println("Found " + digit + " at position " + digitAt);
            return digit;
        }

        public static void main(String... args) {
            long t = System.currentTimeMillis();

            List<Long> testList = new ArrayList<Long>();
            testList.add(1L); testList.add(2L); testList.add(3L); testList.add(9L);
            testList.add(2147483647L);
            for (int i = 1; i <= 18; i++) {
                testList.add( Math.round(Math.pow(10, i-1)) * 10);
            }
            //testList.add(4611686018427387903L); // OVERFLOW OCCURS

            for (Long testValue : testList) {
                char digit = findDigitAt(testValue);
            }

            long took = t = System.currentTimeMillis() - t;
            System.out.println("Calculation of all above took: " + t + "ms");
        }


    }

Results

    Found 1 at position 1
    Found 1 at position 2
    Found 2 at position 3
    Found 3 at position 9
    Found 2 at position 2147483647
    Found 4 at position 10
    Found 1 at position 100
    Found 4 at position 1000
    Found 9 at position 10000
    Found 2 at position 100000
    Found 6 at position 1000000
    Found 2 at position 10000000
    Found 6 at position 100000000
    Found 8 at position 1000000000
    Found 1 at position 10000000000
    Found 1 at position 100000000000
    Found 9 at position 1000000000000
    Found 8 at position 10000000000000
    Found 3 at position 100000000000000
    Found 7 at position 1000000000000000
    Found 6 at position 10000000000000000
    Found 1 at position 100000000000000000
    Found 1 at position 1000000000000000000
    Calculation of all above took: 0ms
Floris
  • 1,082
  • 10
  • 26
3

I've added some code that vastly improves the running time - skip to the bottom to see the examples.

A key insight I found is that you can skip sub-sequences if the input doesn't lie anywhere in it. For example, if you are looking for the 1,000,000,000th number, you know it isn't in the 5th subsequence {1,2,3,4,5}. So why iterate over it? This version seems to be much faster (try running it with an input of 1000000000 and see the time difference), and as far as I can tell it returns the same result in all cases.

Thus, my algorithm keeps track of the length of the subsequence (Add the number of digits on each iteration), and the subsequence we're on. If the input is larger than the subsequence's length, just subtract that length and iterate again. If it is smaller (or equal, since the problem is 1-indexed), start breaking down that sub-sequence.

A minor note: I also updated the getNumberOfDigits so that it can handle any number by doing it recursively, but both the new and old versions rely on this new method so it doesn't get credit for the time improvement.

public class Foo {

private static Scanner sc = new Scanner(System.in);
private static long input;
private static long inputCounter = 0;
private static int numberOfInputs;


/** Updated main method that calls both the new and old step() methods
 * to compare their outputs and their respective calculation times.
 * @param args
 */
public static void main(String[] args) {
    numberOfInputs = Integer.parseInt(sc.nextLine().trim());

    while (inputCounter != numberOfInputs) {
        long i = Long.parseLong(sc.nextLine().trim());
        input = i;
        System.out.println("Processing " + input);
        long t = System.currentTimeMillis();
        System.out.println("New Step result - " + newStep() + " in " + (System.currentTimeMillis() - t)+"ms");
        input = i;
        t = System.currentTimeMillis();
        System.out.println("Old Step result - " + step() + " in " + (System.currentTimeMillis() - t)+"ms");
        inputCounter++;
    }
}

/** Old version of step() method given in question. Used for time comparison */
public static char step() {
    int incrementor = 1;
    long _counter = 1L;

    while (true) {
        for (int i = 1; i <= incrementor; i++) {
            _counter += getNumberOfDigits(i);

            if (_counter > input) {
                return ((i + "").charAt((int)(input - _counter
                        + getNumberOfDigits(i))));
            }
        }
        incrementor++;
    }
}

/** New version of step() method.
 * Instead of iterating one index at a time, determines if the result lies within this
 * sub-sequence. If not, skips ahead the length of the subsequence.
 * If it does, iterate through this subsequence and return the correct digit
 */
public static int newStep() {
    long subSequenceLength = 0L;
    long subSequenceIndex = 1L;
    while(true){
        //Update to the next subsequence length
        subSequenceLength += getNumberOfDigits(subSequenceIndex);
        if(input <= subSequenceLength){
            //Input lies within this subsequence
            long element = 0L;
            do{
                element++;
                long numbDigits = getNumberOfDigits(element);
                if(input > numbDigits)
                    input -= numbDigits;
                else
                    break;
            }while(true);

            //Correct answer is one of the digits in element, 1-indexed.
            //Speed isn't that important on this step because it's only done on return
            return Integer.parseInt(("" + element).substring((int)input-1, (int)input));


        } else{
            //Input does not lie within this subsequence - move to next sequence
            input -= subSequenceLength;
            subSequenceIndex++;
        }

    }
}

/** Updated to handle any number - hopefully won't slow down too much.
 * Won't handle negative numbers correctly, but that's out of the scope of the problem */
private static long getNumberOfDigits(long n){
    return getNumberOfDigits(n, 1);
}

/** Helper to allow for tail recursion.
 * @param n - the number of check the number of digits for
 * @param i - the number of digits thus far. Accumulator. */
private static long getNumberOfDigits(long n, int i) {
    if(n < 10) return i;
    return getNumberOfDigits(n/10, i+1);
}
}

Sample output showing time improvement:

> 8
> 10000
Processing 10000
New Step result - 9 in 0ms
Old Step result - 9 in 2ms
> 100000
Processing 100000
New Step result - 2 in 0ms
Old Step result - 2 in 4ms
> 1000000
Processing 1000000
New Step result - 6 in 0ms
Old Step result - 6 in 3ms
> 10000000
Processing 10000000
New Step result - 2 in 1ms
Old Step result - 2 in 22ms
> 100000000
Processing 100000000
New Step result - 6 in 1ms
Old Step result - 6 in 178ms
> 1000000000
Processing 1000000000
New Step result - 8 in 4ms
Old Step result - 8 in 1765ms
> 10000000000
Processing 10000000000
New Step result - 1 in 11ms
Old Step result - 1 in 18109ms
> 100000000000
Processing 100000000000
New Step result - 1 in 5ms
Old Step result - 1 in 180704ms
Mshnik
  • 7,032
  • 1
  • 25
  • 38
2

I wrote a program without much thinking...

  • length(n) computes the number of digits of the decimal representation of n
  • cummulativLength(n) computes the total number of digits for a sequence ending with n
  • doublyCummulativLength(n) computes the total number of digits for all sequence ending with at most n
  • fullSequenceBefore(pos) computes the longest full sequence before the position pos using binary search
  • digitAt(n) computes the digit at position n by first computing fullSequenceBefore and subtracting its length; it then uses another binary search for the last sequence

I used long everywhere as it's damn fast. There's a rudimentary test and a demo producing the following results

Found 1 at position 1
Found 1 at position 2
Found 2 at position 3
Found 3 at position 9
Found 2 at position 2147483647
Found 4 at position 10
Found 1 at position 100
Found 4 at position 1000
Found 9 at position 10000
Found 2 at position 100000
Found 6 at position 1000000
Found 2 at position 10000000
Found 6 at position 100000000
Found 8 at position 1000000000
Found 1 at position 10000000000
Found 1 at position 100000000000
Found 9 at position 1000000000000
Found 8 at position 10000000000000
Found 3 at position 100000000000000
Found 7 at position 1000000000000000
Found 6 at position 10000000000000000
Found 1 at position 100000000000000000
Found 1 at position 1000000000000000000
Found 7 at position 4611686018427387903

Computed in 0.030 seconds.

The biggest number I tried it for is Long.MAX_VALUE/2. In theory it could work for Long.MAX_VALUE as well, but I'm getting overflow there.

maaartinus
  • 44,714
  • 32
  • 161
  • 320
  • @Floris [Not everyone](http://codereview.stackexchange.com/a/61597/14363) would agree (note that I improved the code after the review). – maaartinus Aug 31 '14 at 14:15
  • I cannot find an 'off by one' error for any input value using your code. Can you give me an example. (Obviously I cannot test all input values :) – Floris Sep 03 '14 at 10:08
  • @Floris I guess there are no errors there now. But there were quite a few, initially some off-by-ones coming mostly from the `pos` being 1-based and later there were overflows above `Long.MAX_VALUE/2`. – maaartinus Sep 03 '14 at 18:01