2

I am working with this problem: https://www.hackerrank.com/challenges/fraudulent-activity-notifications/

My code works almost fine, but for some test cases it fails, because of the large array (over 200000 items). I am spending hours trying to understand what I can do to improve the speed, but I cannot come out with a working solution, so 2 of my tests always fail for timeout and I am frustrated trying to pass this test. I think I cannot avoid the first loop and also the loop in sort, but cannot think of a faster way.

The problem described in the website is is this:

HackerLand National Bank has a simple policy for warning clients about possible fraudulent account activity. If the amount spent by a client on a particular day is greater than or equal to the client's median spending for a trailing number of days, they send the client a notification about potential fraud. The bank doesn't send the client any notifications until they have at least that trailing number of prior days' transaction data.

I solved it with this code

function getMedianNumber(arr) {
  arr.sort((a, b) => a - b);

  let medianNumber = 0;
  const middle = Math.floor(arr.length / 2);

  if (arr.length % 2 === 0) {
    // Is even we get the median number
    medianNumber = (arr[middle] + arr[middle - 1]) / 2;
  } else {
    const index = Math.floor(middle);
    medianNumber = arr[index];
  }

  return medianNumber;
}

function activityNotifications(expenditure, d) {
  let notifications = 0;
  let len = expenditure.length - 1;

  for (let i = len; i > d - 1; i--) {
    let trailingDays = expenditure.slice(i - d, i);
    let dayExpense = expenditure[i];
    let median = getMedianNumber(trailingDays);

    if (expenditure[i] >= median * 2) {
      notifications++;
    }
  }

  return notifications;
}

It only fails in 2 test cases because the passed array is huge and I get a timeout error.

Hitmands
  • 13,491
  • 4
  • 34
  • 69
Kaiser Soze
  • 1,362
  • 3
  • 14
  • 31
  • Compute the median similar to how you compute a rolling sum. See https://en.wikipedia.org/wiki/Moving_average – Bergi Mar 22 '19 at 19:20
  • @berig: could you do that ? Think twice. –  Mar 22 '19 at 19:37
  • 2
    You can do a rolling median, but it's trickier than a rolling sum: https://stackoverflow.com/a/5970314/10396. – AShelly Mar 22 '19 at 20:26
  • @YvesDaoust Yes. Sure, adding to the window, removing from the window and computing the new median is not as simple as `+`, `-` and `id` with the sum, but it's doable. And even if you do it relatively inefficiently, e.g. with a sorted array for the window, that's still much better than sorting each slice own its own like OP is currently doing it. – Bergi Mar 22 '19 at 22:19
  • 1
    @Bergi: I wanted to stress that it is not possible to achieve a rolling median in constant time like you do for a rolling sum. A constant-time rolling minimum is possible, though. –  Mar 22 '19 at 22:25
  • @YvesDaoust Ah OK, I agree there. I didn't mean to imply constant time, but definitely improvement from OP's `O(n log n)` to `O(n)` or even `O(log n)` (`n` being the window size). – Bergi Mar 22 '19 at 22:32

2 Answers2

2

expenditure.slice(i - d, i); is too expensive, you are making it O(n^2) by copying the array elements over each iteration. Use indexes over the original array to calculate median: getMedianNumber(arr, startIndex, endIndex).

juvian
  • 15,875
  • 2
  • 37
  • 38
  • But OP won't be able to do `arr.sort((a, b) => a - b)` then. How would you find the median? – Bergi Mar 22 '19 at 19:20
  • 1
    @Bergi by keeping the count of expenditures, MAX_EXPENDITURE is just up to 200. He can get the median in O(200) – juvian Mar 22 '19 at 19:25
  • I thought that the sort function was slowing down, can you explain why MAX_EXPENDITURE is just up to 200? what that means? Thanks – Kaiser Soze Mar 22 '19 at 21:26
  • 1
    @KaiserSoze its in the problem description, the numbers in the array expenditures are all between 0 and 200 – juvian Mar 22 '19 at 21:51
  • 1
    Thank you @juvian this is what I still have to learn, how to really read and understand the questions. I never look at constraints because I do not understand them, but now at least, I understand how important they are :) – Kaiser Soze Mar 22 '19 at 22:20
  • @juvian I was not arguing about the time complexity, I was wondering how you'd change the algorithm. OP's current `getMedianNumber(arr)` takes an array and sorts it, you can't do that if you just pass two indices. I don't think you want to mutate the range in `arr`, do you? – Bergi Mar 22 '19 at 22:22
  • @juvian If you consider `expenditure` to be limited by a maximum value, everything has `O(1)` complexity. But that's not interesting, and also it doesn't say anything about how fast the particular implementation really is. – Bergi Mar 22 '19 at 22:24
  • @Bergi indeed, but that will get an accepted on that website and that is what matters in this question. There are structures that can update median in log n but not needed in this case – juvian Mar 23 '19 at 05:02
0
function copy(a, ind) {
    b = [];
    for(var i = ind; i < a.length; ++i) {
        b.push(a[i]);
    }
    return b;
}
function processData(input) {
    //Enter your code here
    var inputArr = input.split("\n");
    var d = parseInt(inputArr[0].split(" ")[1]);
    var arr = inputArr[1].split(" ").map(Number);
    var countArray = [], sortedArray = [], tempArray = [], notifications = 0, median, count, middle;
    for(var i = 0; i <= 200; ++i) {
        countArray.push(0);
    }
    for(i = 0; i < d; ++i) {
        countArray[arr[i]]++;
    }
    for(var j = 0; i < arr.length; ++i, ++j) {
        tempArray = [], count = 0;
        for(var k = 0; k <= 200; ++k) {
            if(countArray[k] > 0) {
                count += countArray[k];
                tempArray.push({
                    no: k,
                    count: count
                });
            }
        }
        middle = {};
        if((d&1) === 0) {
            middle.index = count / 2;
        } else {
            middle.index = Math.ceil(count / 2);
        }
        var tempCount = 0;
        for(k = 0; k < tempArray.length; ++k) {
            if(tempArray[k].count === middle.index) {
                if((d&1) === 0) {
                    median = (tempArray[k].no + tempArray[k + 1].no) / 2;
                    break;
                } else {
                    median = tempArray[k].no;
                    break;
                }
            } else if(tempArray[k].count > middle.index) {
                median = tempArray[k].no;
                break;
            }
        }

        //console.log(tempArray, median, arr[i]);
        if(arr[i] >= (2 * median)) {
            notifications++;
        }
        countArray[arr[i]]++;
        countArray[arr[j]]--;
    }
    console.log(notifications);
} 

process.stdin.resume();
process.stdin.setEncoding("ascii");
_input = "";
process.stdin.on("data", function (input) {
    _input += input;
});

process.stdin.on("end", function () {
   processData(_input);
});
  • Can you explain how this code works, and how it is better than the original code? (btw, I didn't downvote this time - the answer seems fine, apart from lacking explanation) – Bergi Mar 22 '19 at 22:26