0

Given a sorted list such as:

[1, 2, 2, 3, 3, 4]

My goal is to check if there are any numbers repeated, and if so, shift the element and all the numbers before it by one to the left as such:

[1, 2, 2, 3, 3, 4]

[0, 1, 2, 3, 3, 4]

[-1, 0, 1, 2, 3, 4]

right now this is my approach:

def shifting(data):
    i = 0
    while i< len(data)-1:
        if data[i]==data[i+1]:
            j=i
            while j>=0:
                data[j]-=1
                j-=1
        i+=1
    return data

But this is an O(n^2) algorithm and takes a lot of time to run with very long lists. I want to find a more efficient approach. Any ideas?

Mohamad Moustafa
  • 479
  • 5
  • 19
  • 1
    Is "shifting" always decrementing by one? – Vulwsztyn Feb 22 '22 at 13:52
  • For `[1,2,2,6,6,7]` your solution returns `[-1, 0, 1, 5, 6, 7]`. Is that the desired result (removing `2` from the input)? – Michael Szczesny Feb 22 '22 at 13:53
  • yes, the goal is to remove any duplicates even if it eventually causes an element to be removed – Mohamad Moustafa Feb 22 '22 at 13:56
  • maybe I misundertood the logic, but would [this simple approach](https://stackoverflow.com/a/71222839/16343464) work? – mozway Feb 22 '22 at 14:14
  • 1
    When answering questions asking for clarification it is best to edit your question rather than elaborating in comments. You have not answered @Vulwsztyn's question. That question is important because if you want to shift a total of `n` times shifting once `n` times might not be the most efficient approach. Or perhaps you want to always continue until you can continue no more because there are no remaining duplicates. If the array were `[3, 3, 4]` after one shift would it be `[0, 3, 4]`, `[2, 3, 4]` (`2` computed `3-1`) or something else? – Cary Swoveland Feb 22 '22 at 18:12

3 Answers3

1

I would agree with Stef. The main idea is that you don't really need to have this nested while loop. All you need is a single pass to pinpoint where the duplications occur and apply a shift accordingly.

I'll propose something a little bit more complex but might be more compact:

import numpy as np

input_list = [1, 2, 2, 3, 3, 4]

# Convert to array for easier indexing
output_ary = np.array(input_list)

# Pinpoint the location at which duplications occur
duplication_indicator = output_ary[:-1] - output_ary[1:] == 0

# Compute the corresponding shift
shift = np.cumsum(duplication_indicator[::-1])[::-1]

# Apply the shift
output_ary[:-1] -= shift

# Convert back to list
output_list = output_ary.tolist()

The main idea is that after you've pinpointed the duplication locations, you can compute the corresponding shift by looking at how many more duplications occur to the right. This could be done by simply doing a reversed cumulative sum (summing from the right to left). Applying this shift to the original list then gives the desired output.

Remy Lau
  • 142
  • 1
  • 4
0

Iterate on the data from right to left. Keep a counter decrement that tells you how many duplicates you've encountered so far, and thus, by how much you want to decrement every element you see.

This is linear instead of quadratic: you only iterate on the data once.

When writing python code, I strongly suggest using for-loops rather than while-loops whenever you can, and in particular when you know the length of the loop by advance.

In your code, i = 0; while i < len(data) - 1: i += 1 can be replaced by for i in range(len(data)-1):.

To iterate from right to left: for i in range(len(data)-1, -1, -1):

Stef
  • 13,242
  • 2
  • 17
  • 28
0

The logic is no fully clear, but IIUC is seems easy to remove the duplicates with np.unique, then to left fill the array with a range to go to the initial length:

a2 = np.unique(a)
out = np.r_[np.arange(-(a.shape[0]-a2.shape[0]), 0)+1, a2]

output:

array([-1,  0,  1,  2,  3,  4])

on a = np.array([1,2,2,6,6,7]):

array([-1,  0,  1,  2,  6,  7])
mozway
  • 194,879
  • 13
  • 39
  • 75
  • The OP is struggling to write a `for`-loop, and you're suggesting some `np.r_` dark magic :-p – Stef Feb 22 '22 at 14:20
  • @Stef well, a for loop is not appropriate for numpy ;) `np.r_` is quite basic for the numpy tag, no? – mozway Feb 22 '22 at 14:23
  • with `a = [100, 200, 200, 300, 300, 400]`, the OP's version gives `[98, 198, 199, 299, 300, 400]`, but your version gives `[ -1, 0, 99, 200, 300, 400]` – Stef Feb 22 '22 at 14:25
  • @Stef thus my question on the (unclear) logic ;) – mozway Feb 22 '22 at 14:33