1

Given a 2D Array (Python List), I need to find a new 1D such that it contains elements unique in each column. For example:

[1, 1, -1, 1, 0]

[1, -1, -1, 3, 4]

[0, 0, 0, -2, -4]

should give me for example, [1,-1,0,3,4]. My trials so far: results is a 2D array of size 3 rows and n columns. last is the array which stores the end result, better said the unique values for each columns. From this array I need to find a 1D array of elements so that they are different.

last = []    
for i in range(len(results[0])):
    a = results[0][i]
    b = results[1][i]
    c = results[2][i]
    if (a not in last):
        last[i] = a
    elif (b not in last):
        last[i] = b
    elif (c not in last):
        last[i] = c

But it does not always give correct answer. This fails for input for example:

[1, 2, 3, 3]

[1, 0, -1, 1]

[0, 1, 2, 2]

The output should be for example [0,1,2,3] or [1,0,3,2] Any help and hints is appreciated.

mhhabib
  • 2,975
  • 1
  • 15
  • 29
ASPire
  • 13
  • 2
  • 1
    what do you mean by unique element in each row? also in text you mention `such that it contains elements unique in each column` So can you please describe the question better – jkhadka Dec 03 '20 at 11:42
  • Also, shouldn't -2 and -4 also be in your expected result? – sai Dec 03 '20 at 11:43
  • 3
    The items `[1,-1,0,3,4]` does not look unique for each column. For example `1` occurs twice in the first column. – Willem Van Onsem Dec 03 '20 at 11:44
  • Maybe all that you are looking for is a `set` of all lists, check it out if you are not aware of it – sai Dec 03 '20 at 11:47
  • Do you want to convert 2D list to single list and then showing the result is unique? – mhhabib Dec 03 '20 at 11:49
  • 1
    You should not assing the value like `last[i] = a` to a list. Your list is not guaranteed to haave enough entries. Use `last.append(a)` instead. – jottbe Dec 03 '20 at 12:12
  • Thank you for the replies. @WillemVanOnsem The problem ist to select elements from each column, such that the **last** Array contains all unique elements, The indexes must remain the same. – ASPire Dec 03 '20 at 12:13
  • @ASPire: well you are using a greedy approach here. For the first column you will always add the first row, but that might eventually result in no solution, since picking `1` (over `0`) might "block" other decisions. You need some sort of backtracking, where you can of course cut search from the moment that you add another element a second time. – Willem Van Onsem Dec 03 '20 at 12:19
  • @hadik the problem is like this: the give array has n columns. The resulting array (I defined it as **last** as an array in my code. The resulting array should also have n columns, each column represents each columns in the **result** array. Unique in the sense that, the resulting array has all unique elements. If it is not possible, it should return false. – ASPire Dec 03 '20 at 12:19

2 Answers2

0

The code in this answer returns a list made of one element from each column, such that this element had no duplicates in its column.

Using collections.Counter to get the least common element of every column. If the column contains an element which is unique, then the unique elements are the least commons (they occur once).

Code

from collections import Counter

def getUniqueFromColumns(table):
  nbRows = len(table)
  nbCol = len(table[0])
  assert(len(row) == nbCol for row in table)
  return [Counter(row[icol] for row in table).most_common()[-1][0] for icol in range(nbCol)]

Output

>>> getUniqueFromColumns([[1, 1, -1, 1, 0],
                          [1, -1, -1, 3, 4],
                          [0, 0, 0, -2, -4]])
[0, 0, 0, -2, -4]

>>> getUniqueFromColumns([[1, 1, -1, 5, 0],
                          [1, 3, 2, 4, 4],
                          [0, 1, 2, 5, -4]]) 
[0, 3, -1, 4, -4]

Code explanation

  • assert(len(row) == nbCol for row in table) ensures the list of lists is really a rectangle;
  • row[icol] for row in table returns the column at index icol;
  • Counter(...) counts the number of occurrences of each distinct element in the column;
  • .most_common() returns a list of pairs (element, count) sorted by count in reverse order;
  • .most_common()[-1][0] gets the least frequent element of the column.

References

Stef
  • 13,242
  • 2
  • 17
  • 28
  • getUniqueFromColumns([[1, 1, -1, 1, 0], [1, -1, -1, 3, 4], [0, 0, 0, -2, -4]]) should then return [0, 1, -1, -2, -4], so that every element is unique. Is it possible with built-in data structures? – ASPire Dec 03 '20 at 12:35
  • @ASPire Hmm. You should probably edit your question to make it clearer what you want. I thought you wanted to find, in each column, an element which has no duplicates in that column. Are you saying you want, instead, to get a list with one element from each column, such that the final list has no duplicate elements? Do you care about duplicate elements in columns or not? – Stef Dec 03 '20 at 12:39
  • Thank you for your well explained code. Duplicate is not a problem. Just the result should be unique. The thing is, you are selecting an element from each column, duplicate or not, but the resulting array must have unique elements. – ASPire Dec 03 '20 at 12:43
  • OK, I'll post a completely different answer, then. – Stef Dec 03 '20 at 12:44
0

The code in this answer returns a list made of one element from each column, such that this list contains no duplicates.

Bruteforce: Using itertools.product on the columns to iterate through every possible combination of elements from every column, and returning the first one which contains no duplicates.

Code

from itertools import product

def getCombinationWithoutDuplicates(table):
  columns = map(frozenset, zip(*table))
  for combination in product(*columns):
    if len(set(combination)) == len(combination):
      return combination
  return None

Output

>>> getCombinationWithoutDuplicates([[1, 1, -1, 1, 0],
                                     [1, -1, -1, 3, 4],
                                     [0, 0, 0, -2, -4]])
(0, 1, -1, 3, 4)

>>> getCombinationWithoutDuplicates([[1, 1, -1, 5, 0],
                                     [1, 3, 2, 4, 4],
                                     [0, 1, 2, 5, -4]])
(0, 1, 2, 4, -4)

Code explanation

  • zip(*table) transposes the list of lists (so it's the list of columns rather than the list of rows);
  • map(frozenset, ...) remove duplicates from each columns, which has no impact on the final solution but makes the bruteforce faster;
  • for combination in product(*columns): iterates over every possible combination of one element from each column;
  • len(set(combination)) == len(combination) tests whether there are duplicates in the combination;
  • In the possible situation where there is no solution, the for-loop will be executed in its entirety and flow will reach the last line return None.

References

Stef
  • 13,242
  • 2
  • 17
  • 28
  • Thank you! It is exactly what I was searching for. How is the time complexity of this algorithm? – ASPire Dec 07 '20 at 15:13
  • @ASPire In the worst-case, it's terrible. This is essentially bruteforce. The only "optimisation" is using frozensets to represent the columns instead of lists, which removes duplicates from columns and avoids iterating through redundant combinations. But in the worst-case this changes nothing. Worst-case time complexity is the number of distinct combinations. I'll let you calculate how many distinct combinations there are. Hint: it's a function of the numbers R of rows and C of columns. – Stef Dec 07 '20 at 17:28
  • @ASPire If the numbers of columns and rows are very high, you might want to consider using backtracking instead of naive bruteforce. Backtracking is a somewhat-less-naive version of bruteforce in which we abort as soon as we realise that a partial solution cannot be completed into a solution; as opposed to naive bruteforce which tests every candidate blindly. – Stef Dec 07 '20 at 17:34