2
lijst = [[], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [],
         [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [], [],
         [], [], [], [], [], [], [], [], [], [], [], [],
         ['/vacatures/oracle-plsql-ontwikkelaar-1/'], [], [], [], [],
         ['/vacatures/oracle-plsql-ontwikkelaar-1/'],
         ['/vacatures/business-intelligence-developer-1/'], [], [], [], [], [],
         ['/vacatures/business-intelligence-developer-1/'],
         ['/vacatures/oracle-dba/'], [], [], ['/vacatures/oracle-dba/'],
         ['/vacatures/database-beheerder/'], [], [], [],
         ['/vacatures/database-beheerder/'],
         ['/vacatures/sql-server-dba-powershell/'], [], [], [],
         ['/vacatures/sql-server-dba-powershell/'],
         ['/vacatures/junior-msbi-consultant/'], [], [], [], [], [],
         ['/vacatures/junior-msbi-consultant/'],
         ['/vacatures/senior-msbi-consultant/'], [], [], [], [], [],
         ['/vacatures/senior-msbi-consultant/'],
         ['/vacatures/medior-msbi-consultant/'], [], [], [], [],
         ['/vacatures/medior-msbi-consultant/'],
         ['/vacatures/zos-mainframe-specialist/'], [], [],
         ['/vacatures/zos-mainframe-specialist/'],
         ['/vacatures/junior-business-analyst/'], [], [], [], [],
         ['/vacatures/junior-business-analyst/'], [], [], [], [], [], [], [],
         [], ['/vacatures/oracle-plsql-ontwikkelaar-1/'], [], [],
         ['/vacatures/oracle-dba/'], [], [],
         ['/vacatures/business-intelligence-developer-1/'], [], [],
         ['/vacatures/database-beheerder/'], [], [],
         ['/vacatures/sql-server-dba-powershell/'], [], [], [], [], [], [], [],
         [], [], []]

I have a question. How can I filter out the empty lists and remove the duplicate items inside of the 2 dimensional list?

Lex Scarisbrick
  • 1,540
  • 1
  • 24
  • 31

2 Answers2

3

It is as simple as doing

new_list0 = list(filter(len, lijst))

and then to remove duplicate, you could turn new_list into set and then cast it back to a list. As follows

new_list1 = list(set(tuple(x) for x in new_list0))

And if you want to cast the elements of new_list1 (that are tuple now) back to lists, something you can do is

new_list2 = list(map(list, new_list1))


But, given the number of back and forth performed above(casting from generator, to list, to set, ..., to list, and so on), something which appears better in term of performance is probably
new_list = []
for el in lijst:
    if el and el not in new_list:
        new_list.append(el)            
#print(new_list)

Finally, note that new_list will still be 2-dimensional, as the original one. If you want to make it 1-dimensional, something you can do is making it flat, as follows

import itertools
new_list = list(itertools.chain.from_iterable(new_list))

or directly creating it as a 1-dimensional list and reducing the time complexity to O(n) (instead of O(n**2) by avoiding the in operator)

new_set = set()
for el in lijst:
    if el:
        new_set.update(el)        
new_list = list(new_set)


answer tested and functional
keepAlive
  • 6,369
  • 5
  • 24
  • 39
2

Your list isn't really 2 dimensional. Every list has either 0 or 1 element.

In that case, you could just extract the strings and put them into a set:

print({l[0] for l in lijst if l})

It outputs:

set(['/vacatures/junior-msbi-consultant/', '/vacatures/junior-business-analyst/', '/vacatures/business-intelligence-developer-1/', '/vacatures/zos-mainframe-specialist/', '/vacatures/sql-server-dba-powershell/', '/vacatures/database-beheerder/', '/vacatures/medior-msbi-consultant/', '/vacatures/oracle-dba/', '/vacatures/oracle-plsql-ontwikkelaar-1/', '/vacatures/senior-msbi-consultant/'])

It's concise and fast (O(n)).

Eric Duminil
  • 52,989
  • 9
  • 71
  • 124
  • Clearly better than my answer. Except that it does not work if the OP wants to keep the final output as a (not-really) 2d list.. which is not likely however. – keepAlive Sep 19 '17 at 19:40