New list of not repeated elements

Question

I want to create a function that take a lsit as argument, for example:

list = ['a','b','a','d','e','f','a','b','g','b']

and returns a specific number of list elements ( i chose the number) such that no number occurs twice. For example if i chose 3:

 new_list = ['a','b','d']

I tried the following:

def func(j, list):
    new_list=[]
    for i in list:
        while(len(new_list)<j):
            for k in new_list:
                if i != k:
                    new_list.append(i)
                    
                    return new_list

But the function went through infinite loop.

[Remove the duplicates in the list](https://stackoverflow.com/questions/7961363/removing-duplicates-in-lists). Then [randomly select as many items as you want](https://stackoverflow.com/a/30488952/843953) — Pranav Hosangadi, May 11 '22 at 18:02
Please run the code you have posted here and make sure it actually runs and reproduces your problem [mre] — Pranav Hosangadi, May 11 '22 at 18:06
You need to remove your `list` variable. It is overriding the built-in list function — OneCricketeer, May 11 '22 at 18:08
@ely66: Is there any specific ordering needed? It looks like you want it in order by first appearance in the original `list`, but you might also be selecting by first alphabetically, or you might allow random selection, or whatever. — ShadowRanger, May 11 '22 at 18:13

codester_09 · Answer 1 · 2022-06-07T04:31:48.440

2

Try this.


lst = ['a','b','a','d','e','f','a','b','g','b']

j = 3

def func(j,list_):
    new_lst = []
    for a in list_:
        if a not in new_lst:
            new_lst.append(a)

    return new_lst[:j]

print(func(j,lst)) # ['a', 'b', 'd']

I don't know why someone does not post a numpy.unique solution

Here is memory efficient way(I think ).

import numpy as np
lst = ['a','b','a','d','e','f','a','b','g','b']

def func(j,list_):
    return np.unique(list_).tolist()[:j]

print(func(3,lst)) # ['a', 'b', 'd']

edited Jun 07 '22 at 04:31

answered May 11 '22 at 18:05

codester_09

5,622
2
5
27

Your second approach was actually better. `a not in new_lst` is slow for large lists. – Pranav Hosangadi May 11 '22 at 18:12
2

@ely66 you get that error because of what [OneCricketeer commented above](https://stackoverflow.com/questions/72205596/new-list-of-not-repeated-elements#comment127573842_72205596). Don't shadow python builtins with your variables. – Pranav Hosangadi May 11 '22 at 18:14

ShadowRanger · Accepted Answer · 2022-05-11T18:28:05.690

def func(j, mylist):
    # dedup, preserving order (dict is insertion-ordered as a language guarantee as of 3.7):
    deduped = list(dict.fromkeys(mylist))
    # Slice off all but the part you care about:
    return deduped[:j]

If performance for large inputs is a concern, that's suboptimal (it processes the whole input even if j unique elements are found in first j indices out of an input where j is much smaller than the input), so the more complicated solution can be used for maximum efficiency. First, copy the itertools unique_everseen recipe:

from itertools import filterfalse, islice  # At top of file, filterfalse for recipe, islice for your function

def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in filterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

now wrap it with islice to only pull off as many elements as required and exiting immediately once you have them (without processing the rest of the input at all):

def func(j, mylist):  # Note: Renamed list argument to mylist to avoid shadowing built-in
    return list(islice(unique_everseen(mylist), j))

score 1 · Answer 3 · answered May 11 '22 at 18:15

1

list is a reserved word in python.

If order of the elements is not a concern then

def func(j, user_list):
    return list(set(user_list))[:j]

answered May 11 '22 at 18:15

sam

1,819
1
18
30

2

this will work but with python is not guaranteed that the list will have the exact order you would expect – d3javu999 May 11 '22 at 18:17

score 1 · Answer 4 · answered May 11 '22 at 18:16

it's bad practice to use "list" as variable name

you can solve the problem by just using the Counter lib in python

from collections import Counter
a=['a','b','a','d','e','f','a','b','g','b']

b = list(Counter(a))

print(b[:3])

so your function will be something like that

def unique_slice(list_in, elements):
    new_list = list(Counter(list_in))
    print("New list: {}".format(new_list))
    if int(elements) <= len(new_list):
            return new_list[:elements]
    return new_list

hope it solves your question

No need to import/use `Counter`; `dict.fromkeys(list_in)` will get the same effect, but without counting them (since you're not using the counts here anyway). — ShadowRanger, May 11 '22 at 18:23

score 1 · Answer 5 · answered May 11 '22 at 18:27

As others have said you should not Shadow built-in name 'list'. Because that could lead to many issues. This is a simple problem where you should add to a new list and check if the element was already added.

The [:] operator in python lets you separate the list along an index.

>>>l = [1, 2, 3, 4]
>>>l[:1]
[1]
>>>l[1:]
[2, 3, 4]

lst = ['a', 'b', 'a', 'd', 'e', 'f', 'a', 'b', 'g', 'b']


def func(number, _list):
    out = []
    for a in _list:
        if a not in out:
            out.append(a)

    return out[:number]


print(func(4, lst))  # ['a', 'b', 'd', 'e']

New list of not repeated elements

5 Answers5