A list comprehension is a way of creating a single list. A basic conditional one must be in the format:
[ expression for item in iterable if condition ]
You can't (easily) update two objects with one comprehension. Also, there's not a lot of point declaring logstocrunch_finset and errorlist and then populating them. Instead, how about something like:
pattern = re.compile(r"\d*F[IR]P", re.IGNORECASE)
logstocrunch_finset = {x for x in logstocrunch_set if pattern.search(x)}
errorlist = [f'{x} is not proper name' for x in logstocrunch_set.difference(logstocrunch_finset)]
UPDATE BELOW - Performance comparison with for loop
As @Barmar suggested, I benchmarked our two solutions. There's not a lot in it. The two comprehensions seem to handle a larger input set better. Changing the ratio of valid to invalid data didn't seem to make much difference.
import re
range_limit = 10
logstocrunch_set = set(
[f'{i}FRP' for i in range(range_limit)] +
[f'longer_{i}frp_lower' for i in range(range_limit)] +
['not valid', 'something else']
)
pattern = re.compile(r"\d*F[IR]P",re.IGNORECASE)
%%timeit -n 100000 -r 20
logstocrunch_finset = set()
errorlist = []
for x in logstocrunch_set:
if pattern.search(x):
logstocrunch_finset.add(x)
else:
errorlist.append(f'{x} is not proper name')
- range_limit = 10 | 9.53 µs ± 34.2 ns per loop (mean ± std. dev. of 20 runs, 100000 loops each)
- range_limit = 50 | 45.5 µs ± 699 ns per loop (mean ± std. dev. of 20 runs, 100000 loops each)
- range_limit = 100 | 89.4 µs ± 1.2 µs per loop (mean ± std. dev. of 10 runs, 100000 loops each)
%%timeit -n 100000 -r 20
logstocrunch_finset = {x for x in logstocrunch_set if pattern.search(x)}
errorlist = [f'{x} is not proper name' for x in logstocrunch_set.difference(logstocrunch_finset)]
- range_limit = 10 | 9.58 µs ± 14.1 ns per loop (mean ± std. dev. of 20 runs, 100000 loops each)
- range_limit = 50 | 42.2 µs ± 24.7 ns per loop (mean ± std. dev. of 20 runs, 100000 loops each)
- range_limit = 100 | 82.2 µs ± 491 ns per loop (mean ± std. dev. of 10 runs, 100000 loops each)