The best answers to the question “List comprehension vs. lambda + filter” in the category Dev.
I happened to find myself having a basic filtering need: I have a list and I have to filter it by an attribute of the items.
My code looked like this:
my_list = [x for x in my_list if x.attribute == value]
But then I thought, wouldn’t it be better to write it like this?
my_list = filter(lambda x: x.attribute == value, my_list)
It’s more readable, and if needed for performance the lambda could be taken out to gain something.
Question is: are there any caveats in using the second way? Any performance difference? Am I missing the Pythonic Way™ entirely and should do it in yet another way (such as using itemgetter instead of the lambda)?
This is a somewhat religious issue in Python. Even though Guido considered removing
reduce from Python 3, there was enough of a backlash that in the end only
reduce was moved from built-ins to functools.reduce.
Personally I find list comprehensions easier to read. It is more explicit what is happening from the expression
[i for i in list if i.attribute == value] as all the behaviour is on the surface not inside the filter function.
I would not worry too much about the performance difference between the two approaches as it is marginal. I would really only optimise this if it proved to be the bottleneck in your application which is unlikely.
Also since the BDFL wanted
filter gone from the language then surely that automatically makes list comprehensions more Pythonic 😉
It is strange how much beauty varies for different people. I find the list comprehension much clearer than
lambda, but use whichever you find easier.
There are two things that may slow down your use of
The first is the function call overhead: as soon as you use a Python function (whether created by
lambda) it is likely that filter will be slower than the list comprehension. It almost certainly is not enough to matter, and you shouldn’t think much about performance until you’ve timed your code and found it to be a bottleneck, but the difference will be there.
The other overhead that might apply is that the lambda is being forced to access a scoped variable (
value). That is slower than accessing a local variable and in Python 2.x the list comprehension only accesses local variables. If you are using Python 3.x the list comprehension runs in a separate function so it will also be accessing
value through a closure and this difference won’t apply.
The other option to consider is to use a generator instead of a list comprehension:
def filterbyvalue(seq, value): for el in seq: if el.attribute==value: yield el
Then in your main code (which is where readability really matters) you’ve replaced both list comprehension and filter with a hopefully meaningful function name.
filter may be the “faster way”, the “Pythonic way” would be not to care about such things unless performance is absolutely critical (in which case you wouldn’t be using Python!).
Since any speed difference is bound to be miniscule, whether to use filters or list comprehensions comes down to a matter of taste. In general I’m inclined to use comprehensions (which seems to agree with most other answers here), but there is one case where I prefer
A very frequent use case is pulling out the values of some iterable X subject to a predicate P(x):
[x for x in X if P(x)]
but sometimes you want to apply some function to the values first:
[f(x) for x in X if P(f(x))]
As a specific example, consider
primes_cubed = [x*x*x for x in range(1000) if prime(x)]
I think this looks slightly better than using
filter. But now consider
prime_cubes = [x*x*x for x in range(1000) if prime(x*x*x)]
In this case we want to
filter against the post-computed value. Besides the issue of computing the cube twice (imagine a more expensive calculation), there is the issue of writing the expression twice, violating the DRY aesthetic. In this case I’d be apt to use
prime_cubes = filter(prime, [x*x*x for x in range(1000)])