I have a list of booleans where occasionally I reset them all to false. After first writing the reset as:
for b in bool_list:
b = False
I found it doesn't work. I spent a moment scratching my head, then remembered that of course it won't work since I'm only changing a reference to the bool, not its value. So I rewrote as:
for i in xrange(len(bool_list)):
bool_list[i] = False
and everything works fine. But I found myself asking, "Is that really the most pythonic way to alter all elements of a list?" Are there other ways that manage to be either more efficient or clearer?
-
If you only have one reference to the list, the following may be easier:
bool_list = [False] * len(bool_list)
This creates a new list populated with
False
elements.See my answer to Python dictionary clear for a similar example.
Soviut : Voted down because he's indicated that he needs it to act in an existing list.S.Lott : Voted up because "existing list" can easily be replaced with new list via assignment statement.dF : Also if you want to change the existing list you can change it to bool_list[:] = [False] * len(bool_list)Jason Baker : If this is a very large list, this code might be bad since you're essentially making two copies of the list. But it should work in about 95% of all situations. -
I wouldn't use the range and len. It's a lot cleaner to use enumerate()
for i, v in enumerate(bool_list): #i, v = index and value bool_list[i] = False
It's left with an unused variable in this case, but it still looks cleaner in my opinion. There's no noticeable change in performance either.
Algorias : Enumerate seems to be overkill in this case. You don't even need the v variable, a clear indicator of overcomplication. A simple list comprehension gets the job done.Sir Oddfellow : For the sake of looking better, I'd still vote for it. There's no noticeable difference in speed. A dummy variable is worth it. I guess it comes down to personal tasteironfroggy : inefficient in all the separate setitem's it requires. -
For value types such as int, bool and string, your 2nd example is about as pretty as its going to get. Your first example will work on any reference types like classes, dicts, or other lists.
Greg Hewgill : The first example will only "work" if you're not trying to change the item contained within the list. If you're doing for x in a: x.foo() then that's certainly okay. However, if a is a list of dicts then for x in a: x = {} does not do what is intended, for the same reason as the original question. -
If you're willing to use numpy arrays, it's actually really easy to do things like this using array slices.
import numpy bool_list = numpy.zeros((100,), dtype=numpy.bool) # do something interesting with bool_list as if it were a normal list bool_list[:] = False # all elements have been reset to False now
-
Here's another version:
bool_list = [False for item in bool_list]
-
I think
bool_list = [False for element in bool_list]
is as pythonic as it gets. Using lists like this should generaly be faster then a for loop in python too.
S.Lott : Why repeat a previous answer? Why not upvote the other answer? I'm not sure what's distinct about this answer. Perhaps I'm missing something. Could you revise or expand your answer to emphasize the unique information? -
bool_list[:] = [False] * len(bool_list)
or
bool_list[:] = [False for item in bool_list]
Dan Homerick : Chosen because this style's effect is clear, doesn't suffer from the limitation that there be only one reference to the list, and the performance for the top one is superb. -
Summary Performance-wise, numpy or a list multiplication are clear winners, as they are 10-20x faster than other approaches.
I did some performance testing on the various options proposed. I used Python 2.5.2, on Linux (Ubuntu 8.10), with a 1.5 Ghz Pentium M.
Original:
python timeit.py -s 'bool_list = [True] * 1000' 'for x in xrange(len(bool_list)): bool_list[x] = False'
1000 loops, best of 3: 280 usec per loop
Slice-based replacement with a list comprehension:
python timeit.py -s 'bool_list = [True] * 1000' 'bool_list[:] = [False for element in bool_list]'
1000 loops, best of 3: 215 usec per loop
Slice-based replacement with a generator comprehension:
python timeit.py -s 'bool_list = [True] * 1000' 'bool_list[:] = (False for element in bool_list)'
1000 loops, best of 3: 265 usec per loop
Enumerate:
python timeit.py -s 'bool_list = [True] * 1000' 'for i, v in enumerate(bool_list): bool_list[i] = False'
1000 loops, best of 3: 385 usec per loop
Numpy:
python timeit.py -s 'import numpy' -s 'bool_list = numpy.zeros((1000,), dtype=numpy.bool)' 'bool_list[:] = False'
10000 loops, best of 3: 15.9 usec per loop
Slice-based replacement with list multiplication:
python timeit.py -s 'bool_list = [True] * 1000' 'bool_list[:] = [False] * len(bool_list)'
10000 loops, best of 3: 23.3 usec per loop
Reference replacement with list multiplication
python timeit.py -s 'bool_list = [True] * 1000' 'bool_list = [False] * len(bool_list)'
10000 loops, best of 3: 11.3 usec per loop
Dan Homerick : I'm glad I timed them. Up until now, I had thought that list multiplications were neat, but probably slow.Sir Oddfellow : That's a good summary! You should mark this as the answer. I had no idea numpy was THAT much faster. Dang
0 comments:
Post a Comment