Every so often, a post pops up in my RSS reader that is something along the lines of “20 optimisation tricks” or “10 techniques to make your code run faster”. Sure, everyone wants their code to run faster, but a number of these techniques only serve to make code run slower and become less readable. The problem is that people blindly accept these posts as factual information(myself included) because optimisation is hard, and the next time they come across a situation they use the false optimisation which makes their code run slower. I decided to investigate a one of the tips I read recently.
The tip I read related to the use of if/else blocks; the article said that I would be better off without the else part by setting the else value of a variable before the if statement. I think code can explain this better than a sentence, so here it is (the for loops are to make the code run for a measurable amount of time):
# speedtest1.py # testing the speed of two optimisations variable = 0 for n in range(0, 10000000): i = 0 if variable: i = 1 variable ^= 1 # flip it!
Now the code I would normally write looks something like this:
# speedtest2.py # testing the speed of two optimisations variable = 0 for n in range(0, 10000000): if variable: i = 1 else: i = 0 variable ^= 1 # XOR flip!
So the next logical step would be to run both scripts through the python profiler, cProfile. My testing system is a stock Q6600 running Windows XP and Python 2.5.2 (r252:60911, Feb 21 2008, 13:11:45).
Their technique:
5 function calls in 5.626 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 5.626 5.626 <string>:1(<module>) 1 5.313 5.313 5.625 5.625 speedtest1.py:3(<module>) 1 0.000 0.000 5.626 5.626 {execfile} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.313 0.313 0.313 0.313 {range}
My technique:
5 function calls in 4.619 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 4.619 4.619 <string>:1(<module>) 1 4.323 4.323 4.618 4.618 speedtest2.py:3(<module>) 1 0.000 0.000 4.619 4.619 {execfile} 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 1 0.296 0.296 0.296 0.296 {range}
As you can see from the profile reports, there is just over a whole second of difference in CPU time for each of the scripts on my system, with the “optimised” one actually being the slower one.
While it might seem like I am out to attack these optimisation posts, I actually think they are a good idea. It is interesting to see other people’s coding techniques, but one must remember that computers are pseudo-unique and that something might not work better for everyone. I suggest that people who post these techniques on their blogs should provide sample code as well, so the technique can be evaluated on a case by case basis.
Note: I didn’t mention the poster of the article because it has little to do with the problem at hand. I’m sure google will help you find it if you want it.
Assembly for this example would be interesting.