Summing up - take a break.Recently I was reading about approximating images with collections of overlapping translucent polygons (evolution of the Mona Lisa) or triangles (Mostly maths approximates images) and I thought it would be interesting to write a combination shotgun hill-climber and genetic algorithm. Nothing complex, nothing too intricate, but just an experiment to see how closely I could approximate an image with a very small number of triangles. Results have been interesting and I'll write about them later. But on the way I had a bizarre experience. I was in the middle of writing the code when I realised that I was on an elderly machine (one I could simply leave running for days) and the version of Python I was using didn't have the built-in "sum". So I quickly wrote one to add to my utilities collection:
def sum(vector): if 0==len(vector): return 0 return vector[0]+sum(vector[1:]) This is the obvious, classic, recursive functional solution, and it tripped off my fingers without thinking. I was, at the time, programming mostly machines with 256 processors or more, so parallel algorithms were always the first thing I looked at. I should perhaps have used an accumulation parameter:
def sum(vector,acc): if 0==len(vector): return acc return sum(vector[1:],acc+vector[0]) Proper tail-call optimisation (TCO) would turn that into an iterative version and it would run well. But of course, my version of Python doesn't have TCO, so the first time I called it in anger it blew the stack. Bother. Easy to fix, though, using the classic "divide and conquer" technique:
def sum(vector): size = len(vector) if 0==size: return 0 if 1==size: return vector[0] half = size/2 return sum(vector[:half]) + sum(vector[half:])
Moving on ...But then I had a thought ... Rather than recursing by cutting in half, summing each half, and adding the results, why not pair off the elements, add the pairs to get a list half the length, then sum that vector. Like this ...
def sum(vector): size = len(vector) if size>1: m = size/2 m2 = m+m z = zip(vector[:m],vector[m:m2]) h = [ a+b for (a,b) in z ] return sum(h)+sum(vector[m2:]) if size==1: return vector[0] return 0
def sum(l): size = len(l) if 1<size: m = size/2 return sum( [ a+b for (a,b) in zip(l[:m],l[m:m+m]) ] )+sum(l[m+m:]) if 1==size: return l[0] return 0 It was better, but not a huge amount. I won't bother exploring that in detail.
Iteration instead of recursionEven without TCO, we can still convert the recursive version into an iterative version:
def sum(vector): r = 0 size = len(vector) while size>0: m = size/2 m2= m+m if m2<size: r += vector[-1] h = [ a+b for (a,b) in zip(vector[:m],vector[m:m2]) ] vector,size = h,m return r I have to say I quite liked this - it turned out rather better than I had expected. Sadly, it runs in almost exactly the same speed as the recursive version. Interesting.
Moving into built-ins ...Of course, so far this is all running in Python, but there are built-ins that are written in C and run rather faster. The obvious one to use is the one from functional programming, the routine "reduce":
def sum(l): return reduce(lambda a,b:a+b,l)
After the break.Some of you will have seen this coming.
Post-scriptThe timings were taken using wall-clock time and averages over some number of calls ranging from 4 to 10. I eye-balled the values and decided that there weren't enough anomalies to bother about. Yes, occasionally there was a larger value, but broadly the figures given are all reasonable. The vector sizes were chosen to be changing by a factor of about 3, and so there's a good mix of odds and evens. After all, most of the routines have tests for odd/even on the size, and I wanted a mix. It seemed a good compromise without much work. After all, this isn't a deep theoretical paper. The routines aren't intended to be lovely hand-optimised. This was just an observation that I thought I'd make. There are possibly hundreds of thing you could try, small variations that may have surprisingly large effects. I'd like to see them, so if you think these tests are fatally flawed, please, run your own and tell us about them.
|