These fields are all optional and need only
be supplied if you would like a direct reply.
Subject
Your email address
Your real name
You must answer this!
If you don't, my spam filtering will
ensure that I never see your email.
What's 8 plus five (in digits only)?
Please make your changes here and then
Editing tips and layout rules.
File: WhenOptimisingCodeMeasure ''' <link rel="alternate" type="application/rss+xml" ''' href="/rss.xml" title="RSS Feed"> ********> width="25%" |>> ''' <a title="Subscribe to my feed" ''' rel="alternate" ''' href="https://www.solipsys.co.uk/rss.xml"> ''' <img style="border-width: 0px;" ''' src="https://www.feedburner.com/fb/images/pub/feed-icon32x32.png" ''' align="middle" ''' alt="" />Subscribe!</a> _ ''' <a href="https://twitter.com/ColinTheMathmo"> ''' <img src="https://www.solipsys.co.uk/new/images/TwitterButton.png" ''' title="By: TwitterButtons.net" ''' width="212" height="69" ''' alt="@ColinTheMathmo" ''' /></a> <<| ---- My latest posts can be found here: * ColinsBlog ---- Previous blog posts: * ADogCalledMixture * AnotherPayPalScam * WhyTopPostingHasWon * UnexpectedInteractionOfFeatures * ArchimedesHatBoxTheorem * ConsideringASphere * ToLinkOrNotToLink * GenericAdviceForWritingAThesis * JustTeachMyChildTheMaths * NotASpectatorSport * LeftTruncatablePrime * TheDoctorAndTheLawyer * FourPointsTwoDistancesProof * MeetingRonGraham * NapkinRingVersusSphericalCap * TheFourPointsPuzzle * RadiusOfTheEarthPartTwo * GrepTimingAnomaly * TheRadiusOfTheEarth * ThisWorksToCureMyHiccoughs * PerhapsWeSavedOne * ThinkingAboutMastodon * DisappearingTrainsOnVirgin * TheIndependenceGame * OneOfMyFavouritePuzzles * ThinkingAboutRecursion * MemorisingTheTube * SpikeySpheres * SurprisinglyQuick * AnUnexpectedFraction * YouHaveToAdmireTheirOptimism * RepresentativesMatter * PythagorasByIncircle * APuzzleAboutPuzzles * HowNotToDoTwitter * Calculating52FactorialByHand * SmallThingsMightNotBeSoSmall * NotIfYouHurry * FactoringViaGraphThreeColouring * AnotherProofOfTheDoodleTheorem * WhenObviousIsNotObvious * GraphThreeColouring * TheDoodleTheorem * BeCarefulWhatYouSay * TheMutilatedChessboardRevisited * AMirrorCopied * TheOtherOtherRopeAroundTheEarth * PhotocopyAMirror * ThePointOfTheBanachTarskiTheorem * SieveOfEratosthenesInPython * FastPerrinTest * RussianPeasantMultiplication * FindingPerrinPseudoPrimes_Part2 * FindingPerrinPseudoPrimes_Part1 * TheUnwiseUpdate * MilesPerGallon * TrackingAnItemOnHackerNews * HackerNewsUserAges * PokingTheDustyCorners * ThereIsNoTimeForThis * PublicallySharingLinks * LearningTimesTables * GracefulDegradation * DiagrammingMathsTopics * OnTheRack * SquareRootByLongDivision * BeyondTheBoundary * FillInTheGaps * SoftwareChecklist * NASASpaceCrews * TheBirthdayParadox * TheTrapeziumConundrum * RevisitingTheAnt * TheAntAndTheRubberBand * IrrationalsExist * MultipleChoiceProbabilityPuzzle * RandomEratosthenes * WrappingUpSquareDissection * DissectingASquarePart2 * DissectingACircle * DissectingASquare * AnOddityInTennis * DecisionTreeForTennis * DecisionTreesInGames * AMatterOfConvention * DoYouNourishOrTarnish * BinarySearchReconsidered * TwoEqualsFour * TheLostPropertyOffice * TheForgivingUserInterface * SettingUpRSS * WithdrawingFromHackerNews ---- Additionally, some earlier writings: * RandomWritings. * ColinsBlog2010 * ColinsBlog2009 * ColinsBlog2008 * ColinsBlog2007 * ColinsBlogBefore2007 ******** !! When optimising code, never guess, always measure. [[[> This page has been _ TaggedAsSoftware ]]] This is a truism that lots of people quote, but it can be hard to remember, especially in the heat of battle (as it were). Rather fortunately it came to mind just when needed, as I found something completely unexpected. I was writing a simple implementation of the Fermat difference of squares method of factoring. This involves writing the number to be factored as - you guessed it - the difference of two squares. If $n=a^2-b^2$ then $n=(a-b)(a+b)$ and we have a factorisation (provided we don't have $a-b=1$). ---- ! Version 0 ********> So the obvious thing to do is to take the square root of $n$, round it up to get a candidate for $a$, then look at $a^2-n$. If that's a square then we're done. ******** width="50%" {{{ def factor_fermat0(n): a = square_root_ceil(n) while True: d = a*a-n b = square_root_ceil(d) if b**2 == d: return a-b, a+b a += 1 }}} ********< ---- ! Version 1 ********> We can improve on this. We can increase $a$ until we have $a^2-b^2>n$, and then increase $b$ until $a^2-b^2<n$. If $n$ is odd then we'll get equality at some point, so we just continue until that happens, and we're done. ******** width="50%" {{{ def factor_fermat1( n ): a = square_root_ceil(n) b = square_root_ceil(a**2-n) while a**2-b**2 != n: while a**2-b**2 < n: a += 1 while a**2-b**2 > n: b += 1 return a-b, a+b }}} ********< ---- ! Version 2 ********> Now we notice that we are repeatedly computing the squares of $a$ and $b$, so we create variables to hold the squares and update them by remembering that $(x+1)^2=x^2+2x+1$. ******** width="50%" {{{ def factor_fermat2( n ): a = square_root_ceil(n) a2 = a**2 b = square_root_ceil(a**2-n) b2 = b**2 while a2-b2 != n: while a2-b2 < n: a2, a = a2+a+a+1, a+1 while a2-b2 > n: b2, b = b2+b+b+1, b+1 return a-b, a+b }}} ********< ---- ! Version 3 ********> We can improve that slightly further by having $c=2a+1$ and $d=2b+1$, and the code now becomes: ******** width="50%" {{{ def factor_fermat3( n ): a = square_root_ceil(n) a2 = a**2 c = 2*a+1 b = square_root_ceil(a**2-n) b2 = b**2 d = 2*b+1 while a2-b2 != n: while a2-b2 < n: a2, c = a2+c, c+2 while a2-b2 > n: b2, d = b2+d, d+2 return (c-d)/2, (c+d)/2-1 }}} ********< ---- !! Let's see how we're doing These improvements all seem obvious and straight-forward, and we can check by doing some timing tests. The times, and indeed, the /ratios/ of times, will vary according to the number being factored, so we can do two different numbers and see what the trend is. If you choose to analyse the algorithm then you'll see what's going on. However, we factored two different numbers: |>> ********> 4736906887651 = _ 1211141 _ x _ 3911111: ******** | Routine | |>> 0 <<| | |>> 1 <<| | |>> 2 <<| | |>> 3 <<| | | Run 0 | 9.223 | 0.409 | 0.258 | 0.188 | | Run 1 | 9.138 | 0.408 | 0.258 | 0.188 | | Run 2 | 9.146 | 0.407 | 0.262 | 0.194 | | Run 3 | 9.149 | 0.416 | 0.258 | 0.188 | | Run 4 | 9.228 | 0.408 | 0.272 | 0.189 | | Run 5 | 9.187 | 0.407 | 0.258 | 0.188 | | Run 6 | 9.154 | 0.421 | 0.259 | 0.188 | | Run 7 | 9.157 | 0.407 | 0.259 | 0.189 | ********< <<| |>> ********> 47368008965429 = _ 3911111 _ x _ 12111139: ******** | Routine | |>> 0 <<| | |>> 1 <<| | |>> 2 <<| | |>> 3 <<| | | Run 0 | 34.922 | 1.230 | 0.774 | 0.564 | | Run 1 | 34.276 | 1.214 | 0.775 | 0.562 | | Run 2 | 34.191 | 1.215 | 0.777 | 0.564 | | Run 3 | 34.198 | 1.214 | 0.776 | 0.564 | | Run 4 | 34.182 | 1.230 | 0.776 | 0.563 | | Run 5 | 34.226 | 1.244 | 0.775 | 0.563 | | Run 6 | 34.144 | 1.235 | 0.773 | 0.564 | | Run 7 | 34.104 | 1.233 | 0.775 | 0.565 | ********< <<| We can see that the first routine is rubbish, but that's no surprise, as it's computing a square root every time through the loop. For the others we can see a clear improvement as we make those optimisations. It did surprise me just how big the improvement was from #2 to #3, but it was nice to see. ---- ! Version 4 ********> Then I made a small change to the code layout ready for another shot at an optimisation. Pretty uncontroversial - I ran the timing tests again to make sure there was no difference, using a larger number, and not bothering to use the reference "version 0" since that was so slow. ******** width="50%" [[[ {{{ def factor_fermat4( n ): a = square_root_ceil(n) a2 = a**2 c = 2*a+1 b = square_root_ceil(a**2-n) b2 = b**2 d = 2*b+1 while a2-b2 != n: while a2-b2 < n: a2 += c c += 2 while a2-b2 > n: b2 += d d += 2 return (c-d)/2, (c+d)/2-1 }}} ]]] ********< |>> ********> 47368008965429 = _ 3911111 _ x _ 12111139: ******** | Routine | |>> 1 <<| | |>> 2 <<| | |>> 3 <<| | |>> 4 <<| | | Run 0 | 1.225 | 0.774 | 0.562 | 0.622 | | Run 1 | 1.223 | 0.772 | 0.562 | 0.624 | | Run 2 | 1.238 | 0.776 | 0.567 | 0.625 | | Run 3 | 1.233 | 0.777 | 0.566 | 0.626 | | Run 4 | 1.229 | 0.769 | 0.563 | 0.624 | | Run 5 | 1.231 | 0.773 | 0.562 | 0.624 | | Run 6 | 1.225 | 0.771 | 0.563 | 0.625 | | Run 7 | 1.219 | 0.771 | 0.563 | 0.622 | ********< <<| Wait - what? It's slower? How can that be? I'd've thought it would compile to exactly the same bytecode, and so execute in pretty much the same time. Just to be sure I ran it on a bigger number. |>> ********> 4736791755876611 = _ 39111113 _ x _ 121111147: ******** | Routine | |>> 1 <<| | |>> 2 <<| | |>> 3 <<| | |>> 4 <<| | | Run 0 | 12.388 | 7.775 | 5.672 | 6.245 | | Run 1 | 12.215 | 7.767 | 5.660 | 6.256 | | Run 2 | 12.295 | 7.910 | 5.674 | 6.301 | | Run 3 | 12.196 | 7.749 | 5.645 | 6.247 | | Run 4 | 12.221 | 7.731 | 5.646 | 6.294 | | Run 5 | 12.227 | 7.750 | 5.638 | 6.232 | | Run 6 | 12.226 | 7.798 | 5.691 | 6.250 | | Run 7 | 12.164 | 7.753 | 5.646 | 6.264 | ********< <<| There is no doubt: |>> [[[ |>> {{{ a2, c = a2+c, c+2 }}} _ is faster than _ {{{ a2 += c c += 2 }}} <<| ]]] <<| That caught me by surprise, but it really does go to show: |>> [[[ |>> When optimising, _ measure, measure, _ measure. <<| ]]] <<| My guess is that in the first case the two evaluations and assignments can happen in parallel, and so may happen on different cores, whereas in the second case they must happen in serial, but that's just a guess. Maybe someone reading this knows for sure. Let me know. I've got more changes to make, but one thing is for sure - I'll be measuring the improvements as I go to make sure they really do make things better. ---- !! References: * https://en.wikipedia.org/wiki/Fermat%27s_factorization_method ---- !! Comments The following comment was sent, but I don't know if the person is happy to be mentioned by name - if it's you, please let me know. However, they said this (paraphrased): [[[ Using the standard python interpreter the GIL would prevent {{{ a2, c = a2+c, c+2 }}} from executing in parallel. Another big optimization would be to replace a**2 with a*a. Multiplication is much faster than power functions < 4. It is possible the interpreter does this automatically, but I don't think so. ]]] And to whomever it was - thank you for the response. ---- |>> | |>> <<<< Prev <<<< ---- ADogCalledMixture <<| | : | |>> >>>> Next >>>> ---- IdleThoughtsAboutPollardRho ... <<| | ---- ********> ''' <a href="https://mathstodon.xyz/@ColinTheMathmo"> ''' <img src="https://www.solipsys.co.uk/images/Mastodon_Mascot.png" ''' width="256" height="280" ''' alt="https://mathstodon.xyz/@ColinTheMathmo" ''' /></a> ******** ''' <a href="https://mathstodon.xyz/@ColinTheMathmo/">You can follow me on Mathstodon.</a> _ _ _ _ [[[> ''' <a href="https://twitter.com/ColinTheMathmo">Of course, you can also<br>follow me on twitter:</a> ''' <a href="https://twitter.com/ColinTheMathmo"> ''' <img src="https://www.solipsys.co.uk/new/images/TwitterButton.png" ''' title="By: TwitterButtons.net" ''' width="212" height="69" ''' alt="@ColinTheMathmo" ''' /></a> ''' <img src="/cgi-bin/CountHits.py?WhenOptimisingCodeMeasure" alt="" /> ]]] ********< ---- !! Send us a comment ... ''' <form action="https://www.solipsys.co.uk/cgi-bin/FormMail.pl" method=post> ''' <input type=hidden name="recipient" value="colinsblogcomment@solipsys.co.uk" > ''' <input type=hidden name="subject" value="Blog comment : WhenOptimisingCodeMeasure" > ''' <input type=hidden name="redirect" value="https://www.solipsys.co.uk/new/ThankYouForYourComment.html" > ''' <input type=hidden name="missing_fields_redirect" value="https://www.solipsys.co.uk/RequestError.html"> ''' <input type=hidden name="env_report" value="REMOTE_HOST, REMOTE_ADDR, HTTP_USER_AGENT" > ''' <input type=hidden name="print_blank_fields" value="1" > ********> width="47%" You can send us a message here. It doesn't get published, it just sends us an email, and is an easy way to ask any questions, or make any comments, without having to send a separate email. So just fill in the boxes and then: ''' <centre><font size="+4"><INPUT TYPE="submit" VALUE="CLICK HERE TO SEND"></font></center> ******** width="6%" ******** width="47%" If you include an email address then I'll be able to respond, and conversely, if you don't, I won't! Likewise, if you give me explicit permission to add your comments here then I might choose to do so, and if you don't, I won't! So if you reply, please include your preferences, otherwise I'll read you email, nod sagely, and file it away. ********< ''' <table cellpadding="5"> ''' <tr> ''' <td valign="top">Your name </td> <td valign="top">:</td> ''' <td> <input type=text name="realname" size="48"> </td> ''' <tr> ''' <td valign="top">Email </td> <td valign="top">:</td> ''' <td> <input type=text name="email" size="48"> </td> ''' </tr> ''' <tr> ''' <td valign="top">Message </td> <td valign="top">:</td> ''' <td> <TEXTAREA NAME="Message" ROWS=10 COLS=64></TEXTAREA> </td> ''' </tr> ''' </table> ''' <center> ''' <font size="+4"> ''' <INPUT TYPE="submit" VALUE="CLICK HERE TO SEND"> ''' </font> ''' </center> ''' </form> ********<