

There is an algorithm for calculating the integer square root of any
positive integer using only integer operations:
def isqrt(n):
if n < 0: raise ValueError
if n == 0:
return 0
bits = n.bit_length()
a, b = divmod(bits, 2)
x = 2**(a+b)
while True:
y = (x + n//x)//2
if y >= x:
return x
x = y
This returns the integer part of the square root of n, that is, the greatest
whole number less than or equal to the square root of n:
py> isqrt(35)
5
py> isqrt(36)
6
That makes it equivalent to int(math.sqrt(n)), which also happens to be
much, much faster, at least for small values of n. However, for large
values of n, using floating point intermediate calculations fail:
py> import math
py> int(math.sqrt(2**3000))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OverflowError: long int too large to convert to float
Another problem is that, above a certain size, the resolution of floats is
larger than 1, so you can't convert every int into a float without loss:
py> float(2**901) == 2**901
False
which means that using math.sqrt is not correct:
py> isqrt(2**901)
35184372088831
py> int(math.sqrt(2**901)) # Off by one.
35184372088832
So, the challenge is to identify when it is safe to optimise isqrt(n) as
int(math.sqrt(n)):
Q1: What is the largest value of M, such that
all(isqrt(i) == int(math.sqrt(n)) for n in range(M))
returns True?
I have done a brute force test, and in nine hours confirmed that M is at
least 7627926244. That took nine hours, and I expect that a brute force
test of every int representable as a float would take *months* of
processing time.
Q2: For values above M, is there a way of identifying which values of n are
okay to use the optimized version?
Q3: What is the largest value of n beyond which you can never use the float
optimization?
You can assume that Python floats are IEEE754 C doubles, and that
math.sqrt() is correctly rounded.

Steven


On Sat, Nov 1, 2014 at 12:29 PM, Steven D'Aprano
<steve+comp.lang.python at pearwood.info> wrote:
> Q1: What is the largest value of M, such that
>
> all(isqrt(i) == int(math.sqrt(n)) for n in range(M))
>
> returns True?
>
> I have done a brute force test, and in nine hours confirmed that M is at
> least 7627926244. That took nine hours, and I expect that a brute force
> test of every int representable as a float would take *months* of
> processing time.
Here's something that should be faster than pure bruteforce:
import math
for i in range(2**34): # 2**34 should take about nine hours
sqrt = int(math.sqrt(i))
if sqrt * sqrt > i:
print("Too high at %d" % i)
break
if (sqrt+1) * (sqrt+1) <= i:
print("Too low at %d" % i)
break
There are two ways int(math.sqrt(i)) can be wrong: it can return a
number that's too high, or one that's too low. If it's too high,
squaring it (btw, int*int seems to be faster than int**2) will produce
a number higher than the original. If the result was too low, then the
next number up would have been within the correct bounds. There's no
need to actually calculate isqrt here.
I was able to probe up to 2**29 in seven minutes on my computer
(allocating one core, CPython 3.5 from trunk). If it's linear, going
as far as 2**34 should take no more than the nine hours you spent. I
expect time will be a bit worse than linear (working with small
numbers is marginally faster than working with big numbers), but if
it's at least reasonably close, probing as far as 2**53 would take
81555 days. So, uhh, a brute force by your method is going to take a
lot more than months. (And 2**53 isn't technically where ints stop
being representable, but I'd use that as the final cutoff, as it's
where *all* ints stop being representable. In any case, there's a
failure at 2**541, so there's no need to go past there.)
Hmm. The speed difference between your brute force and mine isn't all
that much. I'm estimating maybe doing at best twice as much in the
same time. So you're probably already using something similar to the
above, and this isn't adding much to the discussion. :(
ChrisA


Hi Steven,
let me start by answering from reverse:
> Q3: What is the largest value of n beyond which you can never use the
float
> optimization?
>
A3: There is no such value, besides the upper limit of floats (DBL_MAX~
10^308)
P3: If you feed a perfect square into the floating point square root
algorithm, with a mantissa of the root of length smaller than the
bitwidth of your float, it will always come out perfectly. I.e.,
computing sqrt(25) in FP math is no different from sqrt(25*2**200):
>>> 25*2**200
40173451106474756888549052308529065063055074844569820882534400L
>>> x=int(math.sqrt(25*2**200))
>>> x
6338253001141147007483516026880L
>>> x*x
40173451106474756888549052308529065063055074844569820882534400L
>>>
Am 01.11.14 02:29, schrieb Steven D'Aprano:
> There is an algorithm for calculating the integer square root of any
> positive integer using only integer operations:
>
> def isqrt(n):
> if n < 0: raise ValueError
> if n == 0:
> return 0
> bits = n.bit_length()
> a, b = divmod(bits, 2)
> x = 2**(a+b)
> while True:
> y = (x + n//x)//2
> if y >= x:
> return x
> x = y
> Q2: For values above M, is there a way of identifying which values of n are
> okay to use the optimized version?
A2: Do it in a different way.
Your above algorithm is obviously doing Heron or NewtonRaphson
iterations, so the same as with floating point math. The first line
before the while loop computes some approximation to sqrt(n). Instead of
doing bit shuffling, you could compute this by FP math and get closer to
the desired result, unless the integer is too large to be represented by
FP. Now, the terminating condition seems to rely on the fact that the
initial estimate x>=sqrt(n), but I don't really understand it. My guess
is that if you do x=int(sqrt(n)), then do the first iteration, then swap
x and y such that x>y, then enter the loop, you would simply start with
a better estimate in case that the significant bits can be represented
by the float.
So this is my try, but not thoroughly tested:
def isqrt(n):
if n < 0: raise ValueError
if n == 0:
return 0
bits = n.bit_length()
# the highest exponent in 64bit IEEE is 1023
if n>2**1022:
a, b = divmod(bits, 2)
x = 2**(a+b)
else:
x=int(math.sqrt(n))
y=n//x
if x<y:
x,y = (y,x)
while True:
y = (x + n//x)//2
if y >= x:
return x
x = y
Christian


Addendum: If my method below works, you can also use it to speed up
computations for n>2*1022, by splitting off an even power of two from
the integer and computing the FP sqrt of the mantissa for the seed, i.e.
doing the FP manually.
Am 01.11.14 09:02, schrieb Christian Gollwitzer:
> Hi Steven,
>
> let me start by answering from reverse:
> > Q3: What is the largest value of n beyond which you can never use the
> float
> > optimization?
> >
>
> A3: There is no such value, besides the upper limit of floats (DBL_MAX~
> 10^308)
>
> P3: If you feed a perfect square into the floating point square root
> algorithm, with a mantissa of the root of length smaller than the
> bitwidth of your float, it will always come out perfectly. I.e.,
> computing sqrt(25) in FP math is no different from sqrt(25*2**200):
>
> >>> 25*2**200
> 40173451106474756888549052308529065063055074844569820882534400L
> >>> x=int(math.sqrt(25*2**200))
> >>> x
> 6338253001141147007483516026880L
> >>> x*x
> 40173451106474756888549052308529065063055074844569820882534400L
> >>>
>
>
>
> Am 01.11.14 02:29, schrieb Steven D'Aprano:
>> There is an algorithm for calculating the integer square root of any
>> positive integer using only integer operations:
>>
>> def isqrt(n):
>> if n < 0: raise ValueError
>> if n == 0:
>> return 0
>> bits = n.bit_length()
>> a, b = divmod(bits, 2)
>> x = 2**(a+b)
>> while True:
>> y = (x + n//x)//2
>> if y >= x:
>> return x
>> x = y
>
>> Q2: For values above M, is there a way of identifying which values of
>> n are
>> okay to use the optimized version?
>
> A2: Do it in a different way.
>
> Your above algorithm is obviously doing Heron or NewtonRaphson
> iterations, so the same as with floating point math. The first line
> before the while loop computes some approximation to sqrt(n). Instead of
> doing bit shuffling, you could compute this by FP math and get closer to
> the desired result, unless the integer is too large to be represented by
> FP. Now, the terminating condition seems to rely on the fact that the
> initial estimate x>=sqrt(n), but I don't really understand it. My guess
> is that if you do x=int(sqrt(n)), then do the first iteration, then swap
> x and y such that x>y, then enter the loop, you would simply start with
> a better estimate in case that the significant bits can be represented
> by the float.
>
> So this is my try, but not thoroughly tested:
>
> def isqrt(n):
> if n < 0: raise ValueError
> if n == 0:
> return 0
> bits = n.bit_length()
> # the highest exponent in 64bit IEEE is 1023
> if n>2**1022:
> a, b = divmod(bits, 2)
> x = 2**(a+b)
> else:
> x=int(math.sqrt(n))
> y=n//x
> if x<y:
> x,y = (y,x)
>
> while True:
> y = (x + n//x)//2
> if y >= x:
> return x
> x = y
>
>
> Christian
>
>
>
>


In reply to this post by Christian Gollwitzer
On Sat, Nov 1, 2014 at 7:02 PM, Christian Gollwitzer <auriocus at gmx.de> wrote:
> Your above algorithm is obviously doing Heron or NewtonRaphson iterations,
> so the same as with floating point math. The first line before the while
> loop computes some approximation to sqrt(n). Instead of doing bit shuffling,
> you could compute this by FP math and get closer to the desired result,
> unless the integer is too large to be represented by FP. Now, the
> terminating condition seems to rely on the fact that the initial estimate
> x>=sqrt(n), but I don't really understand it. My guess is that if you do
> x=int(sqrt(n)), then do the first iteration, then swap x and y such that
> x>y, then enter the loop, you would simply start with a better estimate in
> case that the significant bits can be represented by the float.
Part of the point of that algorithm is that it never uses FP, and is
therefore not limited by FP restrictions. As to the assumption that
x>=sqrt(n), that would be safe: if the bit length is even (that is,
it's between an odd power of 2 and an even one  for example, 2**13 <
16000 <= 2**14), the initial estimate is the exact square root of the
power of two at the top of that range (bit length of 14 means x is
2**7, 128 == sqrt(16384)); if the bit length is odd (eg 2**8 < 300 <=
2**9), the initial estimate rounds the halving upward (bit length of 9
means x is 2**(9//2+1), 32 > sqrt(512)). So there's a guarantee that
the initial estimate is no lower than the target number.
ChrisA


In reply to this post by Christian Gollwitzer
Am 01.11.14 09:13, schrieb Chris Angelico:
> On Sat, Nov 1, 2014 at 7:02 PM, Christian Gollwitzer <auriocus at gmx.de> wrote:
>> Your above algorithm is obviously doing Heron or NewtonRaphson iterations,
>> so the same as with floating point math. The first line before the while
>> loop computes some approximation to sqrt(n). Instead of doing bit shuffling,
>> you could compute this by FP math and get closer to the desired result,
>> unless the integer is too large to be represented by FP. Now, the
>> terminating condition seems to rely on the fact that the initial estimate
>> x>=sqrt(n), but I don't really understand it. My guess is that if you do
>> x=int(sqrt(n)), then do the first iteration, then swap x and y such that
>> x>y, then enter the loop, you would simply start with a better estimate in
>> case that the significant bits can be represented by the float.
>
> Part of the point of that algorithm is that it never uses FP, and is
> therefore not limited by FP restrictions.
which are???


On Sat, Nov 1, 2014 at 7:25 PM, Christian Gollwitzer <auriocus at gmx.de> wrote:
>> Part of the point of that algorithm is that it never uses FP, and is
>> therefore not limited by FP restrictions.
>
>
> which are???
Most notably, the inability to represent every integer beyond 2**53,
and the inability to represent any integer beyond 2**1024. This
algorithm should work fine with any positive integer.
ChrisA


In reply to this post by Christian Gollwitzer
Am 01.11.14 09:33, schrieb Chris Angelico:
> On Sat, Nov 1, 2014 at 7:25 PM, Christian Gollwitzer <auriocus at gmx.de> wrote:
>>> Part of the point of that algorithm is that it never uses FP, and is
>>> therefore not limited by FP restrictions.
>>
>>
>> which are???
>
> Most notably, the inability to represent every integer beyond 2**53,
> and the inability to represent any integer beyond 2**1024. This
> algorithm should work fine with any positive integer.
>
OK so you did not bother to look at my proposed alternative
implementation. If I understood Steven correctly, he wanted to speed up
the original isqrt algorithm by using FP when this is possible. I have
shown how to do it for n<2**1022 (maybe 2**1024, I'm to lean to check
it). I admit that there is some microoptimizatio left, e.g. the first
division is done twice, the comparison should be bits>1022, not n>2*1022
etc.
Christian


On Sat, Nov 1, 2014 at 7:38 PM, Christian Gollwitzer <auriocus at gmx.de> wrote:
> Am 01.11.14 09:33, schrieb Chris Angelico:
>>
>> On Sat, Nov 1, 2014 at 7:25 PM, Christian Gollwitzer <auriocus at gmx.de>
>> wrote:
>>>>
>>>> Part of the point of that algorithm is that it never uses FP, and is
>>>> therefore not limited by FP restrictions.
>>>
>>>
>>>
>>> which are???
>>
>>
>> Most notably, the inability to represent every integer beyond 2**53,
>> and the inability to represent any integer beyond 2**1024. This
>> algorithm should work fine with any positive integer.
>>
> OK so you did not bother to look at my proposed alternative implementation.
> If I understood Steven correctly, he wanted to speed up the original isqrt
> algorithm by using FP when this is possible. I have shown how to do it for
> n<2**1022 (maybe 2**1024, I'm to lean to check it). I admit that there is
> some microoptimizatio left, e.g. the first division is done twice, the
> comparison should be bits>1022, not n>2*1022 etc.
I did look at it. Trouble is, I don't know floating point's details
well enough to prove that there are no *other* limitations.
FWIW, I've proven the algorithm as far as 2**38. That's still a long
way short of 2**53, though, and getting as far as 2**39 would, with my
bruteforce checker, require between 2.5 and 5 hours. I've made some
improvements over the original, but it's still slow.
ChrisA


Steven D'Aprano <steve+comp.lang.python at pearwood.info> writes:
> There is an algorithm for calculating the integer square root of any
> positive integer using only integer operations:
>
> def isqrt(n):
> if n < 0: raise ValueError
> if n == 0:
> return 0
> bits = n.bit_length()
> a, b = divmod(bits, 2)
> x = 2**(a+b)
> while True:
> y = (x + n//x)//2
> if y >= x:
> return x
> x = y
>
> This returns the integer part of the square root of n, that is, the greatest
> whole number less than or equal to the square root of n:
>
> py> isqrt(35)
> 5
> py> isqrt(36)
> 6
>
>
> That makes it equivalent to int(math.sqrt(n)), which also happens to be
> much, much faster, at least for small values of n. However, for large
> values of n, using floating point intermediate calculations fail:
>
> py> import math
> py> int(math.sqrt(2**3000))
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> OverflowError: long int too large to convert to float
>
> Another problem is that, above a certain size, the resolution of floats is
> larger than 1, so you can't convert every int into a float without loss:
>
> py> float(2**901) == 2**901
> False
>
> which means that using math.sqrt is not correct:
>
> py> isqrt(2**901)
> 35184372088831
> py> int(math.sqrt(2**901)) # Off by one.
> 35184372088832
>
>
> So, the challenge is to identify when it is safe to optimise isqrt(n) as
> int(math.sqrt(n)):
>
> Q1: What is the largest value of M, such that
>
> all(isqrt(i) == int(math.sqrt(n)) for n in range(M))
>
> returns True?
>
> I have done a brute force test, and in nine hours confirmed that M is at
> least 7627926244. That took nine hours, and I expect that a brute force
> test of every int representable as a float would take *months* of
> processing time.
>
> Q2: For values above M, is there a way of identifying which values of n are
> okay to use the optimized version?
>
> Q3: What is the largest value of n beyond which you can never use the float
> optimization?
>
>
> You can assume that Python floats are IEEE754 C doubles, and that
> math.sqrt() is correctly rounded.
Where do you want to use your optimized isqrt(i)?
There could be specialized algorithms that work only in narrow specific
circumstances e.g., the inverse square root (1/sqrt(x)) implementation
from Quake III Arena that has 0x5f3759df constant in it (only of
historical interest now).
If you want to work with very large (thousands, millions of digits)
integers then gmp library might be faster then the default Python
integer implementation.

Akira


On 01.11.14 03:29, Steven D'Aprano wrote:
> There is an algorithm for calculating the integer square root of any
> positive integer using only integer operations:
Here is my optimized implementation inspired by Christian's ideas.
import math, sys
C1 = sys.float_info.radix ** sys.float_info.mant_dig
C2 = int(sys.float_info.max)
C3 = C2.bit_length()  2
def isqrt(n):
if n < 0:
raise ValueError
if n == 0:
return 0
if n <= C1:
return int(math.sqrt(n))
if n <= C2:
x = int(math.sqrt(n))
else:
b = (n.bit_length()  C3) // 2
x = int(math.sqrt(n >> (2 * b))) << b
y = (x + n // x) // 2
if y == x:
return x
while True:
x = y
y = (x + n // x) // 2
if y >= x:
return x


Serhiy,
This looks very good indeed. As a matter of interest, is there any
particular reason you have used 2*b instead of b+b? Might b+b be faster
than b*2?
Also, in various lines, you use //2. Would >>1 be quicker? On reflection,
perhaps you have had to use //2 because >>1 cannot be used in those
situations.
Stephen Tucker.
On Thu, Nov 20, 2014 at 6:00 PM, Serhiy Storchaka <storchaka at gmail.com>
wrote:
> On 01.11.14 03:29, Steven D'Aprano wrote:
>
>> There is an algorithm for calculating the integer square root of any
>> positive integer using only integer operations:
>>
>
> Here is my optimized implementation inspired by Christian's ideas.
>
> import math, sys
>
> C1 = sys.float_info.radix ** sys.float_info.mant_dig
> C2 = int(sys.float_info.max)
> C3 = C2.bit_length()  2
>
> def isqrt(n):
> if n < 0:
> raise ValueError
> if n == 0:
> return 0
> if n <= C1:
> return int(math.sqrt(n))
> if n <= C2:
> x = int(math.sqrt(n))
> else:
> b = (n.bit_length()  C3) // 2
> x = int(math.sqrt(n >> (2 * b))) << b
> y = (x + n // x) // 2
> if y == x:
> return x
> while True:
> x = y
> y = (x + n // x) // 2
> if y >= x:
> return x
>
>
> 
> https://mail.python.org/mailman/listinfo/pythonlist>
 next part 
An HTML attachment was scrubbed...
URL: < http://mail.python.org/pipermail/pythonlist/attachments/20141121/af61daa9/attachment.html>


On 11/21/2014 03:15 AM, Stephen Tucker wrote:
>
>
>
> On Thu, Nov 20, 2014 at 6:00 PM, Serhiy Storchaka <storchaka at gmail.com>
> wrote:
>
>> On 01.11.14 03:29, Steven D'Aprano wrote:
>>
>>> There is an algorithm for calculating the integer square root of any
>>> positive integer using only integer operations:
>>>
>>
>> Here is my optimized implementation inspired by Christian's ideas.
>>
>> ....
>> def isqrt(n):
>> if n < 0:
>> raise ValueError
>> if n == 0:
>> return 0
>> if n <= C1:
>> return int(math.sqrt(n))
>> if n <= C2:
>> x = int(math.sqrt(n))
>> else:
>> b = (n.bit_length()  C3) // 2
>> x = int(math.sqrt(n >> (2 * b))) << b
>> y = (x + n // x) // 2
>> if y == x:
>> return x
>> while True:
>> x = y
>> y = (x + n // x) // 2
>> if y >= x:
>> return x
>>
>> > Serhiy,
>
> This looks very good indeed. As a matter of interest, is there any
> particular reason you have used 2*b instead of b+b? Might b+b be faster
> than b*2?
>
> Also, in various lines, you use //2. Would >>1 be quicker? On reflection,
> perhaps you have had to use //2 because >>1 cannot be used in those
> situations.
>
> Stephen Tucker.
>
(Please don't toppost here. I moved your remarks after the code you're
referencing, so that others may see it in the preferred order)
It's been probably two decades since I've been in a programming
situation where that kind of difference mattered. In Python in
particular, the actual arithmetic or bit shifting is probably only one
part in a hundred, and in modern processors all these operations tend to
be very close in timing.
You're welcome to use the timeit module to try to measure it. But I
doubt it makes a difference of more than one part in a thousand. And in
some situations I can imagine b+b to be slower than 2*b. (For example,
if b were global)

DaveA


?'??????, 21???2014 08:15:57 ?? ????????:
> This looks very good indeed. As a matter of interest, is there any
> particular reason you have used 2*b instead of b+b? Might b+b be faster
> than b*2?
Yes, it is slightly faster, but the effect is indiscernible in total
time. But
there is not harm to use b+b.
> Also, in various lines, you use //2. Would >>1 be quicker? On reflection,
> perhaps you have had to use //2 because >>1 cannot be used in those
> situations.
I thought this effect would be insignificant too. But actually it is
measurable
(about 10% for some input). Thanks, this optimization is worth to be
applied.


On 11/25/2014 02:31 PM, Serhiy Storchaka wrote:
> ?'??????, 21???2014 08:15:57 ?? ????????:
>> This looks very good indeed. As a matter of interest, is there any
>> particular reason you have used 2*b instead of b+b? Might b+b be faster
>> than b*2?
>
> Yes, it is slightly faster, but the effect is indiscernible in total
> time. But
> there is not harm to use b+b.
>
>> Also, in various lines, you use //2. Would >>1 be quicker? On reflection,
>> perhaps you have had to use //2 because >>1 cannot be used in those
>> situations.
>
> I thought this effect would be insignificant too. But actually it is
> measurable
> (about 10% for some input). Thanks, this optimization is worth to be
> applied.
>
Unfortunately, for many values, the version of the function with >>1 is
slower. It's only when the argument is bigger than 10**40 that it's as
much as 1% faster. But it's true that for really large values, it's
quicker.
A quick test shows that 3.3 is about 20% faster for both these functions
than 2.7.
My oversight earlier was assuming a native type. Once you get into
"longs" which aren't supported by the processor, the shift will likely
become much faster than divide.

DaveA


Another _possible_ performance improvement that is staring us in the face
is that 2*b could be replaced with b<<1. Given that b+b (an earlier
suggestion of mine) involves two table lookups for b, whereas b<<1 only
involves one, it seems that the scope here for improvement is significant.
By the way, I hope this post is not "topposted" as my previous one was.
Apologies for that  I am new to this kind of thing.
On Wed, Nov 26, 2014 at 12:46 AM, Dave Angel <davea at davea.name> wrote:
> On 11/25/2014 02:31 PM, Serhiy Storchaka wrote:
>
>> ?'??????, 21???2014 08:15:57 ?? ????????:
>>
>>> This looks very good indeed. As a matter of interest, is there any
>>> particular reason you have used 2*b instead of b+b? Might b+b be faster
>>> than b*2?
>>>
>>
>> Yes, it is slightly faster, but the effect is indiscernible in total
>> time. But
>> there is not harm to use b+b.
>>
>> Also, in various lines, you use //2. Would >>1 be quicker? On reflection,
>>> perhaps you have had to use //2 because >>1 cannot be used in those
>>> situations.
>>>
>>
>> I thought this effect would be insignificant too. But actually it is
>> measurable
>> (about 10% for some input). Thanks, this optimization is worth to be
>> applied.
>>
>>
> Unfortunately, for many values, the version of the function with >>1 is
> slower. It's only when the argument is bigger than 10**40 that it's as
> much as 1% faster. But it's true that for really large values, it's
> quicker.
>
> A quick test shows that 3.3 is about 20% faster for both these functions
> than 2.7.
>
> My oversight earlier was assuming a native type. Once you get into
> "longs" which aren't supported by the processor, the shift will likely
> become much faster than divide.
>
>
> 
> DaveA
> 
> https://mail.python.org/mailman/listinfo/pythonlist>
 next part 
An HTML attachment was scrubbed...
URL: < http://mail.python.org/pipermail/pythonlist/attachments/20141126/1b858e3f/attachment.html>


Stephen Tucker wrote:
> Another _possible_ performance improvement that is staring us in the face
> is that 2*b could be replaced with b<<1. Given that b+b (an earlier
> suggestion of mine) involves two table lookups for b, whereas b<<1 only
> involves one, it seems that the scope here for improvement is significant.
As the Original Poster of this thread, I want to thank everyone who has
contributed and assure them that despite my silence so far, I do appreciate
your responses and I'm collecting the suggestions for later.
> By the way, I hope this post is not "topposted" as my previous one was.
> Apologies for that  I am new to this kind of thing.
Alas, you have top posted. Has anyone explained what top posting is?
Reading from the top down, as is standard in English, an email conversation
will often take the following form. I'll use "=" to indicate quoted text
instead of the more common ">".
= quoted comment
= more text
response
= a question
answer
= a second question
a second answer
Note that it reads like a conversation, "you say, I respond, you say, I
respond...". This is often called "interleaved posting" or "inline
posting", because the replies are interleaved between the questions. Also,
the quoted text is usually trimmed or "snipped" to remove parts which are
no longer relevant. The idea is to leave enough quoted content to establish
content, but without removing so much as to give a misleading impression,
or leaving so much in that the reader is overwhelmed by the volume of
quoted text.
To oldschool email and Usenet users, inline posting is usually considered
far superior, *especially* for long and technical discussions like we often
find here.
Top posting usually ends up looking like this:
response completely lacking any context
= quoted comment
= more text
= text which is no longer relevant
= a question
= more text which is not relevant
= and even more of it
= a second question
= and more text which is not being
= commented on, so why is it left in?
Notice that it is all too easy to fail to answer the questions when top
posting. (To be fair, that's more a product of the person responding being
lazy or careless, but top posting does encourage lazy and careless
replies.)
Bottom posting often looks like this:
= quoted comment
= more text
= text which is no longer relevant
= a question
= more text which is not relevant
= and even more of it
= a second question
= and more text which is not being
= commented on, so why is it left in?
response lacking context
Note that the reader may end up having to scroll past many paragraphs (or
even pages!) of quoted text to discover the response. Which, all too often,
is a single sentence.
Thanks for contributing to the discussion.

Steven


On 26.11.14 02:46, Dave Angel wrote:
> Unfortunately, for many values, the version of the function with >>1 is
> slower. It's only when the argument is bigger than 10**40 that it's as
> much as 1% faster. But it's true that for really large values, it's
> quicker.
Note that this path is used only for values above 2**53 (about 10**16).
For all tested values from 10**16 to 10**40 the difference between two
versions is not larger than 1% (in one or other direction). But the
difference between different runs of the benchmark for the same version
vary about 1%. So the difference between versions in this range doesn't
matter.

