a more precise distance algorithm

classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

ravas
I read an interesting comment:
"""
The coolest thing I've ever discovered about Pythagorean's Theorem is an alternate way to calculate it. If you write a program that uses the distance form c = sqrt(a^2 + b^2) you will suffer from the lose of half of your available precision because the square root operation is last. A more accurate calculation is c = a * sqrt(1 + b^2 / a^2). If a is less than b, you should swap them and of course handle the special case of a = 0.
"""

Is this valid? Does it apply to python?
Any other thoughts? :D

My imagining:

def distance(A, B):
    """
    A & B are objects with x and y attributes
    :return: the distance between A and B
    """
    dx = B.x - A.x
    dy = B.y - A.y
    a = min(dx, dy)
    b = max(dx, dy)
    if a == 0:
        return b
    elif b == 0:
        return a
    else:
        return a * sqrt(1 + (b / a)**2)


Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

felix
El 25/05/15 15:21, ravas escribi?:

> I read an interesting comment:
> """
> The coolest thing I've ever discovered about Pythagorean's Theorem is an alternate way to calculate it. If you write a program that uses the distance form c = sqrt(a^2 + b^2) you will suffer from the lose of half of your available precision because the square root operation is last. A more accurate calculation is c = a * sqrt(1 + b^2 / a^2). If a is less than b, you should swap them and of course handle the special case of a = 0.
> """
>
> Is this valid? Does it apply to python?
> Any other thoughts? :D
>
> My imagining:
>
> def distance(A, B):
>      """
>      A & B are objects with x and y attributes
>      :return: the distance between A and B
>      """
>      dx = B.x - A.x
>      dy = B.y - A.y
>      a = min(dx, dy)
>      b = max(dx, dy)
>      if a == 0:
>          return b
>      elif b == 0:
>          return a
>      else:
>          return a * sqrt(1 + (b / a)**2)
I don't know if precision lose fits here but the second way you gave to
calculate c is just Math. Nothing extraordinary here.

c = a * sqrt(1 + b^2 / a^2)
c = sqrt(a^2(1 + b^2 / a^2)) applying the inverse function to introduce
a inside the square root
c = sqrt(a^2 + a^2*b^2/a^2) then just simplify
c = sqrt(a^2 + b^2)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20150525/b7c91065/attachment.html>

Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

Gary Herron-3
In reply to this post by ravas
On 05/25/2015 12:21 PM, ravas wrote:
> I read an interesting comment:
> """
> The coolest thing I've ever discovered about Pythagorean's Theorem is an alternate way to calculate it. If you write a program that uses the distance form c = sqrt(a^2 + b^2) you will suffer from the lose of half of your available precision because the square root operation is last. A more accurate calculation is c = a * sqrt(1 + b^2 / a^2). If a is less than b, you should swap them and of course handle the special case of a = 0.
> """
>
> Is this valid?

> Does it apply to python?

This is a statement about floating point numeric calculations on a
computer,.  As such, it does apply to Python which uses the underlying
hardware for floating point calculations.

Validity is another matter.  Where did you find the quote?

Gary Herron


> Any other thoughts? :D
>
> My imagining:
>
> def distance(A, B):
>      """
>      A & B are objects with x and y attributes
>      :return: the distance between A and B
>      """
>      dx = B.x - A.x
>      dy = B.y - A.y
>      a = min(dx, dy)
>      b = max(dx, dy)
>      if a == 0:
>          return b
>      elif b == 0:
>          return a
>      else:
>          return a * sqrt(1 + (b / a)**2)



Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

Christian Gollwitzer
In reply to this post by ravas
Am 25.05.15 um 21:21 schrieb ravas:
> I read an interesting comment:
> """
> The coolest thing I've ever discovered about Pythagorean's Theorem is an alternate way to calculate it. If you write a program that uses the distance form c = sqrt(a^2 + b^2) you will suffer from the lose of half of your available precision because the square root operation is last. A more accurate calculation is c = a * sqrt(1 + b^2 / a^2). If a is less than b, you should swap them and of course handle the special case of a = 0.
> """
>
> Is this valid?

Yes. Valid for floating point math, which can overflow and lose precision.

> Does it apply to python?

Yes. Python uses floating point math by default

> Any other thoughts? :D
>
> My imagining:
>
> def distance(A, B):

Wrong. Just use the built-in function Math.hypot() - it should handle
these cases and also overflow, infinity etc. in the best possible way.

Apfelkiste:~ chris$ python
Python 2.7.2 (default, Oct 11 2012, 20:14:37)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
 >>> import math
 >>> math.hypot(3,4)
5.0

        Christian


Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

ravas
On Monday, May 25, 2015 at 1:27:24 PM UTC-7, Christian Gollwitzer wrote:

> Wrong. Just use the built-in function Math.hypot() - it should handle
> these cases and also overflow, infinity etc. in the best possible way.
>
> Apfelkiste:~ chris$ python
> Python 2.7.2 (default, Oct 11 2012, 20:14:37)
> [GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import math
>  >>> math.hypot(3,4)
> 5.0
>
> Christian

Thank you! :D I forgot about that one.
I wonder how the sympy Point.distance method compares...


Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

ravas
In reply to this post by ravas
On Monday, May 25, 2015 at 1:27:43 PM UTC-7, Gary Herron wrote:
> This is a statement about floating point numeric calculations on a
> computer,.  As such, it does apply to Python which uses the underlying
> hardware for floating point calculations.
>
> Validity is another matter.  Where did you find the quote?

Thank you.

You can find the quote in the 4th comment at the bottom of:
http://betterexplained.com/articles/surprising-uses-of-the-pythagorean-theorem/


Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

Steven D'Aprano-8
In reply to this post by ravas
On Tue, 26 May 2015 05:21 am, ravas wrote:

> I read an interesting comment:
> """
> The coolest thing I've ever discovered about Pythagorean's Theorem is an
> alternate way to calculate it. If you write a program that uses the
> distance form c = sqrt(a^2 + b^2) you will suffer from the lose of half of
> your available precision because the square root operation is last. A more
> accurate calculation is c = a * sqrt(1 + b^2 / a^2). If a is less than b,
> you should swap them and of course handle the special case of a = 0. """


Let's compare three methods.


import math
import random

def naive(a, b):
    return math.sqrt(a**2 + b**2)

def alternate(a, b):
    a, b = min(a, b), max(a, b)
    if a == 0:  return b
    if b == 0:  return a
    return a * math.sqrt(1 + b**2 / a**2)

counter = 0
print("Type Ctrl-C to exit")
while True:
    counter += 1
    a = random.uniform(0, 1000)
    b = random.uniform(0, 1000)
    d1 = naive(a, b)
    d2 = alternate(a, b)
    d3 = math.hypot(a, b)
    if not (d1 == d2 == d3):
        print("mismatch after %d trials" % counter)
        print("naive:", d1)
        print("alternate:", d2)
        print("hypot:", d3)
        break



When I run that, I get:

mismatch after 3 trials
naive: 1075.6464259886257
alternate: 1075.6464259886257
hypot: 1075.6464259886254

A second run gives:

mismatch after 3 trials
naive: 767.3916150255787
alternate: 767.3916150255789
hypot: 767.3916150255787


which shows that:

(1) It's not hard to find mismatches;
(2) It's not obvious which of the three methods is more accurate.



--
Steven



Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

ravas
On Monday, May 25, 2015 at 8:11:25 PM UTC-7, Steven D'Aprano wrote:
> Let's compare three methods.
> ...
> which shows that:
>
> (1) It's not hard to find mismatches;
> (2) It's not obvious which of the three methods is more accurate.

Thank you; that is very helpful!

I'm curious: what about the sqrt() function being last is detrimental?
>From a point of ignorance it seems like we are just producing errors sooner,
and then multiplying them, with this alternative method.



Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

Ian Kelly-2
In reply to this post by ravas
On Mon, May 25, 2015 at 1:21 PM, ravas <ravas at outlook.com> wrote:

> I read an interesting comment:
> """
> The coolest thing I've ever discovered about Pythagorean's Theorem is an alternate way to calculate it. If you write a program that uses the distance form c = sqrt(a^2 + b^2) you will suffer from the lose of half of your available precision because the square root operation is last. A more accurate calculation is c = a * sqrt(1 + b^2 / a^2). If a is less than b, you should swap them and of course handle the special case of a = 0.
> """
>
> Is this valid? Does it apply to python?
> Any other thoughts? :D
>
> My imagining:
>
> def distance(A, B):
>     """
>     A & B are objects with x and y attributes
>     :return: the distance between A and B
>     """
>     dx = B.x - A.x
>     dy = B.y - A.y
>     a = min(dx, dy)
>     b = max(dx, dy)
>     if a == 0:
>         return b
>     elif b == 0:
>         return a

This branch is incorrect because a could be negative.

You don't need this anyway; the a == 0 branch is only there because of
the division by a in the else branch.

>     else:
>         return a * sqrt(1 + (b / a)**2)

Same issue; if a is negative then the result will have the wrong sign.


Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

ravas
In reply to this post by ravas
Oh ya... true >_<

Thanks :D

On Monday, May 25, 2015 at 9:43:47 PM UTC-7, Ian wrote:

> > def distance(A, B):
> >     """
> >     A & B are objects with x and y attributes
> >     :return: the distance between A and B
> >     """
> >     dx = B.x - A.x
> >     dy = B.y - A.y
> >     a = min(dx, dy)
> >     b = max(dx, dy)
> >     if a == 0:
> >         return b
> >     elif b == 0:
> >         return a
>
> This branch is incorrect because a could be negative.
>
> You don't need this anyway; the a == 0 branch is only there because of
> the division by a in the else branch.
>
> >     else:
> >         return a * sqrt(1 + (b / a)**2)
>
> Same issue; if a is negative then the result will have the wrong sign.



Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

Gary Herron-2
In reply to this post by ravas
On 05/25/2015 09:13 PM, ravas wrote:

> On Monday, May 25, 2015 at 8:11:25 PM UTC-7, Steven D'Aprano wrote:
>> Let's compare three methods.
>> ...
>> which shows that:
>>
>> (1) It's not hard to find mismatches;
>> (2) It's not obvious which of the three methods is more accurate.
> Thank you; that is very helpful!
>
> I'm curious: what about the sqrt() function being last is detrimental?
>  From a point of ignorance it seems like we are just producing errors sooner,
> and then multiplying them, with this alternative method.

It's probably not the square root that's causing the inaccuracies. In
many other cases, and probably here also, it's the summing of two
numbers that have vastly different values that loses precision.  A
demonstration:

 >>> big = 100000000.0
 >>> small = 0.000000001
 >>> (big+small)-big # Should produce a value =small, but gives an exact
zero instead.
0.0

The squaring of the two values in x*x+y*y just makes the addition even
more error prone since the squares make large values even larger and
small values even smaller.


Gary Herron.

--
Dr. Gary Herron
Department of Computer Science
DigiPen Institute of Technology
(425) 895-4418



Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

Christian Gollwitzer
In reply to this post by Steven D'Aprano-8
Am 26.05.15 um 05:11 schrieb Steven D'Aprano:

> mismatch after 3 trials
> naive: 767.3916150255787
> alternate: 767.3916150255789
> hypot: 767.3916150255787
>
>
> which shows that:
>
> (1) It's not hard to find mismatches;
> (2) It's not obvious which of the three methods is more accurate.


The main problem is not necessarily precision. A square root is a very
precise operation in floating point math, the relative precision
*increases* by sqrt. The big problem is overflow. Take e.g. a=3*10^160,
b=4*10^160, then the exact result is c=5*10^160. But:

 >>> a=3e160
 >>> b=4e160
 >>> math.sqrt(a**2+b**2)
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
OverflowError: (34, 'Result too large')
 >>> math.hypot(a,b)
5e+160


        Christian




Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

ravas
In reply to this post by ravas
On Monday, May 25, 2015 at 10:16:02 PM UTC-7, Gary Herron wrote:

> It's probably not the square root that's causing the inaccuracies. In
> many other cases, and probably here also, it's the summing of two
> numbers that have vastly different values that loses precision.  A
> demonstration:
>
>  >>> big = 100000000.0
>  >>> small = 0.000000001
>  >>> (big+small)-big # Should produce a value =small, but gives an exact
> zero instead.
> 0.0
>
> The squaring of the two values in x*x+y*y just makes the addition even
> more error prone since the squares make large values even larger and
> small values even smaller.

I appreciate your help.


Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

random832@fastmail.us
In reply to this post by ravas
On Mon, May 25, 2015, at 15:21, ravas wrote:
> Is this valid? Does it apply to python?
> Any other thoughts? :D

The math.hypot function uses the C library's function which should deal
with such concerns internally. There is a fallback version in case the C
library does not have this function, in Python/pymath.c - which,
incidentally, does use your algorithm.


Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

random832@fastmail.us
On Tue, May 26, 2015, at 09:40, random832 at fastmail.us wrote:
> On Mon, May 25, 2015, at 15:21, ravas wrote:
> > Is this valid? Does it apply to python?
> > Any other thoughts? :D
>
> The math.hypot function uses the C library's function which should deal
> with such concerns internally. There is a fallback version in case the C
> library does not have this function, in Python/pymath.c - which,
> incidentally, does use your algorithm.

Well, I should say, not _precisely_ your algorithm.

The "0 special case" mentioned in the text you read was for both values
being zero, not just one. The biggest flaw in your function, though, was
the failure to take the absolute values of the differences. This defeats
the point of swapping them (which I assume is to get the magnitudes in
the order needed for best precision), and makes it possible for your
function to return a negative value when the other is zero.

Here's the equivalent python code for the hypot function in pymath.c,
and for your distance function.

from math import sqrt

def hypot(x, y):
    x = abs(x)
    y = abs(y)
    if x < y:
        x, y = y, x
    if x == 0:  # both are 0 due to the swap
        return 0.0
    else:
        return x*sqrt(1.0 + (y/x)**2)

def distance(A, B):
    return hypot(A.x-B.x, A.y-B.y)

What I wonder is if there's a best way to do it for three dimensions. I
mean, you could simply do hypot(hypot(dx, dy), dz), but should you
choose the largest, smallest, or middle value to be the odd one out?


Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

Robin Becker
A minor point is that if you just need to compare distances you don't need to
compute the hypotenuse, its square will do so no subtractions etc etc.
--
Robin Becker



Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

Brian Blais-2
In reply to this post by Steven D'Aprano-8
On Mon, May 25, 2015 at 11:11 PM, Steven D'Aprano <steve at pearwood.info> wrote:

>
> Let's compare three methods.
>
> def naive(a, b):
>     return math.sqrt(a**2 + b**2)
>
> def alternate(a, b):
>     a, b = min(a, b), max(a, b)
>     if a == 0:  return b
>     if b == 0:  return a
>     return a * math.sqrt(1 + b**2 / a**2)


>     d1 = naive(a, b)
>     d2 = alternate(a, b)
>     d3 = math.hypot(a, b)
>

> which shows that:
>
> (1) It's not hard to find mismatches;
> (2) It's not obvious which of the three methods is more accurate.
>

Bottom line: they all suck.  :)

I ran the program you posted, and, like you, got the following two examples:

for fun in [naive, alternate, math.hypot]:
    print '%.20f' % fun(222.44802484683657,680.255801504161)

715.70320611153294976248
715.70320611153283607564
715.70320611153283607564

and

for fun in [naive, alternate, math.hypot]:
    print '%.20f' % fun(376.47153302262484,943.1877995550265)

1015.54617837194291496417
1015.54617837194280127733
1015.54617837194291496417

but when comparing to Wolfram Alpha, which calculates these out many
more decimal places, we have for the two cases:

715.7032061115328768204988784125331443593766145937358347357252...
715.70320611153294976248
715.70320611153283607564
715.70320611153283607564

1015.546178371942943007625196455666280385821355370154991424749...
1015.54617837194291496417
1015.54617837194280127733
1015.54617837194291496417

where all of the methods deviate at the 13/14 decimal place.



bb


--
-----------------

             bblais at gmail.com
             http://web.bryant.edu/~bblais


Reply | Threaded
Open this post in threaded view
|

a more precise distance algorithm

Oscar Benjamin-2
On 27 May 2015 at 19:00, Brian Blais <bblais at gmail.com> wrote:

> On Mon, May 25, 2015 at 11:11 PM, Steven D'Aprano <steve at pearwood.info> wrote:
>>
>> Let's compare three methods.
>>
>> def naive(a, b):
>>     return math.sqrt(a**2 + b**2)
>>
>> def alternate(a, b):
>>     a, b = min(a, b), max(a, b)
>>     if a == 0:  return b
>>     if b == 0:  return a
>>     return a * math.sqrt(1 + b**2 / a**2)
>
>
>>     d1 = naive(a, b)
>>     d2 = alternate(a, b)
>>     d3 = math.hypot(a, b)
>>
>
>> which shows that:
>>
>> (1) It's not hard to find mismatches;
>> (2) It's not obvious which of the three methods is more accurate.
>>
>
> Bottom line: they all suck.  :)
>
> I ran the program you posted, and, like you, got the following two examples:
>
> for fun in [naive, alternate, math.hypot]:
>     print '%.20f' % fun(222.44802484683657,680.255801504161)
>
> 715.70320611153294976248
> 715.70320611153283607564
> 715.70320611153283607564
>
> and
>
> for fun in [naive, alternate, math.hypot]:
>     print '%.20f' % fun(376.47153302262484,943.1877995550265)
>
> 1015.54617837194291496417
> 1015.54617837194280127733
> 1015.54617837194291496417
>
> but when comparing to Wolfram Alpha, which calculates these out many
> more decimal places, we have for the two cases:
>
> 715.7032061115328768204988784125331443593766145937358347357252...
> 715.70320611153294976248
> 715.70320611153283607564
> 715.70320611153283607564
>
> 1015.546178371942943007625196455666280385821355370154991424749...
> 1015.54617837194291496417
> 1015.54617837194280127733
> 1015.54617837194291496417
>
> where all of the methods deviate at the 13/14 decimal place.

So they have 12/13 correct digits after the decimal point. Including
the digits before the decimal point they all have 15/16 correct
decimal digits. This is exactly what you should expect when using
double precision floating point. Feel free to print as many digits as
you like but the extra ones won't add any more accuracy.

The difference between the answers you showed are just 1 ULP:
>>> a = 715.70320611153283607564
>>> b = 715.70320611153294976248
>>> a
715.7032061115328
>>> a + 5e-14
715.7032061115328
>>> a + 6e-14
715.703206111533
>>> a + 6e-14 == b
True

Since you don't show the source numbers we can't know whether the true
result lies between these two or not.


Oscar