Transcendental floating point functions are now unfixably broken on Intel processors

Discussion:

(too old to reply)

Yousuf Khan

2014-10-10 10:58:43 UTC

" This error has tragically become un-fixable because of the
compatibility requirements from one generation to the next. The fix for
this problem was figured out quite a long time ago. In the excellent
paper The K5 transcendental functions by T. Lynch, A. Ahmed, M. Schulte,
T. Callaway, and R. Tisdale a technique is described for doing argument
reduction as if you had an infinitely precise value for pi. As far as I
know, the K5 is the only x86 family CPU that did sin/cos accurately. AMD
went back to being bit-for-bit compatibile with the old x87 behavior,
assumably because too many applications broke. Oddly enough, this is
fixed in Itanium.

What we do in the JVM on x86 is moderately obvious: we range check the
argument, and if it's outside the range [-pi/4, pi/4]we do the precise
range reduction by hand, and then call fsin.

So Java is accurate, but slower. I've never been a fan of "fast, but
wrong" when "wrong" is roughly random(). Benchmarks rarely test
accuracy. "double sin(double theta) { return 0; }" would be a great
benchmark-compatible implementation of sin(). For large values of theta,
0 would be arguably more accurate since the absolute error is never
greater than 1. fsin/fcos can have absolute errors as large as 2
(correct answer=1; returned result=-1). "

https://blogs.oracle.com/jag/entry/transcendental_meditation

Mark F

2014-10-10 13:52:05 UTC

Permalink

On Fri, 10 Oct 2014 06:58:43 -0400, Yousuf Khan

Post by Yousuf Khan
" This error has tragically become un-fixable because of the
compatibility requirements from one generation to the next. The fix for
this problem was figured out quite a long time ago. In the excellent
paper The K5 transcendental functions by T. Lynch, A. Ahmed, M. Schulte,
T. Callaway, and R. Tisdale a technique is described for doing argument
reduction as if you had an infinitely precise value for pi. As far as I
know, the K5 is the only x86 family CPU that did sin/cos accurately. AMD
went back to being bit-for-bit compatibile with the old x87 behavior,
assumably because too many applications broke. Oddly enough, this is
fixed in Itanium.
What we do in the JVM on x86 is moderately obvious: we range check the
argument, and if it's outside the range [-pi/4, pi/4]we do the precise
range reduction by hand, and then call fsin.
So Java is accurate, but slower. I've never been a fan of "fast, but
wrong" when "wrong" is roughly random(). Benchmarks rarely test
accuracy. "double sin(double theta) { return 0; }" would be a great
benchmark-compatible implementation of sin(). For large values of theta,
0 would be arguably more accurate since the absolute error is never
greater than 1. fsin/fcos can have absolute errors as large as 2
(correct answer=1; returned result=-1). "
https://blogs.oracle.com/jag/entry/transcendental_meditation

I wanted to see what the algorithm was, so I found the
paper:
"The K5 Transcendental Functions"
Tom Lynch, Ashraf Ahmed, Mike Schulte,
Tom Callaway, and Robert Tisdale.
"ARITH '95 Proceedings of the 12th
Symposium on Computer Arithmetic", pages 163-167, 1995
ISBN:0-8186-7089-4

https://www.researchgate.net/publication/3612479_The_K5_transcendental_functions

The paper describes an elegant algorithm for argument reduction.

However, if I am reading things correctly,
"2.1 Multiprecision Arithmetic" (page 164)
says the arguments have at most 88 bits of precision.

The range reduction is done, per:
"2 Algorithms" (page 164)
from [-2^63,2^63] to [- pi/4,pi/4]

Because of the allowable range of the pre-reduction
arguments, only about 21 bits (=88-63) of precision remain.

In particular, at the extremes of the pre-reduction
argument range, while some function values can have slightly more
than 21 bits of precision, other functions have much less precision.

In particular:
tan(x) near pi/4
has fewer than 21 bits of precision.

Another example of bad use of argument reduction was in the
VAX VMS math library circa 1978.

I couldn't find the earlier pre-1980 paper that
describes the argument reduction algorithm used in detail,
but I did find a reference in the ACM Digital Library (dl.acm.org):
"Radian reduction for trigonometric functions"
Authors: Mary H. Payne, Robert N. Hanek
"ACM SIGNUM Newsletter"
Volume 18 Issue 1, January 1983, Pages 19 - 24

The math library described in the paper uses the constant
pi/4 to stored to about 32768 bits to do an elegant
argument reduction for arguments up to more than 2^16000 radians.

Once again, the problem is that the arguments are not
exact, but rather truncated numbers. The VAX-11 H Floating point
numbers have about 113 bits of precision, so most any argument
that is the result of a computation only has 113 bits,
so any pre-reduction argument that was larger than about
113 bits actually could be any angle, and raising
a loss-of-significance error is the best action that
the library should have taken.

d***@gmail.com

2015-01-22 00:27:58 UTC

Permalink

Wow, you're still here. I haven't peeked at comp.chips in years, maybe a decade. Is Keith / KRW still around? I haven't seen or heard from him since he retired. I see John Corse is still around, same-old-same-old.

To be on-topic, it's interesting to see the transcendentals broken on Intel. I'm looking into AMD's HSA, and though the math can be double-precision, I'd heard that transcendentals were fudged single-precision. I'd thought of Intel as the gold standard on this, at least after the integer bruising was fixed.

Oops.

Yousuf Khan

2015-01-22 04:25:35 UTC

Permalink

Post by d***@gmail.com
Wow, you're still here. I haven't peeked at comp.chips in years,
maybe a decade. Is Keith / KRW still around? I haven't seen or
heard from him since he retired. I see John Corse is still around,
same-old-same-old.

Yeah, I check into it from time to time. At least it's still on my
newsgroups list. After it's done filtering out all of the spam, I might
see one posting in 3 months here on average.

Post by d***@gmail.com
To be on-topic, it's interesting to see the transcendentals broken
on Intel. I'm looking into AMD's HSA, and though the math can be
double-precision, I'd heard that transcendentals were fudged
single-precision. I'd thought of Intel as the gold standard on
this, at least after the integer bruising was fixed.

I think these days the transcendentals are all emulated in software
anyhow, so the precision now depends on how bug-free the floating point
libraries are, not now bug-free the hardware microcode is. AMD64 has
gotten rid of the x87 floating point unit, it's completely replaced by
the SSE2 and higher system, so it's not an option to use the hardware
transcendentals, since trancendentals are not part of the SSE specs. All
higher level floating point functions are now carried out by software.
So in a sense, the RISC idea of keeping complex functions to minimum has
won out, at least in the floating point side of x86.

Yousuf Khan

k***@attt.bizz

2015-01-22 23:57:06 UTC

Permalink

Post by d***@gmail.com

Hi Dale,

I'm still "around" but there hasn't been much activity here for a
decade or so. Oh, and I'm un-retired. Completely different industry,
though. ...doing mostly analog design. ;-)

Post by d***@gmail.com
To be on-topic, it's interesting to see the transcendentals broken on Intel. I'm looking into AMD's HSA, and though the math can be double-precision, I'd heard that transcendentals were fudged single-precision. I'd thought of Intel as the gold standard on this, at least after the integer bruising was fixed.
Oops.

Yousuf Khan

2015-01-24 18:44:23 UTC

Permalink

Post by k***@attt.bizz
Hi Dale,
I'm still "around" but there hasn't been much activity here for a
decade or so. Oh, and I'm un-retired. Completely different industry,
though. ...doing mostly analog design. ;-)

These days there are other newsgroups that are more active on the PC
front, such as:

alt.comp.hardware.pc-homebuilt

You guys should check it out, add it to your newsgroup list.

Yousuf Khan

k***@attt.bizz

2015-01-24 22:43:20 UTC

Permalink

On Sat, 24 Jan 2015 13:44:23 -0500, Yousuf Khan

Post by Yousuf Khan

These days there are other newsgroups that are more active on the PC
alt.comp.hardware.pc-homebuilt
You guys should check it out, add it to your newsgroup list.

The last PC I built was an original Opteron, if that tells you
anything. ;-)

Yousuf Khan

2015-01-27 19:11:51 UTC

Permalink

Post by k***@attt.bizz
On Sat, 24 Jan 2015 13:44:23 -0500, Yousuf Khan

Post by Yousuf Khan
These days there are other newsgroups that are more active on the PC
alt.comp.hardware.pc-homebuilt
You guys should check it out, add it to your newsgroup list.

The last PC I built was an original Opteron, if that tells you
anything. ;-)

Well, if I trace back my current desktop, it can be trace it all of the
way back to my first ever 8088 PC-XT clone. It's been upgraded
continuously ever since, component by component.

Yousuf Khan

k***@attt.bizz

2015-01-27 23:42:58 UTC

Permalink

On Tue, 27 Jan 2015 14:11:51 -0500, Yousuf Khan

Post by Yousuf Khan

Post by k***@attt.bizz
On Sat, 24 Jan 2015 13:44:23 -0500, Yousuf Khan

Post by Yousuf Khan
These days there are other newsgroups that are more active on the PC
alt.comp.hardware.pc-homebuilt
You guys should check it out, add it to your newsgroup list.

The last PC I built was an original Opteron, if that tells you
anything. ;-)

Well, if I trace back my current desktop, it can be trace it all of the
way back to my first ever 8088 PC-XT clone. It's been upgraded
continuously ever since, component by component.

The difference is that my Opteron system has never been upgraded. It
holds down the floor just as well as it did 12 years ago. It uses a
lot less power, these days, though.

Yousuf Khan

2015-01-28 02:23:06 UTC

Permalink

Post by k***@attt.bizz
On Tue, 27 Jan 2015 14:11:51 -0500, Yousuf Khan

Post by Yousuf Khan
Well, if I trace back my current desktop, it can be trace it all of the
way back to my first ever 8088 PC-XT clone. It's been upgraded
continuously ever since, component by component.

The difference is that my Opteron system has never been upgraded. It
holds down the floor just as well as it did 12 years ago. It uses a
lot less power, these days, though.

Well, I still use my desktop daily, it's my most used computer. It's
also scheduled for its next mini-upgrade in a few days or weeks. I'm
going to be installing a water cooler to it, and then I'm going to be
overclocking it. The current system is using a Phenom II X6 1100T, which
is overclocking-ready. I did overclock it slightly back when I first got
it, using its stock cooler. I did not really need the extra speed and
decided to keep it at stock speed. I'm expecting that if I overclock it
with water, I should be good with this current processor for another 2
years or so.

Yousuf Khan

k***@attt.bizz

2015-01-29 01:13:24 UTC

Permalink

On Tue, 27 Jan 2015 21:23:06 -0500, Yousuf Khan

Post by Yousuf Khan

Post by k***@attt.bizz
On Tue, 27 Jan 2015 14:11:51 -0500, Yousuf Khan

The difference is that my Opteron system has never been upgraded. It
holds down the floor just as well as it did 12 years ago. It uses a
lot less power, these days, though.

I have *many* things that take up my time other than upgrading
computers. I got out of that completely when I bought my first
laptop. Computers have gotten boring. They're just another tool.

Yousuf Khan

2015-01-29 13:20:21 UTC

Permalink

Post by k***@attt.bizz
I have *many* things that take up my time other than upgrading
computers. I got out of that completely when I bought my first
laptop. Computers have gotten boring. They're just another tool.

I make a semi-major upgrade once every other year on my systems. I have
a laptop too, they're definitely boring, although I've upgraded that one
too. But you're only limited to upgrading RAM and hard disk on laptops.
On the laptop, I've upgraded both RAM and storage, from 4GB to 8GB, and
from an HDD to an SSD. The upgrade to a water cooler on my desktop will
allow me to avoid upgrading the processor for several more years.
Doesn't really occupy too much more of my time beyond that.

Yousuf Khan

k***@attt.bizz

2015-01-30 01:47:28 UTC

Permalink

On Thu, 29 Jan 2015 08:20:21 -0500, Yousuf Khan

Post by Yousuf Khan

My point is "why". Computers are boring, not just laptops.

Yousuf Khan

2015-01-31 07:06:24 UTC

Permalink

Post by k***@attt.bizz
My point is "why". Computers are boring, not just laptops.

I'm not upgrading it to make it exciting, I'm upgrading it to keep it
useful.

Yousuf Khan

k***@attt.bizz

2015-02-01 00:22:06 UTC

Permalink

On Sat, 31 Jan 2015 02:06:24 -0500, Yousuf Khan

Post by Yousuf Khan

Post by k***@attt.bizz
My point is "why". Computers are boring, not just laptops.

I'm not upgrading it to make it exciting, I'm upgrading it to keep it
useful.

There isn't enough interesting *to* upgrade anymore. Times have
changed.