From Fabrice Bellard, with minor name change (umulh
):
// return the high 32 bit part of the 64 bit addition of (hi0, lo0) and (hi1, lo1)
Math.iaddh(lo0, hi0, lo1, hi1)
// return the high 32 bit part of the 64 bit subtraction of (hi0, lo0) and (hi1, lo1)
Math.isubh(lo0, hi0, lo1, hi1)
// return the high 32 bit part of the signed 64 bit product of the 32 bit numbers a and b
Math.imulh(a, b)
// return the high 32 bit part of the unsigned 64 bit product of the 32 bit numbers a and b
Math.umulh(a, b)
All these functions convert their argument to 32 bit integers. They return a signed 32 bit integer.
With these functions, the 64 bit operations are easy to implement :
// (hi_res, lo_res) = (hi0, lo0) + (hi1, lo1) (64 bit addition):
lo_res = (lo0 + lo1) | 0;
hi_res = Math.iaddh(lo0, hi0, lo1, hi1);
// (hi_res, lo_res) = (hi0, lo0) - (hi1, lo1) (64 bit subtraction):
lo_res = (lo0 - lo1) | 0;
hi_res = Math.isubh(lo0, hi0, lo1, hi1);
// (hi_res, lo_res) = a * b (signed 64 bit product of 32 bit integers):
lo_res = Math.imul(a, b);
hi_res = Math.imulh(a, b);
// (hi_res, lo_res) = a * b (unsigned 64 bit product of 32 bit integers):
lo_res = Math.imul(a, b);
hi_res = Math.umulh(a, b);
// (hi_res, lo_res) = (hi0, lo0) * (hi1, lo1) (signed 64 bit product of 64 bit integers):
lo_res = Math.imul(lo0, lo1);
hi_res = (Math.imulh(lo0, lo1) + Math.imul(lo0, hi1)) | 0;
hi_res = (hi_res + Math.imul(lo1, hi0)) | 0;
// (hi_res, lo_res) = (hi0, lo0) * (hi1, lo1) (unsigned 64 bit product of 64 bit integers):
lo_res = Math.imul(lo0, lo1);
hi_res = (Math.umulh(lo0, lo1) + Math.imul(lo0, hi1)) | 0;
hi_res = (hi_res + Math.imul(lo1, hi0)) | 0;
Fabrice wrote:
"It is easy for the compiler to optimize the code because only 32 bit integers are used and because the functions have no side effect. Even if the compiler does not remove the duplicate operation for the low 32 bit parts, the overhead is very small on a 32 bit CPU (1 more instruction than the optimal code)."
UPDATE to provide 64x64=64 functions:
// (hi_res, lo_res) = (hi0, lo0) * (hi1, lo1) (signed 64 bit product of 64 bit integers):
lo_res = Math.imul(lo0, lo1);
hi_res = Math.imulx(lo0, hi0, lo1, hi1);
// (hi_res, lo_res) = (hi0, lo0) * (hi1, lo1) (unsigned 64 bit product of 64 bit integers):
lo_res = Math.imul(lo0, lo1);
hi_res = Math.umulx(lo0, hi0, lo1, hi1);
The x
suffix is straw, it could be hx
to connote hi_res
is the result, or something longer.
/be
While these imulh and umulh are pretty ordinary, these iaddh and isubh were a little surprising to me. I expected Math.iaddh and Math.isubh to have two arguments and return 0 or 1, holding the carry value, similar to how imulh and umulh work.
In this way, 64-bit add would be:
lo_res = (lo0 + lo1) | 0;
hi_res = hi0 + hi1 + Math.iaddh(lo0, lo1) | 0;
However, if you're thinking that JITs are going to want to pattern-match the whole thing to actual 64-bit arithmetic, this would make more work for them. But in that case, perhaps we should instead go the other way and make Math.imulh and Math.umulh take 4 inputs too. This would make the 32x32=64 case less nice, but it'd make the 64x64=64 case use less code, and that's the more common case for e.g. Emscripten.
(I'm leaving aside the concern of using carry flags to implement big-integer math here :-)).