libmpdecimal

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

libmpdecimal

John Ralls-2
The libmpdecimal branch now passes make check all the way through. I've force-pushed a rebase on the latest master to https://github.com/jralls/gnucash.git for anyone who'd like to play with it.

Next step is to do some tuning to see how much I can shrink it while getting full 64-bit coefficients (current tests limit it to 44-bits) then profiling to see if there are any performance differences to master.

Regards,
John Ralls


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: libmpdecimal

Geert Janssens
On Saturday 19 July 2014 11:53:37 John Ralls wrote:

> The libmpdecimal branch now passes make check all the way through.
> I've force-pushed a rebase on the latest master to
> https://github.com/jralls/gnucash.git for anyone who'd like to play
> with it.
>
> Next step is to do some tuning to see how much I can shrink it while
> getting full 64-bit coefficients (current tests limit it to 44-bits)
> then profiling to see if there are any performance differences to
> master.
>
> Regards,
> John Ralls
>
>
> _______________________________________________
> gnucash-devel mailing list
> [hidden email]
> https://lists.gnucash.org/mailman/listinfo/gnucash-devel

Nice! Keep up the good work.

I'm curious to hear about the performance difference. Hopefully the performance will be
better :)

Geert
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: libmpdecimal

John Ralls-2
In reply to this post by John Ralls-2

On Jul 31, 2014, at 7:13 AM, Geert Janssens <[hidden email]> wrote:

> On Saturday 19 July 2014 11:53:37 John Ralls wrote:
> > The libmpdecimal branch now passes make check all the way through.
> > I've force-pushed a rebase on the latest master to
> > https://github.com/jralls/gnucash.git for anyone who'd like to play
> > with it.
> >
> > Next step is to do some tuning to see how much I can shrink it while
> > getting full 64-bit coefficients (current tests limit it to 44-bits)
> > then profiling to see if there are any performance differences to
> > master.
> >
> > Regards,
> > John Ralls
> >
> >
> > _______________________________________________
> > gnucash-devel mailing list
> > [hidden email]
> > https://lists.gnucash.org/mailman/listinfo/gnucash-devel
>  
> Nice! Keep up the good work.
>  
> I'm curious to hear about the performance difference. Hopefully the performance will be better :)

Libmpdecimal is about 25% slower as it is now, but I see some good optimization opportunities from the profile. I'm also looking at the Intel and GCC/ICU versions to see if they might prove faster, since mpdecimal is written specifically for Python and so has some extra overhead that doesn't seem to be present in the others.

The Intel version is particularly interesting because it uses a different encoding scheme and dispenses with contexts, both of which they claim afford much faster execution. Unfortunately their code is rather impenetrable and the documentation is sparse and difficult to understand. It also doesn't appear to expose an interface that can be used to extract a rational expression of the number, so it would require more code changes in the rest of GnuCash to be useable.

Removing the 44-bit clamp and increasing the range of the denominators from 10^6 to 10^9 passes all tests except test-lots, which fails from being unable to balance the lots in complex cases. I'm still debugging that.

Regards,
John Ralls

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: libmpdecimal

John Ralls-2

On Jul 31, 2014, at 9:38 AM, John Ralls <[hidden email]> wrote:

>
> On Jul 31, 2014, at 7:13 AM, Geert Janssens <[hidden email]> wrote:
>
>> On Saturday 19 July 2014 11:53:37 John Ralls wrote:
>>> The libmpdecimal branch now passes make check all the way through.
>>> I've force-pushed a rebase on the latest master to
>>> https://github.com/jralls/gnucash.git for anyone who'd like to play
>>> with it.
>>>
>>> Next step is to do some tuning to see how much I can shrink it while
>>> getting full 64-bit coefficients (current tests limit it to 44-bits)
>>> then profiling to see if there are any performance differences to
>>> master.
>>>
>>> Regards,
>>> John Ralls
>>>
>>>
>>> _______________________________________________
>>> gnucash-devel mailing list
>>> [hidden email]
>>> https://lists.gnucash.org/mailman/listinfo/gnucash-devel
>>
>> Nice! Keep up the good work.
>>
>> I'm curious to hear about the performance difference. Hopefully the performance will be better :)
>
> Libmpdecimal is about 25% slower as it is now, but I see some good optimization opportunities from the profile. I'm also looking at the Intel and GCC/ICU versions to see if they might prove faster, since mpdecimal is written specifically for Python and so has some extra overhead that doesn't seem to be present in the others.
>
> The Intel version is particularly interesting because it uses a different encoding scheme and dispenses with contexts, both of which they claim afford much faster execution. Unfortunately their code is rather impenetrable and the documentation is sparse and difficult to understand. It also doesn't appear to expose an interface that can be used to extract a rational expression of the number, so it would require more code changes in the rest of GnuCash to be useable.
>
> Removing the 44-bit clamp and increasing the range of the denominators from 10^6 to 10^9 passes all tests except test-lots, which fails from being unable to balance the lots in complex cases. I'm still debugging that.

So, having gotten test-lots and all of the other tests working* with libmpdecimal, I studied the Intel library for several days and couldn't figure out how to make it work, so I decided to try the GCC implementation, which offers a 128-bit IEEE 754 format that's fixed size. Since it doesn't ever call malloc, I thought it might prove faster, and indeed it is. I haven't finished integrating it -- the library doesn't provide formatted printing -- but it's far enough along that it passes all of the engine and backend tests. Some results:

test-numeric, with NREPS increased to 20000 to get a reasonable execution time for profiling:
    master     9645ms
    mpDecimal 21410ms
    decNumber 12985ms

test-lots:
    master      16300ms
    mpDecimal   20203ms
    decNumber   19044ms

The first shows the relative speed in more or less pure computation, the latter shows the overall impact on one of the longer-running tests that does a lot of other stuff.

I haven't investigated Christian's other suggestion of aggressive rounding to eliminate the overflow issue to make room for larger denominators, nor my original idea of replacing gnc_numeric with boost::rational atop a multi-precision class (either boost::mp or gmp). I have noticed that we're doing some dumb things with Scheme, like using double as an intermediate when converting from Scheme numbers to gnc_numeric (Scheme numbers are also rational, so the conversion should be direct) and representing gnc_numerics as a tuple (num, denom) instead of just using Scheme rationals. Neither will work for decimal floats, of course; the whole class will have to be wrapped so that computation takes place in C++. Storage in SQL is also an issue, as is maintaining backward file compatibility.

Another issue is equality: In order to get tests to pass I've had to implement a fuzzy comparison where both numbers are first rounded to the smaller number of decimal places -- 2 fewer if there are 12 or more -- and compared with two roundings, first truncation and second "bankers", and declared unequal only if they're unequal in both. I hate this, but it seems to be necessary to obtain equality when dealing with large divisors (as when computing prices or interest rates). I suspect that we'd have to do something similar if we pursue aggressive rounding to avoid overflows, but the only way to know for certain is to try.

I've force-pushed both branches to my github repo for the curious; beware that ATM neither passes "make check".

Regards,
John Ralls

* That didn't last long, though: The latest rebase onto master broke some of the tests. I haven't fixed them yet because I wanted to get this profiling data done.



_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: libmpdecimal

Geert Janssens
On Saturday 23 August 2014 18:01:15 John Ralls wrote:

> So, having gotten test-lots and all of the other tests working* with
> libmpdecimal, I studied the Intel library for several days and
> couldn't figure out how to make it work, so I decided to try the GCC
> implementation, which offers a 128-bit IEEE 754 format that's fixed
> size. Since it doesn't ever call malloc, I thought it might prove
> faster, and indeed it is. I haven't finished integrating it -- the
> library doesn't provide formatted printing -- but it's far enough
> along that it passes all of the engine and backend tests. Some
> results:
>
> test-numeric, with NREPS increased to 20000 to get a reasonable
> execution time for profiling: master     9645ms
>     mpDecimal 21410ms
>     decNumber 12985ms
>
> test-lots:
>     master      16300ms
>     mpDecimal   20203ms
>     decNumber   19044ms
>

> The first shows the relative speed in more or less pure computation,
> the latter shows the overall impact on one of the longer-running
> tests that does a lot of other stuff.
John,

Thanks for implementing this and running the tests. The topic was last touched before my
holidays so it took me a while to refresh my memory...

decNumber clearly performs better, although both implementations lag on our current
gnc_numeric performance.

>
> I haven't investigated Christian's other suggestion of aggressive
> rounding to eliminate the overflow issue to make room for larger
> denominators, nor my original idea of replacing gnc_numeric with
> boost::rational atop a multi-precision class (either boost::mp or
> gmp).
Do you still have plans for either ?

I suppose aggressive rounding is orthogonal to the choice of data type. Christian's argument
that we should round as is expected in the financial world makes sense to me but that
argument does not imply any underlying data type.

How about the boost::rational option ?

> I have noticed that we're doing some dumb things with Scheme,
> like using double as an intermediate when converting from Scheme
> numbers to gnc_numeric (Scheme numbers are also rational, so the
> conversion should be direct) and representing gnc_numerics as a tuple
> (num, denom) instead of just using Scheme rationals.
Does this mean you see potential performance gains in this as we clean up the C<->Scheme
number conversions ?

> Neither will
> work for decimal floats, of course; the whole class will have to be
> wrapped so that computation takes place in C++.
Which means some performance drop again...

> Storage in SQL is
> also an issue,
From the previous conversation I recall sqlite doesn't have a decimal type so we can't run
calculating queries on it directly.

But how about the other two: mysql and postsgresql. Is the decimal type you're using in your
tests directly compatible with the decimal data types in mysql and postgresql, or compatible
enough to convert automatically between them ?

> as is maintaining backward file compatibility.
>
> Another issue is equality: In order to get tests to pass I've had to
> implement a fuzzy comparison where both numbers are first rounded to
> the smaller number of decimal places -- 2 fewer if there are 12 or
> more -- and compared with two roundings, first truncation and second
> "bankers", and declared unequal only if they're unequal in both. I
> hate this, but it seems to be necessary to obtain equality when
> dealing with large divisors (as when computing prices or interest
> rates). I suspect that we'd have to do something similar if we pursue
> aggressive rounding to avoid overflows, but the only way to know for
> certain is to try.
Ugh. :(

So what's the current balance ?

I see following pros and cons of your tests so far:

Pro:
- using a decimal type gives us more precision

Con:
- sqlite doesn't have a decimal data type, so as it currently stands we can't run calculations in
queries in that database type
- we loose backward/forward compatibility with earlier versions of GnuCash
- decNumber or mpDecimal are new dependencies
- their performance is currently less than the original gnc_numeric
- guile doesn't know of a decimal data type so we may need some conversion glue
- equality is fuzzy

Please add if I forgot arguments on either side.

Arguably many of the con arguments can be solved. That will effort however. And I consider
the first two more important than the others.

So do you think the benefits (I assume there will be more than the one I mentioned) will
outweigh the drawbacks ? Does the work that will go into it bring GnuCash enough value to
continue on this track ?

It's probably too early to tell for sure but I wanted to get your ideas based on what we have so
far.

Geert
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: libmpdecimal

John Ralls-2

On Aug 27, 2014, at 8:32 AM, Geert Janssens <[hidden email]> wrote:

> On Saturday 23 August 2014 18:01:15 John Ralls wrote:
> > So, having gotten test-lots and all of the other tests working* with
> > libmpdecimal, I studied the Intel library for several days and
> > couldn't figure out how to make it work, so I decided to try the GCC
> > implementation, which offers a 128-bit IEEE 754 format that's fixed
> > size. Since it doesn't ever call malloc, I thought it might prove
> > faster, and indeed it is. I haven't finished integrating it -- the
> > library doesn't provide formatted printing -- but it's far enough
> > along that it passes all of the engine and backend tests. Some
> > results:
> >
> > test-numeric, with NREPS increased to 20000 to get a reasonable
> > execution time for profiling: master     9645ms
> >     mpDecimal 21410ms
> >     decNumber 12985ms
> >
> > test-lots:
> >     master      16300ms
> >     mpDecimal   20203ms
> >     decNumber   19044ms
> >
>  
> > The first shows the relative speed in more or less pure computation,
> > the latter shows the overall impact on one of the longer-running
> > tests that does a lot of other stuff.
> John,
>  
> Thanks for implementing this and running the tests. The topic was last touched before my holidays so it took me a while to refresh my memory...
>  
> decNumber clearly performs better, although both implementations lag on our current gnc_numeric performance.
>  
> >
> > I haven't investigated Christian's other suggestion of aggressive
> > rounding to eliminate the overflow issue to make room for larger
> > denominators, nor my original idea of replacing gnc_numeric with
> > boost::rational atop a multi-precision class (either boost::mp or
> > gmp).
> Do you still have plans for either ?
>  
> I suppose aggressive rounding is orthogonal to the choice of data type. Christian's argument that we should round as is expected in the financial world makes sense to me but that argument does not imply any underlying data type.
>  
> How about the boost::rational option ?
>  
> > I have noticed that we're doing some dumb things with Scheme,
> > like using double as an intermediate when converting from Scheme
> > numbers to gnc_numeric (Scheme numbers are also rational, so the
> > conversion should be direct) and representing gnc_numerics as a tuple
> > (num, denom) instead of just using Scheme rationals.
> Does this mean you see potential performance gains in this as we clean up the C<->Scheme number conversions ?
>  
> > Neither will
> > work for decimal floats, of course; the whole class will have to be
> > wrapped so that computation takes place in C++.
> Which means some performance drop again...
>  
> > Storage in SQL is
> > also an issue,
> From the previous conversation I recall sqlite doesn't have a decimal type so we can't run calculating queries on it directly.
>  
> But how about the other two: mysql and postsgresql. Is the decimal type you're using in your tests directly compatible with the decimal data types in mysql and postgresql, or compatible enough to convert automatically between them ?
>  
> > as is maintaining backward file compatibility.
> >
> > Another issue is equality: In order to get tests to pass I've had to
> > implement a fuzzy comparison where both numbers are first rounded to
> > the smaller number of decimal places -- 2 fewer if there are 12 or
> > more -- and compared with two roundings, first truncation and second
> > "bankers", and declared unequal only if they're unequal in both. I
> > hate this, but it seems to be necessary to obtain equality when
> > dealing with large divisors (as when computing prices or interest
> > rates). I suspect that we'd have to do something similar if we pursue
> > aggressive rounding to avoid overflows, but the only way to know for
> > certain is to try.
> Ugh. :(
>  
> So what's the current balance ?
>  
> I see following pros and cons of your tests so far:
>  
> Pro:
> - using a decimal type gives us more precision
>  
> Con:
> - sqlite doesn't have a decimal data type, so as it currently stands we can't run calculations in queries in that database type
> - we loose backward/forward compatibility with earlier versions of GnuCash
> - decNumber or mpDecimal are new dependencies
> - their performance is currently less than the original gnc_numeric
> - guile doesn't know of a decimal data type so we may need some conversion glue
> - equality is fuzzy
>  
> Please add if I forgot arguments on either side.
>  
> Arguably many of the con arguments can be solved. That will effort however. And I consider the first two more important than the others.
>  
> So do you think the benefits (I assume there will be more than the one I mentioned) will outweigh the drawbacks ? Does the work that will go into it bring GnuCash enough value to continue on this track ?
>  
> It's probably too early to tell for sure but I wanted to get your ideas based on what we have so far.

Testing boost::rational is next on the agenda. My original idea was to use it with boost::multiprecision or gmp, but I'd prefer something that doesn't depend on heap allocations because it's so much slower than stack allocation and must be passed by pointer, which is a major change in the API -- meaning a ton of cleanup work up front. I think I'll do a straight substitution of the existing math128 with boost::rational<int64_t> just to see what happens.

I think that part of implementing immediate rounding must include constraining denominators to powers-of-ten. The main reason is that it makes my head hurt when I try to think about how to do rounding with arbitrary denominators. If you consider that a big chunk of the overflow problems arise from denominators and divisors that are large primes, it becomes quickly apparent that avoiding large prime denominators might well resolve much of the problem. It's also true that for real-world numbers, as opposed to free random-generated numbers from tests, that all numbers have powers-of-ten denominators. We'd still have many-digit-prime divisors  to deal with, but constraining denominators gives us something to round to. Does that make sense, or does it seem the rambling of a lunatic? This really does make my head hurt.

I'd modify your "pro" summary to "using a decimal type gives us more significant digits without overflow while maintaining a 128-bit stack-allocatable object size". There are a lot of ways to get more precision, and the necessity of using fuzzy equality rather suggests that we're not really getting more precision with decimal floating point, we're just getting more significant digits. That sounds a bit weird, so I'll offer an example: With a rational number, one can represent 1/3 exactly, but if one constrains the denominator to powers-of-ten, either as a computational rule with rationals or by using a decimal representation, then one can only approximate it with a numerator of as many 3s as will fit in the numerator's type and the equivalent power of ten in the denominator. Decimal floats are a bit more efficient because instead of representing the denominator directly as an int they can represent it as an exponent; that allows more bits to be assigned to the numerator, inc!
 reasing precision *for any particular object size*. But it doesn't bother me much to have to use 256 bits instead of 128 to store enough bits to deal with Bitcoin and mutual funds vainly trying to get a more accurate representation of 1/3 as long as I don't have to write and maintain the math256.cpp required to pull that off.

Equality is going to be fuzzy in any rounding environment. To some extent we're fooling ourselves now: Our tests are carefully written to avoid rounding or to check for predictable rounding in controlled circumstances. I'm not at all convinced that that reflects real life. That said, our current rational-number system gives us a great deal more control over rounding than the decimal-float libraries do.

The SQL representation is another problem. None of our supported DBs support decimal floats or rationals, but rationals can be worked around. There's at least one more round of experiments, maybe two, before it's time to address storage.

That's kind of a rambling answer, and I doubt that it really adds much besides more stuff to think about. I think that getting the numeric representation right is really important, so I'm willing to keep at it for a bit longer, and I don't yet know enough to say which direction is best, or maybe least-bad. I'll start on the boost::rational implementation next week and see where that goes.

Regards,
John Ralls

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: libmpdecimal

John Ralls-2

On Aug 27, 2014, at 10:31 PM, John Ralls <[hidden email]> wrote:

>
> On Aug 27, 2014, at 8:32 AM, Geert Janssens <[hidden email]> wrote:
>
>> On Saturday 23 August 2014 18:01:15 John Ralls wrote:
>>> So, having gotten test-lots and all of the other tests working* with
>>> libmpdecimal, I studied the Intel library for several days and
>>> couldn't figure out how to make it work, so I decided to try the GCC
>>> implementation, which offers a 128-bit IEEE 754 format that's fixed
>>> size. Since it doesn't ever call malloc, I thought it might prove
>>> faster, and indeed it is. I haven't finished integrating it -- the
>>> library doesn't provide formatted printing -- but it's far enough
>>> along that it passes all of the engine and backend tests. Some
>>> results:
>>>
>>> test-numeric, with NREPS increased to 20000 to get a reasonable
>>> execution time for profiling: master     9645ms
>>>    mpDecimal 21410ms
>>>    decNumber 12985ms
>>>
>>> test-lots:
>>>    master      16300ms
>>>    mpDecimal   20203ms
>>>    decNumber   19044ms
>>>
>>
>>> The first shows the relative speed in more or less pure computation,
>>> the latter shows the overall impact on one of the longer-running
>>> tests that does a lot of other stuff.
>> John,
>>
>> Thanks for implementing this and running the tests. The topic was last touched before my holidays so it took me a while to refresh my memory...
>>
>> decNumber clearly performs better, although both implementations lag on our current gnc_numeric performance.
>>
>>>
>>> I haven't investigated Christian's other suggestion of aggressive
>>> rounding to eliminate the overflow issue to make room for larger
>>> denominators, nor my original idea of replacing gnc_numeric with
>>> boost::rational atop a multi-precision class (either boost::mp or
>>> gmp).
>> Do you still have plans for either ?
>>
>> I suppose aggressive rounding is orthogonal to the choice of data type. Christian's argument that we should round as is expected in the financial world makes sense to me but that argument does not imply any underlying data type.
>>
>> How about the boost::rational option ?
>>
>>> I have noticed that we're doing some dumb things with Scheme,
>>> like using double as an intermediate when converting from Scheme
>>> numbers to gnc_numeric (Scheme numbers are also rational, so the
>>> conversion should be direct) and representing gnc_numerics as a tuple
>>> (num, denom) instead of just using Scheme rationals.
>> Does this mean you see potential performance gains in this as we clean up the C<->Scheme number conversions ?
>>
>>> Neither will
>>> work for decimal floats, of course; the whole class will have to be
>>> wrapped so that computation takes place in C++.
>> Which means some performance drop again...
>>
>>> Storage in SQL is
>>> also an issue,
>> From the previous conversation I recall sqlite doesn't have a decimal type so we can't run calculating queries on it directly.
>>
>> But how about the other two: mysql and postsgresql. Is the decimal type you're using in your tests directly compatible with the decimal data types in mysql and postgresql, or compatible enough to convert automatically between them ?
>>
>>> as is maintaining backward file compatibility.
>>>
>>> Another issue is equality: In order to get tests to pass I've had to
>>> implement a fuzzy comparison where both numbers are first rounded to
>>> the smaller number of decimal places -- 2 fewer if there are 12 or
>>> more -- and compared with two roundings, first truncation and second
>>> "bankers", and declared unequal only if they're unequal in both. I
>>> hate this, but it seems to be necessary to obtain equality when
>>> dealing with large divisors (as when computing prices or interest
>>> rates). I suspect that we'd have to do something similar if we pursue
>>> aggressive rounding to avoid overflows, but the only way to know for
>>> certain is to try.
>> Ugh. :(
>>
>> So what's the current balance ?
>>
>> I see following pros and cons of your tests so far:
>>
>> Pro:
>> - using a decimal type gives us more precision
>>
>> Con:
>> - sqlite doesn't have a decimal data type, so as it currently stands we can't run calculations in queries in that database type
>> - we loose backward/forward compatibility with earlier versions of GnuCash
>> - decNumber or mpDecimal are new dependencies
>> - their performance is currently less than the original gnc_numeric
>> - guile doesn't know of a decimal data type so we may need some conversion glue
>> - equality is fuzzy
>>
>> Please add if I forgot arguments on either side.
>>
>> Arguably many of the con arguments can be solved. That will effort however. And I consider the first two more important than the others.
>>
>> So do you think the benefits (I assume there will be more than the one I mentioned) will outweigh the drawbacks ? Does the work that will go into it bring GnuCash enough value to continue on this track ?
>>
>> It's probably too early to tell for sure but I wanted to get your ideas based on what we have so far.
>
> Testing boost::rational is next on the agenda. My original idea was to use it with boost::multiprecision or gmp, but I'd prefer something that doesn't depend on heap allocations because it's so much slower than stack allocation and must be passed by pointer, which is a major change in the API -- meaning a ton of cleanup work up front. I think I'll do a straight substitution of the existing math128 with boost::rational<int64_t> just to see what happens.
>
> I think that part of implementing immediate rounding must include constraining denominators to powers-of-ten. The main reason is that it makes my head hurt when I try to think about how to do rounding with arbitrary denominators. If you consider that a big chunk of the overflow problems arise from denominators and divisors that are large primes, it becomes quickly apparent that avoiding large prime denominators might well resolve much of the problem. It's also true that for real-world numbers, as opposed to free random-generated numbers from tests, that all numbers have powers-of-ten denominators. We'd still have many-digit-prime divisors  to deal with, but constraining denominators gives us something to round to. Does that make sense, or does it seem the rambling of a lunatic? This really does make my head hurt.

Boost::Rational is a serious disappointment. Boost::rational<int64_t> didn’t allow a significant increase in precision and is further hampered by not providing any overflow detection. Benchmarks of test-numeric with NREPS set to 20000 (the numbers are a bit different from before because I’m using my Mac Pro instead of my Mac Book Air, and because these are debug builds):

Branch Tests Time
master: 1187558 5346ms
libmpdecimal: 1180076 8718ms
boost-rational, cppint: 1187558 20903ms
boost-rational, gmp: 1187558 34232ms

cppint means boost::multiprecision::checked_cppint128_t, a 16-byte stack allocated multi-precision integer. “Checked” means that it throws std::overflow_error instead of wrapping.  Gmp means the Gnu Multiprecision library. It’s supposed to be faster than cppint, but its performance is killed by having to malloc everything. The fact that our own C code is substantially faster than any library I’ve tried is a tribute to Linas.

There’s another wrinkle: Boost::Rational immediately reduces all numbers to what we called in my grade school “simplest form”, meaning no common factors between the numerator and denominator. This actually helps prevent overflows, but means that we have to be very careful to supply the SCU as the rounding denominator or we’ll get unexpected rounding results.  Boost::Rational provides no rounding function of its own so I rewrote gnc_numeric_convert into C++ using the overloaded operators from boost::multiprecision. That at least taught me about rounding arbitrary denominators, so my head doesn’t explode any more.

The good news is that using 128-bit numbers for all internal representations along with aggressive reduction and a tweak to get_random_gnc_numeric() so that the actual number doesn’t exceed 1E13/1 and careful attention to rounding prevents overflow errors during testing, at least up through test-lots.

Looking a bit more at rounding, it doesn’t appear to me that at 14 out of 151 gnc_numeric operations in the code base we’re over-using GNC_HOW_RND_NEVER. I’m not convinced that it would help much to eliminate those cases.

It looks like the best solution is to work over our existing gnc-numeric with math128 implementation so that the internals are always 128-bit and we don’t declare overflows prematurely.

But first it’s time to squash some bugs before next week’s release.

Regards,
John Ralls


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: libmpdecimal

Geert Janssens-4
On Saturday 20 September 2014 18:21:44 John Ralls wrote:
> On Aug 27, 2014, at 10:31 PM, John Ralls <[hidden email]> wrote:
> > On Aug 27, 2014, at 8:32 AM, Geert Janssens <janssens-
[hidden email]> wrote:

> >> On Saturday 23 August 2014 18:01:15 John Ralls wrote:
> >>> So, having gotten test-lots and all of the other tests working*
> >>> with
> >>> libmpdecimal, I studied the Intel library for several days and
> >>> couldn't figure out how to make it work, so I decided to try the
> >>> GCC
> >>> implementation, which offers a 128-bit IEEE 754 format that's
> >>> fixed
> >>> size. Since it doesn't ever call malloc, I thought it might prove
> >>> faster, and indeed it is. I haven't finished integrating it -- the
> >>> library doesn't provide formatted printing -- but it's far enough
> >>> along that it passes all of the engine and backend tests. Some
> >>> results:
> >>>
> >>> test-numeric, with NREPS increased to 20000 to get a reasonable
> >>> execution time for profiling: master     9645ms
> >>>
> >>>    mpDecimal 21410ms
> >>>    decNumber 12985ms
> >>>
> >>> test-lots:
> >>>    master      16300ms
> >>>    mpDecimal   20203ms
> >>>    decNumber   19044ms
> >>>
> >>> The first shows the relative speed in more or less pure
> >>> computation,
> >>> the latter shows the overall impact on one of the longer-running
> >>> tests that does a lot of other stuff.
> >>
> >> John,
> >>
> >> Thanks for implementing this and running the tests. The topic was
> >> last touched before my holidays so it took me a while to refresh
> >> my memory...
> >>
> >> decNumber clearly performs better, although both implementations
> >> lag on our current gnc_numeric performance.>>
> >>> I haven't investigated Christian's other suggestion of aggressive
> >>> rounding to eliminate the overflow issue to make room for larger
> >>> denominators, nor my original idea of replacing gnc_numeric with
> >>> boost::rational atop a multi-precision class (either boost::mp or
> >>> gmp).
> >>
> >> Do you still have plans for either ?
> >>
> >> I suppose aggressive rounding is orthogonal to the choice of data
> >> type. Christian's argument that we should round as is expected in
> >> the financial world makes sense to me but that argument does not
> >> imply any underlying data type.
> >>
> >> How about the boost::rational option ?
> >>
> >>> I have noticed that we're doing some dumb things with Scheme,
> >>> like using double as an intermediate when converting from Scheme
> >>> numbers to gnc_numeric (Scheme numbers are also rational, so the
> >>> conversion should be direct) and representing gnc_numerics as a
> >>> tuple
> >>> (num, denom) instead of just using Scheme rationals.
> >>
> >> Does this mean you see potential performance gains in this as we
> >> clean up the C<->Scheme number conversions ?>>
> >>> Neither will
> >>> work for decimal floats, of course; the whole class will have to
> >>> be
> >>> wrapped so that computation takes place in C++.
> >>
> >> Which means some performance drop again...
> >>
> >>> Storage in SQL is
> >>> also an issue,
> >>
> >> From the previous conversation I recall sqlite doesn't have a
> >> decimal type so we can't run calculating queries on it directly.
> >>
> >> But how about the other two: mysql and postsgresql. Is the decimal
> >> type you're using in your tests directly compatible with the
> >> decimal data types in mysql and postgresql, or compatible enough
> >> to convert automatically between them ?>>
> >>> as is maintaining backward file compatibility.
> >>>
> >>> Another issue is equality: In order to get tests to pass I've had
> >>> to
> >>> implement a fuzzy comparison where both numbers are first rounded
> >>> to
> >>> the smaller number of decimal places -- 2 fewer if there are 12 or
> >>> more -- and compared with two roundings, first truncation and
> >>> second
> >>> "bankers", and declared unequal only if they're unequal in both. I
> >>> hate this, but it seems to be necessary to obtain equality when
> >>> dealing with large divisors (as when computing prices or interest
> >>> rates). I suspect that we'd have to do something similar if we
> >>> pursue
> >>> aggressive rounding to avoid overflows, but the only way to know
> >>> for
> >>> certain is to try.
> >>
> >> Ugh. :(
> >>
> >> So what's the current balance ?
> >>
> >> I see following pros and cons of your tests so far:
> >>
> >> Pro:
> >> - using a decimal type gives us more precision
> >>
> >> Con:
> >> - sqlite doesn't have a decimal data type, so as it currently
> >> stands we can't run calculations in queries in that database type
> >> - we loose backward/forward compatibility with earlier versions of
> >> GnuCash - decNumber or mpDecimal are new dependencies
> >> - their performance is currently less than the original gnc_numeric
> >> - guile doesn't know of a decimal data type so we may need some
> >> conversion glue - equality is fuzzy
> >>
> >> Please add if I forgot arguments on either side.
> >>
> >> Arguably many of the con arguments can be solved. That will effort
> >> however. And I consider the first two more important than the
> >> others.
> >>
> >> So do you think the benefits (I assume there will be more than the
> >> one I mentioned) will outweigh the drawbacks ? Does the work that
> >> will go into it bring GnuCash enough value to continue on this
> >> track ?
> >>
> >> It's probably too early to tell for sure but I wanted to get your
> >> ideas based on what we have so far.>
> > Testing boost::rational is next on the agenda. My original idea was
> > to use it with boost::multiprecision or gmp, but I'd prefer
> > something that doesn't depend on heap allocations because it's so
> > much slower than stack allocation and must be passed by pointer,
> > which is a major change in the API -- meaning a ton of cleanup work
> > up front. I think I'll do a straight substitution of the existing
> > math128 with boost::rational<int64_t> just to see what happens.
> >
> > I think that part of implementing immediate rounding must include
> > constraining denominators to powers-of-ten. The main reason is that
> > it makes my head hurt when I try to think about how to do rounding
> > with arbitrary denominators. If you consider that a big chunk of
> > the overflow problems arise from denominators and divisors that are
> > large primes, it becomes quickly apparent that avoiding large prime
> > denominators might well resolve much of the problem. It's also true
> > that for real-world numbers, as opposed to free random-generated
> > numbers from tests, that all numbers have powers-of-ten
> > denominators. We'd still have many-digit-prime divisors  to deal
> > with, but constraining denominators gives us something to round to.
> > Does that make sense, or does it seem the rambling of a lunatic?
> > This really does make my head hurt.
> Boost::Rational is a serious disappointment. Boost::rational<int64_t>
> didn’t allow a significant increase in precision and is further
> hampered by not providing any overflow detection. Benchmarks of
> test-numeric with NREPS set to 20000 (the numbers are a bit different
> from before because I’m using my Mac Pro instead of my Mac Book Air,
> and because these are debug builds):
>
> Branch Tests Time
> master: 1187558 5346ms
> libmpdecimal: 1180076 8718ms
> boost-rational, cppint: 1187558 20903ms
> boost-rational, gmp: 1187558 34232ms
>
> cppint means boost::multiprecision::checked_cppint128_t, a 16-byte
> stack allocated multi-precision integer. “Checked” means that it
> throws std::overflow_error instead of wrapping.  Gmp means the Gnu
> Multiprecision library. It’s supposed to be faster than cppint, but
> its performance is killed by having to malloc everything. The fact
> that our own C code is substantially faster than any library I’ve
> tried is a tribute to Linas.
>
> There’s another wrinkle: Boost::Rational immediately reduces all
> numbers to what we called in my grade school “simplest form”, meaning
> no common factors between the numerator and denominator. This
> actually helps prevent overflows, but means that we have to be very
> careful to supply the SCU as the rounding denominator or we’ll get
> unexpected rounding results.  Boost::Rational provides no rounding
> function of its own so I rewrote gnc_numeric_convert into C++ using
> the overloaded operators from boost::multiprecision. That at least
> taught me about rounding arbitrary denominators, so my head doesn’t
> explode any more.
>
> The good news is that using 128-bit numbers for all internal
> representations along with aggressive reduction and a tweak to
> get_random_gnc_numeric() so that the actual number doesn’t exceed
> 1E13/1 and careful attention to rounding prevents overflow errors
> during testing, at least up through test-lots.
>
> Looking a bit more at rounding, it doesn’t appear to me that at 14 out
> of 151 gnc_numeric operations in the code base we’re over-using
> GNC_HOW_RND_NEVER. I’m not convinced that it would help much to
> eliminate those cases.
>
> It looks like the best solution is to work over our existing
> gnc-numeric with math128 implementation so that the internals are
> always 128-bit and we don’t declare overflows prematurely.
>
Thanks for the update and the elaborate testing.

So,... math128 is what we use now, using the rational representation of
numbers, do I get that right ? And the best option is to stick with it
and improve on it ? Would you still transform it into C++ so it becomes
an object with properties and members ?

Geert

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: libmpdecimal

John Ralls-2

On Sep 24, 2014, at 2:10 AM, Geert Janssens <[hidden email]> wrote:

> On Saturday 20 September 2014 18:21:44 John Ralls wrote:
>> On Aug 27, 2014, at 10:31 PM, John Ralls <[hidden email]> wrote:
>>> On Aug 27, 2014, at 8:32 AM, Geert Janssens <janssens-
> [hidden email]> wrote:
>>>> On Saturday 23 August 2014 18:01:15 John Ralls wrote:
>>>>> So, having gotten test-lots and all of the other tests working*
>>>>> with
>>>>> libmpdecimal, I studied the Intel library for several days and
>>>>> couldn't figure out how to make it work, so I decided to try the
>>>>> GCC
>>>>> implementation, which offers a 128-bit IEEE 754 format that's
>>>>> fixed
>>>>> size. Since it doesn't ever call malloc, I thought it might prove
>>>>> faster, and indeed it is. I haven't finished integrating it -- the
>>>>> library doesn't provide formatted printing -- but it's far enough
>>>>> along that it passes all of the engine and backend tests. Some
>>>>> results:
>>>>>
>>>>> test-numeric, with NREPS increased to 20000 to get a reasonable
>>>>> execution time for profiling: master     9645ms
>>>>>
>>>>>   mpDecimal 21410ms
>>>>>   decNumber 12985ms
>>>>>
>>>>> test-lots:
>>>>>   master      16300ms
>>>>>   mpDecimal   20203ms
>>>>>   decNumber   19044ms
>>>>>
>>>>> The first shows the relative speed in more or less pure
>>>>> computation,
>>>>> the latter shows the overall impact on one of the longer-running
>>>>> tests that does a lot of other stuff.
>>>>
>>>> John,
>>>>
>>>> Thanks for implementing this and running the tests. The topic was
>>>> last touched before my holidays so it took me a while to refresh
>>>> my memory...
>>>>
>>>> decNumber clearly performs better, although both implementations
>>>> lag on our current gnc_numeric performance.>>
>>>>> I haven't investigated Christian's other suggestion of aggressive
>>>>> rounding to eliminate the overflow issue to make room for larger
>>>>> denominators, nor my original idea of replacing gnc_numeric with
>>>>> boost::rational atop a multi-precision class (either boost::mp or
>>>>> gmp).
>>>>
>>>> Do you still have plans for either ?
>>>>
>>>> I suppose aggressive rounding is orthogonal to the choice of data
>>>> type. Christian's argument that we should round as is expected in
>>>> the financial world makes sense to me but that argument does not
>>>> imply any underlying data type.
>>>>
>>>> How about the boost::rational option ?
>>>>
>>>>> I have noticed that we're doing some dumb things with Scheme,
>>>>> like using double as an intermediate when converting from Scheme
>>>>> numbers to gnc_numeric (Scheme numbers are also rational, so the
>>>>> conversion should be direct) and representing gnc_numerics as a
>>>>> tuple
>>>>> (num, denom) instead of just using Scheme rationals.
>>>>
>>>> Does this mean you see potential performance gains in this as we
>>>> clean up the C<->Scheme number conversions ?>>
>>>>> Neither will
>>>>> work for decimal floats, of course; the whole class will have to
>>>>> be
>>>>> wrapped so that computation takes place in C++.
>>>>
>>>> Which means some performance drop again...
>>>>
>>>>> Storage in SQL is
>>>>> also an issue,
>>>>
>>>> From the previous conversation I recall sqlite doesn't have a
>>>> decimal type so we can't run calculating queries on it directly.
>>>>
>>>> But how about the other two: mysql and postsgresql. Is the decimal
>>>> type you're using in your tests directly compatible with the
>>>> decimal data types in mysql and postgresql, or compatible enough
>>>> to convert automatically between them ?>>
>>>>> as is maintaining backward file compatibility.
>>>>>
>>>>> Another issue is equality: In order to get tests to pass I've had
>>>>> to
>>>>> implement a fuzzy comparison where both numbers are first rounded
>>>>> to
>>>>> the smaller number of decimal places -- 2 fewer if there are 12 or
>>>>> more -- and compared with two roundings, first truncation and
>>>>> second
>>>>> "bankers", and declared unequal only if they're unequal in both. I
>>>>> hate this, but it seems to be necessary to obtain equality when
>>>>> dealing with large divisors (as when computing prices or interest
>>>>> rates). I suspect that we'd have to do something similar if we
>>>>> pursue
>>>>> aggressive rounding to avoid overflows, but the only way to know
>>>>> for
>>>>> certain is to try.
>>>>
>>>> Ugh. :(
>>>>
>>>> So what's the current balance ?
>>>>
>>>> I see following pros and cons of your tests so far:
>>>>
>>>> Pro:
>>>> - using a decimal type gives us more precision
>>>>
>>>> Con:
>>>> - sqlite doesn't have a decimal data type, so as it currently
>>>> stands we can't run calculations in queries in that database type
>>>> - we loose backward/forward compatibility with earlier versions of
>>>> GnuCash - decNumber or mpDecimal are new dependencies
>>>> - their performance is currently less than the original gnc_numeric
>>>> - guile doesn't know of a decimal data type so we may need some
>>>> conversion glue - equality is fuzzy
>>>>
>>>> Please add if I forgot arguments on either side.
>>>>
>>>> Arguably many of the con arguments can be solved. That will effort
>>>> however. And I consider the first two more important than the
>>>> others.
>>>>
>>>> So do you think the benefits (I assume there will be more than the
>>>> one I mentioned) will outweigh the drawbacks ? Does the work that
>>>> will go into it bring GnuCash enough value to continue on this
>>>> track ?
>>>>
>>>> It's probably too early to tell for sure but I wanted to get your
>>>> ideas based on what we have so far.>
>>> Testing boost::rational is next on the agenda. My original idea was
>>> to use it with boost::multiprecision or gmp, but I'd prefer
>>> something that doesn't depend on heap allocations because it's so
>>> much slower than stack allocation and must be passed by pointer,
>>> which is a major change in the API -- meaning a ton of cleanup work
>>> up front. I think I'll do a straight substitution of the existing
>>> math128 with boost::rational<int64_t> just to see what happens.
>>>
>>> I think that part of implementing immediate rounding must include
>>> constraining denominators to powers-of-ten. The main reason is that
>>> it makes my head hurt when I try to think about how to do rounding
>>> with arbitrary denominators. If you consider that a big chunk of
>>> the overflow problems arise from denominators and divisors that are
>>> large primes, it becomes quickly apparent that avoiding large prime
>>> denominators might well resolve much of the problem. It's also true
>>> that for real-world numbers, as opposed to free random-generated
>>> numbers from tests, that all numbers have powers-of-ten
>>> denominators. We'd still have many-digit-prime divisors  to deal
>>> with, but constraining denominators gives us something to round to.
>>> Does that make sense, or does it seem the rambling of a lunatic?
>>> This really does make my head hurt.
>> Boost::Rational is a serious disappointment. Boost::rational<int64_t>
>> didn’t allow a significant increase in precision and is further
>> hampered by not providing any overflow detection. Benchmarks of
>> test-numeric with NREPS set to 20000 (the numbers are a bit different
>> from before because I’m using my Mac Pro instead of my Mac Book Air,
>> and because these are debug builds):
>>
>> Branch Tests Time
>> master: 1187558 5346ms
>> libmpdecimal: 1180076 8718ms
>> boost-rational, cppint: 1187558 20903ms
>> boost-rational, gmp: 1187558 34232ms
>>
>> cppint means boost::multiprecision::checked_cppint128_t, a 16-byte
>> stack allocated multi-precision integer. “Checked” means that it
>> throws std::overflow_error instead of wrapping.  Gmp means the Gnu
>> Multiprecision library. It’s supposed to be faster than cppint, but
>> its performance is killed by having to malloc everything. The fact
>> that our own C code is substantially faster than any library I’ve
>> tried is a tribute to Linas.
>>
>> There’s another wrinkle: Boost::Rational immediately reduces all
>> numbers to what we called in my grade school “simplest form”, meaning
>> no common factors between the numerator and denominator. This
>> actually helps prevent overflows, but means that we have to be very
>> careful to supply the SCU as the rounding denominator or we’ll get
>> unexpected rounding results.  Boost::Rational provides no rounding
>> function of its own so I rewrote gnc_numeric_convert into C++ using
>> the overloaded operators from boost::multiprecision. That at least
>> taught me about rounding arbitrary denominators, so my head doesn’t
>> explode any more.
>>
>> The good news is that using 128-bit numbers for all internal
>> representations along with aggressive reduction and a tweak to
>> get_random_gnc_numeric() so that the actual number doesn’t exceed
>> 1E13/1 and careful attention to rounding prevents overflow errors
>> during testing, at least up through test-lots.
>>
>> Looking a bit more at rounding, it doesn’t appear to me that at 14 out
>> of 151 gnc_numeric operations in the code base we’re over-using
>> GNC_HOW_RND_NEVER. I’m not convinced that it would help much to
>> eliminate those cases.
>>
>> It looks like the best solution is to work over our existing
>> gnc-numeric with math128 implementation so that the internals are
>> always 128-bit and we don’t declare overflows prematurely.
>>
> Thanks for the update and the elaborate testing.
>
> So,... math128 is what we use now, using the rational representation of
> numbers, do I get that right ? And the best option is to stick with it
> and improve on it ? Would you still transform it into C++ so it becomes
> an object with properties and members ?

Yes to all.

Regards,
John Ralls


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: libmpdecimal

Aaron Laws
In reply to this post by John Ralls-2
Dear Mr. Ralls,

Excellent work! I'm happy to hear the results, although with you I'm
disappointed that boost::rational didn't bring something valuable to the
table. I look forward to getting to know that code some day...on an
as-needed basis!


In Christ,
Aaron Laws

On Sat, Sep 20, 2014 at 9:21 PM, John Ralls <[hidden email]> wrote:

>
> On Aug 27, 2014, at 10:31 PM, John Ralls <[hidden email]> wrote:
>
> >
> > On Aug 27, 2014, at 8:32 AM, Geert Janssens <[hidden email]>
> wrote:
> >
> >> On Saturday 23 August 2014 18:01:15 John Ralls wrote:
> >>> So, having gotten test-lots and all of the other tests working* with
> >>> libmpdecimal, I studied the Intel library for several days and
> >>> couldn't figure out how to make it work, so I decided to try the GCC
> >>> implementation, which offers a 128-bit IEEE 754 format that's fixed
> >>> size. Since it doesn't ever call malloc, I thought it might prove
> >>> faster, and indeed it is. I haven't finished integrating it -- the
> >>> library doesn't provide formatted printing -- but it's far enough
> >>> along that it passes all of the engine and backend tests. Some
> >>> results:
> >>>
> >>> test-numeric, with NREPS increased to 20000 to get a reasonable
> >>> execution time for profiling: master     9645ms
> >>>    mpDecimal 21410ms
> >>>    decNumber 12985ms
> >>>
> >>> test-lots:
> >>>    master      16300ms
> >>>    mpDecimal   20203ms
> >>>    decNumber   19044ms
> >>>
> >>
> >>> The first shows the relative speed in more or less pure computation,
> >>> the latter shows the overall impact on one of the longer-running
> >>> tests that does a lot of other stuff.
> >> John,
> >>
> >> Thanks for implementing this and running the tests. The topic was last
> touched before my holidays so it took me a while to refresh my memory...
> >>
> >> decNumber clearly performs better, although both implementations lag on
> our current gnc_numeric performance.
> >>
> >>>
> >>> I haven't investigated Christian's other suggestion of aggressive
> >>> rounding to eliminate the overflow issue to make room for larger
> >>> denominators, nor my original idea of replacing gnc_numeric with
> >>> boost::rational atop a multi-precision class (either boost::mp or
> >>> gmp).
> >> Do you still have plans for either ?
> >>
> >> I suppose aggressive rounding is orthogonal to the choice of data type.
> Christian's argument that we should round as is expected in the financial
> world makes sense to me but that argument does not imply any underlying
> data type.
> >>
> >> How about the boost::rational option ?
> >>
> >>> I have noticed that we're doing some dumb things with Scheme,
> >>> like using double as an intermediate when converting from Scheme
> >>> numbers to gnc_numeric (Scheme numbers are also rational, so the
> >>> conversion should be direct) and representing gnc_numerics as a tuple
> >>> (num, denom) instead of just using Scheme rationals.
> >> Does this mean you see potential performance gains in this as we clean
> up the C<->Scheme number conversions ?
> >>
> >>> Neither will
> >>> work for decimal floats, of course; the whole class will have to be
> >>> wrapped so that computation takes place in C++.
> >> Which means some performance drop again...
> >>
> >>> Storage in SQL is
> >>> also an issue,
> >> From the previous conversation I recall sqlite doesn't have a decimal
> type so we can't run calculating queries on it directly.
> >>
> >> But how about the other two: mysql and postsgresql. Is the decimal type
> you're using in your tests directly compatible with the decimal data types
> in mysql and postgresql, or compatible enough to convert automatically
> between them ?
> >>
> >>> as is maintaining backward file compatibility.
> >>>
> >>> Another issue is equality: In order to get tests to pass I've had to
> >>> implement a fuzzy comparison where both numbers are first rounded to
> >>> the smaller number of decimal places -- 2 fewer if there are 12 or
> >>> more -- and compared with two roundings, first truncation and second
> >>> "bankers", and declared unequal only if they're unequal in both. I
> >>> hate this, but it seems to be necessary to obtain equality when
> >>> dealing with large divisors (as when computing prices or interest
> >>> rates). I suspect that we'd have to do something similar if we pursue
> >>> aggressive rounding to avoid overflows, but the only way to know for
> >>> certain is to try.
> >> Ugh. :(
> >>
> >> So what's the current balance ?
> >>
> >> I see following pros and cons of your tests so far:
> >>
> >> Pro:
> >> - using a decimal type gives us more precision
> >>
> >> Con:
> >> - sqlite doesn't have a decimal data type, so as it currently stands we
> can't run calculations in queries in that database type
> >> - we loose backward/forward compatibility with earlier versions of
> GnuCash
> >> - decNumber or mpDecimal are new dependencies
> >> - their performance is currently less than the original gnc_numeric
> >> - guile doesn't know of a decimal data type so we may need some
> conversion glue
> >> - equality is fuzzy
> >>
> >> Please add if I forgot arguments on either side.
> >>
> >> Arguably many of the con arguments can be solved. That will effort
> however. And I consider the first two more important than the others.
> >>
> >> So do you think the benefits (I assume there will be more than the one
> I mentioned) will outweigh the drawbacks ? Does the work that will go into
> it bring GnuCash enough value to continue on this track ?
> >>
> >> It's probably too early to tell for sure but I wanted to get your ideas
> based on what we have so far.
> >
> > Testing boost::rational is next on the agenda. My original idea was to
> use it with boost::multiprecision or gmp, but I'd prefer something that
> doesn't depend on heap allocations because it's so much slower than stack
> allocation and must be passed by pointer, which is a major change in the
> API -- meaning a ton of cleanup work up front. I think I'll do a straight
> substitution of the existing math128 with boost::rational<int64_t> just to
> see what happens.
> >
> > I think that part of implementing immediate rounding must include
> constraining denominators to powers-of-ten. The main reason is that it
> makes my head hurt when I try to think about how to do rounding with
> arbitrary denominators. If you consider that a big chunk of the overflow
> problems arise from denominators and divisors that are large primes, it
> becomes quickly apparent that avoiding large prime denominators might well
> resolve much of the problem. It's also true that for real-world numbers, as
> opposed to free random-generated numbers from tests, that all numbers have
> powers-of-ten denominators. We'd still have many-digit-prime divisors  to
> deal with, but constraining denominators gives us something to round to.
> Does that make sense, or does it seem the rambling of a lunatic? This
> really does make my head hurt.
>
> Boost::Rational is a serious disappointment. Boost::rational<int64_t>
> didn’t allow a significant increase in precision and is further hampered by
> not providing any overflow detection. Benchmarks of test-numeric with NREPS
> set to 20000 (the numbers are a bit different from before because I’m using
> my Mac Pro instead of my Mac Book Air, and because these are debug builds):
>
> Branch                  Tests           Time
> master:                 1187558          5346ms
> libmpdecimal:           1180076          8718ms
> boost-rational, cppint: 1187558         20903ms
> boost-rational, gmp:    1187558         34232ms
>
> cppint means boost::multiprecision::checked_cppint128_t, a 16-byte stack
> allocated multi-precision integer. “Checked” means that it throws
> std::overflow_error instead of wrapping.  Gmp means the Gnu Multiprecision
> library. It’s supposed to be faster than cppint, but its performance is
> killed by having to malloc everything. The fact that our own C code is
> substantially faster than any library I’ve tried is a tribute to Linas.
>
> There’s another wrinkle: Boost::Rational immediately reduces all numbers
> to what we called in my grade school “simplest form”, meaning no common
> factors between the numerator and denominator. This actually helps prevent
> overflows, but means that we have to be very careful to supply the SCU as
> the rounding denominator or we’ll get unexpected rounding results.
> Boost::Rational provides no rounding function of its own so I rewrote
> gnc_numeric_convert into C++ using the overloaded operators from
> boost::multiprecision. That at least taught me about rounding arbitrary
> denominators, so my head doesn’t explode any more.
>
> The good news is that using 128-bit numbers for all internal
> representations along with aggressive reduction and a tweak to
> get_random_gnc_numeric() so that the actual number doesn’t exceed 1E13/1
> and careful attention to rounding prevents overflow errors during testing,
> at least up through test-lots.
>
> Looking a bit more at rounding, it doesn’t appear to me that at 14 out of
> 151 gnc_numeric operations in the code base we’re over-using
> GNC_HOW_RND_NEVER. I’m not convinced that it would help much to eliminate
> those cases.
>
> It looks like the best solution is to work over our existing gnc-numeric
> with math128 implementation so that the internals are always 128-bit and we
> don’t declare overflows prematurely.
>
> But first it’s time to squash some bugs before next week’s release.
>
> Regards,
> John Ralls
>
>
> _______________________________________________
> gnucash-devel mailing list
> [hidden email]
> https://lists.gnucash.org/mailman/listinfo/gnucash-devel
>
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel