[GNC-dev] Convert Imap to Flat

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[GNC-dev] Convert Imap to Flat

Robert Fewell-2
Hi,
I was poking around with the CSV importer and I noticed the following, this
may also be an issue with other importers on first use after creating a new
file...
With a new empty xml file, I used a one line transaction csv file with
appropriate settings and the 'Account' set to 'Assets:Current
Assets:Checking Account' and observed the following when I got to the match
page...

With the 'Checking Account' register open it would partly show the imported
transactions, (this can be fixed by suspending GUI changes which I was
going to propose in a PR) and the register would reload seven times.
This reloading is caused by the triggering of the
'imap_convert_bayes_to_flat' function and as it is a new file you would not
expect it to do any thing but it does. Adding a few print statements I get
the following...

matchmap_find_destination
imap_convert_bayes_to_flat
 convert_imap_account_bayes_to_flat 'Assets'
  gnc_split_register_load called for account 'Assets:Current
Assets:Checking Account' with list of 1
 convert_imap_account_bayes_to_flat 'Current Assets'
  gnc_split_register_load called for account 'Assets:Current
Assets:Checking Account' with list of 1
 convert_imap_account_bayes_to_flat 'Checking Account'
  gnc_split_register_load called for account 'Assets:Current
Assets:Checking Account' with list of 1
 convert_imap_account_bayes_to_flat 'Liabilities'
  gnc_split_register_load called for account 'Assets:Current
Assets:Checking Account' with list of 1
 convert_imap_account_bayes_to_flat 'Income'
  gnc_split_register_load called for account 'Assets:Current
Assets:Checking Account' with list of 1
 convert_imap_account_bayes_to_flat 'Expenses'
  gnc_split_register_load called for account 'Assets:Current
Assets:Checking Account' with list of 1
 convert_imap_account_bayes_to_flat 'Equity'
  gnc_split_register_load called for account 'Assets:Current
Assets:Checking Account' with list of 1

As you can see, seven accounts get updated forcing the register reload
seven times, (not sure why those other accounts force a reload either), and
this gets even worse if this first import is 100 transactions which would
equate to 700 reloads. I have not worked out why all these accounts are
updated or why after the first pass the converted flag is not set/noticed
there by eliminating the convert for the rest of the transactions, it only
seems to be noticed on subsequent imports.

You also get this behaviour if you start the 'Import Map Dialogue' which
may be the source of a report about that dialogue freezing but that needs
more investigating.

Any idea why these accounts are updated and why it runs on every import
transaction row ?

Regards,
   Bob
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Convert Imap to Flat

Geert Janssens-4
Op maandag 5 november 2018 17:07:25 CET schreef Robert Fewell:

> Hi,
> I was poking around with the CSV importer and I noticed the following, this
> may also be an issue with other importers on first use after creating a new
> file...
> With a new empty xml file, I used a one line transaction csv file with
> appropriate settings and the 'Account' set to 'Assets:Current
> Assets:Checking Account' and observed the following when I got to the match
> page...
>
> With the 'Checking Account' register open it would partly show the imported
> transactions, (this can be fixed by suspending GUI changes which I was
> going to propose in a PR) and the register would reload seven times.
> This reloading is caused by the triggering of the
> 'imap_convert_bayes_to_flat' function and as it is a new file you would not
> expect it to do any thing but it does. Adding a few print statements I get
> the following...
>
> matchmap_find_destination
> imap_convert_bayes_to_flat
>  convert_imap_account_bayes_to_flat 'Assets'
>   gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>  convert_imap_account_bayes_to_flat 'Current Assets'
>   gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>  convert_imap_account_bayes_to_flat 'Checking Account'
>   gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>  convert_imap_account_bayes_to_flat 'Liabilities'
>   gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>  convert_imap_account_bayes_to_flat 'Income'
>   gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>  convert_imap_account_bayes_to_flat 'Expenses'
>   gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>  convert_imap_account_bayes_to_flat 'Equity'
>   gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>
> As you can see, seven accounts get updated forcing the register reload
> seven times, (not sure why those other accounts force a reload either), and
> this gets even worse if this first import is 100 transactions which would
> equate to 700 reloads. I have not worked out why all these accounts are
> updated or why after the first pass the converted flag is not set/noticed
> there by eliminating the convert for the rest of the transactions, it only
> seems to be noticed on subsequent imports.
>
> You also get this behaviour if you start the 'Import Map Dialogue' which
> may be the source of a report about that dialogue freezing but that needs
> more investigating.
>
This calls gnc_account_imap_get_info_bayes, which also calls
imap_convert_bayes_to_flat so the it will trigger the same account refreshes.

> Any idea why these accounts are updated and why it runs on every import
> transaction row ?

Why the accounts are updated: while only a run in the debugger will verify it,
this is what I have gathered from reading the code:

imap_convert_bayes_to_flat's sub functions will call xaccAccountBeginEdit and
xaccAccountCommitEdit at some point. This happens because it changes the
account's kvp frames that store the import maps.

On the other side, the register code has set a watch on the register's
account(s) via the component manager. So each time the account signals a
change (or more precisely a successful run of xaccAccountCommitEdit) the
component manager will tell the register to refresh itself.

As you suggest you can probably disable this by a call to
gnc_suspend_gui_refresh.

Why it runs on every import transaction row ? I suspect this is because there
are no imap records stored yet and hence the feature flag that blocks the
conversion is not set yet. So for each transaction it will try to do the
conversion, find there's no converted imap record to store and skip setting
the feature flag. This will probably continue forever if the user doesn't use
bayesian matching at all.
This is a difficult issue to solve. We don't want to set the flat_bayes
conversion flag if there are no bayes maps because that would needlessly break
backwards compatibility. We could make the conversion code more careful and
have it only commit to accounts if there really are changes to commit. And add
a run time flag that signals the conversion has run already once. With that
conversion should only start if this flag is false AND the feature flag is
false.
However it may all be unnecessary and perhaps just blocking register updates
while the transaction matching (or the whole import code) is running may
eliminate the performance bottleneck already.
So I would start with that: block gui updates.
Then secondly, make the imap code more careful not to do unnecessary account
updates and lastly consider a run time flag.

Like you though I suspect this may be at least one cause of the import issues
we see. Thanks for poking at it!

Regards,

Geert


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Convert Imap to Flat

GnuCash - Dev mailing list
In reply to this post by Robert Fewell-2
On 05/11/2018 16:07, Robert Fewell wrote:

> Hi,
> I was poking around with the CSV importer and I noticed the following, this
> may also be an issue with other importers on first use after creating a new
> file...
> With a new empty xml file, I used a one line transaction csv file with
> appropriate settings and the 'Account' set to 'Assets:Current
> Assets:Checking Account' and observed the following when I got to the match
> page...
>
> With the 'Checking Account' register open it would partly show the imported
> transactions, (this can be fixed by suspending GUI changes which I was
> going to propose in a PR) and the register would reload seven times.
> This reloading is caused by the triggering of the
> 'imap_convert_bayes_to_flat' function and as it is a new file you would not
> expect it to do any thing but it does. Adding a few print statements I get
> the following...
>
> matchmap_find_destination
> imap_convert_bayes_to_flat
>   convert_imap_account_bayes_to_flat 'Assets'
>    gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>   convert_imap_account_bayes_to_flat 'Current Assets'
>    gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>   convert_imap_account_bayes_to_flat 'Checking Account'
>    gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>   convert_imap_account_bayes_to_flat 'Liabilities'
>    gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>   convert_imap_account_bayes_to_flat 'Income'
>    gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>   convert_imap_account_bayes_to_flat 'Expenses'
>    gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>   convert_imap_account_bayes_to_flat 'Equity'
>    gnc_split_register_load called for account 'Assets:Current
> Assets:Checking Account' with list of 1
>
> As you can see, seven accounts get updated forcing the register reload
> seven times, (not sure why those other accounts force a reload either), and
> this gets even worse if this first import is 100 transactions which would
> equate to 700 reloads. I have not worked out why all these accounts are
> updated or why after the first pass the converted flag is not set/noticed
> there by eliminating the convert for the rest of the transactions, it only
> seems to be noticed on subsequent imports.
>
> You also get this behaviour if you start the 'Import Map Dialogue' which
> may be the source of a report about that dialogue freezing but that needs
> more investigating.
>
> Any idea why these accounts are updated and why it runs on every import
> transaction row ?

It is apparently a "good thing" because it prevents stupid people from
doing multiple imports from their bank accounts and actually believing
they have more or less money than they actually have.

That was the explanation given to me, grrr
--
Wm


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Convert Imap to Flat

GnuCash - Dev mailing list
In reply to this post by Geert Janssens-4
On 05/11/2018 17:50, Geert Janssens wrote:

> Like you though I suspect this may be at least one cause of the import issues
> we see. Thanks for poking at it!

this is a basic issue, people aren't allowed to decide if their tx are
new or not
--
Wm



_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Convert Imap to Flat

Geert Janssens-4
Op maandag 5 november 2018 21:03:15 CET schreef Wm via gnucash-devel:
> On 05/11/2018 17:50, Geert Janssens wrote:
> > Like you though I suspect this may be at least one cause of the import
> > issues we see. Thanks for poking at it!
>
> this is a basic issue, people aren't allowed to decide if their tx are
> new or not

No, the bayesian matcher is only meant to suggest possible matches. The user
is definitely allowed to correct the suggested results.

The issue Bob brings up happens earlier in the process and is a bug.

Regards,

Geert


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Convert Imap to Flat

Geert Janssens-4
In reply to this post by GnuCash - Dev mailing list
Op maandag 5 november 2018 20:23:40 CET schreef Wm via gnucash-devel:

> On 05/11/2018 16:07, Robert Fewell wrote:
> > Hi,
> > I was poking around with the CSV importer and I noticed the following,
> > this
> > may also be an issue with other importers on first use after creating a
> > new
> > file...
> > With a new empty xml file, I used a one line transaction csv file with
> > appropriate settings and the 'Account' set to 'Assets:Current
> > Assets:Checking Account' and observed the following when I got to the
> > match
> > page...
> >
> > With the 'Checking Account' register open it would partly show the
> > imported
> > transactions, (this can be fixed by suspending GUI changes which I was
> > going to propose in a PR) and the register would reload seven times.
> > This reloading is caused by the triggering of the
> > 'imap_convert_bayes_to_flat' function and as it is a new file you would
> > not
> > expect it to do any thing but it does. Adding a few print statements I get
> > the following...
> >
> > matchmap_find_destination
> > imap_convert_bayes_to_flat
> >
> >   convert_imap_account_bayes_to_flat 'Assets'
> >  
> >    gnc_split_register_load called for account 'Assets:Current
> >
> > Assets:Checking Account' with list of 1
> >
> >   convert_imap_account_bayes_to_flat 'Current Assets'
> >  
> >    gnc_split_register_load called for account 'Assets:Current
> >
> > Assets:Checking Account' with list of 1
> >
> >   convert_imap_account_bayes_to_flat 'Checking Account'
> >  
> >    gnc_split_register_load called for account 'Assets:Current
> >
> > Assets:Checking Account' with list of 1
> >
> >   convert_imap_account_bayes_to_flat 'Liabilities'
> >  
> >    gnc_split_register_load called for account 'Assets:Current
> >
> > Assets:Checking Account' with list of 1
> >
> >   convert_imap_account_bayes_to_flat 'Income'
> >  
> >    gnc_split_register_load called for account 'Assets:Current
> >
> > Assets:Checking Account' with list of 1
> >
> >   convert_imap_account_bayes_to_flat 'Expenses'
> >  
> >    gnc_split_register_load called for account 'Assets:Current
> >
> > Assets:Checking Account' with list of 1
> >
> >   convert_imap_account_bayes_to_flat 'Equity'
> >  
> >    gnc_split_register_load called for account 'Assets:Current
> >
> > Assets:Checking Account' with list of 1
> >
> > As you can see, seven accounts get updated forcing the register reload
> > seven times, (not sure why those other accounts force a reload either),
> > and
> > this gets even worse if this first import is 100 transactions which would
> > equate to 700 reloads. I have not worked out why all these accounts are
> > updated or why after the first pass the converted flag is not set/noticed
> > there by eliminating the convert for the rest of the transactions, it only
> > seems to be noticed on subsequent imports.
> >
> > You also get this behaviour if you start the 'Import Map Dialogue' which
> > may be the source of a report about that dialogue freezing but that needs
> > more investigating.
> >
> > Any idea why these accounts are updated and why it runs on every import
> > transaction row ?
>
> It is apparently a "good thing" because it prevents stupid people from
> doing multiple imports from their bank accounts and actually believing
> they have more or less money than they actually have.
>
> That was the explanation given to me, grrr

We seem to be talking at different levels here. Bob question refers to a
screen refresh that happens even when accounts don't change.

You seem to be referring to the part of the importer that tries to match
import data against existing transactions. In some circumstances gnucash
freezes or crashes before this completes and the user is asked to review what
is found. Obviously this should not happen.

Geert


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Convert Imap to Flat

John Ralls
In reply to this post by Geert Janssens-4


> On Nov 6, 2018, at 2:50 AM, Geert Janssens <[hidden email]> wrote:
>
> Op maandag 5 november 2018 17:07:25 CET schreef Robert Fewell:
>> Hi,
>> I was poking around with the CSV importer and I noticed the following, this
>> may also be an issue with other importers on first use after creating a new
>> file...
>> With a new empty xml file, I used a one line transaction csv file with
>> appropriate settings and the 'Account' set to 'Assets:Current
>> Assets:Checking Account' and observed the following when I got to the match
>> page...
>>
>> With the 'Checking Account' register open it would partly show the imported
>> transactions, (this can be fixed by suspending GUI changes which I was
>> going to propose in a PR) and the register would reload seven times.
>> This reloading is caused by the triggering of the
>> 'imap_convert_bayes_to_flat' function and as it is a new file you would not
>> expect it to do any thing but it does. Adding a few print statements I get
>> the following...
>>
>> matchmap_find_destination
>> imap_convert_bayes_to_flat
>> convert_imap_account_bayes_to_flat 'Assets'
>>  gnc_split_register_load called for account 'Assets:Current
>> Assets:Checking Account' with list of 1
>> convert_imap_account_bayes_to_flat 'Current Assets'
>>  gnc_split_register_load called for account 'Assets:Current
>> Assets:Checking Account' with list of 1
>> convert_imap_account_bayes_to_flat 'Checking Account'
>>  gnc_split_register_load called for account 'Assets:Current
>> Assets:Checking Account' with list of 1
>> convert_imap_account_bayes_to_flat 'Liabilities'
>>  gnc_split_register_load called for account 'Assets:Current
>> Assets:Checking Account' with list of 1
>> convert_imap_account_bayes_to_flat 'Income'
>>  gnc_split_register_load called for account 'Assets:Current
>> Assets:Checking Account' with list of 1
>> convert_imap_account_bayes_to_flat 'Expenses'
>>  gnc_split_register_load called for account 'Assets:Current
>> Assets:Checking Account' with list of 1
>> convert_imap_account_bayes_to_flat 'Equity'
>>  gnc_split_register_load called for account 'Assets:Current
>> Assets:Checking Account' with list of 1
>>
>> As you can see, seven accounts get updated forcing the register reload
>> seven times, (not sure why those other accounts force a reload either), and
>> this gets even worse if this first import is 100 transactions which would
>> equate to 700 reloads. I have not worked out why all these accounts are
>> updated or why after the first pass the converted flag is not set/noticed
>> there by eliminating the convert for the rest of the transactions, it only
>> seems to be noticed on subsequent imports.
>>
>> You also get this behaviour if you start the 'Import Map Dialogue' which
>> may be the source of a report about that dialogue freezing but that needs
>> more investigating.
>>
> This calls gnc_account_imap_get_info_bayes, which also calls
> imap_convert_bayes_to_flat so the it will trigger the same account refreshes.
>
>> Any idea why these accounts are updated and why it runs on every import
>> transaction row ?
>
> Why the accounts are updated: while only a run in the debugger will verify it,
> this is what I have gathered from reading the code:
>
> imap_convert_bayes_to_flat's sub functions will call xaccAccountBeginEdit and
> xaccAccountCommitEdit at some point. This happens because it changes the
> account's kvp frames that store the import maps.
>
> On the other side, the register code has set a watch on the register's
> account(s) via the component manager. So each time the account signals a
> change (or more precisely a successful run of xaccAccountCommitEdit) the
> component manager will tell the register to refresh itself.
>
> As you suggest you can probably disable this by a call to
> gnc_suspend_gui_refresh.
>
> Why it runs on every import transaction row ? I suspect this is because there
> are no imap records stored yet and hence the feature flag that blocks the
> conversion is not set yet. So for each transaction it will try to do the
> conversion, find there's no converted imap record to store and skip setting
> the feature flag. This will probably continue forever if the user doesn't use
> bayesian matching at all.
> This is a difficult issue to solve. We don't want to set the flat_bayes
> conversion flag if there are no bayes maps because that would needlessly break
> backwards compatibility. We could make the conversion code more careful and
> have it only commit to accounts if there really are changes to commit. And add
> a run time flag that signals the conversion has run already once. With that
> conversion should only start if this flag is false AND the feature flag is
> false.
> However it may all be unnecessary and perhaps just blocking register updates
> while the transaction matching (or the whole import code) is running may
> eliminate the performance bottleneck already.
> So I would start with that: block gui updates.
> Then secondly, make the imap code more careful not to do unnecessary account
> updates and lastly consider a run time flag.
>
> Like you though I suspect this may be at least one cause of the import issues
> we see. Thanks for poking at it!

I think that we do want to set the feature flag in this case: The import map uses the flat layout and a pre-flat-bayes GnuCash won’t be able to read it, causing user frustration.

That’s not to say that we shouldn’t also do everything we can to speed up the code--it’s really slow--and certainly suspending UI events while computing the import matches is a good idea.

There’s something else going wrong, though: convert_imap_bayes_to_flat calls xaccAccountBeginEdit at the start and xaccAccountCommitEdit at the end. That should raise the edit level and prevent any interior calls to xaccAccountCommitEdit from doing anything.

Regards,
John Ralls

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Convert Imap to Flat

Geert Janssens-4
Op maandag 5 november 2018 23:27:31 CET schreef John Ralls:

> > On Nov 6, 2018, at 2:50 AM, Geert Janssens <[hidden email]>
> > wrote:>
> > Op maandag 5 november 2018 17:07:25 CET schreef Robert Fewell:
> >> Hi,
> >> I was poking around with the CSV importer and I noticed the following,
> >> this
> >> may also be an issue with other importers on first use after creating a
> >> new
> >> file...
> >> With a new empty xml file, I used a one line transaction csv file with
> >> appropriate settings and the 'Account' set to 'Assets:Current
> >> Assets:Checking Account' and observed the following when I got to the
> >> match
> >> page...
> >>
> >> With the 'Checking Account' register open it would partly show the
> >> imported
> >> transactions, (this can be fixed by suspending GUI changes which I was
> >> going to propose in a PR) and the register would reload seven times.
> >> This reloading is caused by the triggering of the
> >> 'imap_convert_bayes_to_flat' function and as it is a new file you would
> >> not
> >> expect it to do any thing but it does. Adding a few print statements I
> >> get
> >> the following...
> >>
> >> matchmap_find_destination
> >> imap_convert_bayes_to_flat
> >> convert_imap_account_bayes_to_flat 'Assets'
> >>
> >>  gnc_split_register_load called for account 'Assets:Current
> >>
> >> Assets:Checking Account' with list of 1
> >> convert_imap_account_bayes_to_flat 'Current Assets'
> >>
> >>  gnc_split_register_load called for account 'Assets:Current
> >>
> >> Assets:Checking Account' with list of 1
> >> convert_imap_account_bayes_to_flat 'Checking Account'
> >>
> >>  gnc_split_register_load called for account 'Assets:Current
> >>
> >> Assets:Checking Account' with list of 1
> >> convert_imap_account_bayes_to_flat 'Liabilities'
> >>
> >>  gnc_split_register_load called for account 'Assets:Current
> >>
> >> Assets:Checking Account' with list of 1
> >> convert_imap_account_bayes_to_flat 'Income'
> >>
> >>  gnc_split_register_load called for account 'Assets:Current
> >>
> >> Assets:Checking Account' with list of 1
> >> convert_imap_account_bayes_to_flat 'Expenses'
> >>
> >>  gnc_split_register_load called for account 'Assets:Current
> >>
> >> Assets:Checking Account' with list of 1
> >> convert_imap_account_bayes_to_flat 'Equity'
> >>
> >>  gnc_split_register_load called for account 'Assets:Current
> >>
> >> Assets:Checking Account' with list of 1
> >>
> >> As you can see, seven accounts get updated forcing the register reload
> >> seven times, (not sure why those other accounts force a reload either),
> >> and
> >> this gets even worse if this first import is 100 transactions which would
> >> equate to 700 reloads. I have not worked out why all these accounts are
> >> updated or why after the first pass the converted flag is not set/noticed
> >> there by eliminating the convert for the rest of the transactions, it
> >> only
> >> seems to be noticed on subsequent imports.
> >>
> >> You also get this behaviour if you start the 'Import Map Dialogue' which
> >> may be the source of a report about that dialogue freezing but that needs
> >> more investigating.
> >
> > This calls gnc_account_imap_get_info_bayes, which also calls
> > imap_convert_bayes_to_flat so the it will trigger the same account
> > refreshes.>
> >> Any idea why these accounts are updated and why it runs on every import
> >> transaction row ?
> >
> > Why the accounts are updated: while only a run in the debugger will verify
> > it, this is what I have gathered from reading the code:
> >
> > imap_convert_bayes_to_flat's sub functions will call xaccAccountBeginEdit
> > and xaccAccountCommitEdit at some point. This happens because it changes
> > the account's kvp frames that store the import maps.
> >
> > On the other side, the register code has set a watch on the register's
> > account(s) via the component manager. So each time the account signals a
> > change (or more precisely a successful run of xaccAccountCommitEdit) the
> > component manager will tell the register to refresh itself.
> >
> > As you suggest you can probably disable this by a call to
> > gnc_suspend_gui_refresh.
> >
> > Why it runs on every import transaction row ? I suspect this is because
> > there are no imap records stored yet and hence the feature flag that
> > blocks the conversion is not set yet. So for each transaction it will try
> > to do the conversion, find there's no converted imap record to store and
> > skip setting the feature flag. This will probably continue forever if the
> > user doesn't use bayesian matching at all.
> > This is a difficult issue to solve. We don't want to set the flat_bayes
> > conversion flag if there are no bayes maps because that would needlessly
> > break backwards compatibility. We could make the conversion code more
> > careful and have it only commit to accounts if there really are changes
> > to commit. And add a run time flag that signals the conversion has run
> > already once. With that conversion should only start if this flag is
> > false AND the feature flag is false.
> > However it may all be unnecessary and perhaps just blocking register
> > updates while the transaction matching (or the whole import code) is
> > running may eliminate the performance bottleneck already.
> > So I would start with that: block gui updates.
> > Then secondly, make the imap code more careful not to do unnecessary
> > account updates and lastly consider a run time flag.
> >
> > Like you though I suspect this may be at least one cause of the import
> > issues we see. Thanks for poking at it!
>
> I think that we do want to set the feature flag in this case: The import map
> uses the flat layout and a pre-flat-bayes GnuCash won’t be able to read it,
> causing user frustration.

The timing matters. We want to set the flag from the first flat_bayes_imap
entry that's actually written to file, not at the start of a conversion run.
That's how the code is constructed now.

I don't want the flag to be set if at the end of a conversion attempt there
still aren't any flat_bayes_imap entries. That would needlessly break
backwards compatibility.
>
> That’s not to say that we shouldn’t also do everything we can to speed up
> the code--it’s really slow--and certainly suspending UI events while
> computing the import matches is a good idea.
>
> There’s something else going wrong, though: convert_imap_bayes_to_flat calls
> xaccAccountBeginEdit at the start and xaccAccountCommitEdit at the end.
> That should raise the edit level and prevent any interior calls to
> xaccAccountCommitEdit from doing anything.

This doesn't look wrong per se to me.
convert_imap_account_bayes_to_flat is run once for each account. So there will
still be the same number of gui refreshes as there are accounts. I don't think
we can reduce this further on the account begin/commit level. A complete gui
refresh suspend may do better.

Regards,

Geert


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Convert Imap to Flat

John Ralls


> On Nov 7, 2018, at 5:52 AM, Geert Janssens <[hidden email]> wrote:
>
> Op maandag 5 november 2018 23:27:31 CET schreef John Ralls:
>>> On Nov 6, 2018, at 2:50 AM, Geert Janssens <[hidden email]>
>>> wrote:>
>>> Op maandag 5 november 2018 17:07:25 CET schreef Robert Fewell:
>>>> Hi,
>>>> I was poking around with the CSV importer and I noticed the following,
>>>> this
>>>> may also be an issue with other importers on first use after creating a
>>>> new
>>>> file...
>>>> With a new empty xml file, I used a one line transaction csv file with
>>>> appropriate settings and the 'Account' set to 'Assets:Current
>>>> Assets:Checking Account' and observed the following when I got to the
>>>> match
>>>> page...
>>>>
>>>> With the 'Checking Account' register open it would partly show the
>>>> imported
>>>> transactions, (this can be fixed by suspending GUI changes which I was
>>>> going to propose in a PR) and the register would reload seven times.
>>>> This reloading is caused by the triggering of the
>>>> 'imap_convert_bayes_to_flat' function and as it is a new file you would
>>>> not
>>>> expect it to do any thing but it does. Adding a few print statements I
>>>> get
>>>> the following...
>>>>
>>>> matchmap_find_destination
>>>> imap_convert_bayes_to_flat
>>>> convert_imap_account_bayes_to_flat 'Assets'
>>>>
>>>> gnc_split_register_load called for account 'Assets:Current
>>>>
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Current Assets'
>>>>
>>>> gnc_split_register_load called for account 'Assets:Current
>>>>
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Checking Account'
>>>>
>>>> gnc_split_register_load called for account 'Assets:Current
>>>>
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Liabilities'
>>>>
>>>> gnc_split_register_load called for account 'Assets:Current
>>>>
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Income'
>>>>
>>>> gnc_split_register_load called for account 'Assets:Current
>>>>
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Expenses'
>>>>
>>>> gnc_split_register_load called for account 'Assets:Current
>>>>
>>>> Assets:Checking Account' with list of 1
>>>> convert_imap_account_bayes_to_flat 'Equity'
>>>>
>>>> gnc_split_register_load called for account 'Assets:Current
>>>>
>>>> Assets:Checking Account' with list of 1
>>>>
>>>> As you can see, seven accounts get updated forcing the register reload
>>>> seven times, (not sure why those other accounts force a reload either),
>>>> and
>>>> this gets even worse if this first import is 100 transactions which would
>>>> equate to 700 reloads. I have not worked out why all these accounts are
>>>> updated or why after the first pass the converted flag is not set/noticed
>>>> there by eliminating the convert for the rest of the transactions, it
>>>> only
>>>> seems to be noticed on subsequent imports.
>>>>
>>>> You also get this behaviour if you start the 'Import Map Dialogue' which
>>>> may be the source of a report about that dialogue freezing but that needs
>>>> more investigating.
>>>
>>> This calls gnc_account_imap_get_info_bayes, which also calls
>>> imap_convert_bayes_to_flat so the it will trigger the same account
>>> refreshes.>
>>>> Any idea why these accounts are updated and why it runs on every import
>>>> transaction row ?
>>>
>>> Why the accounts are updated: while only a run in the debugger will verify
>>> it, this is what I have gathered from reading the code:
>>>
>>> imap_convert_bayes_to_flat's sub functions will call xaccAccountBeginEdit
>>> and xaccAccountCommitEdit at some point. This happens because it changes
>>> the account's kvp frames that store the import maps.
>>>
>>> On the other side, the register code has set a watch on the register's
>>> account(s) via the component manager. So each time the account signals a
>>> change (or more precisely a successful run of xaccAccountCommitEdit) the
>>> component manager will tell the register to refresh itself.
>>>
>>> As you suggest you can probably disable this by a call to
>>> gnc_suspend_gui_refresh.
>>>
>>> Why it runs on every import transaction row ? I suspect this is because
>>> there are no imap records stored yet and hence the feature flag that
>>> blocks the conversion is not set yet. So for each transaction it will try
>>> to do the conversion, find there's no converted imap record to store and
>>> skip setting the feature flag. This will probably continue forever if the
>>> user doesn't use bayesian matching at all.
>>> This is a difficult issue to solve. We don't want to set the flat_bayes
>>> conversion flag if there are no bayes maps because that would needlessly
>>> break backwards compatibility. We could make the conversion code more
>>> careful and have it only commit to accounts if there really are changes
>>> to commit. And add a run time flag that signals the conversion has run
>>> already once. With that conversion should only start if this flag is
>>> false AND the feature flag is false.
>>> However it may all be unnecessary and perhaps just blocking register
>>> updates while the transaction matching (or the whole import code) is
>>> running may eliminate the performance bottleneck already.
>>> So I would start with that: block gui updates.
>>> Then secondly, make the imap code more careful not to do unnecessary
>>> account updates and lastly consider a run time flag.
>>>
>>> Like you though I suspect this may be at least one cause of the import
>>> issues we see. Thanks for poking at it!
>>
>> I think that we do want to set the feature flag in this case: The import map
>> uses the flat layout and a pre-flat-bayes GnuCash won’t be able to read it,
>> causing user frustration.
>
> The timing matters. We want to set the flag from the first flat_bayes_imap
> entry that's actually written to file, not at the start of a conversion run.
> That's how the code is constructed now.
>
> I don't want the flag to be set if at the end of a conversion attempt there
> still aren't any flat_bayes_imap entries. That would needlessly break
> backwards compatibility.

OK, that’s reasonable.

>>
>> That’s not to say that we shouldn’t also do everything we can to speed up
>> the code--it’s really slow--and certainly suspending UI events while
>> computing the import matches is a good idea.
>>
>> There’s something else going wrong, though: convert_imap_bayes_to_flat calls
>> xaccAccountBeginEdit at the start and xaccAccountCommitEdit at the end.
>> That should raise the edit level and prevent any interior calls to
>> xaccAccountCommitEdit from doing anything.
>
> This doesn't look wrong per se to me.
> convert_imap_account_bayes_to_flat is run once for each account. So there will
> still be the same number of gui refreshes as there are accounts. I don't think
> we can reduce this further on the account begin/commit level. A complete gui
> refresh suspend may do better.

Each account has its own import map and any particular import affects only one account’s map, so there should be exactly one effective, i.e. with edit_level == 1, commit and exactly one UI refresh. If there’s more than one of either inside a call to convert_imap_account_bayes() then something’s broken at the QofInstance level. If we’re calling convert_imap_account_bayes() on a particular account more than once in a session then there’s something wrong with the decision logic that calls it. Bob’s printf-profile suggests at least the latter and  your "imap_convert_bayes_to_flat's sub functions will call xaccAccountBeginEdit and xaccAccountCommitEdit at some point” suggests the former.

I suppose Aaron thought that running it on an empty or non-existent map would take negligible time; if that’s not the case then we can simply check for that and quit, but it should be checked in the profiler before we add any code.

Regards,
John Ralls


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Convert Imap to Flat

Geert Janssens-4
Op woensdag 7 november 2018 01:10:03 CET schreef John Ralls:

> > On Nov 7, 2018, at 5:52 AM, Geert Janssens <[hidden email]>
> > wrote:>
> > Op maandag 5 november 2018 23:27:31 CET schreef John Ralls:
> >> That’s not to say that we shouldn’t also do everything we can to speed up
> >> the code--it’s really slow--and certainly suspending UI events while
> >> computing the import matches is a good idea.
> >>
> >> There’s something else going wrong, though: convert_imap_bayes_to_flat
> >> calls xaccAccountBeginEdit at the start and xaccAccountCommitEdit at the
> >> end. That should raise the edit level and prevent any interior calls to
> >> xaccAccountCommitEdit from doing anything.
> >
> > This doesn't look wrong per se to me.
> > convert_imap_account_bayes_to_flat is run once for each account. So there
> > will still be the same number of gui refreshes as there are accounts. I
> > don't think we can reduce this further on the account begin/commit level.
> > A complete gui refresh suspend may do better.
>
> Each account has its own import map and any particular import affects only
> one account’s map, so there should be exactly one effective, i.e. with
> edit_level == 1, commit and exactly one UI refresh.

The new CSV importer can handle imports into multiple accounts (on both
sides), but that's tangential to this issue.

The conversion as you describe it would be a lazy conversion: only convert
maps on an as needed basis. An interesting idea which hadn't occurred to me
and also different from how it's implemented.
Right now the conversion of *all* import maps of all accounts is initiated the
first time any import map (bayes of course) is needed. So even if the import
requires only one account's maps, all maps will be converted in one go. Hence
the iteration over all accounts.

I don't know how 2.6.21 would handle partly converted bayes import maps. I do
know as it is now the conversion is designed to be run as long as the feature
flag is not set. That would also have to change if we would like to go for the
lazy conversion.

The advantage of such lazy conversion is the time required to convert is
spread over several import runs, so each conversion is likely to take less
time. The disadvantage is we risk ending up with books that carry hierarchical
bayes data for an eternity and hence gnucash has to keep code around to handle
with it. In a full conversion in one go scenario we can at some point 2 major
releases from now declare this unsupported as we only guarantee backwards
compatibility for 1 major release series.

What *is* a problem with the one-shot-convert-all scenario is that it takes
noticable time (in some cases even horribly long) and we don't inform the user
of what's happening. The conversion should really have been initiated from the
gui with a progress bar showing something is going on and indicating how far
the process has run.

> If there’s more than
> one of either inside a call to convert_imap_account_bayes() then
> something’s broken at the QofInstance level. If we’re calling
> convert_imap_account_bayes() on a particular account more than once in a
> session then there’s something wrong with the decision logic that calls it.
> Bob’s printf-profile suggests at least the latter and  your
> "imap_convert_bayes_to_flat's sub functions will call xaccAccountBeginEdit
> and xaccAccountCommitEdit at some point” suggests the former.
>
> I suppose Aaron thought that running it on an empty or non-existent map
> would take negligible time; if that’s not the case then we can simply check
> for that and quit, but it should be checked in the profiler before we add
> any code.

Agreed

Geert


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Convert Imap to Flat

John Ralls


> On Nov 7, 2018, at 5:14 PM, Geert Janssens <[hidden email]> wrote:
>
> Op woensdag 7 november 2018 01:10:03 CET schreef John Ralls:
>>> On Nov 7, 2018, at 5:52 AM, Geert Janssens <[hidden email]>
>>> wrote:>
>>> Op maandag 5 november 2018 23:27:31 CET schreef John Ralls:
>>>> That’s not to say that we shouldn’t also do everything we can to speed up
>>>> the code--it’s really slow--and certainly suspending UI events while
>>>> computing the import matches is a good idea.
>>>>
>>>> There’s something else going wrong, though: convert_imap_bayes_to_flat
>>>> calls xaccAccountBeginEdit at the start and xaccAccountCommitEdit at the
>>>> end. That should raise the edit level and prevent any interior calls to
>>>> xaccAccountCommitEdit from doing anything.
>>>
>>> This doesn't look wrong per se to me.
>>> convert_imap_account_bayes_to_flat is run once for each account. So there
>>> will still be the same number of gui refreshes as there are accounts. I
>>> don't think we can reduce this further on the account begin/commit level.
>>> A complete gui refresh suspend may do better.
>>
>> Each account has its own import map and any particular import affects only
>> one account’s map, so there should be exactly one effective, i.e. with
>> edit_level == 1, commit and exactly one UI refresh.
>
> The new CSV importer can handle imports into multiple accounts (on both
> sides), but that's tangential to this issue.
>
> The conversion as you describe it would be a lazy conversion: only convert
> maps on an as needed basis. An interesting idea which hadn't occurred to me
> and also different from how it's implemented.
> Right now the conversion of *all* import maps of all accounts is initiated the
> first time any import map (bayes of course) is needed. So even if the import
> requires only one account's maps, all maps will be converted in one go. Hence
> the iteration over all accounts.
>
> I don't know how 2.6.21 would handle partly converted bayes import maps. I do
> know as it is now the conversion is designed to be run as long as the feature
> flag is not set. That would also have to change if we would like to go for the
> lazy conversion.
>
> The advantage of such lazy conversion is the time required to convert is
> spread over several import runs, so each conversion is likely to take less
> time. The disadvantage is we risk ending up with books that carry hierarchical
> bayes data for an eternity and hence gnucash has to keep code around to handle
> with it. In a full conversion in one go scenario we can at some point 2 major
> releases from now declare this unsupported as we only guarantee backwards
> compatibility for 1 major release series.
>
> What *is* a problem with the one-shot-convert-all scenario is that it takes
> noticable time (in some cases even horribly long) and we don't inform the user
> of what's happening. The conversion should really have been initiated from the
> gui with a progress bar showing something is going on and indicating how far
> the process has run.

Sorry, I’d forgotten that it’s an all-at-once. Given that it’s controlled by a feature-flag that’s a reasonable design decision. It points to another possible slow-down: Even if convert_imap_account_bayes() takes negligible time on an empty map, walking a large account tree looking for maps won’t. We can fix that and speed up the eventual conversion a bit by constructing a list or vector of accounts while loading the book: The SQL backends can use a single query, SELECT guid FROM accounts WHERE guid = (SELECT DISTINCT a.guid FROM accounts = a, slots = k WHERE a.guid = s.obj_guid AND s.name = 'import-map-bayes’);[not tested, might need a tweak or two]. The XML backend would just add the account guid to the list.

While working with imaps we should see if we can combine the flatting and the transfer account name->guid conversion so that they need only one pass through the map to accomplish both in cases where both are needed.

I agree that a dialog box informing the user of what’s going on would be good. Given the performance problems with progress bars on Windows with HiDPI displays I’m not so sure about that part, especially since generating a useful progress measure is a problem for most of our progress bars.

Regards,
John Ralls
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel