python GnuCash interface to SQL backend

classic Classic list List threaded Threaded
24 messages Options
12
Reply | Threaded
Open this post in threaded view
|

python GnuCash interface to SQL backend

Sébastien de Menten
Hello,

After trying multiple times to work with GnuCash from python (via xml, via
the python bindings, via sql), I finally had a try to use SQLAlchemy to
handle the GnuCash Books saved through the SQL backend (sqlite3 and
postgres).

I have a release on PyPI the package "pyscash" installable through 'pip
install pyscash' (see some raw documentation on
https://github.com/sdementen/pyscash). It is 'alpha' quality.

While it opens by default the Book in read only mode to be able to do
reporting or extract data from a GnuCash Book, I also succeeded in doing
more elaborate scripts that change a Book : creating new
accounts/sub-accounts, creating new transactions, uploading quotes for
stocks, etc.

I read that the SQL backend is just a backend to save the data and that
GnuCash is not a DB application and that the preferred way to program
GnuCash in python is through the python bindings. However, I found it much
easier to work with this pyscash package, at least as long as it is done
"offline" (i.e. not modifying a Book that is at the same time opened by
GnuCash).

I would be genuinely interested to have more specific documentation on the
risks of going the SQL way.

Sebastien
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

John Ralls-2

> On Nov 11, 2014, at 1:10 PM, Sébastien de Menten <[hidden email]> wrote:
>
> Hello,
>
> After trying multiple times to work with GnuCash from python (via xml, via
> the python bindings, via sql), I finally had a try to use SQLAlchemy to
> handle the GnuCash Books saved through the SQL backend (sqlite3 and
> postgres).
>
> I have a release on PyPI the package "pyscash" installable through 'pip
> install pyscash' (see some raw documentation on
> https://github.com/sdementen/pyscash). It is 'alpha' quality.
>
> While it opens by default the Book in read only mode to be able to do
> reporting or extract data from a GnuCash Book, I also succeeded in doing
> more elaborate scripts that change a Book : creating new
> accounts/sub-accounts, creating new transactions, uploading quotes for
> stocks, etc.
>
> I read that the SQL backend is just a backend to save the data and that
> GnuCash is not a DB application and that the preferred way to program
> GnuCash in python is through the python bindings. However, I found it much
> easier to work with this pyscash package, at least as long as it is done
> "offline" (i.e. not modifying a Book that is at the same time opened by
> GnuCash).
>
> I would be genuinely interested to have more specific documentation on the
> risks of going the SQL way.

There's nothing wrong with reading the database to generate reports. That is indeed easier for many people via SQL query than writing custom report plugins in Scheme.

The risk of writing to the database outside of GnuCash, whether in SQL or XML, is that unless you are very careful and have a deep understanding of how GnuCash works that you will irretrievably corrupt your accounting data. There is no business logic encoded in the SQL database, so your code must replicate the GnuCsah engine code to ensure that all required fields are computed and stored correctly. Much of GnuCash is neither straightforward nor obvious and some critical data are stored outside of the primary tables, usually to preserve backward compatibility with previous versions.

Regards,
John Ralls




_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

Sébastien de Menten
On Wednesday, November 12, 2014, John Ralls <[hidden email]> wrote:

>
> > On Nov 11, 2014, at 1:10 PM, Sébastien de Menten <[hidden email]
> <javascript:;>> wrote:
> > ....
> >
> > I would be genuinely interested to have more specific documentation on
> the
> > risks of going the SQL way.
>
> There's nothing wrong with reading the database to generate reports. That
> is indeed easier for many people via SQL query than writing custom report
> plugins in Scheme.
>
> It may also be easier to go with the SQL than with the std python binding
for reporting but also to change the GnuCash book.


> The risk of writing to the database outside of GnuCash, whether in SQL or
> XML, is that unless you are very careful and have a deep understanding of
> how GnuCash works that you will irretrievably corrupt your accounting data.
> There is no business logic encoded in the SQL database, so your code must
> replicate the GnuCsah engine code to ensure that all required fields are
> computed and stored correctly. Much of GnuCash is neither straightforward
> nor obvious and some critical data are stored outside of the primary
> tables, usually to preserve backward compatibility with previous versions.
>
> Regards,
> John Ralls
>
>
>
I have mainly used the basic objects from GnuCash required for
basic personal finance (so no invoice, no budget, ...) and did not found
any issues while handling lot of accounts/transactions/splits and stock
prices. I had the impression that GnuCash does indeed calculations when the
book is opened but that it does not save them in the SQL backend. Hence, if
we access the book when it's not opened by GnuCash at the same time, risks
are quite reduced, would this be a "wrong" impression ?

Where could I find detailed documentation on the GnuCash engine (and the
constrains/invariants GnuCash enforces) ? Or would there be some
code/program to check a GnuCash file is "sane/consistent" ?

Regards

Sebastien
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

John Ralls-2

> On Nov 12, 2014, at 12:08 PM, Sébastien de Menten <[hidden email]> wrote:
>
> On Wednesday, November 12, 2014, John Ralls <[hidden email] <mailto:[hidden email]>> wrote:
>
> > On Nov 11, 2014, at 1:10 PM, Sébastien de Menten <[hidden email] <javascript:;>> wrote:
> > ....
> >
> > I would be genuinely interested to have more specific documentation on the
> > risks of going the SQL way.
>
> There's nothing wrong with reading the database to generate reports. That is indeed easier for many people via SQL query than writing custom report plugins in Scheme.
>
> It may also be easier to go with the SQL than with the std python binding for reporting but also to change the GnuCash book.

It might be, but I doubt it. You won’t be able to implement the business logic in SQL alone.

>  
> The risk of writing to the database outside of GnuCash, whether in SQL or XML, is that unless you are very careful and have a deep understanding of how GnuCash works that you will irretrievably corrupt your accounting data. There is no business logic encoded in the SQL database, so your code must replicate the GnuCsah engine code to ensure that all required fields are computed and stored correctly. Much of GnuCash is neither straightforward nor obvious and some critical data are stored outside of the primary tables, usually to preserve backward compatibility with previous versions.
>
> Regards,
> John Ralls
>
>
>
> I have mainly used the basic objects from GnuCash required for basic personal finance (so no invoice, no budget, ...) and did not found any issues while handling lot of accounts/transactions/splits and stock prices. I had the impression that GnuCash does indeed calculations when the book is opened but that it does not save them in the SQL backend. Hence, if we access the book when it's not opened by GnuCash at the same time, risks are quite reduced, would this be a "wrong" impression ?
>
> Where could I find detailed documentation on the GnuCash engine (and the constrains/invariants GnuCash enforces) ? Or would there be some code/program to check a GnuCash file is "sane/consistent” ?

The developer documentation is in the sources in doxygen format; the current master documentation is compiled nightly and served at http://code.gnucash.org/docs/HEAD/ <http://code.gnucash.org/docs/HEAD/>.

GnuCash has a check and repair facility built in, most of which is run at file load. It does some sanity checking but won’t necessarily correct every possible error that an external program could make.

I’m not sure I follow you about calculations when the book is opened that aren’t saved. With the SQL backend, everything is written back to the database when a change is committed, so the *results* of the calculations are immediately saved. What GnuCash doesn’t do is *read* the database after the initial load, nor does it use any database concurrency control, so there are two potential ways to screw things up with two programs (even two instances of GnuCash) using the same database: A change made by one instance could be overwritten by a change made in the other or, much worse, the two instances could try writing the same records at the same time, corrupting those records.

Regards,
John Ralls

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

Sébastien de Menten
On Thursday, November 13, 2014, John Ralls <[hidden email]> wrote:

>
> On Nov 12, 2014, at 12:08 PM, Sébastien de Menten <[hidden email]
> <javascript:_e(%7B%7D,'cvml','[hidden email]');>> wrote:
>
> On Wednesday, November 12, 2014, John Ralls <[hidden email]
> <javascript:_e(%7B%7D,'cvml','[hidden email]');>> wrote:
>
>>
>> > On Nov 11, 2014, at 1:10 PM, Sébastien de Menten <[hidden email]>
>> wrote:
>> > ....
>> >
>> > I would be genuinely interested to have more specific documentation on
>> the
>> > risks of going the SQL way.
>>
>> There's nothing wrong with reading the database to generate reports. That
>> is indeed easier for many people via SQL query than writing custom report
>> plugins in Scheme.
>>
>> It may also be easier to go with the SQL than with the std python binding
> for reporting but also to change the GnuCash book.
>
>
> It might be, but I doubt it. You won’t be able to implement the business
> logic in SQL alone.
>
>
The python interface uses SQLAlchemy (an ORM) only to handle the backend
(retrieve and save objects), all business logic is in the python code. For
instance, when creating a transaction and the related splits, it is the
python code that ensures the business logic (for instance that the sum of
the splits = 0). This is close to what GnuCash does.
Moreover, there are some basic SQL integrity constrains (we cannot remove a
split without removing the related transaction) that are added in the ORM
layer as they do not exist in the SQL backend.

>
>
>>
>> I have mainly used the basic objects from GnuCash required for
> basic personal finance (so no invoice, no budget, ...) and did not found
> any issues while handling lot of accounts/transactions/splits and stock
> prices. I had the impression that GnuCash does indeed calculations when the
> book is opened but that it does not save them in the SQL backend. Hence, if
> we access the book when it's not opened by GnuCash at the same time, risks
> are quite reduced, would this be a "wrong" impression ?
>
> I’m not sure I follow you about calculations when the book is opened that
> aren’t saved.
>
I was thinking about temporary results/cached calculations/etc that are not
saved to the back ends (if there are any).


> With the SQL backend, everything is written back to the database when a
> change is committed, so the *results* of the calculations are immediately
> saved. What GnuCash doesn’t do is *read* the database after the initial
> load, nor does it use any database concurrency control, so there are two
> potential ways to screw things up with two programs (even two instances of
> GnuCash) using the same database: A change made by one instance could be
> overwritten by a change made in the other or, much worse, the two instances
> could try writing the same records at the same time, corrupting those
> records.
>
> Indeed, if GnuCash has opened the file and is using it (ie there is a lock
in the table gnc_lock), we are almost 100% sure to have the issues you
mentions if we change the file in parallel through SQL.  There is a check
in pyscash that raises an exception in this case (it can be overruled but
at user's own risk.
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

John Ralls-2

> On Nov 12, 2014, at 10:12 PM, Sébastien de Menten <[hidden email]> wrote:
>
> On Thursday, November 13, 2014, John Ralls <[hidden email] <mailto:[hidden email]>> wrote:
>
>> On Nov 12, 2014, at 12:08 PM, Sébastien de Menten <[hidden email] <>> wrote:
>>
>> On Wednesday, November 12, 2014, John Ralls <[hidden email] <>> wrote:
>>
>> > On Nov 11, 2014, at 1:10 PM, Sébastien de Menten <[hidden email] <>> wrote:
>> > ....
>> >
>> > I would be genuinely interested to have more specific documentation on the
>> > risks of going the SQL way.
>>
>> There's nothing wrong with reading the database to generate reports. That is indeed easier for many people via SQL query than writing custom report plugins in Scheme.
>>
>> It may also be easier to go with the SQL than with the std python binding for reporting but also to change the GnuCash book.
>
> It might be, but I doubt it. You won’t be able to implement the business logic in SQL alone.
>
>
> The python interface uses SQLAlchemy (an ORM) only to handle the backend (retrieve and save objects), all business logic is in the python code. For instance, when creating a transaction and the related splits, it is the python code that ensures the business logic (for instance that the sum of the splits = 0). This is close to what GnuCash does.
> Moreover, there are some basic SQL integrity constrains (we cannot remove a split without removing the related transaction) that are added in the ORM layer as they do not exist in the SQL backend.
>>  
>>
>> I have mainly used the basic objects from GnuCash required for basic personal finance (so no invoice, no budget, ...) and did not found any issues while handling lot of accounts/transactions/splits and stock prices. I had the impression that GnuCash does indeed calculations when the book is opened but that it does not save them in the SQL backend. Hence, if we access the book when it's not opened by GnuCash at the same time, risks are quite reduced, would this be a "wrong" impression ?
>>
>
> I’m not sure I follow you about calculations when the book is opened that aren’t saved.
> I was thinking about temporary results/cached calculations/etc that are not saved to the back ends (if there are any).
>  
> With the SQL backend, everything is written back to the database when a change is committed, so the *results* of the calculations are immediately saved. What GnuCash doesn’t do is *read* the database after the initial load, nor does it use any database concurrency control, so there are two potential ways to screw things up with two programs (even two instances of GnuCash) using the same database: A change made by one instance could be overwritten by a change made in the other or, much worse, the two instances could try writing the same records at the same time, corrupting those records.
>
> Indeed, if GnuCash has opened the file and is using it (ie there is a lock in the table gnc_lock), we are almost 100% sure to have the issues you mentions if we change the file in parallel through SQL.  There is a check in pyscash that raises an exception in this case (it can be overruled but at user's own risk.  


What’s your goal here? I don’t think that reimplementing GnuCash in Python with GnuCash’s SQL schema is a particularly good approach: It’s not exactly the most efficient design. Rather, it’s designed to mirror the XML schema. You’ll have a better design if you relegate GnuCash SQL to import/export.

An aside about the name: Pyscash is likely to be pronounced by English speakers with a short “i” sound where the y goes.  That conveys a rather unfortunate meaning.

Regards,
John Ralls

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

John Ralls-2

On Nov 13, 2014, at 9:31 AM, Sébastien de Menten <[hidden email]> wrote:

> Indeed, it may be worth to explain what are the goals (and the limits).
>
> I have tried to use the official python bindings and had the following issues:
> - need swig + compilations to make them work => pyscash is pure python and has only sqlalchemy as main dependency (which is rather supported/standard package)
> - python binding is a mapping if C API with a thin layer python friendly layer => I do not find the resulting python scripts very pythonic
>
> Hence, instead of writing (part of the example to create accounts):
>
> s = Session("bookname", is_new=False)
>
> book = s.book
> root = book.get_root_account()
> commod_table = book.get_table()
> CAD = commod_table.lookup('CURRENCY', 'CAD')
>
> a = Account(book)
> root.append_child(a)
> a.SetName('Assets')
> a.SetType(ACCT_TYPE_ASSET)
> a.SetCommodity(CAD)
>
> a2 = Account(book)
> a.append_child(a2)
> a2.SetName('Receivables')
> a2.SetType(ACCT_TYPE_RECEIVABLE)
> a2.SetCommodity(CAD)
>
> a3 = Account(book)
> root.append_child(a3)
> a3.SetName('Income')
> a3.SetType(ACCT_TYPE_INCOME)
> a3.SetCommodity(CAD)
>
> i can now write (reusing standard SQL Alchemy patterns):
>
> s = Session("bookname")           # pyscash cannot create a Book from scratch. It relies on gnucash to do this
>
> # query from the session the Book and the CAD currency
> book = s.query(Book).one()        
> CAD = s.query(Commodity).filter_by(name="CAD").one()
>
> root = book.root_account
>
> # possibility to create an account in a pythonic way
> acc = Account(name="Assets", type=ACCT_TYPE_ASSET, commodity=CAD, parent=root_account)
>
> # possibility to create accounts directly with subaccounts
> acc_inc = Account(name="Income", type=ACCT_TYPE_ASSET, commodity=CAD,
>                   parent=root,
>                   children=[
>                       Account(name="Income subacc 1", type=ACCT_TYPE_INCOME, commodity=CAD),
>                       Account(name="Income subacc 2", type=ACCT_TYPE_INCOME, commodity=CAD),
>                       Account(name="Income subacc 3", type=ACCT_TYPE_INCOME, commodity=CAD),
> )
>
> # and then either one of the command
> s.rollback()   # to undo changes
> s.commit()    # to save/commit changes
>
> So, to sum up, goals of pyscash :
> - easy installation as pure python module (no compilation, no swig, ...)
> - easy to contribute as pure python and based on de facto standard ORM for SQL in python (sql alchemy)
> - pythonic implementation by leveraging SQL Alchemy features (transparent handling of guid, free commit/rollback/transaction support, automatic parent-children relation handling, etc)
> - pythonic interface for CRUD operations on a GnuCash Book
>
> The last point is important as the goal is not at all to reimplement the whole GnuCash engine but only to be able to manipulate (CRUD operations) the different GnuCash entities (book, account, currency, etc) in an easy pythonic way.
> For instance, I do not plan to reimplement functions like GetBalance, GetReconciledBalance, GetPresentBalanceInCurrency, GetAccountBalance, etc
>
> Is the goal of the project clearer ?
> Do you see complexities in reimplementing the pure CRUD operations (except the basic complexities of leaving a GnuCash book in a consistent status) ?
>
> On the name issue, you are definitely right ! I had in mind a pie-ess-cash pronounciation (the S for SQL) as pygnucash was already taken. Any suggestion ? pysacash (python sqlalchemy gnucash interface) ? or is it even worse :-) ?
>
> kind regards,
>
> Sebastien
>
> ps: and thank you for clarifying the announce on the user mailing list ! I should have thought about that myself to avoid confusion…

Sébastien,

Please remember to copy the list on all replies.

The goal of the project is clearer but is in my mind severely misguided. Object persistence is always a side effect of a non-trivial program, and GnuCash is a non-trivial program.

If your complaint about the python bindings is that they’re not pythonic I think that your efforts would be better spent improving that. It should become easier as we progress through the C++ rewrite which will produce a much more object-oriented architecture where that’s appropriate. That said, it’s also worth noting that C++ is a far more versatile language than either C or Python in that it supports generic and functional programming as well as the structured and OO paradigms that Python supports. In moving away from GObject we’ll also be enforcing type safety much more rigorously. Since that’s a notorious weakness (though casual hackers think it’s a feature) of Python, having thin Python wrappers around C++ objects will provide for much safer Python add ons than does the current code base.

It’s also very dangerous to attempt to select what parts of GnuCash’s internal functions to reimplement. To take your example of balances, it’s an integral part of GnuCash’s internal checks to compute and compare balances. Leaving that out greatly increases the risk of introducing corrupt data.

As for the name, how about PieCash? It doesn’t get the SQL element in there, but it avoids mispronunciation.

Regards,
John Ralls


>
>
> On Thu, Nov 13, 2014 at 4:28 PM, John Ralls <[hidden email]> wrote:
>
>> On Nov 12, 2014, at 10:12 PM, Sébastien de Menten <[hidden email]> wrote:
>>
>> On Thursday, November 13, 2014, John Ralls <[hidden email]> wrote:
>>
>>> On Nov 12, 2014, at 12:08 PM, Sébastien de Menten <[hidden email]> wrote:
>>>
>>> On Wednesday, November 12, 2014, John Ralls <[hidden email]> wrote:
>>>
>>> > On Nov 11, 2014, at 1:10 PM, Sébastien de Menten <[hidden email]> wrote:
>>> > ....
>>> >
>>> > I would be genuinely interested to have more specific documentation on the
>>> > risks of going the SQL way.
>>>
>>> There's nothing wrong with reading the database to generate reports. That is indeed easier for many people via SQL query than writing custom report plugins in Scheme.
>>>
>>> It may also be easier to go with the SQL than with the std python binding for reporting but also to change the GnuCash book.
>>
>> It might be, but I doubt it. You won’t be able to implement the business logic in SQL alone.
>>
>>
>> The python interface uses SQLAlchemy (an ORM) only to handle the backend (retrieve and save objects), all business logic is in the python code. For instance, when creating a transaction and the related splits, it is the python code that ensures the business logic (for instance that the sum of the splits = 0). This is close to what GnuCash does.
>> Moreover, there are some basic SQL integrity constrains (we cannot remove a split without removing the related transaction) that are added in the ORM layer as they do not exist in the SQL backend.
>>>  
>>>
>>> I have mainly used the basic objects from GnuCash required for basic personal finance (so no invoice, no budget, ...) and did not found any issues while handling lot of accounts/transactions/splits and stock prices. I had the impression that GnuCash does indeed calculations when the book is opened but that it does not save them in the SQL backend. Hence, if we access the book when it's not opened by GnuCash at the same time, risks are quite reduced, would this be a "wrong" impression ?
>>>
>> I’m not sure I follow you about calculations when the book is opened that aren’t saved.
>> I was thinking about temporary results/cached calculations/etc that are not saved to the back ends (if there are any).
>>  
>> With the SQL backend, everything is written back to the database when a change is committed, so the *results* of the calculations are immediately saved. What GnuCash doesn’t do is *read* the database after the initial load, nor does it use any database concurrency control, so there are two potential ways to screw things up with two programs (even two instances of GnuCash) using the same database: A change made by one instance could be overwritten by a change made in the other or, much worse, the two instances could try writing the same records at the same time, corrupting those records.
>>
>> Indeed, if GnuCash has opened the file and is using it (ie there is a lock in the table gnc_lock), we are almost 100% sure to have the issues you mentions if we change the file in parallel through SQL.  There is a check in pyscash that raises an exception in this case (it can be overruled but at user's own risk.  
>
>
> What’s your goal here? I don’t think that reimplementing GnuCash in Python with GnuCash’s SQL schema is a particularly good approach: It’s not exactly the most efficient design. Rather, it’s designed to mirror the XML schema. You’ll have a better design if you relegate GnuCash SQL to import/export.
>
> An aside about the name: Pyscash is likely to be pronounced by English speakers with a short “i” sound where the y goes.  That conveys a rather unfortunate meaning.
>
> Regards,
> John Ralls
>
>


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

Sébastien de Menten

On 2014-11-13 19:25, John Ralls wrote:

> On Nov 13, 2014, at 9:31 AM, Sébastien de Menten <[hidden email]> wrote:
>
>> Indeed, it may be worth to explain what are the goals (and the limits).
>>
>> I have tried to use the official python bindings and had the following issues:
>> - need swig + compilations to make them work => pyscash is pure python and has only sqlalchemy as main dependency (which is rather supported/standard package)
>> - python binding is a mapping if C API with a thin layer python friendly layer => I do not find the resulting python scripts very pythonic
>>
>> [...]
>> So, to sum up, goals of pyscash :
>> - easy installation as pure python module (no compilation, no swig, ...)
>> - easy to contribute as pure python and based on de facto standard ORM for SQL in python (sql alchemy)
>> - pythonic implementation by leveraging SQL Alchemy features (transparent handling of guid, free commit/rollback/transaction support, automatic parent-children relation handling, etc)
>> - pythonic interface for CRUD operations on a GnuCash Book
>>
>> The last point is important as the goal is not at all to reimplement the whole GnuCash engine but only to be able to manipulate (CRUD operations) the different GnuCash entities (book, account, currency, etc) in an easy pythonic way.
>> For instance, I do not plan to reimplement functions like GetBalance, GetReconciledBalance, GetPresentBalanceInCurrency, GetAccountBalance, etc
>>
>> Is the goal of the project clearer ?
>> Do you see complexities in reimplementing the pure CRUD operations (except the basic complexities of leaving a GnuCash book in a consistent status) ?
>>
>> On the name issue, you are definitely right ! I had in mind a pie-ess-cash pronounciation (the S for SQL) as pygnucash was already taken. Any suggestion ? pysacash (python sqlalchemy gnucash interface) ? or is it even worse :-) ?
>>
>> kind regards,
>>
>> Sebastien
>>
>> ps: and thank you for clarifying the announce on the user mailing list ! I should have thought about that myself to avoid confusion…
> Sébastien,
>
> Please remember to copy the list on all replies.
>
> The goal of the project is clearer but is in my mind severely misguided. Object persistence is always a side effect of a non-trivial program, and GnuCash is a non-trivial program.
If you see GnuCash from the (limited) perspective of an editor for a
Book document (as LibreOffice Writer is an editor for a ODT document),
object persistence is central. And luckily we have both a very clean and
well designed object model in GnuCash as well as standard persistence
formats (xml, SQL  and not an obscure binary format).

The GnuCash documentation states that the XML document can be used to
get the GnuCash data in whatever other format (see "Converting XML
GnuCash File"
http://www.gnucash.org/docs/v2.6/C/gnucash-guide/appendixa_xmlconvert1.html)
through XLST transformation and vice-versa (so generating a
new/transformed gnucash XML file from a LibreOffice spreadsheet).

PieCash (I like your name !) aims to fulfill exactly this purpose, no
more, no less. As with the XLST transforms, it allows to do CRUD
operations on the objects within a GnuCash Book.

Has this stance on the manipulation outside GnuCash of a GnuCash
document evolved since it was written ? Would this still be supported
after the C++ rewrite ?
>   your complaint about the python bindings is that they’re not pythonic I think that your efforts would be better spent improving that. It should become easier as we progress through the C++ rewrite which will produce a much more object-oriented architecture where that’s appropriate. That said, it’s also worth noting that C++ is a far more versatile language than either C or Python in that it supports generic and functional programming as well as the structured and OO paradigms that Python supports. In moving away from GObject we’ll also be enforcing type safety much more rigorously. Since that’s a notorious weakness (though casual hackers think it’s a feature) of Python, having thin Python wrappers around C++ objects will provide for much safer Python add ons than does the current code base.
The python binding are not pure python bindings (and will probably never
be as they should interface C or C++ code). This makes them not easily
accessible on Windows platforms (complexity in compilation) and more
complex to hack (for non C/C++ programmers). However, they offer the
ability to call any function of the engine and to be independent of the
backend (XML, SQL,...) as long as it is interfaced so may be the only
solution in some cases.

On the other side, PieCash is pure python. The code required to
interface an entity is almost trivial. For instance, the full interface
to the Commodity entity is

     class Commodity(DeclarativeBaseGuid):
         __tablename__ = 'commodities'
         __table_args__ = {}

         # column definitions
         cusip = Column('cusip', TEXT(length=2048))
         fraction = Column('fraction', INTEGER(), nullable=False)
         fullname = Column('fullname', TEXT(length=2048))
         mnemonic = Column('mnemonic', TEXT(length=2048), nullable=False)
         namespace = Column('namespace', TEXT(length=2048), nullable=False)
         quote_flag = Column('quote_flag', INTEGER(), nullable=False)
         quote_source = Column('quote_source', TEXT(length=2048))
         quote_tz = Column('quote_tz', TEXT(length=2048))

All the rest is handled automatically thanks to the SQLAlchemy layer
(link between objects, cascade delete, locking and transactions,
generation of GUID key, ...).

So, I admit, the main effort in this project consisted in writing for
each entity (book, account, etc) the equivalent here above code as well
as the relationships between entities. The later is done with a syntax like

     class Split(DeclarativeBaseGuid):
         [...]
         account = relation('Account', backref=backref('splits',
cascade='all, delete-orphan'))


> It’s also very dangerous to attempt to select what parts of GnuCash’s internal functions to reimplement. To take your example of balances, it’s an integral part of GnuCash’s internal checks to compute and compare balances. Leaving that out greatly increases the risk of introducing corrupt data.
I may have not be totally clear in my previous email but I do not indeed
see any common sense in implementing the Balance calculations in
PieCash. These results are part of the GnuCash application logic and are
not saved in the GnuCash XML/SQL files. They are neither part of the
object model itself but transient calculations.
> As for the name, how about PieCash? It doesn’t get the SQL element in there, but it avoids mispronunciation.
As already written here above, I like it !
> Regards,
> John Ralls
>

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

John Ralls-2

On Nov 13, 2014, at 12:44 PM, Sébastien de Menten <[hidden email]> wrote:

>
> On 2014-11-13 19:25, John Ralls wrote:
>> On Nov 13, 2014, at 9:31 AM, Sébastien de Menten <[hidden email]> wrote:
>>
>>> Indeed, it may be worth to explain what are the goals (and the limits).
>>>
>>> I have tried to use the official python bindings and had the following issues:
>>> - need swig + compilations to make them work => pyscash is pure python and has only sqlalchemy as main dependency (which is rather supported/standard package)
>>> - python binding is a mapping if C API with a thin layer python friendly layer => I do not find the resulting python scripts very pythonic
>>>
>>> [...]
>>> So, to sum up, goals of pyscash :
>>> - easy installation as pure python module (no compilation, no swig, ...)
>>> - easy to contribute as pure python and based on de facto standard ORM for SQL in python (sql alchemy)
>>> - pythonic implementation by leveraging SQL Alchemy features (transparent handling of guid, free commit/rollback/transaction support, automatic parent-children relation handling, etc)
>>> - pythonic interface for CRUD operations on a GnuCash Book
>>>
>>> The last point is important as the goal is not at all to reimplement the whole GnuCash engine but only to be able to manipulate (CRUD operations) the different GnuCash entities (book, account, currency, etc) in an easy pythonic way.
>>> For instance, I do not plan to reimplement functions like GetBalance, GetReconciledBalance, GetPresentBalanceInCurrency, GetAccountBalance, etc
>>>
>>> Is the goal of the project clearer ?
>>> Do you see complexities in reimplementing the pure CRUD operations (except the basic complexities of leaving a GnuCash book in a consistent status) ?
>>>
>>> On the name issue, you are definitely right ! I had in mind a pie-ess-cash pronounciation (the S for SQL) as pygnucash was already taken. Any suggestion ? pysacash (python sqlalchemy gnucash interface) ? or is it even worse :-) ?
>>>
>>> kind regards,
>>>
>>> Sebastien
>>>
>>> ps: and thank you for clarifying the announce on the user mailing list ! I should have thought about that myself to avoid confusion…
>> Sébastien,
>>
>> Please remember to copy the list on all replies.
>>
>> The goal of the project is clearer but is in my mind severely misguided. Object persistence is always a side effect of a non-trivial program, and GnuCash is a non-trivial program.
> If you see GnuCash from the (limited) perspective of an editor for a Book document (as LibreOffice Writer is an editor for a ODT document), object persistence is central. And luckily we have both a very clean and well designed object model in GnuCash as well as standard persistence formats (xml, SQL  and not an obscure binary format).

I don’t share either opinion, especially about the object model in GnuCash.

>
> The GnuCash documentation states that the XML document can be used to get the GnuCash data in whatever other format (see "Converting XML GnuCash File" http://www.gnucash.org/docs/v2.6/C/gnucash-guide/appendixa_xmlconvert1.html) through XLST transformation and vice-versa (so generating a new/transformed gnucash XML file from a LibreOffice spreadsheet).

No matter what the GnuCash documentation might say, only someone with a deep understanding of GnuCash could successfully create a correct GnuCash data file from a spreadsheet document using XSLT. I’m not familiar with LibreOffice’s ODS format, but I expect that it would take deep knowledge of that format to successfully extract useful data from it with XSLT as well.

>
> PieCash (I like your name !) aims to fulfill exactly this purpose, no more, no less. As with the XLST transforms, it allows to do CRUD operations on the objects within a GnuCash Book.

CRUD operations on a GnuCash database will corrupt it. The GnuCash schema is not normalized, and not all of the necessary data is stored in the table associated with the objects.

>
> Has this stance on the manipulation outside GnuCash of a GnuCash document evolved since it was written ? Would this still be supported after the C++ rewrite ?

It isn’t supported now. It never has, and it is unlikely that it ever will be, even if we are able, after several development cycles, to actually migrate to a 3N database schema. There is too much logic that is encoded in the program and which cannot be portably encoded in a SQL database, to make that feasible.

>>  your complaint about the python bindings is that they’re not pythonic I think that your efforts would be better spent improving that. It should become easier as we progress through the C++ rewrite which will produce a much more object-oriented architecture where that’s appropriate. That said, it’s also worth noting that C++ is a far more versatile language than either C or Python in that it supports generic and functional programming as well as the structured and OO paradigms that Python supports. In moving away from GObject we’ll also be enforcing type safety much more rigorously. Since that’s a notorious weakness (though casual hackers think it’s a feature) of Python, having thin Python wrappers around C++ objects will provide for much safer Python add ons than does the current code base.
> The python binding are not pure python bindings (and will probably never be as they should interface C or C++ code). This makes them not easily accessible on Windows platforms (complexity in compilation) and more complex to hack (for non C/C++ programmers). However, they offer the ability to call any function of the engine and to be independent of the backend (XML, SQL,...) as long as it is interfaced so may be the only solution in some cases.
>
> On the other side, PieCash is pure python. The code required to interface an entity is almost trivial. For instance, the full interface to the Commodity entity is
>
>    class Commodity(DeclarativeBaseGuid):
>        __tablename__ = 'commodities'
>        __table_args__ = {}
>
>        # column definitions
>        cusip = Column('cusip', TEXT(length=2048))
>        fraction = Column('fraction', INTEGER(), nullable=False)
>        fullname = Column('fullname', TEXT(length=2048))
>        mnemonic = Column('mnemonic', TEXT(length=2048), nullable=False)
>        namespace = Column('namespace', TEXT(length=2048), nullable=False)
>        quote_flag = Column('quote_flag', INTEGER(), nullable=False)
>        quote_source = Column('quote_source', TEXT(length=2048))
>        quote_tz = Column('quote_tz', TEXT(length=2048))
>
> All the rest is handled automatically thanks to the SQLAlchemy layer (link between objects, cascade delete, locking and transactions, generation of GUID key, …).

Won’t work. SQLAlchemy can’t automatically generate a correct class from the schema nor can it derive a correct table description from the C headers.

You’ll get closer working with the XML schema. There’s a reasonably up-to-date version in src/doc/xml/gnucash-v2.rnc.

>
> So, I admit, the main effort in this project consisted in writing for each entity (book, account, etc) the equivalent here above code as well as the relationships between entities. The later is done with a syntax like
>
>    class Split(DeclarativeBaseGuid):
>        [...]
>        account = relation('Account', backref=backref('splits', cascade='all, delete-orphan'))
>
>
>> It’s also very dangerous to attempt to select what parts of GnuCash’s internal functions to reimplement. To take your example of balances, it’s an integral part of GnuCash’s internal checks to compute and compare balances. Leaving that out greatly increases the risk of introducing corrupt data.
> I may have not be totally clear in my previous email but I do not indeed see any common sense in implementing the Balance calculations in PieCash. These results are part of the GnuCash application logic and are not saved in the GnuCash XML/SQL files. They are neither part of the object model itself but transient calculations.

You’re confusing the object model with the persistence model. They’re rather different.

Regards,
John Ralls



_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

Derek Atkins
In reply to this post by Sébastien de Menten
Sébastien de Menten <[hidden email]> writes:

> Where could I find detailed documentation on the GnuCash engine (and the
> constrains/invariants GnuCash enforces) ? Or would there be some
> code/program to check a GnuCash file is "sane/consistent" ?

Only in the implementation.  There is no documentation per se on this,
because we do not support modification of the database from outside the
GnuCash APIs.  This also allows us to change the underlying storage
mechanisms without breaking things, because it's all abstracted.  By
re-implementing it you're basically binding yourself to a particular
version of the database schema, which can (and will) change over time,
requiring you to duplicate the effort already happening in the gnucash
code.

Moreover, you're also tied to a particular backend, which isn't very
nice.

If you want to modify the gnucash database, you really should use the
exported GnuCash APIs.

If the current python bindings aren't pythonic enough for you, then I
urge you to spend the time to fix that instead of reimplementing
something that will absolutely break some time in the future.  When it
will break I cannot tell you, but I can assure you it WILL break at some
point.

> Regards
>
> Sebastien

-derek

--
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       [hidden email]                        PGP key available

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

Derek Atkins
In reply to this post by John Ralls-2
John Ralls <[hidden email]> writes:

> What’s your goal here? I don’t think that reimplementing GnuCash in
> Python with GnuCash’s SQL schema is a particularly good approach: It’s
> not exactly the most efficient design. Rather, it’s designed to mirror
> the XML schema. You’ll have a better design if you relegate GnuCash
> SQL to import/export.

I agree wholeheartedly.  Don't do it this way.  Use the GnuCash Python
bindings.  If you don't like the current structure of them, then fix
that.  This way your apps will always work because the bindings will
stay in lockstep with any changes that get made.

> An aside about the name: Pyscash is likely to be pronounced by English
> speakers with a short “i” sound where the y goes.  That conveys a
> rather unfortunate meaning.

Indeed, I would pronounce is "piss cash".

> Regards,
> John Ralls

-derek
--
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       [hidden email]                        PGP key available

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

Sébastien de Menten
In reply to this post by John Ralls-2
First of all, thank you John for taking the time to answer to this thread !


> > If you see GnuCash from the (limited) perspective of an editor for a
> Book document (as LibreOffice Writer is an editor for a ODT document),
> object persistence is central. And luckily we have both a very clean and
> well designed object model in GnuCash as well as standard persistence
> formats (xml, SQL  and not an obscure binary format).
>
> I don’t share either opinion, especially about the object model in GnuCash.
>
> When talking about the object model, I am referring to the entities
(Account, Book, Split, ...), their relation and attributes. And for most of
the "core" objects (excluding company, invoice, ...) it is rather clean and
efficient. Personnally, I think it could be a great basis for an open
document format for this domain of application (better than QIF & co).

In terms of the implementation itself of the object model, the main things
I see not that clean are:
- KVP vs field representation
- XXX_denom / XXX_num split (instead of using a Decimal type)
But these two points should be hidden for the user/developper in python
bindings (official or piecash).

>
> > The GnuCash documentation states that the XML document can be used to
> get the GnuCash data in whatever other format (see "Converting XML GnuCash
> File"
> http://www.gnucash.org/docs/v2.6/C/gnucash-guide/appendixa_xmlconvert1.html)
> through XLST transformation and vice-versa (so generating a new/transformed
> gnucash XML file from a LibreOffice spreadsheet).
>
> No matter what the GnuCash documentation might say, only someone with a
> deep understanding of GnuCash could successfully create a correct GnuCash
> data file from a spreadsheet document using XSLT. I’m not familiar with
> LibreOffice’s ODS format, but I expect that it would take deep knowledge of
> that format to successfully extract useful data from it with XSLT as well.
>

I agree that creating GnuCash file through XSLT is probably clause to be a
nightmare ... that is why I quickly dropped the XML route (on ODS, I just
took what was written in the GnuCash doc, and indeed interacting with such
document require some library).


> >
> > PieCash (I like your name !) aims to fulfill exactly this purpose, no
> more, no less. As with the XLST transforms, it allows to do CRUD operations
> on the objects within a GnuCash Book.
>
> CRUD operations on a GnuCash database will corrupt it. The GnuCash schema
> is not normalized, and not all of the necessary data is stored in the table
> associated with the objects.
>
>
"CRUD operations" is not to be taken as "pure CRUD operations", there
should always be in piecash a check that the set of objects is
self-consistent (but this is really simple, obvious, natural constrains if
one know a bit the GnuCash object model)

Could you point me to 2 or 3 real case example we would nevertheless end up
corrupting the database ? It would really help me to materialise this
"risk".
With my practical experience with piecash (creating a full account tree
structure and importing thousands of transactions from csv file), I haven't
found any case of corruption (except when developing the API and having
"obvious" bugs).


> >
> > Has this stance on the manipulation outside GnuCash of a GnuCash
> document evolved since it was written ? Would this still be supported after
> the C++ rewrite ?
>
> It isn’t supported now. It never has, and it is unlikely that it ever will
> be, even if we are able, after several development cycles, to actually
> migrate to a 3N database schema. There is too much logic that is encoded in
> the program and which cannot be portably encoded in a SQL database, to make
> that feasible.
>
>
For piecash, there is no need for a 3N db schema and there is no need to
write any logic in SQL !
It is through SQLAlchemy that we can handle/encapsulate the extra bit of
logic (which is, as far as I am in the development of piecash and given
piecash scope, rather limited). Nothing is done in SQL itself.

 >

> > All the rest is handled automatically thanks to the SQLAlchemy layer
> (link between objects, cascade delete, locking and transactions, generation
> of GUID key, …).
>
> Won’t work. SQLAlchemy can’t automatically generate a correct class from
> the schema nor can it derive a correct table description from the C headers.
>
>
The table schema is exactly the only piece of code that needs to be written
(piecash is in fact just that, table schemas with some metadata).
I do not hope/expect/think to generate this code automatically from C
headers (as SWIG does).


> You’ll get closer working with the XML schema. There’s a reasonably
> up-to-date version in src/doc/xml/gnucash-v2.rnc.
>
>
As written here above, XML is not easy to work with (at least for me) and
does not save me of anything you said before (corruption, etc).


> >
> > So, I admit, the main effort in this project consisted in writing for
> each entity (book, account, etc) the equivalent here above code as well as
> the relationships between entities. The later is done with a syntax like
> >
> >    class Split(DeclarativeBaseGuid):
> >        [...]
> >        account = relation('Account', backref=backref('splits',
> cascade='all, delete-orphan'))
> >
> >
> >> It’s also very dangerous to attempt to select what parts of GnuCash’s
> internal functions to reimplement. To take your example of balances, it’s
> an integral part of GnuCash’s internal checks to compute and compare
> balances. Leaving that out greatly increases the risk of introducing
> corrupt data.
> > I may have not be totally clear in my previous email but I do not indeed
> see any common sense in implementing the Balance calculations in PieCash.
> These results are part of the GnuCash application logic and are not saved
> in the GnuCash XML/SQL files. They are neither part of the object model
> itself but transient calculations.
>
> You’re confusing the object model with the persistence model. They’re
> rather different.
>
>
I understand "object model" as the entities+relations+fields model and its
concrete implementation, the "persistence model" as the table schema. Is it
also what you are referring to ? If so, could you elaborate shortly on your
comment ? I can't succeed in making the link between my comment and yours.

and thank you again for your time !
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

Sébastien de Menten
In reply to this post by Derek Atkins
On Fri, Nov 14, 2014 at 3:33 AM, Derek Atkins <[hidden email]> wrote:

> John Ralls <[hidden email]> writes:
>
> > What’s your goal here? I don’t think that reimplementing GnuCash in
> > Python with GnuCash’s SQL schema is a particularly good approach: It’s
> > not exactly the most efficient design. Rather, it’s designed to mirror
> > the XML schema. You’ll have a better design if you relegate GnuCash
> > SQL to import/export.
>
> I agree wholeheartedly.  Don't do it this way.  Use the GnuCash Python
> bindings.  If you don't like the current structure of them, then fix
> that.  This way your apps will always work because the bindings will
> stay in lockstep with any changes that get made.
>
> Hello Derek,

The GnuCash python bindings are C/SWIG based. This causes some issues on
windows, and requires deep knowledge of C, SWIG and the GnuCash C api to
contribute to.

The piecash python bindings are a pure python package ("pip install
piecash" and you're up and running) and works on the SQL tables through the
SQL Alchemy library. It is only 500 SLOC today (and may grow in the future
but not by an order of magnitude). As it is short and in python, it is
rather easy to contribute/hack/extend.
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

Sébastien de Menten
In reply to this post by Derek Atkins
On Fri, Nov 14, 2014 at 3:31 AM, Derek Atkins <[hidden email]> wrote:

> Sébastien de Menten <[hidden email]> writes:
>
> > Where could I find detailed documentation on the GnuCash engine (and the
> > constrains/invariants GnuCash enforces) ? Or would there be some
> > code/program to check a GnuCash file is "sane/consistent" ?
>
> Only in the implementation.  There is no documentation per se on this,
> because we do not support modification of the database from outside the
> GnuCash APIs.  This also allows us to change the underlying storage
> mechanisms without breaking things, because it's all abstracted.  By
> re-implementing it you're basically binding yourself to a particular
> version of the database schema, which can (and will) change over time,
> requiring you to duplicate the effort already happening in the gnucash
> code.
>
> Moreover, you're also tied to a particular backend, which isn't very
> nice.
>
>
Indeed, piecash is only applicable to the SQL backend and depends on the
version of the "SQL schema" used by GnuCash. But as GnuCash is rather good
at keeping backward compatibility, my fears are not so high in this respect
(we can a minor changes but that is ok). I can add a check on the different
rows of the VERSION table to ensure explicitly that the schema has not
changed.

A major rewrite of GnuCash and/or the SQL backend would require a major
rewrite of piecash, but that's OK, we are talking about 500 SLOC.

If you want to modify the gnucash database, you really should use the
> exported GnuCash APIs.
>
> If the current python bindings aren't pythonic enough for you, then I
> urge you to spend the time to fix that instead of reimplementing
> something that will absolutely break some time in the future.  When it
> will break I cannot tell you, but I can assure you it WILL break at some
> point.
>
>
I agree, it requires indeed to follow the evolution of GnuCash SQL schema.
But these changes are well documented in the changelog.
The official python bindings are C/SWIG based and are more complex to
understand than the 500 SLOC of piecash. I have a preference to work with
piecash even if I understand that it fragments the python bindings
situation. But ok, I can't forbid myself to scratch this itch :-)

kind regards,

Sebastien
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

John Ralls-2
In reply to this post by Sébastien de Menten

> On Nov 14, 2014, at 4:28 AM, Sébastien de Menten <[hidden email]> wrote:
>
> First of all, thank you John for taking the time to answer to this thread !
>
>
> > If you see GnuCash from the (limited) perspective of an editor for a Book document (as LibreOffice Writer is an editor for a ODT document), object persistence is central. And luckily we have both a very clean and well designed object model in GnuCash as well as standard persistence formats (xml, SQL  and not an obscure binary format).
>
> I don’t share either opinion, especially about the object model in GnuCash.
>
> When talking about the object model, I am referring to the entities (Account, Book, Split, ...), their relation and attributes. And for most of the "core" objects (excluding company, invoice, ...) it is rather clean and efficient. Personnally, I think it could be a great basis for an open document format for this domain of application (better than QIF & co).
>
> In terms of the implementation itself of the object model, the main things I see not that clean are:
> - KVP vs field representation
> - XXX_denom / XXX_num split (instead of using a Decimal type)
> But these two points should be hidden for the user/developper in python bindings (official or piecash).

I experimented with the available decimal libraries over the summer. They’re mostly too slow and they don’t afford enough control over rounding to be sufficiently accurate for accounting use.

The object model as it stands now has too much interdependence between classes, especially the transaction, split, account, and commodity classes. The implementation has a lot of

>
> >
> > The GnuCash documentation states that the XML document can be used to get the GnuCash data in whatever other format (see "Converting XML GnuCash File" http://www.gnucash.org/docs/v2.6/C/gnucash-guide/appendixa_xmlconvert1.html <http://www.gnucash.org/docs/v2.6/C/gnucash-guide/appendixa_xmlconvert1.html>) through XLST transformation and vice-versa (so generating a new/transformed gnucash XML file from a LibreOffice spreadsheet).
>
> No matter what the GnuCash documentation might say, only someone with a deep understanding of GnuCash could successfully create a correct GnuCash data file from a spreadsheet document using XSLT. I’m not familiar with LibreOffice’s ODS format, but I expect that it would take deep knowledge of that format to successfully extract useful data from it with XSLT as well.
>
> I agree that creating GnuCash file through XSLT is probably clause to be a nightmare ... that is why I quickly dropped the XML route (on ODS, I just took what was written in the GnuCash doc, and indeed interacting with such document require some library).
>
>
> >
> > PieCash (I like your name !) aims to fulfill exactly this purpose, no more, no less. As with the XLST transforms, it allows to do CRUD operations on the objects within a GnuCash Book.
>
> CRUD operations on a GnuCash database will corrupt it. The GnuCash schema is not normalized, and not all of the necessary data is stored in the table associated with the objects.
>
>
> "CRUD operations" is not to be taken as "pure CRUD operations", there should always be in piecash a check that the set of objects is self-consistent (but this is really simple, obvious, natural constrains if one know a bit the GnuCash object model)
>
> Could you point me to 2 or 3 real case example we would nevertheless end up corrupting the database ? It would really help me to materialise this "risk".
> With my practical experience with piecash (creating a full account tree structure and importing thousands of transactions from csv file), I haven't found any case of corruption (except when developing the API and having "obvious" bugs).

I can’t provide concrete examples without doing an extensive code review of piecash, for which I have neither the time nor the inclination. Some obvious trouble spots include cross-commodity transactions, especially involving lots or trading accounts.

Have you tested with bad data to see if piecash rejects it? Did you thoroughly analyze the ways that bad data could be created and ensure that you have test cases proving that piecash rejects all of them?

> >
> > Has this stance on the manipulation outside GnuCash of a GnuCash document evolved since it was written ? Would this still be supported after the C++ rewrite ?
>
> It isn’t supported now. It never has, and it is unlikely that it ever will be, even if we are able, after several development cycles, to actually migrate to a 3N database schema. There is too much logic that is encoded in the program and which cannot be portably encoded in a SQL database, to make that feasible.
>
>
> For piecash, there is no need for a 3N db schema and there is no need to write any logic in SQL !
> It is through SQLAlchemy that we can handle/encapsulate the extra bit of logic (which is, as far as I am in the development of piecash and given piecash scope, rather limited). Nothing is done in SQL itself.

I have trouble believing that an ORM will generate a correct implementation with a non-normal schema.

>
>  >
> > All the rest is handled automatically thanks to the SQLAlchemy layer (link between objects, cascade delete, locking and transactions, generation of GUID key, …).
>
> Won’t work. SQLAlchemy can’t automatically generate a correct class from the schema nor can it derive a correct table description from the C headers.
>
>
> The table schema is exactly the only piece of code that needs to be written (piecash is in fact just that, table schemas with some metadata).
> I do not hope/expect/think to generate this code automatically from C headers (as SWIG does).

You’re not being consistent. Are you using SQLAlchemy as an ORM or simply as a SQL abstraction layer? You said above that nothing is done in SQL, and earlier that you are doing consistency checks, but here you’re saying that all you’ve written is a table-schema, which would imply that you’re relying on the database to do all of the work.

>  
> You’ll get closer working with the XML schema. There’s a reasonably up-to-date version in src/doc/xml/gnucash-v2.rnc.
>
>
> As written here above, XML is not easy to work with (at least for me) and does not save me of anything you said before (corruption, etc).

Sounds like a learning opportunity.

>  
> >
> > So, I admit, the main effort in this project consisted in writing for each entity (book, account, etc) the equivalent here above code as well as the relationships between entities. The later is done with a syntax like
> >
> >    class Split(DeclarativeBaseGuid):
> >        [...]
> >        account = relation('Account', backref=backref('splits', cascade='all, delete-orphan'))
> >
> >
> >> It’s also very dangerous to attempt to select what parts of GnuCash’s internal functions to reimplement. To take your example of balances, it’s an integral part of GnuCash’s internal checks to compute and compare balances. Leaving that out greatly increases the risk of introducing corrupt data.
> > I may have not be totally clear in my previous email but I do not indeed see any common sense in implementing the Balance calculations in PieCash. These results are part of the GnuCash application logic and are not saved in the GnuCash XML/SQL files. They are neither part of the object model itself but transient calculations.
>
> You’re confusing the object model with the persistence model. They’re rather different.
>
>
> I understand "object model" as the entities+relations+fields model and its concrete implementation, the "persistence model" as the table schema. Is it also what you are referring to ? If so, could you elaborate shortly on your comment ? I can't succeed in making the link between my comment and yours.
>
> and thank you again for your time !
>

The object model is the class hierarchy, with its member variables and functions, expressed in C and Scheme. In a proper database-based application there would be a close mapping between the two, but GnuCash isn’t yet one of those. In a proper OO design, everything that directly affects an object’s state would be encapsulated in the cmember functions of the object’s class. GnuCash isn’t one of those either: State-changing code is spread throughout the program.

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

Sébastien de Menten
On Friday, November 14, 2014, John Ralls <[hidden email]> wrote:

>
> On Nov 14, 2014, at 4:28 AM, Sébastien de Menten <[hidden email]
> <javascript:_e(%7B%7D,'cvml','[hidden email]');>> wrote:
>
>
> In terms of the implementation itself of the object model, the main things
> I see not that clean are:
> - KVP vs field representation
> - XXX_denom / XXX_num split (instead of using a Decimal type)
> But these two points should be hidden for the user/developper in python
> bindings (official or piecash).
>
>
> I experimented with the available decimal libraries over the summer.
> They’re mostly too slow and they don’t afford enough control over rounding
> to be sufficiently accurate for accounting use.
>
> The object model as it stands now has too much interdependence between
> classes, especially the transaction, split, account, and commodity classes.
> The implementation has a lot of
>
>
When looking at the SQL/XML document, what would
be the unneeded interdependencies ? The links between
account->split->transaction looks meaningful to me. The links
transaction->currency and account->commodity also.
Is it the issues with the scu/denom/num that is cumbersome to handle (if
there is a change in the scu of a currency, all splits related to it should
be updated) ?



> Could you point me to 2 or 3 real case example we would nevertheless end
> up corrupting the database ? It would really help me to materialise this
> "risk".
> With my practical experience with piecash (creating a full account tree
> structure and importing thousands of transactions from csv file), I haven't
> found any case of corruption (except when developing the API and having
> "obvious" bugs).
>
>
> I can’t provide concrete examples without doing an extensive code review
> of piecash, for which I have neither the time nor the inclination. Some
> obvious trouble spots include cross-commodity transactions, especially
> involving lots or trading accounts.
>

> Have you tested with bad data to see if piecash rejects it? Did you
> thoroughly analyze the ways that bad data could be created and ensure that
> you have test cases proving that piecash rejects all of them?
>
> Well, at first, I wanted to have a simple way to extract data out of
GnuCash (a read-only mode). This was easy to do with SQL alchemy and is
safe.
Afterwards, I wanted to be able to modify a GnuCash book knowing what I was
doing (ie being cautious to not create inconsistent objects. This was also
ok (as said, I have used piecash to create automatically a complex account
structure from an excel file and to import thousands of records without
errors).
Next is the ability to ensure consistency checks/error detection when the
user does changes. This is the less mature part and I hear well your
warnings about the complexity of this part.


> For piecash, there is no need for a 3N db schema and there is no need to
> write any logic in SQL !
> It is through SQLAlchemy that we can handle/encapsulate the extra bit of
> logic (which is, as far as I am in the development of piecash and given
> piecash scope, rather limited). Nothing is done in SQL itself.
>
>
> I have trouble believing that an ORM will generate a correct
> implementation with a non-normal schema.
>
> In piecash, the ORM does not generate the SQL schema as it already exists
(it is the one defined by GnuCash). It just maps the existing schema (that
needs to be redescribed in python) to python objects transparently handling
type conversion, transactions, guids, relationships, etc

>
> The table schema is exactly the only piece of code that needs to be
> written (piecash is in fact just that, table schemas with some metadata).
> I do not hope/expect/think to generate this code automatically from C
> headers (as SWIG does).
>
>
> You’re not being consistent. Are you using SQLAlchemy as an ORM or simply
> as a SQL abstraction layer? You said above that nothing is done in SQL, and
> earlier that you are doing consistency checks, but here you’re saying that
> all you’ve written is a table-schema, which would imply that you’re relying
> on the database to do all of the work.
>

>
> I am using sqlalchemy as a mapper and as a session manager (unit of work
pattern). Nothing is done in SQL (meaning I do not write SQL queries nor
SQL stored procedures).
A cascade delete constrain (if I remove a transaction, all related splits
are automatically deleted) is also managed by SA.
So I am "essentially" writing a mapping and then adding some methods to
either create objects in a consistent way (like having a function
"create_transaction(from_acc, to_acc, amount, date, description)" that
create always correct objects) or to check objects are consistent before
commit (this is not yet done)


>
> You’ll get closer working with the XML schema. There’s a reasonably
>> up-to-date version in src/doc/xml/gnucash-v2.rnc.
>>
>>
> As written here above, XML is not easy to work with (at least for me) and
> does not save me of anything you said before (corruption, etc).
>
>
> Sounds like a learning opportunity.
>

But will this save me from all the data corruption issues you mention ? If
not, I prefer working with python objects than XML. If so, how is this done
?

The object model is the class hierarchy, with its member variables and
> functions, expressed in C and Scheme. In a proper database-based
> application there would be a close mapping between the two, but GnuCash
> isn’t yet one of those. In a proper OO design, everything that directly
> affects an object’s state would be encapsulated in the cmember functions of
> the object’s class. GnuCash isn’t one of those either: State-changing code
> is spread throughout the program.
>
> If we take a document centric view instead of an application centric view,
the only important element is the data (and not the implementation of the
engine/class hierarchy/...).
As long as another program changes the data while keeping the
GnuCash schema and related invariants, we do not really care about the
program being OO or not, with encapsulation or not,... In fact, we do not
need the full power of the real GnuCash engine to manipulate (for simple
cases) the data.

I understand from your comment that the status of the GnuCash code is not
optimal and you're working on a clean rewrite in C++. This can be an extra
argument to try to have python bindings not linked to the GnuCash C code
but only relying on the GnuCash data.

Kind regards
Sebastien
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

Sébastien de Menten
In reply to this post by John Ralls-2
Hello John,

I have put at this address
https://github.com/sdementen/piecash/blob/master/docs/source/object_model.rst
what I understood from the object model of GnuCash
(schema/fields/invariants).
I have also added some questions regarding the objects for which you may
have the answer... (or these may be somewhere in GnuCash docs I could not
find)

I would be keen to get your opinion on this document.

kind regards,

Sebastien
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

John Ralls-2
In reply to this post by Sébastien de Menten

> On Nov 14, 2014, at 3:18 PM, Sébastien de Menten <[hidden email]> wrote:
>
>
>
> On Friday, November 14, 2014, John Ralls <[hidden email] <mailto:[hidden email]>> wrote:
>
>> On Nov 14, 2014, at 4:28 AM, Sébastien de Menten <[hidden email] <javascript:_e(%7B%7D,'cvml','[hidden email]');>> wrote:
>>
>>
>> In terms of the implementation itself of the object model, the main things I see not that clean are:
>> - KVP vs field representation
>> - XXX_denom / XXX_num split (instead of using a Decimal type)
>> But these two points should be hidden for the user/developper in python bindings (official or piecash).
>
> I experimented with the available decimal libraries over the summer. They’re mostly too slow and they don’t afford enough control over rounding to be sufficiently accurate for accounting use.
>
> The object model as it stands now has too much interdependence between classes, especially the transaction, split, account, and commodity classes. The implementation has a lot of
>
>
> When looking at the SQL/XML document, what would be the unneeded interdependencies ? The links between account->split->transaction looks meaningful to me. The links transaction->currency and account->commodity also.
> Is it the issues with the scu/denom/num that is cumbersome to handle (if there is a change in the scu of a currency, all splits related to it should be updated) ?
>
>  
>> Could you point me to 2 or 3 real case example we would nevertheless end up corrupting the database ? It would really help me to materialise this "risk".
>> With my practical experience with piecash (creating a full account tree structure and importing thousands of transactions from csv file), I haven't found any case of corruption (except when developing the API and having "obvious" bugs).
>
> I can’t provide concrete examples without doing an extensive code review of piecash, for which I have neither the time nor the inclination. Some obvious trouble spots include cross-commodity transactions, especially involving lots or trading accounts.
>
> Have you tested with bad data to see if piecash rejects it? Did you thoroughly analyze the ways that bad data could be created and ensure that you have test cases proving that piecash rejects all of them?
>
> Well, at first, I wanted to have a simple way to extract data out of GnuCash (a read-only mode). This was easy to do with SQL alchemy and is safe.
> Afterwards, I wanted to be able to modify a GnuCash book knowing what I was doing (ie being cautious to not create inconsistent objects. This was also ok (as said, I have used piecash to create automatically a complex account structure from an excel file and to import thousands of records without errors).
> Next is the ability to ensure consistency checks/error detection when the user does changes. This is the less mature part and I hear well your warnings about the complexity of this part.
>  
>> For piecash, there is no need for a 3N db schema and there is no need to write any logic in SQL !
>> It is through SQLAlchemy that we can handle/encapsulate the extra bit of logic (which is, as far as I am in the development of piecash and given piecash scope, rather limited). Nothing is done in SQL itself.
>
> I have trouble believing that an ORM will generate a correct implementation with a non-normal schema.
>
> In piecash, the ORM does not generate the SQL schema as it already exists (it is the one defined by GnuCash). It just maps the existing schema (that needs to be redescribed in python) to python objects transparently handling type conversion, transactions, guids, relationships, etc
>>
>> The table schema is exactly the only piece of code that needs to be written (piecash is in fact just that, table schemas with some metadata).
>> I do not hope/expect/think to generate this code automatically from C headers (as SWIG does).
>
> You’re not being consistent. Are you using SQLAlchemy as an ORM or simply as a SQL abstraction layer? You said above that nothing is done in SQL, and earlier that you are doing consistency checks, but here you’re saying that all you’ve written is a table-schema, which would imply that you’re relying on the database to do all of the work.
>
>
> I am using sqlalchemy as a mapper and as a session manager (unit of work pattern). Nothing is done in SQL (meaning I do not write SQL queries nor SQL stored procedures).
> A cascade delete constrain (if I remove a transaction, all related splits are automatically deleted) is also managed by SA.
> So I am "essentially" writing a mapping and then adding some methods to either create objects in a consistent way (like having a function "create_transaction(from_acc, to_acc, amount, date, description)" that create always correct objects) or to check objects are consistent before commit (this is not yet done)
>
>  
>> You’ll get closer working with the XML schema. There’s a reasonably up-to-date version in src/doc/xml/gnucash-v2.rnc.
>>
>>
>> As written here above, XML is not easy to work with (at least for me) and does not save me of anything you said before (corruption, etc).
>
> Sounds like a learning opportunity.  
>
> But will this save me from all the data corruption issues you mention ? If not, I prefer working with python objects than XML. If so, how is this done ?
>
> The object model is the class hierarchy, with its member variables and functions, expressed in C and Scheme. In a proper database-based application there would be a close mapping between the two, but GnuCash isn’t yet one of those. In a proper OO design, everything that directly affects an object’s state would be encapsulated in the cmember functions of the object’s class. GnuCash isn’t one of those either: State-changing code is spread throughout the program.
>
> If we take a document centric view instead of an application centric view, the only important element is the data (and not the implementation of the engine/class hierarchy/...).
> As long as another program changes the data while keeping the GnuCash schema and related invariants, we do not really care about the program being OO or not, with encapsulation or not,... In fact, we do not need the full power of the real GnuCash engine to manipulate (for simple cases) the data.
>
> I understand from your comment that the status of the GnuCash code is not optimal and you're working on a clean rewrite in C++. This can be an extra argument to try to have python bindings not linked to the GnuCash C code but only relying on the GnuCash data.

Sébastien,

The close coupling between classes is a OO design issue, not a data issue.

It’s certainly true that there’s no requirement for an accounting program to be OO. GnuCash is, mostly, and many of its design flaws are OO-related. Since your language of choice is Python, which heavily favors OO abstraction mechanisms, it’s surprising that you would want to ignore those flaws.

Yes, if another program changing the data keeps the schema *and invariants*, it will be compatible with GnuCash. The invariants are the catch: In order to maintain them, you have two choices: To use the GnuCash API or to faithfully duplicate GnuCash’s internals. You absolutely do need the full power of GnuCash to manipulate the data, because the full power of GnuCash is required to correctly interpret the stored state.

Regards,
John Ralls

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

John Ralls-2
In reply to this post by Sébastien de Menten

> On Nov 15, 2014, at 7:08 AM, Sébastien de Menten <[hidden email]> wrote:
>
> Hello John,
>
> I have put at this address https://github.com/sdementen/piecash/blob/master/docs/source/object_model.rst <https://github.com/sdementen/piecash/blob/master/docs/source/object_model.rst> what I understood from the object model of GnuCash (schema/fields/invariants).
> I have also added some questions regarding the objects for which you may have the answer... (or these may be somewhere in GnuCash docs I could not find)
>
> I would be keen to get your opinion on this document.

Sébastien,

It’s a gross simplification which is wrong in several important respects. The most significant error is the Price table, which is not a central data object, it’s just a record of prices/exchange rates at particular times. Before GnuCash 2.6 it didn’t even contain actual prices booked from transactions. Prices from the price table are used to calculate the default currency value in the Accounts page and on the summary bar, and for some reports. Prices in the register are calculated by the register code at the time of display, and editing the price in the Transfer Dialog changes the value. Another: Transactions *must* balance on value; if a submitted transaction doesn’t balance GnuCash will create a balancing split debited to Imbalance-XXX where XXX is the appropriate currency code for the side of the transaction that’s out of balance.

While it’s true that the SQL schema for Account doesn’t include a direct book reference, it does have a parent-GUID field which eventually ties it to the root account which, as you noted, is referenced by the book. While the stored state allows only one book per file or database, the importers work by creating a second book in memory, writing the imported records to that book, then merging the second book into the primary one. That would fail if the accounts couldn’t be traced back to the right book.

There’s more, but I’m no more inclined to do a detailed review of and response to your document than I am to review your code.

Regards,
John Ralls


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: python GnuCash interface to SQL backend

Christian Stimming-4
In reply to this post by Sébastien de Menten
Dear Sébastien,

I really try not to be rude, but a little bit it seems to me as if you don't
accept "no" as an answer here. You asked whether the gnucash developers
support an alternative SQL access layer written in python from scratch, and
John's and other answers clearly said "no". What else are you looking for?

John as already outlined many important aspects about our object model. In
case you haven't see this so far, some current documentation is also here
http://wiki.gnucash.org/wiki/C_API and the linked "Entity-Relationship
Diagram" there, http://wiki.gnucash.org/wiki/images/8/86/Gnucash_erd.png .

But let's just make this clear: You asked whether your idea would be endorsed
and supported by us, and the answer was clearly a "no". If you like to
continue your idea, feel free to do so. But just don't repeatedly discuss here
whether we want to change our answer (at this point in time). Thanks!

Regards,

Christian

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
12