[GNC-dev] Problems with python 3 and the gnucash python bindings.

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

[GNC-dev] Problems with python 3 and the gnucash python bindings.

David Osguthorpe
Hi All,


In upgrading to gnucash 3.2 from 2.6.18 and updating my python scripts I
have found an issue with the gnucash bindings and python 3.


I saw this with query runs that failed to produce any results when they
should have, and used to under 2.6.18.


So far it appears the issue is with the wrapping of the
qof_query_search_for call.

By inspection of the generated c code it appears by default swig wraps
python 3 unicode strings by converting to a new python 3 byte string object
(SWIG_AsCharPtrAndSize) and passes its char buffer pointer as the char *
pointer to the wrapped function call.

This object and pointer is freed at the end of the wrapper function.

Unfortunately it seems qof_query_search_for just stores the char * pointer
for the query which means by the time the query is run the pointer is
invalid (and indeed junk was printed using gnucash debug logging as the
Query Object Type: which is what lead me to this problem).


This is going to be a general problem for any gnucash function that simply
stores a char * pointer rather than copying the string.


According to the Python 3 documentation the PyUnicode_AsUTF8 function
creates a char * buffer which is cached in the unicode object so exists for
the lifetime of the unicode object.

I dont understand why this isnt the default SWIG wrapping for char pointers
but it does seem to solve the problem here.

(I havent been able to find any real discussion of this via Google. The
only hint Ive seen is that its a memory issue as PyUnicode_AsUTF8 needs to
store 2 versions of the string.)


This requires setting up swig typemaps.

So far I have used a very specific typemap for QofIdType and QofIdTypeConst
which solves the problem and python query runs produce results (and my
scripts work as under 2.6.18).


The question is where and how to implement this.

Currently Ive put the typemaps in gnucash_core.i - but they probably
should be in base-typemaps.i.

Then the question is how general to make this - it could be a typemap for
all char * pointers otherwise each explicit character type (eg gchar *)
would need its own typemap.

I think its clear that any const char * should map to PyUnicode_AsUTF8.


Also note that the GSList typemap in base-typemaps.i needs to be updated to
use unicode strings for python 3 - otherwise it requires python 3 byte
strings.


David


PS. Another solution would be to force byte string only arguments for
python 3 using a SWIG define SWIG_PYTHON_STRICT_BYTE_CHAR.

This would require a major re-write of the gnucash_core.py to perform the
unicode<->byte transformations.
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Problems with python 3 and the gnucash python bindings.

John Ralls-2


> On Jul 10, 2018, at 11:58 AM, David Osguthorpe <[hidden email]> wrote:
>
> Hi All,
>
>
> In upgrading to gnucash 3.2 from 2.6.18 and updating my python scripts I
> have found an issue with the gnucash bindings and python 3.
>
>
> I saw this with query runs that failed to produce any results when they
> should have, and used to under 2.6.18.
>
>
> So far it appears the issue is with the wrapping of the
> qof_query_search_for call.
>
> By inspection of the generated c code it appears by default swig wraps
> python 3 unicode strings by converting to a new python 3 byte string object
> (SWIG_AsCharPtrAndSize) and passes its char buffer pointer as the char *
> pointer to the wrapped function call.
>
> This object and pointer is freed at the end of the wrapper function.
>
> Unfortunately it seems qof_query_search_for just stores the char * pointer
> for the query which means by the time the query is run the pointer is
> invalid (and indeed junk was printed using gnucash debug logging as the
> Query Object Type: which is what lead me to this problem).
>
>
> This is going to be a general problem for any gnucash function that simply
> stores a char * pointer rather than copying the string.
>
>
> According to the Python 3 documentation the PyUnicode_AsUTF8 function
> creates a char * buffer which is cached in the unicode object so exists for
> the lifetime of the unicode object.
>
> I dont understand why this isnt the default SWIG wrapping for char pointers
> but it does seem to solve the problem here.
>
> (I havent been able to find any real discussion of this via Google. The
> only hint Ive seen is that its a memory issue as PyUnicode_AsUTF8 needs to
> store 2 versions of the string.)
>
>
> This requires setting up swig typemaps.
>
> So far I have used a very specific typemap for QofIdType and QofIdTypeConst
> which solves the problem and python query runs produce results (and my
> scripts work as under 2.6.18).
>
>
> The question is where and how to implement this.
>
> Currently Ive put the typemaps in gnucash_core.i - but they probably
> should be in base-typemaps.i.
>
> Then the question is how general to make this - it could be a typemap for
> all char * pointers otherwise each explicit character type (eg gchar *)
> would need its own typemap.
>
> I think its clear that any const char * should map to PyUnicode_AsUTF8.
>
>
> Also note that the GSList typemap in base-typemaps.i needs to be updated to
> use unicode strings for python 3 - otherwise it requires python 3 byte
> strings.
>
>
> David
>
>
> PS. Another solution would be to force byte string only arguments for
> python 3 using a SWIG define SWIG_PYTHON_STRICT_BYTE_CHAR.
>
> This would require a major re-write of the gnucash_core.py to perform the
> unicode<->byte transformations.

ISTM it would be better to fix qof_query. It shouldn't be assuming that just because it has typedeffed const char* to "QofIdTypeConst" that it will necessarily get a statically allocated char* that is safe to keep. There's absolutely nothing preventing it from being a stack or heap object (stack is more likely in C, as in
  QofIdTypeConst type = "book";
  qof_query_search_for(query, type);
) that will cause the same crash.

There are probably other cases where someone has made the same bad assumption and stored an unduplicated char*, though I hope none quite so egregious.

Please file a bug: https://wiki.gnucash.org/wiki/Bugzilla

Regards,
John Ralls

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Problems with python 3 and the gnucash python bindings.

David Osguthorpe
OK Will file a bug  - plus file a fix for GSList which is still needed.

David

On Tue, Jul 10, 2018 at 10:28 PM, John Ralls <[hidden email]> wrote:

>
>
> > On Jul 10, 2018, at 11:58 AM, David Osguthorpe <
> [hidden email]> wrote:
> >
> > Hi All,
> >
> >
> > In upgrading to gnucash 3.2 from 2.6.18 and updating my python scripts I
> > have found an issue with the gnucash bindings and python 3.
> >
> >
> > I saw this with query runs that failed to produce any results when they
> > should have, and used to under 2.6.18.
> >
> >
> > So far it appears the issue is with the wrapping of the
> > qof_query_search_for call.
> >
> > By inspection of the generated c code it appears by default swig wraps
> > python 3 unicode strings by converting to a new python 3 byte string
> object
> > (SWIG_AsCharPtrAndSize) and passes its char buffer pointer as the char *
> > pointer to the wrapped function call.
> >
> > This object and pointer is freed at the end of the wrapper function.
> >
> > Unfortunately it seems qof_query_search_for just stores the char *
> pointer
> > for the query which means by the time the query is run the pointer is
> > invalid (and indeed junk was printed using gnucash debug logging as the
> > Query Object Type: which is what lead me to this problem).
> >
> >
> > This is going to be a general problem for any gnucash function that
> simply
> > stores a char * pointer rather than copying the string.
> >
> >
> > According to the Python 3 documentation the PyUnicode_AsUTF8 function
> > creates a char * buffer which is cached in the unicode object so exists
> for
> > the lifetime of the unicode object.
> >
> > I dont understand why this isnt the default SWIG wrapping for char
> pointers
> > but it does seem to solve the problem here.
> >
> > (I havent been able to find any real discussion of this via Google. The
> > only hint Ive seen is that its a memory issue as PyUnicode_AsUTF8 needs
> to
> > store 2 versions of the string.)
> >
> >
> > This requires setting up swig typemaps.
> >
> > So far I have used a very specific typemap for QofIdType and
> QofIdTypeConst
> > which solves the problem and python query runs produce results (and my
> > scripts work as under 2.6.18).
> >
> >
> > The question is where and how to implement this.
> >
> > Currently Ive put the typemaps in gnucash_core.i - but they probably
> > should be in base-typemaps.i.
> >
> > Then the question is how general to make this - it could be a typemap for
> > all char * pointers otherwise each explicit character type (eg gchar *)
> > would need its own typemap.
> >
> > I think its clear that any const char * should map to PyUnicode_AsUTF8.
> >
> >
> > Also note that the GSList typemap in base-typemaps.i needs to be updated
> to
> > use unicode strings for python 3 - otherwise it requires python 3 byte
> > strings.
> >
> >
> > David
> >
> >
> > PS. Another solution would be to force byte string only arguments for
> > python 3 using a SWIG define SWIG_PYTHON_STRICT_BYTE_CHAR.
> >
> > This would require a major re-write of the gnucash_core.py to perform the
> > unicode<->byte transformations.
>
> ISTM it would be better to fix qof_query. It shouldn't be assuming that
> just because it has typedeffed const char* to "QofIdTypeConst" that it will
> necessarily get a statically allocated char* that is safe to keep. There's
> absolutely nothing preventing it from being a stack or heap object (stack
> is more likely in C, as in
>   QofIdTypeConst type = "book";
>   qof_query_search_for(query, type);
> ) that will cause the same crash.
>
> There are probably other cases where someone has made the same bad
> assumption and stored an unduplicated char*, though I hope none quite so
> egregious.
>
> Please file a bug: https://wiki.gnucash.org/wiki/Bugzilla
>
> Regards,
> John Ralls
>
>
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Problems with python 3 and the gnucash python bindings.

Derek Atkins
In reply to this post by John Ralls-2
John Ralls <[hidden email]> writes:

>> PS. Another solution would be to force byte string only arguments for
>> python 3 using a SWIG define SWIG_PYTHON_STRICT_BYTE_CHAR.
>>
>> This would require a major re-write of the gnucash_core.py to perform the
>> unicode<->byte transformations.
>
> ISTM it would be better to fix qof_query. It shouldn't be assuming
> that just because it has typedeffed const char* to "QofIdTypeConst"
> that it will necessarily get a statically allocated char* that is safe
> to keep. There's absolutely nothing preventing it from being a stack
> or heap object (stack is more likely in C, as in
>   QofIdTypeConst type = "book";
>   qof_query_search_for(query, type);
> ) that will cause the same crash.

I disagree -- the API specification is clear that it will store the
pointer, so callers need to know this.

When it moves to C++ you can just use std::string.  :)

> There are probably other cases where someone has made the same bad
> assumption and stored an unduplicated char*, though I hope none quite
> so egregious.
>
> Please file a bug: https://wiki.gnucash.org/wiki/Bugzilla
>
> Regards,
> John Ralls

-derek
--
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       [hidden email]                        PGP key available
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Problems with python 3 and the gnucash python bindings.

John Ralls-2


> On Jul 12, 2018, at 7:23 AM, Derek Atkins <[hidden email]> wrote:
>
> John Ralls <[hidden email]> writes:
>
>>> PS. Another solution would be to force byte string only arguments for
>>> python 3 using a SWIG define SWIG_PYTHON_STRICT_BYTE_CHAR.
>>>
>>> This would require a major re-write of the gnucash_core.py to perform the
>>> unicode<->byte transformations.
>>
>> ISTM it would be better to fix qof_query. It shouldn't be assuming
>> that just because it has typedeffed const char* to "QofIdTypeConst"
>> that it will necessarily get a statically allocated char* that is safe
>> to keep. There's absolutely nothing preventing it from being a stack
>> or heap object (stack is more likely in C, as in
>>  QofIdTypeConst type = "book";
>>  qof_query_search_for(query, type);
>> ) that will cause the same crash.
>
> I disagree -- the API specification is clear that it will store the
> pointer, so callers need to know this.
>

Where? The Doxygen comment is:

/** Set the object type to be searched for.  The results of
 *  performing the query will be a list of this obj_type.
 */
void qof_query_search_for (QofQuery *query, QofIdTypeConst obj_type);

Not only silent about storing a pointer, it even hides the fact that it’s a pointer at all. One has to go look at the typedef for QofIdTypeConst in qofid.h to see that it’s a const char *.

> When it moves to C++ you can just use std::string.  :)

An enum would be safer, faster, smaller, more intuitive, and wouldn’t break swig.

Regards,
John Ralls

>
>> There are probably other cases where someone has made the same bad
>> assumption and stored an unduplicated char*, though I hope none quite
>> so egregious.
>>
>> Please file a bug: https://wiki.gnucash.org/wiki/Bugzilla
>>
>> Regards,
>> John Ralls
>
> -derek
> --
>       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
>       Member, MIT Student Information Processing Board  (SIPB)
>       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
>       [hidden email]                        PGP key available

_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Problems with python 3 and the gnucash python bindings.

Derek Atkins
John Ralls <[hidden email]> writes:

>> When it moves to C++ you can just use std::string.  :)
>
> An enum would be safer, faster, smaller, more intuitive, and wouldn’t
> break swig.

Funny...  15+ years ago it WAS an Enum, but changed to a string because
we wanted to enable searching on plug-in modules.

> Regards,
> John Ralls

-derek
--
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       [hidden email]                        PGP key available
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Problems with python 3 and the gnucash python bindings.

John Ralls-2


> On Jul 16, 2018, at 10:09 AM, Derek Atkins <[hidden email]> wrote:
>
> John Ralls <[hidden email]> writes:
>
>>> When it moves to C++ you can just use std::string.  :)
>>
>> An enum would be safer, faster, smaller, more intuitive, and wouldn’t
>> break swig.
>
> Funny...  15+ years ago it WAS an Enum, but changed to a string because
> we wanted to enable searching on plug-in modules.

OK, and enums are immune to run-time changes implied by a plugin module, but there are better ways than strings to tag types. Glib provides Gquarks for just this purpose and even provides a function to get one on a GObject's type name:
g_type_qname(G_OBJECT_TYPE()).

C++ obviously has a different type model and we'll need to rework/replace QofID, QofContainer, and QofQuery to match it so changing them now to use quarks instead of strings would be wasted effort.

Regards,
John Ralls


_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel
Reply | Threaded
Open this post in threaded view
|

Re: [GNC-dev] Problems with python 3 and the gnucash python bindings.

Derek Atkins
John Ralls <[hidden email]> writes:

>> Funny...  15+ years ago it WAS an Enum, but changed to a string because
>> we wanted to enable searching on plug-in modules.
>
> OK, and enums are immune to run-time changes implied by a plugin
> module, but there are better ways than strings to tag types. Glib
> provides Gquarks for just this purpose and even provides a function to
> get one on a GObject's type name:
> g_type_qname(G_OBJECT_TYPE()).

These APIs did not exist in 2002 when this code was written.

> C++ obviously has a different type model and we'll need to
> rework/replace QofID, QofContainer, and QofQuery to match it so
> changing them now to use quarks instead of strings would be wasted
> effort.

*nods*

> Regards,
> John Ralls

-derek
--
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       [hidden email]                        PGP key available
_______________________________________________
gnucash-devel mailing list
[hidden email]
https://lists.gnucash.org/mailman/listinfo/gnucash-devel