Identifier API

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Identifier API

"Martin v. Löwis"
In benchmarking PEP 393, I noticed that many UTF-8 decode
calls originate from C code with static strings, in particular
PyObject_CallMethod. Many of such calls already have been optimized
to cache a string object, however, PyObject_CallMethod remains
unoptimized since it requires a char*.

I find the ad-hoc approach of declaring and initializing variables
inadequate, in particular since it is difficult to clean up all
those string objects at interpreter shutdown.

I propose to add an explicit API to deal with such identifiers.
With this API,

     tmp = PyObject_CallMethod(result, "update", "O", other);

would be replaced with

      PyObject *tmp;
      Py_identifier(update);
      ...
      tmp = PyObject_CallMethodId(result, &PyId_update, "O", other);

Py_identifier expands to a struct

typedef struct Py_Identifier {
     struct Py_Identifier *next;
     const char* string;
     PyObject *object;
} Py_Identifier;

string will be initialized by the compiler, next and object on
first use. The new API for that will be

PyObject* PyUnicode_FromId(Py_Identifier*);
PyObject* PyObject_CallMethodId(PyObject*, Py_Identifier*, char*, ...);
PyObject* PyObject_GetAttrId(PyObject*, Py_Identifier*);
int PyObject_SetAttrId(PyObject*, Py_Identifier*, PyObject*);
int PyObject_HasAttrId(PyObject*, Py_Identifier*);

I have micro-benchmarked this; for

import time
d={}
i=d.items()
t=time.time()
for _ in range(10**6):
     i | d
print(time.time()-t)

I get a speed-up of 30% (notice that "i | d" invokes
the above PyObject_CallMethod call).

Regards,
Martin
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Antoine Pitrou
On Sat, 08 Oct 2011 16:54:06 +0200
"Martin v. Löwis" <[hidden email]> wrote:
>
> I find the ad-hoc approach of declaring and initializing variables
> inadequate, in particular since it is difficult to clean up all
> those string objects at interpreter shutdown.
>
> I propose to add an explicit API to deal with such identifiers.

That sounds like a good idea.

> With this API,
>
>      tmp = PyObject_CallMethod(result, "update", "O", other);
>
> would be replaced with
>
>       PyObject *tmp;
>       Py_identifier(update);
>       ...
>       tmp = PyObject_CallMethodId(result, &PyId_update, "O", other);

Surely there is something missing to initialize the "const char *" in
the structure? Or is "Py_identifier()" actually a macro?

> string will be initialized by the compiler, next and object on
> first use. The new API for that will be
>
> PyObject* PyUnicode_FromId(Py_Identifier*);
> PyObject* PyObject_CallMethodId(PyObject*, Py_Identifier*, char*, ...);
> PyObject* PyObject_GetAttrId(PyObject*, Py_Identifier*);
> int PyObject_SetAttrId(PyObject*, Py_Identifier*, PyObject*);
> int PyObject_HasAttrId(PyObject*, Py_Identifier*);

Do we want to make these APIs public?

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Georg Brandl-2
Am 08.10.2011 17:43, schrieb Antoine Pitrou:

> On Sat, 08 Oct 2011 16:54:06 +0200
> "Martin v. Löwis" <[hidden email]> wrote:
>>
>> I find the ad-hoc approach of declaring and initializing variables
>> inadequate, in particular since it is difficult to clean up all
>> those string objects at interpreter shutdown.
>>
>> I propose to add an explicit API to deal with such identifiers.
>
> That sounds like a good idea.
>
>> With this API,
>>
>>      tmp = PyObject_CallMethod(result, "update", "O", other);
>>
>> would be replaced with
>>
>>       PyObject *tmp;
>>       Py_identifier(update);
>>       ...
>>       tmp = PyObject_CallMethodId(result, &PyId_update, "O", other);
>
> Surely there is something missing to initialize the "const char *" in
> the structure? Or is "Py_identifier()" actually a macro?

Yes (note the parenthesized usage).

>> string will be initialized by the compiler, next and object on
>> first use. The new API for that will be
>>
>> PyObject* PyUnicode_FromId(Py_Identifier*);
>> PyObject* PyObject_CallMethodId(PyObject*, Py_Identifier*, char*, ...);
>> PyObject* PyObject_GetAttrId(PyObject*, Py_Identifier*);
>> int PyObject_SetAttrId(PyObject*, Py_Identifier*, PyObject*);
>> int PyObject_HasAttrId(PyObject*, Py_Identifier*);
>
> Do we want to make these APIs public?

Probably not at first.  I'd suggest making them private for Python 3.3, and if
the approach proves satisfying, we can think about adding them to the public
API in Python 3.4.

Georg

_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Hrvoje Niksic-2
In reply to this post by "Martin v. Löwis"
On 10/08/2011 04:54 PM, "Martin v. Löwis" wrote:
>       tmp = PyObject_CallMethod(result, "update", "O", other);
>
> would be replaced with
>
>        PyObject *tmp;
>        Py_identifier(update);
>        ...
>        tmp = PyObject_CallMethodId(result,&PyId_update, "O", other);

An alternative I am fond of is to to avoid introducing a new type, and
simply initialize a PyObject * and register its address.  For example:

   PyObject *tmp;
   static PyObject *s_update;    // pick a naming convention

   PY_IDENTIFIER_INIT(update);
   tmp = PyObject_CallMethodObj(result, s_update, "O", other);

   (but also PyObject_GetAttr(o, s_update), etc.)

PY_IDENTIFIER_INIT(update) might expand to something like:

   if (!s_update) {
     s_update = PyUnicode_InternFromString("update");
     _Py_IdentifierAdd(&s_update);
   }

_PyIdentifierAdd adds the address of the variable to a global set of C
variables that need to be decreffed and zeroed-out at interpreted shutdown.

The benefits of this approach is:
   * you don't need special "identifier" versions of functions such as
     PyObject_CallMethod. In my example I invented a
     PyObject_CallMethodObj, but adding that might be useful anyway.
   * a lot of Python/C code implements similar caching, often
     leaking strings.

Hrvoje
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Amaury Forgeot d'Arc
2011/10/11 Hrvoje Niksic <[hidden email]>
An alternative I am fond of is to to avoid introducing a new type, and simply initialize a PyObject * and register its address.  For example:

 PyObject *tmp;
 static PyObject *s_update;    // pick a naming convention

 PY_IDENTIFIER_INIT(update);
 tmp = PyObject_CallMethodObj(result, s_update, "O", other);

 (but also PyObject_GetAttr(o, s_update), etc.)

PY_IDENTIFIER_INIT(update) might expand to something like:

 if (!s_update) {
   s_update = PyUnicode_InternFromString("update");
   _Py_IdentifierAdd(&s_update);
 }

It should also check for errors; in this case the initialization is a bit more verbose:
if (PY_IDENTIFIER_INIT(update) < 0)
   <return NULL or return -1 or goto error>;

--
Amaury Forgeot d'Arc


_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Barry Warsaw
In reply to this post by Hrvoje Niksic-2
On Oct 11, 2011, at 02:36 PM, Hrvoje Niksic wrote:

>On 10/08/2011 04:54 PM, "Martin v. Löwis" wrote:
>>       tmp = PyObject_CallMethod(result, "update", "O", other);
>>
>> would be replaced with
>>
>>        PyObject *tmp;
>>        Py_identifier(update);
>>        ...
>>        tmp = PyObject_CallMethodId(result,&PyId_update, "O", other);
>
>An alternative I am fond of is to to avoid introducing a new type, and simply
>initialize a PyObject * and register its address.  For example:
>
>   PyObject *tmp;
>   static PyObject *s_update;    // pick a naming convention
>
>   PY_IDENTIFIER_INIT(update);
>   tmp = PyObject_CallMethodObj(result, s_update, "O", other);
>
>   (but also PyObject_GetAttr(o, s_update), etc.)

I like this better too because of the all-caps macro name.  Something about
seeing "Py_identifier" look like a function call and having it add the magical
PyId_update local bugs me.  It just looks wrong, whereas the all-caps is more
of a cultural clue that something else is going on.

-Barry
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Hrvoje Niksic-2
In reply to this post by Amaury Forgeot d'Arc
On 10/11/2011 02:45 PM, Amaury Forgeot d'Arc wrote:
> It should also check for errors; in this case the initialization is a
> bit more verbose:
> if (PY_IDENTIFIER_INIT(update) < 0)
> <return NULL or return -1 or goto error>;

Error checking is somewhat more controversial because behavior in case
of error differs between situations and coding patterns.  I think it
should be up to the calling code to check for s_update remaining NULL.
In my example, I would expect PyObject_CallMethodObj and similar to
raise InternalError when passed a NULL pointer.  Since their return
values are already checked, this should be enought to cover the unlikely
case of identifier creation failing.

Hrvoje
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Antoine Pitrou
In reply to this post by Barry Warsaw
On Tue, 11 Oct 2011 09:19:43 -0400
Barry Warsaw <[hidden email]> wrote:

> On Oct 11, 2011, at 02:36 PM, Hrvoje Niksic wrote:
>
> >On 10/08/2011 04:54 PM, "Martin v. Löwis" wrote:
> >>       tmp = PyObject_CallMethod(result, "update", "O", other);
> >>
> >> would be replaced with
> >>
> >>        PyObject *tmp;
> >>        Py_identifier(update);
> >>        ...
> >>        tmp = PyObject_CallMethodId(result,&PyId_update, "O", other);
> >
> >An alternative I am fond of is to to avoid introducing a new type, and simply
> >initialize a PyObject * and register its address.  For example:
> >
> >   PyObject *tmp;
> >   static PyObject *s_update;    // pick a naming convention
> >
> >   PY_IDENTIFIER_INIT(update);
> >   tmp = PyObject_CallMethodObj(result, s_update, "O", other);
> >
> >   (but also PyObject_GetAttr(o, s_update), etc.)
>
> I like this better too because of the all-caps macro name.  Something about
> seeing "Py_identifier" look like a function call and having it add the magical
> PyId_update local bugs me.  It just looks wrong, whereas the all-caps is more
> of a cultural clue that something else is going on.

+1 for something more recognizable.
I think "const string" is more accurate than "identifier" as well.

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Benjamin Peterson-3
2011/10/11 Antoine Pitrou <[hidden email]>:
> +1 for something more recognizable.
> I think "const string" is more accurate than "identifier" as well.

It should only really be used for identifiers, though, because the
result is interned.


--
Regards,
Benjamin
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Stefan Behnel-3
In reply to this post by "Martin v. Löwis"
"Martin v. Löwis", 08.10.2011 16:54:
> In benchmarking PEP 393, I noticed that many UTF-8 decode
> calls originate from C code with static strings, in particular
> PyObject_CallMethod. Many of such calls already have been optimized
> to cache a string object, however, PyObject_CallMethod remains
> unoptimized since it requires a char*.

Yes, I noticed that in Cython, too. We often use PyObject_CallMethod() as a
fallback for optimistically optimised method calls when the expected fast
path does not hit, and it always bugged me that this needs to generate a
Python string on each call in order to look up the method.


> I propose to add an explicit API to deal with such identifiers.
> With this API,
>
> tmp = PyObject_CallMethod(result, "update", "O", other);
>
> would be replaced with
>
> PyObject *tmp;
> Py_identifier(update);
> ...
> tmp = PyObject_CallMethodId(result, &PyId_update, "O", other);
>
> Py_identifier expands to a struct
>
> typedef struct Py_Identifier {
> struct Py_Identifier *next;
> const char* string;
> PyObject *object;
> } Py_Identifier;
>
> string will be initialized by the compiler, next and object on
> first use.

As I understand it, the macro expands to both the ID variable declaration
and the init-at-first-call code, right? This is problematic when more than
one identifier is used, as some C compilers strictly require declarations
to be written *before* any other code. I'm not sure how often users will
need more than one identifier in a function, but if it's not too hard to
come up with a way that avoids this problem all together, it would be
better to do so right from the start.

Also note that existing code needs to be changed in order to take advantage
of this. It might be possible to optimise PyObject_CallMethod() internally
by making the lookup either reuse a number of cached Python strings, or by
supporting a lookup of char* values in a dict *somehow*. However, this
appears to be substantially more involved than just moving a smaller burden
on the users.

Stefan

_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Victor STINNER
In reply to this post by "Martin v. Löwis"
Le samedi 8 octobre 2011 16:54:06, Martin v. Löwis a écrit :
> In benchmarking PEP 393, I noticed that many UTF-8 decode
> calls originate from C code with static strings, in particular
> PyObject_CallMethod. Many of such calls already have been optimized
> to cache a string object, however, PyObject_CallMethod remains
> unoptimized since it requires a char*.

Because all identifiers are ASCII (in the C code base), another idea is to use
a structure similar to PyASCIIObject but with an additional pointer to the
constant char* string:

typedef struct {
  PyASCIIObject _base;
  const char *ascii;
} PyConstASCIIObject;

Characters don't have to be copied, just the pointer, but you still have to
allocate a structure. Because the size of the structure is also constant, we
can have an efficient free list. Pseudo-code to create such object:

PyObject*
create_const_ascii(const char *str)
{
  PyConstASCIIObject *obj;
  /* ensure maybe that str is ASCII only? */
  obj = get_from_freelist(); # reset the object (e.g. hash)
  if (!obj) {
     obj = allocate_new_const_ascii();
     if (!obj) return NULL;
  }
  obj->ascii = str;
  return obj;
}

Except PyUnicode_DATA, such structure should be fully compatible with other
PEP 383 structures.

We would need a new format for Py_BuildValue, e.g. 'a' for ASCII string. Later
we can add new functions like _PyDict_GetASCII().

Victor
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Victor STINNER
Le jeudi 13 octobre 2011 00:44:33, Victor Stinner a écrit :

> Le samedi 8 octobre 2011 16:54:06, Martin v. Löwis a écrit :
> > In benchmarking PEP 393, I noticed that many UTF-8 decode
> > calls originate from C code with static strings, in particular
> > PyObject_CallMethod. Many of such calls already have been optimized
> > to cache a string object, however, PyObject_CallMethod remains
> > unoptimized since it requires a char*.
>
> Because all identifiers are ASCII (in the C code base), another idea is to
> use a structure similar to PyASCIIObject but with an additional pointer to
> the constant char* string:

Oh, I realized that Martin did already commit its PyIdentifier API, it's maybe
too late :-)

> We would need a new format for Py_BuildValue, e.g. 'a' for ASCII string.
> Later we can add new functions like _PyDict_GetASCII().

The main difference between my new "const ASCII" string idea and PyIdentifier is
the lifetime of objects. Using "const ASCII" strings, the strings are
destroyed quickly (to not waste memory), whereas PyIdentifiers are intern
strings and so they are only destroyed at Python exit.

I don't know if "const ASCII" strings solve a real issue. I implemented my
idea. I will do some benchmarks to check if it's useful or not :-)

Victor
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

"Martin v. Löwis"
In reply to this post by Hrvoje Niksic-2
> An alternative I am fond of is to to avoid introducing a new type, and
> simply initialize a PyObject * and register its address.

-1 on that, because of the lack of error checking.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

"Martin v. Löwis"
In reply to this post by Barry Warsaw
> I like this better too because of the all-caps macro name.  Something about
> seeing "Py_identifier" look like a function call and having it add the magical
> PyId_update local bugs me.  It just looks wrong, whereas the all-caps is more
> of a cultural clue that something else is going on.

If people think the macro should be all upper-case, I can go through and
replace them (but only once). Let me know what the exact spelling
should be.

Originally, I meant to make the variable name equal the string (e.g.
then having a variable named __init__ point to the "__init__" string).
However, I quickly gave up on that idea, since the strings conflict
too often with other identifiers in C. In particular, you couldn't
use that approach for calling the "fileno", "read" or "write" methods.

So I think it needs a prefix. If you don't like PyId_, let me know
what the prefix should be instead.

If there is no fixed prefix (i.e. if you have to specify variable
name and string value separately), and if there is no error checking,
there is IMO too little gain to make this usable. I'm particularly
worried about the error checking: the tricky part in C is to keep
track of all the code paths. This API hides this by putting the
initialization into the callee (PyObject_GetAttrId and friends),
hence the (unlikely) failure to initialize the string is reported in
the same code path as the (very plausible) error that the actual
attribute access failed.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Antoine Pitrou
On Thu, 13 Oct 2011 14:00:38 +0200
"Martin v. Löwis" <[hidden email]> wrote:
> > I like this better too because of the all-caps macro name.  Something about
> > seeing "Py_identifier" look like a function call and having it add the magical
> > PyId_update local bugs me.  It just looks wrong, whereas the all-caps is more
> > of a cultural clue that something else is going on.
>
> If people think the macro should be all upper-case, I can go through and
> replace them (but only once). Let me know what the exact spelling
> should be.

Py_CONST_STRING or Py_IDENTIFIER would be fine with me.
Given that everything else uses "Id" in their name, Py_IDENTIFIER is
probably better?

> Originally, I meant to make the variable name equal the string (e.g.
> then having a variable named __init__ point to the "__init__" string).
> However, I quickly gave up on that idea, since the strings conflict
> too often with other identifiers in C. In particular, you couldn't
> use that approach for calling the "fileno", "read" or "write" methods.
>
> So I think it needs a prefix. If you don't like PyId_, let me know
> what the prefix should be instead.

I agree with that.

Regards

Antoine.


_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Barry Warsaw
On Oct 13, 2011, at 03:23 PM, Antoine Pitrou wrote:

>Py_CONST_STRING or Py_IDENTIFIER would be fine with me.  Given that
>everything else uses "Id" in their name, Py_IDENTIFIER is probably better?

I agree that either is fine, with a slight preference for Py_IDENTIFIER for
the same reasons.

>> Originally, I meant to make the variable name equal the string (e.g.
>> then having a variable named __init__ point to the "__init__" string).
>> However, I quickly gave up on that idea, since the strings conflict
>> too often with other identifiers in C. In particular, you couldn't
>> use that approach for calling the "fileno", "read" or "write" methods.
>>
>> So I think it needs a prefix. If you don't like PyId_, let me know
>> what the prefix should be instead.
>
>I agree with that.

I'm fine with that too, as long as it's all well-documented in the C API
guide.

-Barry
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

"Martin v. Löwis"
>> Py_CONST_STRING or Py_IDENTIFIER would be fine with me.  Given that
>> everything else uses "Id" in their name, Py_IDENTIFIER is probably better?
>
> I agree that either is fine, with a slight preference for Py_IDENTIFIER for
> the same reasons.

Ok, so it's Py_IDENTIFIER.

>>> So I think it needs a prefix. If you don't like PyId_, let me know
>>> what the prefix should be instead.
>>
>> I agree with that.
>
> I'm fine with that too, as long as it's all well-documented in the C API
> guide.

Hmm. People voted that this should be an internal API, so I'm not sure
it should be documented at all outside of the header file, or if, in
what document.

Currently, this very point is documented in the header file.

Regards,
Martin
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Barry Warsaw
On Oct 13, 2011, at 08:08 PM, Martin v. Löwis wrote:

>>> Py_CONST_STRING or Py_IDENTIFIER would be fine with me.  Given that
>>> everything else uses "Id" in their name, Py_IDENTIFIER is probably better?
>>
>> I agree that either is fine, with a slight preference for Py_IDENTIFIER for
>> the same reasons.
>
>Ok, so it's Py_IDENTIFIER.

Given below, shouldn't that be _Py_IDENTIFIER?

>>>> So I think it needs a prefix. If you don't like PyId_, let me know
>>>> what the prefix should be instead.
>>>
>>> I agree with that.
>>
>> I'm fine with that too, as long as it's all well-documented in the C API
>> guide.
>
>Hmm. People voted that this should be an internal API, so I'm not sure it should be documented at all outside of the header file, or if, in
>what document.
>
>Currently, this very point is documented in the header file.
That's fine, if the macro is prefixed with an underscore.

-Barry

_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com

signature.asc (853 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Gregory Ewing
In reply to this post by "Martin v. Löwis"
Martin v. Löwis wrote:
> So I think it needs a prefix. If you don't like PyId_, let me know
> what the prefix should be instead.

Instead of an explicit prefix, how about a macro, such as
Py_ID(__string__)?

--
Greg
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
Reply | Threaded
Open this post in threaded view
|

Re: Identifier API

Victor STINNER
In reply to this post by Victor STINNER
Le jeudi 13 octobre 2011 03:34:00, Victor Stinner a écrit :

> > We would need a new format for Py_BuildValue, e.g. 'a' for ASCII string.
> > Later we can add new functions like _PyDict_GetASCII().
>
> The main difference between my new "const ASCII" string idea and
> PyIdentifier is the lifetime of objects. Using "const ASCII" strings, the
> strings are destroyed quickly (to not waste memory), whereas PyIdentifiers
> are intern strings and so they are only destroyed at Python exit.
>
> I don't know if "const ASCII" strings solve a real issue. I implemented my
> idea. I will do some benchmarks to check if it's useful or not :-)

Ok, I did some tests: it is slower with my PyConstASCIIObject. I don't
understand why, but it means that the idea is not interesting because the code
is not faster. It is also difficult to ensure that the string is "constant"
(test the scope of the string).

At least, I found a nice GCC function: __builtin_constant_p(str) can be used
to ensure that the string is constant (e.g. "abc" vs char*).

Victor
_______________________________________________
Python-Dev mailing list
[hidden email]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/lists%40nabble.com
12