DBAPI two phase commit implementation in psycopg2

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

DBAPI two phase commit implementation in psycopg2

Daniele Varrazzo-2
Hello,

I've recently joined the db-sig ML, and I've read the threads about
the two phase commit interface design of Jan 2008.

I'd like to implement the DBAPI TPC extension in psycopg2: I'm
considering the best way to overcome the slight model difference
between the XA-inspired DBAPI and the PostgreSQL commands.

The DBAPI xid structure has members (format_id, gtrid, bqual). In
postgresql PREPARE TRANSACTION only takes a string "tid". So it will
be the driver's responsibility to map between the xid triple and the
tid string. I may come out with a separator (e.g. "|") and concatenate
the three parts into a tid, escaping the separator. On the other way
round, it wouldn't be a problem to perform the inverse split, but I
should take in consideration transactions whose tid doesn't follow the
pattern (e.g. created by a non-XA-oriented application) and is just
composed by a string. tpc_recover would then create xid with format_id
= 0 and bqual = "", that seem reasonable default values reading the XA
specs.

Is postgres the only database with this xid mapping issue? If other
dbs have similar issues, how is the xid - string mapping usually
performed?

Any other suggestion about the matter would be appreciated. Thank you.

-- Daniele
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|

Re: DBAPI two phase commit implementation in psycopg2

M.-A. Lemburg
Daniele Varrazzo wrote:

> Hello,
>
> I've recently joined the db-sig ML, and I've read the threads about
> the two phase commit interface design of Jan 2008.
>
> I'd like to implement the DBAPI TPC extension in psycopg2: I'm
> considering the best way to overcome the slight model difference
> between the XA-inspired DBAPI and the PostgreSQL commands.
>
> The DBAPI xid structure has members (format_id, gtrid, bqual). In
> postgresql PREPARE TRANSACTION only takes a string "tid". So it will
> be the driver's responsibility to map between the xid triple and the
> tid string. I may come out with a separator (e.g. "|") and concatenate
> the three parts into a tid, escaping the separator. On the other way
> round, it wouldn't be a problem to perform the inverse split, but I
> should take in consideration transactions whose tid doesn't follow the
> pattern (e.g. created by a non-XA-oriented application) and is just
> composed by a string. tpc_recover would then create xid with format_id
> = 0 and bqual = "", that seem reasonable default values reading the XA
> specs.

Both sound like reasonable ways to map the xid requirements onto
PGs single string approach. I'd set the branch qualifier to something
like 'pgsql' or the database name, since each resource in a global
transaction should have its own branch qualifier.

I'd also look around to check how other tools that interoperate with
PG in two-phase commits handle this. XA is a widely used standard
in the industry, so I assume the problem must have popped up
elsewhere as well.

Note that the TM will usually create the xid and only the TM has
to be able to recognize its own xids for the purpose of managing
different transactions.

> Is postgres the only database with this xid mapping issue? If other
> dbs have similar issues, how is the xid - string mapping usually
> performed?

All the big ones (Oracle, DB2, Sybase, etc.) use XA for this.

> Any other suggestion about the matter would be appreciated. Thank you.

Some references:
http://download.oracle.com/javase/1.5.0/docs/api/javax/transaction/xa/Xid.html
http://dev.mysql.com/doc/refman/5.0/en/xa-statements.html

The XA spec:
http://www.opengroup.org/bookstore/catalog/c193.htm

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Sep 24 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|

Re: DBAPI two phase commit implementation in psycopg2

Daniele Varrazzo-2
On Fri, Sep 24, 2010 at 9:48 AM, M.-A. Lemburg <[hidden email]> wrote:

> I'd also look around to check how other tools that interoperate with
> PG in two-phase commits handle this. XA is a widely used standard
> in the industry, so I assume the problem must have popped up
> elsewhere as well.

MySQL uses a XA model, so the mapping is direct. Other high profile
databases hide the commands they use under tons of APIs and I've not
been able to find implementation details.

I've found instead the JDBC implementation of the XA - PG mapper [1]:
it is probably a good idea to use the same format to allow some form
of interoperation between tools. They use str(format_id) + '_' +
Base64(gtrid) + '_' + Base64(bqual) and on recover they refuse to work
on anything that doesn't follow this model. I don't agree on the
latter point because I think the driver should allow the user to
leverage everything the database permits, so I will probably find a
way to parse back a generic string into a XID. But I guess this is an
implementation detail better discussed on the psycopg mailing lists.

Thank you very much. Have a nice weekend.

-- Daniele

[1]: http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/jdbc/pgjdbc/org/postgresql/xa/RecoveredXid.java?rev=1.3&content-type=text/x-cvsweb-markup
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|

Re: DBAPI two phase commit implementation in psycopg2

James Henstridge
On Sat, Sep 25, 2010 at 12:04 AM, Daniele Varrazzo
<[hidden email]> wrote:

> On Fri, Sep 24, 2010 at 9:48 AM, M.-A. Lemburg <[hidden email]> wrote:
>
>> I'd also look around to check how other tools that interoperate with
>> PG in two-phase commits handle this. XA is a widely used standard
>> in the industry, so I assume the problem must have popped up
>> elsewhere as well.
>
> MySQL uses a XA model, so the mapping is direct. Other high profile
> databases hide the commands they use under tons of APIs and I've not
> been able to find implementation details.
>
> I've found instead the JDBC implementation of the XA - PG mapper [1]:
> it is probably a good idea to use the same format to allow some form
> of interoperation between tools. They use str(format_id) + '_' +
> Base64(gtrid) + '_' + Base64(bqual) and on recover they refuse to work
> on anything that doesn't follow this model. I don't agree on the
> latter point because I think the driver should allow the user to
> leverage everything the database permits, so I will probably find a
> way to parse back a generic string into a XID. But I guess this is an
> implementation detail better discussed on the psycopg mailing lists.
>
> Thank you very much. Have a nice weekend.

When writing the TPC additions for the spec, I did take PostgreSQL
into account (I thought I'd have time to write the psycopg2
implementation back then).  The reasoning for using the three part
identifiers was that the XA-style identifiers were quite common and it
was easier to losslessly encode the three part identifiers as a string
than vice versa.

The spec should allow you to manage identifiers that don't match your
mapping though.  The user can only get references to transaction ID
objects via method calls on the connection.  So while transaction IDs
are required to provide tuple like behaviour, the adapter doesn't have
to use actual tuple objects.  A custom object type could easily be
used here to round trip the foreign IDs between tpc_recover() and
tpc_commit()/tpc_abort().

James.
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig
Reply | Threaded
Open this post in threaded view
|

Re: DBAPI two phase commit implementation in psycopg2

Daniele Varrazzo-2
On Tue, Sep 28, 2010 at 8:09 AM, James Henstridge <[hidden email]> wrote:

> When writing the TPC additions for the spec, I did take PostgreSQL
> into account (I thought I'd have time to write the psycopg2
> implementation back then).  The reasoning for using the three part
> identifiers was that the XA-style identifiers were quite common and it
> was easier to losslessly encode the three part identifiers as a string
> than vice versa.
>
> The spec should allow you to manage identifiers that don't match your
> mapping though.  The user can only get references to transaction ID
> objects via method calls on the connection.  So while transaction IDs
> are required to provide tuple like behaviour, the adapter doesn't have
> to use actual tuple objects.  A custom object type could easily be
> used here to round trip the foreign IDs between tpc_recover() and
> tpc_commit()/tpc_abort().

Thank you James. I've found your code on launchpad with the Xid
implementation and it seems a great starting point. I know well enough
how the state in psycopg connections is handled so there should be no
problem to complete the implementation.

Regards,

-- Daniele
_______________________________________________
DB-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/db-sig