Switch to database-level autocommit

classic Classic list List threaded Threaded
60 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Switch to database-level autocommit

Aymeric Augustin
Hello,

I'd like to improve transactions handling in Django. The first step is the current emulation of autocommit with database-level autocommit.

** Rationale **

PEP249 says that any SQL query opens a transaction and that it's the application's job to commit (or rollback) changes.  This model is also required by the SQL standard. But it isn't very developer-friendly.

To alleviate the pain, Django commits after each ORM write.  Unfortunately, this means that each read opens a transaction, that eventually gets committed after the next write, or rolled back at the end of the query. Such transactions are useless and don't come for free. Relying on them to enforce integrity is extremely fragile — what if an external library starts writing to a log table in the middle of one of these implicit transactions? The term "footgun" comes to mind.

Database authors have reached the same conclusion, and most databases supported by Django use autocommit by default, ignoring the SQL standard. On PostgreSQL and SQLite, this is the only mode available.

As a consequence, to implement the behavior mandated by PEP 249, the Python libraries (psycopg2, sqlite3, etc.) automatically start transactions. And then Django automatically commits them. This is not only wasteful, but also buggy. It's the root cause of "idle in transaction" connections on PostgreSQL. It's also sometimes poorly implemented: for instance, executing "SAVEPOINT …" on SQLite commits implicitly. (It's arguably a bug in the design of the sqlite3 module. The Python bug tracker suggests it's going to be documented.)

Basically, Django intends to provide autocommit by default. Rather than fight the database adapter that itselfs fights the database, I propose to simply turn autocommit on, and stop implicitly starting and committing transactions. Explicit is better than implicit.

** Implementation **

All databases supported by Django provide an API to turn autocommit on:


This obviously has far-reaching consequences on transaction handling in Django, but the current APIs should still work. (Fixing them is part 2 of the plan.) The general idea is that Django will explicitly start a transaction when entering transaction management.

This will obviously impact maintainers of backend for other databases, but if it works on Oracle (which doesn't have autocommit — it's emulated in OCI) and on PostgreSQL (which enforces autocommit), I hope it can work anywhere.

** Backwards-compatibility **

Roughly, I'd classify Django users in four groups:
1 - "Transactions, how do they work?"
2 - "Django autocommits, doesn't it?"
3 - "I'm using TransactionMiddleware!"
4 - "I'm managing my transactions."

Groups 1 and 2 won't see the difference. There won't be any difference for group 3. Group 4 may be impacted by the change, but I believe most people in this category have autocommit turned on already — or would like to, if they're on MySQL — and will understand the change.

I don't see much point in providing an option to turn autocommit off, because starting a transaction is a much more explicit way to achieve the same effect. We usually don't provide "backwards compatibility with bugs".

Yay or nay?

-- 
Aymeric.



--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Donald Stufft
On Friday, March 1, 2013 at 7:48 AM, Aymeric Augustin wrote:
Hello,

I'd like to improve transactions handling in Django. The first step is the current emulation of autocommit with database-level autocommit.

** Rationale **

PEP249 says that any SQL query opens a transaction and that it's the application's job to commit (or rollback) changes.  This model is also required by the SQL standard. But it isn't very developer-friendly.

To alleviate the pain, Django commits after each ORM write.  Unfortunately, this means that each read opens a transaction, that eventually gets committed after the next write, or rolled back at the end of the query. Such transactions are useless and don't come for free. Relying on them to enforce integrity is extremely fragile — what if an external library starts writing to a log table in the middle of one of these implicit transactions? The term "footgun" comes to mind.

Database authors have reached the same conclusion, and most databases supported by Django use autocommit by default, ignoring the SQL standard. On PostgreSQL and SQLite, this is the only mode available.

As a consequence, to implement the behavior mandated by PEP 249, the Python libraries (psycopg2, sqlite3, etc.) automatically start transactions. And then Django automatically commits them. This is not only wasteful, but also buggy. It's the root cause of "idle in transaction" connections on PostgreSQL. It's also sometimes poorly implemented: for instance, executing "SAVEPOINT …" on SQLite commits implicitly. (It's arguably a bug in the design of the sqlite3 module. The Python bug tracker suggests it's going to be documented.)

Basically, Django intends to provide autocommit by default. Rather than fight the database adapter that itselfs fights the database, I propose to simply turn autocommit on, and stop implicitly starting and committing transactions. Explicit is better than implicit.

** Implementation **

All databases supported by Django provide an API to turn autocommit on:


This obviously has far-reaching consequences on transaction handling in Django, but the current APIs should still work. (Fixing them is part 2 of the plan.) The general idea is that Django will explicitly start a transaction when entering transaction management.

This will obviously impact maintainers of backend for other databases, but if it works on Oracle (which doesn't have autocommit — it's emulated in OCI) and on PostgreSQL (which enforces autocommit), I hope it can work anywhere.

** Backwards-compatibility **

Roughly, I'd classify Django users in four groups:
1 - "Transactions, how do they work?"
2 - "Django autocommits, doesn't it?"
3 - "I'm using TransactionMiddleware!"
4 - "I'm managing my transactions."

Groups 1 and 2 won't see the difference. There won't be any difference for group 3. Group 4 may be impacted by the change, but I believe most people in this category have autocommit turned on already — or would like to, if they're on MySQL — and will understand the change.

I don't see much point in providing an option to turn autocommit off, because starting a transaction is a much more explicit way to achieve the same effect. We usually don't provide "backwards compatibility with bugs".

Yay or nay?
+1 

-- 
Aymeric.



--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Florian Apolloner
In reply to this post by Aymeric Augustin
Yay (+1) and preferably ASAP to get it tested by users right now.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Stephan Jaensch-2
In reply to this post by Aymeric Augustin
 
I'd like to improve transactions handling in Django. The first step is the current emulation of autocommit with database-level autocommit.
[...] 
I don't see much point in providing an option to turn autocommit off, because starting a transaction is a much more explicit way to achieve the same effect. We usually don't provide "backwards compatibility with bugs".

Yay or nay?

A definite yay from me!

Cheers,
Stephan 

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Javier Guerra Giraldez
In reply to this post by Aymeric Augustin
On Fri, Mar 1, 2013 at 7:48 AM, Aymeric Augustin
<[hidden email]> wrote:
> To alleviate the pain, Django commits after each ORM write.

just curious:  why is it so?   i can't think of any reason not to tie
transaction to the request/response cycle by default. (ie what
TransactionMiddleware does).

this is the default on a few other frameworks i use, and has already
saved my behinds more than once.


--
Javier

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Aymeric Augustin
On 1 mars 2013, at 14:38, Javier Guerra Giraldez <[hidden email]> wrote:

> On Fri, Mar 1, 2013 at 7:48 AM, Aymeric Augustin
> <[hidden email]> wrote:
>> To alleviate the pain, Django commits after each ORM write.
>
> just curious:  why is it so?   i can't think of any reason not to tie
> transaction to the request/response cycle by default. (ie what
> TransactionMiddleware does).

TransactionMiddleware makes efficient connection pooling nearly
impossible.

> this is the default on a few other frameworks i use, and has already
> saved my behinds more than once.


I agree that it's a good default for small-to-medium sites.

This is somewhere in a later part of my plan :)

--
Aymeric.



--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Carl Meyer-4
In reply to this post by Aymeric Augustin
On 03/01/2013 05:48 AM, Aymeric Augustin wrote:
> Basically, Django intends to provide autocommit by default. Rather than
> fight the database adapter that itselfs fights the database, I propose
> to simply turn autocommit on, and stop implicitly starting and
> committing transactions. Explicit is better than implicit.

+1.

Carl

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Alex Gaynor
+1 from me. Here's a patch to add autocommit to MySQL: https://github.com/django/django/pull/857

FWIW: any sort of scheme where a transaction is kept open all request (including a transaction that only ever reads), will cause you serious pain if you're trying to do migrations on MySQL with traffic.

Alex


On Fri, Mar 1, 2013 at 7:07 AM, Carl Meyer <[hidden email]> wrote:
On 03/01/2013 05:48 AM, Aymeric Augustin wrote:
> Basically, Django intends to provide autocommit by default. Rather than
> fight the database adapter that itselfs fights the database, I propose
> to simply turn autocommit on, and stop implicitly starting and
> committing transactions. Explicit is better than implicit.

+1.

Carl

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.





--
"I disapprove of what you say, but I will defend to the death your right to say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Christophe Pettus-2
In reply to this post by Aymeric Augustin

On Mar 1, 2013, at 4:48 AM, Aymeric Augustin wrote:
> Yay or nay?

+1.
--
-- Christophe Pettus
   [hidden email]

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Jacob Kaplan-Moss-2
In reply to this post by Aymeric Augustin
Hey Aymeric -

Yes, I think this is the right thing to do: +1.

Jacob

On Fri, Mar 1, 2013 at 4:48 AM, Aymeric Augustin
<[hidden email]> wrote:

> Hello,
>
> I'd like to improve transactions handling in Django. The first step is the
> current emulation of autocommit with database-level autocommit.
>
> ** Rationale **
>
> PEP249 says that any SQL query opens a transaction and that it's the
> application's job to commit (or rollback) changes.  This model is also
> required by the SQL standard. But it isn't very developer-friendly.
>
> To alleviate the pain, Django commits after each ORM write.  Unfortunately,
> this means that each read opens a transaction, that eventually gets
> committed after the next write, or rolled back at the end of the query. Such
> transactions are useless and don't come for free. Relying on them to enforce
> integrity is extremely fragile — what if an external library starts writing
> to a log table in the middle of one of these implicit transactions? The term
> "footgun" comes to mind.
>
> Database authors have reached the same conclusion, and most databases
> supported by Django use autocommit by default, ignoring the SQL standard. On
> PostgreSQL and SQLite, this is the only mode available.
>
> As a consequence, to implement the behavior mandated by PEP 249, the Python
> libraries (psycopg2, sqlite3, etc.) automatically start transactions. And
> then Django automatically commits them. This is not only wasteful, but also
> buggy. It's the root cause of "idle in transaction" connections on
> PostgreSQL. It's also sometimes poorly implemented: for instance, executing
> "SAVEPOINT …" on SQLite commits implicitly. (It's arguably a bug in the
> design of the sqlite3 module. The Python bug tracker suggests it's going to
> be documented.)
>
> Basically, Django intends to provide autocommit by default. Rather than
> fight the database adapter that itselfs fights the database, I propose to
> simply turn autocommit on, and stop implicitly starting and committing
> transactions. Explicit is better than implicit.
>
> ** Implementation **
>
> All databases supported by Django provide an API to turn autocommit on:
>
> - http://initd.org/psycopg/docs/connection.html#connection.autocommit
> -
> http://docs.python.org/2/library/sqlite3#sqlite3.Connection.isolation_level
> - http://mysql-python.sourceforge.net/MySQLdb.html => conn.autocommit()
> -
> http://cx-oracle.sourceforge.net/html/connection.html#Connection.autocommit
>
> This obviously has far-reaching consequences on transaction handling in
> Django, but the current APIs should still work. (Fixing them is part 2 of
> the plan.) The general idea is that Django will explicitly start a
> transaction when entering transaction management.
>
> This will obviously impact maintainers of backend for other databases, but
> if it works on Oracle (which doesn't have autocommit — it's emulated in OCI)
> and on PostgreSQL (which enforces autocommit), I hope it can work anywhere.
>
> ** Backwards-compatibility **
>
> Roughly, I'd classify Django users in four groups:
> 1 - "Transactions, how do they work?"
> 2 - "Django autocommits, doesn't it?"
> 3 - "I'm using TransactionMiddleware!"
> 4 - "I'm managing my transactions."
>
> Groups 1 and 2 won't see the difference. There won't be any difference for
> group 3. Group 4 may be impacted by the change, but I believe most people in
> this category have autocommit turned on already — or would like to, if
> they're on MySQL — and will understand the change.
>
> I don't see much point in providing an option to turn autocommit off,
> because starting a transaction is a much more explicit way to achieve the
> same effect. We usually don't provide "backwards compatibility with bugs".
>
> Yay or nay?
>
> --
> Aymeric.
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [hidden email].
> To post to this group, send email to [hidden email].
> Visit this group at http://groups.google.com/group/django-developers?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Anssi Kääriäinen
In reply to this post by Aymeric Augustin
On 1 maalis, 14:48, Aymeric Augustin
<[hidden email]> wrote:

> Yay or nay?

+1.

I discussed this on IRC with Aymeric. I tried to find cases where this
breaks currently working code, and I was mostly unsuccessful. It seems
there could be such cases (doing raw SQL writes, commit at end, and no
ORM writes, all this outside transaction management), but it seems
such cases are rare, and worked more by luck than knowledge of how
Django's tx management works. Explicit transactions for such cases
would be good anyways.

This will be a backwards incompatible change. The current behaviour is
explicitly documented, and the new behaviour is clearly different in
some cases. However, finding cases where one actually wants the
current behaviour (ORM writes commit, raw SQL writes do not, and any
query will start a transaction) is hard. Doing a full deprecation path
for this will be ugly, so just changing this without deprecation
period seems good to me.

All in all, this might cause some pain to very small portion of users
when upgrading to 1.6. Those users will almost certainly be happy
after they have upgraded their code. Otherwise this will be a big win
for everybody.

 - Anssi

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Shai Berger
In reply to this post by Aymeric Augustin
On Friday 01 March 2013, Aymeric Augustin wrote:
> Hello,
>
> I'd like to improve transactions handling in Django. The first step is [to
> replace -- assumed, SB] the current emulation of autocommit with
> database-level autocommit.
>

There is an issue you seem to be ignoring: An "ORM Write" is, in many cases,
more than a single query against the backend. The most obvious example is
models with inheritance -- trying to do these with database-level autocommit
means that the code writes the two parts in separate transactions. I'm very -1
on this.

There are two alternatives I see -- one is to make sure that an ORM Write is
still a transaction (not database-level autocommit); the other is to
explicitly commit-unless-managed (or something equivalent) after every read
and raw-sql, as well as write.

BTW, how do you treat select-for-update, which currently makes sense in "faked
autocommit" mode, but will stop making sense with the suggested change?

Shai.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Shai Berger
In reply to this post by Aymeric Augustin
Thinking again, a few more points come to mind:

On Friday 01 March 2013, Aymeric Augustin wrote:
>
> Such transactions are useless and don't come for free. Relying on them to
> enforce integrity is extremely fragile — what if an external library
> starts writing to a log table in the middle of one of these implicit
> transactions?

Then it's the external library that is at fault; it should be using a separate
alias, with its own transactions.

> The term "footgun" comes to mind.
>
True; but has a wider implication than you seem to give it credit for. In
particular,

>
> ** Backwards-compatibility **
>
> Roughly, I'd classify Django users in four groups:
> 1 - "Transactions, how do they work?"
> 2 - "Django autocommits, doesn't it?"
> 3 - "I'm using TransactionMiddleware!"
> 4 - "I'm managing my transactions."
>
> Groups 1 and 2 won't see the difference.

This will turn existing projects from correct (though perhaps fragile) to
buggy, exactly in groups 1&2, where the users are less likely to understand
the issues. The apps are correct today -- they rely on the documented
transaction behavior, even if the authors are not fully aware of it. And they
will turn buggy in a way that is not likely to be detected by tests, because
it will have effects mostly under concurrent access or system crashes --
circumstances you don't usually have in tests.

Thus,
>
> I don't see much point in providing an option to turn autocommit off,
> because starting a transaction is a much more explicit way to achieve the
> same effect. We usually don't provide "backwards compatibility with bugs".
>
-1. Make it easier (and cross-backend) to set db-level-autocommit on. Put the
setting for it in the default template for new projects. Don't change existing
code from "fragile" to "subtly broken".

Shai.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Aymeric Augustin
In reply to this post by Shai Berger
On 2 mars 2013, at 15:50, Shai Berger <[hidden email]> wrote:

> There is an issue you seem to be ignoring: An "ORM Write" is, in many cases,
> more than a single query against the backend. The most obvious example is
> models with inheritance -- trying to do these with database-level autocommit
> means that the code writes the two parts in separate transactions. I'm very -1
> on this.

Yes, that's very important. Anssi raised this problem on IRC too.

(I also heard that Django is currently broken in this regard; I didn't check.)

> There are two alternatives I see -- one is to make sure that an ORM Write is
> still a transaction (not database-level autocommit); the other is to
> explicitly commit-unless-managed (or something equivalent) after every read
> and raw-sql, as well as write.

The plan is to ensure that ORM operations that may result in multiple queries
are executed within a transaction.

It's possible to start with a dumb implementation, ie. wrap all ORM writes in a
transaction, and eventually get smarter, ie. determine if more than one query
will be emitted.

Even the dumb implementation will have better transactional properties than
the status quo.

In practice, that means adding support for savepoints to all backends, ripping
off https://github.com/xof/xact — with permission from Christophe, which I
haven't asked for yet — and wrapping ORM writes in this decorator. However,
autocommit is a pre-requisite for adding savepoints to SQLite, because of the
stdlib's braindead implementation of PEP 249.

Hence, autocommit is part 1 of the plan, and improved transaction
management is part 2.

> BTW, how do you treat select-for-update, which currently makes sense in "faked
> autocommit" mode, but will stop making sense with the suggested change?


I'm leaning towards not doing any magic and documenting this as a backwards
incompatibility. Relying on Django's current pseudo-auto-commit is very fragile:

qs = MyModel.objects.filter(…)
qs.select_for_update(…)
# what if someone introduces an unrelated ORM write here?
qs.update(…)

I hope developers already wrap their select_for_update calls and associated
writes in transaction.commit_on_success, and it's a good thing to make it
mandatory.

--
Aymeric.



--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Aymeric Augustin
In reply to this post by Shai Berger
On 2 mars 2013, at 16:18, Shai Berger <[hidden email]> wrote:

> -1. Make it easier (and cross-backend) to set db-level-autocommit on. Put the
> setting for it in the default template for new projects. Don't change existing
> code from "fragile" to "subtly broken".

This isn't simply about changing some default. It's about re-implementing all the
transaction machinery.

With the scope I described, I estimate that I can complete the project in 120 to 200
hours. If I have to keep bug-parity with the current behavior, I estimate that I'm not
capable of delivering it.

In other words, you're requesting that I fork Django rather than contribute this to the
main tree.

--
Aymeric.



--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Shai Berger
On Saturday 02 March 2013, Aymeric Augustin wrote:

> On 2 mars 2013, at 16:18, Shai Berger <[hidden email]> wrote:
> > -1. Make it easier (and cross-backend) to set db-level-autocommit on. Put
> > the setting for it in the default template for new projects. Don't
> > change existing code from "fragile" to "subtly broken".
>
> This isn't simply about changing some default. It's about re-implementing
> all the transaction machinery.
>
> With the scope I described, I estimate that I can complete the project in
> 120 to 200 hours. If I have to keep bug-parity with the current behavior,
> I estimate that I'm not capable of delivering it.
>
> In other words, you're requesting that I fork Django rather than contribute
> this to the main tree.

I may have phrased my objection too strongly -- all in all, I laud your
continuing effort to improve Django and its database layer, including this
change. However, I must stand my ground.

The Django documentation on transactions, at the moment says this on Django's
default behavior[0]:

> Django’s default behavior is to run with an open transaction which it
> commits automatically when any built-in, data-altering model function is
> called. For example, if you call model.save() or model.delete(), the
> change will be committed immediately.

Now, you want to call this behavior (the part where there's one open
transaction, rather than one-per-db-operation) a bug. I find that a little odd,
but I can live with it.

What I have a harder time living with is my other point, which has not been
refuted: The proposed change turns code that is currently correct -- liable to
break if changed, but still correct as it stands -- into code with "phantom
bugs" that show up only in production, under specific timings of concurrent
requests (because reads and writes which used to be in the same transaction
are now on separate transactions). It is my opinion that such changes may only
be made when changing major versions (i.e. Django 2.0), if at all.

If anyone can offer a way to detect the situation and warn users of the change,
that would make things perfectly fine. Changing the default transaction
behavior along the lines you suggested would be a great improvement. I have a
strong suspicion, though, that if the new behavior is implemented as default
(and with no recourse to the old behavior, to boot), then that simply cannot
be done.

I think that as suggested, the change would break the project's committments
to its users in a way that is much more important than the performance gains,
avoided idle-in-transaction errors, and effort considerations, all put
together. That is my objection.

Thanks for your attention,

Shai.

[0] https://docs.djangoproject.com/en/dev/topics/db/transactions/

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Aymeric Augustin
On 2 mars 2013, at 21:46, Shai Berger <[hidden email]> wrote:

> The Django documentation on transactions, at the moment says this on Django's
> default behavior[0]:
>
>> Django’s default behavior is to run with an open transaction which it
>> commits automatically when any built-in, data-altering model function is
>> called. For example, if you call model.save() or model.delete(), the
>> change will be committed immediately.

Yes, my proposal is a backwards-incompatible change with regard to this
sentence.

However, I believe that only a small fraction of the users of Django fully
grok the implications of this sentence, and a fraction of this fraction relies
on it to ensure some level of integrity.

> What I have a harder time living with is my other point, which has not been
> refuted: The proposed change turns code that is currently correct -- liable to
> break if changed, but still correct as it stands -- into code with "phantom
> bugs" that show up only in production, under specific timings of concurrent
> requests (because reads and writes which used to be in the same transaction
> are now on separate transactions).

I'm aware of this scenario and I'm willing to document it in the release
notes.

> It is my opinion that such changes may only
> be made when changing major versions (i.e. Django 2.0), if at all.

The current consensus — at least within the core team — is that there will never
be a "break everything" Django 2.0.

I believe that the level of backwards-incompatibility described above is within
acceptable bounds for Django 1.6.

I also believe that it beats the alternative — namely, live with the current
behavior forever.

> If anyone can offer a way to detect the situation and warn users of the change,
> that would make things perfectly fine. Changing the default transaction
> behavior along the lines you suggested would be a great improvement. I have a
> strong suspicion, though, that if the new behavior is implemented as default
> (and with no recourse to the old behavior, to boot), then that simply cannot
> be done.
>
> I think that as suggested, the change would break the project's committments
> to its users in a way that is much more important than the performance gains,
> avoided idle-in-transaction errors, and effort considerations, all put
> together. That is my objection.


Yes, I agree with this way to frame the discussion.

We reach different conclusions because I don't consider backwards
compatibility an absolute that trumps every other consideration. It's a
continuum. Backwards compatibility must find a balance with progress.  The
same goes for security: while extremely important, it still has to find a
balance with usability — otherwise you'd never dare open port 80.

--
Aymeric.



--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Shai Berger

Hi again,

 

On Sunday 03 March 2013, Aymeric Augustin wrote:

> On 2 mars 2013, at 21:46, Shai Berger <[hidden email]> wrote:

> > The Django documentation on transactions, at the moment says this on

> > Django's

> >

> > default behavior[0]:

> >> Django’s default behavior is to run with an open transaction which it

> >> commits automatically when any built-in, data-altering model function is

> >> called. For example, if you call model.save() or model.delete(), the

> >> change will be committed immediately.

>

> Yes, my proposal is a backwards-incompatible change with regard to this

> sentence.

>

> However, I believe that only a small fraction of the users of Django fully

> grok the implications of this sentence, and a fraction of this fraction

> relies on it to ensure some level of integrity.

>

 

I agree with this belief about users *deliberately* relying on the implications for integrity. I have no idea how many people simply wrote code that works, and will now break. I also have no idea how many people maintain a system started long ago by a more knowledgeable developer.

 

> > What I have a harder time living with is my other point, which has not

> > been refuted: The proposed change turns code that is currently correct

> > -- liable to break if changed, but still correct as it stands -- into

> > code with "phantom bugs" that show up only in production, under specific

> > timings of concurrent requests (because reads and writes which used to

> > be in the same transaction are now on separate transactions).

> [...]

>

> I believe that the level of backwards-incompatibility described above is

> within acceptable bounds for Django 1.6.

>

 

I believe this is the core of our disagreement here.

 

> I also believe that it beats the alternative — namely, live with the

> current behavior forever.

>

 

I sincerely hope that is not the only alternative; that there's a way to implement the new behavior side-by-side with the old one. There usually is.

 

> > If anyone can offer a way to detect the situation and warn users of the

> > change, that would make things perfectly fine. Changing the default

> > transaction behavior along the lines you suggested would be a great

> > improvement. I have a strong suspicion, though, that if the new behavior

> > is implemented as default (and with no recourse to the old behavior, to

> > boot), then that simply cannot be done.

> >

> > I think that as suggested, the change would break the project's

> > committments to its users in a way that is much more important than the

> > performance gains, avoided idle-in-transaction errors, and effort

> > considerations, all put together. That is my objection.

>

> Yes, I agree with this way to frame the discussion.

>

> We reach different conclusions because I don't consider backwards

> compatibility an absolute that trumps every other consideration. It's a

> continuum. Backwards compatibility must find a balance with progress.

 

I don't consider backwards-compatibility an absolute, and the first of my two paras which you quoted above shows just that. I do think there are proper ways to introduce backwards-incompatible changes, and what you suggested isn't one of them.

 

In particular, per https://docs.djangoproject.com/en/dev/misc/api-stability/, we should have a deprecation process with warnings, unless fixing a security bug. Even with security issues, it was deemed preferrable to take an expedited deprecation process when possible (e.g. with the markdown module) rather than an immediate breaking change. Throwing this process to the wind over an improvement -- as nice as it is -- seems unwise to me.

 

An immeiately-breaking change, where the breakage is hard to detect and debug, and with high potential for data loss... I don't think any warning in the release notes can be enough. All I can say is, -1.

 

Shai.

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Jacob Kaplan-Moss-2
On Sat, Mar 2, 2013 at 3:27 PM, Shai Berger <[hidden email]> wrote:
>> I believe that the level of backwards-incompatibility described above is
>> within acceptable bounds for Django 1.6.
>
> I believe this is the core of our disagreement here.

I'm with Aymeric: the current behavior is bad enough, and this is a
big enough improvement, and the backwards-incompatibility is minor
enough. I think the data-loss possibilities Shai suggests are pretty
overblown. Shai: do you have any real-world code that'd have data loss
with this bug? Looking through the tons of code I've got available
can't reveal a single place where this chance would have any negative
effect. OTOH, there're lots of places where it'd have a positive
effect.

Jacob

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|

Re: Switch to database-level autocommit

Florian Apolloner
In reply to this post by Shai Berger
Hi Shai,

On Sunday, March 3, 2013 12:27:47 AM UTC+1, Shai Berger wrote:
> I also believe that it beats the alternative — namely, live with the

> current behavior forever.

 

I sincerely hope that is not the only alternative; that there's a way to implement the new behavior side-by-side with the old one. There usually is.


Can you describe how an alternative in this case would look like? I personally can't think of any sensible one.
 
I don't consider backwards-compatibility an absolute, and the first of my two paras which you quoted above shows just that. I do think there are proper ways to introduce backwards-incompatible changes, and what you suggested isn't one of them.

If you think there are proper ways to introduce this specific change, please show us. Aside from that, all of our bugfixes and new features have a possibility to break code. Bugfixes that break old code usually occur as a side effect of not seeing the issue the fix causes or being fully aware of the issue and considering it minor enough (that is no data-loss and only happening to a small fraction of users). New Features shouldn't break code, but they usually also touch old code, so they can break stuff in the same way as bugfixes.

In the case of this change I think that users with low traffic sites won't even notice the changes (no concurrency) and users with high traffic sites shouldn't see it cause they should be using autocommit and the relevant decorators anyways…


An immeiately-breaking change, where the breakage is hard to detect and debug, and with high potential for data loss... I don't think any warning in the release notes can be enough. All I can say is, -1.


I am still +1.

Cheers,
Florian

--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at http://groups.google.com/group/django-developers?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 
123