Quantcast

[Django] #16649: Models.save() refactoring: check updated rows to determine action

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
#16649: Models.save() refactoring: check updated rows to determine action
----------------------------+----------------------------------------------
 Reporter:  akaariai        |          Owner:  nobody
     Type:                  |         Status:  new
  Cleanup/optimization      |      Component:  Database layer (models, ORM)
Milestone:                  |       Severity:  Normal
  Version:  1.3             |   Triage Stage:  Unreviewed
 Keywords:                  |  Easy pickings:  0
Has patch:  1               |
    UI/UX:  0               |
----------------------------+----------------------------------------------
 The attached patch implements a new logic for models.save(). Instead of
 the old
 {{{
 SELECT
 if found:
     UPDATE
 else:
     INSERT

 The patch uses
 UPDATE
 if rows modified:
     DONE
 else:
     INSERT
 }}}

 The logic is naturally more complex when tacking in account force_update,
 force_insert and all the corner cases.

 As a teaser: load 1000 objects from db, update field, save. Almost 50%
 faster now. (Remember to run in one transaction).

 The gain is that when the save() results in an update, there is no need
 for an extra select. In my somewhat limited tests, with PostgreSQL and
 10000 saves of a simple model in single transaction the result was almost
 50% speed up. The speedup is similar when using SQLite3.

 For the insert case, when there is no PK value for the model, there is no
 effect. The result is an insert as always.

 For the insert case where there is a PK value defined, and there is no
 matching pk value in the DB, the idea results in a speed loss. The reason
 is that when doing the UPDATE which doesn't update anything, the data must
 be transferred to the db. The speed loss is very small for models which
 have not too many / wide fields, and according to my quick testing, it is
 somewhere from 10 to 20% with very wide (100000 chars) models. This is
 localhost setup, so if there is a slow (as in throughput) network
 involved, I would imagine this would result in pretty much exactly 50%
 speed loss. There is patch attached to this ticket mitigating this ticket.
 In the end of this post there is an explanation how to mitigate this
 problem.

 There is one big catch: It is currently very well documented that Django
 does a select and then based on that an update or an insert. I can't see
 changing that could break applications (famous last words) so this is just
 a documentation issue. The other downside is that this makes save_base
 more complicated. Especially the patch no2, we are sometimes selecting and
 sometimes updating as first action, and it is not easy to guess the path
 taken.

 The speedtests I did are basically different ways of saving models with pk
 set and not set. I will post some benchmarks if this is not thrown out on
 backwards compatibility / complexity claims. I don't have any benchmarks
 worth posting at the moment.

 All tests passed using SQLite3, running PostgreSQL tests currently. No new
 tests added. Needs documentation, but I might not be the best person to do
 that. MultiDB testing would be welcome. I hope to get some feedback.
 Negative or positive...


 ----
 To work around the unnecessary data transfer problem, I used the
 model._state.adding variable. The idea is simple. If we are adding (the
 model did not come from the db) and there is a pk value, we don't expect
 there to be any data matching for our primary key. So we use the old
 select check in this case to avoid the potential network traffic. Based on
 the select, set force_insert or force_update to get the desired action.
 This mimics exactly what was done before, expect for error messages in
 obscure cases. Those can be fixed, too.

 When the _state.adding is False (the model did come from the db) we expect
 the save() to result in an update. So we use the above described method.

 There is still one situation where that guess can fail. Multidb. Save the
 model first in master db, then in slave dbs. The pk is set and
 adding=False, but we are still adding in reality. For this there is a
 check for self._state.db <> using. I would except that in these situations
 people would use force_insert=True, but I am no multidb person so not sure
 about that.

 I did a third small optimization. The idea is again simple. When saving a
 inheriting model, the parents are saved first (there is no change here).
 If the parent was inserted, we know this model must also be inserted. It
 is not possible for a model to exist without it's parent. This way we
 don't need to do any checking, we can force_insert. For update the same
 doesn't work: it is possible that the parent exists already, and we are
 "extending" it at the moment, saving a new model. This optimization can
 result also in almost 50% speedup in the best case, although the cases are
 more rare. I don't know any downsides to this optimization.

 There are three patches. The first is the base idea, the second is the
 _state.adding trick and the third is the parent inheritance thingy. These
 must be applied in order.

 Sorry for the long post. I was going to do just the update and check rows
 returned idea. But why not do just one small thing more... :)

--
Ticket URL: <https://code.djangoproject.com/ticket/16649>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-updates?hl=en.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
               Reporter:  akaariai   |          Owner:  nobody
                   Type:             |         Status:  new
  Cleanup/optimization               |      Component:  Database layer
              Milestone:             |  (models, ORM)
                Version:  1.3        |       Severity:  Normal
             Resolution:             |       Keywords:
           Triage Stage:  Design     |      Has patch:  1
  decision needed                    |    Needs tests:  0
    Needs documentation:  0          |  Easy pickings:  0
Patch needs improvement:  0          |
                  UI/UX:  0          |
-------------------------------------+-------------------------------------
Changes (by aaugustin):

 * needs_better_patch:   => 0
 * needs_docs:   => 0
 * needs_tests:   => 0
 * stage:  Unreviewed => Design decision needed


Comment:

 The idea makes sense and I think it's worth studying.

 However, Trac isn't the best place to reach an audience for comments, I
 suggest writing to django-developers if you haven't done so yet.

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-updates?hl=en.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
               Reporter:  akaariai   |          Owner:  nobody
                   Type:             |         Status:  new
  Cleanup/optimization               |      Component:  Database layer
              Milestone:             |  (models, ORM)
                Version:  1.3        |       Severity:  Normal
             Resolution:             |       Keywords:
           Triage Stage:  Design     |      Has patch:  1
  decision needed                    |    Needs tests:  0
    Needs documentation:  0          |  Easy pickings:  0
Patch needs improvement:  0          |
                  UI/UX:  0          |
-------------------------------------+-------------------------------------

Comment (by jacob):

 akaariai, can you back up a few steps and explain *why* you think this is
 a good idea? As far as I can tell, the speed benefits are just a trade-
 off: you trade create performance for update performance. Either call we
 make is going to adversely affect some users, so that's exactly why the
 `force_update`/`_insert` arguments exist. As it stands, I'm fairly well
 against the idea: I can't see the benefit in making a fundamental change
 like this, and I *can* see a lot of downside to changing the behavior of
 `save()`. More information might change my mind, though.

 Thanks!

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-updates?hl=en.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
               Reporter:  akaariai   |          Owner:  nobody
                   Type:             |         Status:  new
  Cleanup/optimization               |      Component:  Database layer
              Milestone:             |  (models, ORM)
                Version:  1.3        |       Severity:  Normal
             Resolution:             |       Keywords:
           Triage Stage:  Design     |      Has patch:  1
  decision needed                    |    Needs tests:  0
    Needs documentation:  0          |  Easy pickings:  0
Patch needs improvement:  0          |
                  UI/UX:  0          |
-------------------------------------+-------------------------------------

Comment (by akaariai):

 In summary, update case now requires just one query while previously it
 required 2. Insert case still requires 2 queries. Unfortunately the update
 -> insert way can be a little bit more expensive than doing select ->
 insert. This is due to the need to transfer the data to the DB two times.
 There are still reasons why I think this is an optimization:

  - I think we will be gaining more from the update case speedup than we
 are going to lose in the insert case. Needs benchmarks.
  - In the common case of an autofield pk we are even more likely to gain.
 Autofield gets its value on first save and should not change after that.
 If the field is set to a value it is very likely we are going to do an
 update. If it is not set, we are going to do an insert directly.
  - When loading fixtures, or when saving the same model to another DB we
 are likely going to do an insert. In this case we can use a heuristic: if
 we are adding the object to the DB (instance._state.adding=True) or if we
 are saving to a different DB (instance._state.db <> using) we use the old
 method of SELECT -> insert / update.

 All three above combined, I don't think there are many cases where losses
 are expected. The only case I can think of is reusing an object in a loop
 while changing its primary key, something like "obj = Obj(); for i in
 range(0, 100): obj.pk=i; obj.save()" - but I don't think that is a case we
 should optimize for. In fact, I think that should either result in error
 or update of the PK value, but that is another ticket's problem.

 Of course, the best optimization would be to use database specific methods
 (MERGE or INSERT ON DUPLICATE KEY UPDATE). This would allow for single
 query save in every case. Unfortunately there is no such method for
 PostgreSQL, at least not any method I know of. It might make sense to
 investigate this route first.

 Sorry for making the original post so long, there are a lot of ideas mixed
 in one ticket. I will try to be a better trac-citizen in the future :)

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:3>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit this group at http://groups.google.com/group/django-updates?hl=en.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:
  (models, ORM)                      |             Triage Stage:  Design
     Severity:  Normal               |  decision needed
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by akaariai):

 A bit more information about what the approach suggested is about. Assume
 you load a model from DB, change some fields and save it. We currently do:
 {{{
 SELECT - load from DB
 [change fields]; save()
 In save:
     SELECT - assert that the PK exists in the DB
     if yes:
         UPDATE
     else:
         INSERT
 }}}

 The second select above is usually redundant. If we know the object is
 loaded from DB (self._state.adding == False), and we are saving it to the
 same DB it was loaded from (self._state.db == the db we are saving to),
 then it is very likely the object exists in the DB. So, instead we should
 do:
 {{{
 In save:
     if same db, not adding:
         exists = UPDATE
     else:
         exists = SELECT
         if exists:
             UPDATE
     if not exists:
         INSERT
 }}}
 So, in the case the model is already in the DB, we do just a single update
 instead of select + update. If it so happens it doesn't exist (which
 should be very unlikely), we do update with zero rows modified and still
 know to do an insert.

 This does lead to behaviour change on MySQL when there is a concurrent
 insert to the same table. Currently the SELECT sees nothing, and the
 insert will lead to integrity error. After this, the update will block
 until the inserting transaction is committed, and then update the matching
 row. However, on PostgreSQL the update sees nothing, so there is no
 behaviour change there. For MySQL users this will be better than what we
 have now. For PostgreSQL users there is no change from behaviour
 standpoint.

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:4>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:
  (models, ORM)                      |             Triage Stage:  Design
     Severity:  Normal               |  decision needed
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by akaariai):

 Sorry, the explanations above are a little confusing. Still another try:
 {{{
 s = SomeModel.objects.get(pk=someval)
 s.somecol = someval
 s.save()
 }}}
 Here save is implemented as SELECT - if not found INSERT - else UPDATE.
 The SELECT here is redundant, we have the information that SELECT was just
 done in model._state. So, we can do directly an UPDATE. If it happens so
 that nothing is updated (likely because of concurrent delete) then we will
 still do the INSERT and things work as expected.

 I have done a
 [https://github.com/akaariai/django/compare/model_save_refactor full
 refactor] of model.save_base(), this includes 4 parts:
   - What this ticket deals with - trying directly to UPDATE if the model
 was fetched from the same DB we are saving to. Also assorted cleanup to
 make this possible.
   - Cleanup to proxy parents handling (the logic is somewhat ugly
 currently)
   - A bug fix for #17341 (do not commit transaction after every parent
 model save)
   - Splitting save_base into parts so that the saving logic is easier to
 follow

 Above, the bug fix is of course needed, and the proxy parents handling
 cleanup is IMO also needed.

 I can't see any downsides to trying UPDATE directly as done in the patch.
 This should result in clear performance improvement in most cases. There
 is a problem that we have documented very explicitly the sequence of SQL
 commands .save() does but I don't think that documentation should be
 considered part of the public API. So, docs update is needed.

 The refactoring of .save_base() into parts is a stylistic question. I
 surprisingly prefer the refactored way. Some examples of things that are
 hard to see from the current writing of the code:
   - Which signals are sent for which models
   - The actual hard work is done in the if not meta.proxy: branch.
 However, it is impossible that meta.proxy is True here. And it isn't
 exactly easy to see this.
   - The origin parameter is a bit weird - it is None except for the
 outermost table save.

 Still, I'd like to get a confirmation that others prefer the new writing,
 too.

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:5>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------
Changes (by akaariai):

 * stage:  Design decision needed => Ready for checkin


Comment:

 Squash-rebase of the above branch available from:
 https://github.com/akaariai/django/compare/django:master...model_save_refactor_squash

 I am moving this out of DDN - I have written to django-developers and I
 haven't seen any -1 (or +1 for that matter) when I suggested this change.
 I take this as nobody opposing the change. I think the only reason to
 oppose this feature is that something might break. As far as I know
 nothing should break, but the only way to find out for sure is moving this
 in and getting real world usage of the new save() way.

 A final review (if only looking for typos & stale comments) is welcome.
 Another thing I'd still like to get reviewed is the structuring of the new
 code - do the method names make sense, are the splitpoints sensible?

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:6>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by akaariai):

 I updated the branch with some last minute polish. Will commit soon.

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:7>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To post to this group, send email to [hidden email].
To unsubscribe from this group, send email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by timo):

 I hope you won't find my grammar nitpicks too exacting. :-)

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:8>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:
  (models, ORM)                      |             Triage Stage:  Accepted
     Severity:  Normal               |      Needs documentation:  0
     Keywords:                       |  Patch needs improvement:  1
    Has patch:  1                    |                    UI/UX:  0
  Needs tests:  0                    |
Easy pickings:  0                    |
-------------------------------------+-------------------------------------
Changes (by akaariai):

 * needs_better_patch:  0 => 1
 * stage:  Ready for checkin => Accepted


Comment:

 Thanks for the comments! The Finnish dialect of English has special
 grammar...

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:9>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------
Changes (by akaariai):

 * needs_better_patch:  1 => 0
 * stage:  Accepted => Ready for checkin


Comment:

 Patch updated,
 https://github.com/akaariai/django/compare/django:master...model_save_refactor_squash.
 I removed the "if from same DB" check. Now the patch just always tries
 UPDATE first if the model is from the same DB. It will be easy enough to
 change the behaviour later on if it seems that is needed.

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:10>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  closed
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:  fixed
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------
Changes (by Anssi Kääriäinen <akaariai@…>):

 * status:  new => closed
 * resolution:   => fixed


Comment:

 In [changeset:"6b4834952dcce0db5cbc1534635c00ff8573a6d8"]:
 {{{
 #!CommitTicketReference repository=""
 revision="6b4834952dcce0db5cbc1534635c00ff8573a6d8"
 Fixed #16649 -- Refactored save_base logic

 Model.save() will use UPDATE - if not updated - INSERT instead of
 SELECT - if found UPDATE else INSERT. This should save a query when
 updating, but will cost a little when inserting model with PK set.

 Also fixed #17341 -- made sure .save() commits transactions only after
 the whole model has been saved. This wasn't the case in model
 inheritance situations.

 The save_base implementation was refactored into multiple methods.
 A typical chain for inherited save is:
 save_base()
     _save_parents(self)
         for each parent:
             _save_parents(parent)
             _save_table(parent)
     _save_table(self)
 }}}

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:11>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  closed
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:  fixed
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by akaariai):

 While using update_fields on one of my projects I noticed a possible
 reason to revert the way update count is used in this ticket's committed
 patch. On PostgreSQL, if there is a trigger for the updated table, and
 this trigger returns NULL in some conditions, then the UPDATE statement is
 seen as affecting zero rows. This will result in Django seeing zero rows
 updated, and this again will result in error when using update_fields as
 the update didn't seem to affect any rows.

 For plain .save() in 1.6 the result will be IntegrityError - the update
 doesn't seem to affect any rows, but there is a row matching the update
 condition in the DB, so Django will try to UPDATE, see no rows matched,
 and then try to INSERT -> boom.

 The end result is that code that worked in 1.5 will not work in 1.6 if
 there are ON UPDATE triggers which return NULL. I am not sure about how
 common this condition will be. It will be easy to change the detection in
 save_base() from UPDATE - if not updated - INSERT to SELECT - if found
 UPDATE - else INSERT. But I would like to wait and see if this problem is
 common enough to worry about...

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:12>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  closed
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:  fixed
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------
Changes (by intgr):

 * cc: marti@… (added)


Comment:

 We hit the problem mentioned in [comment:12 comment 12] while trying out
 Django 1.6 beta. We are using the PostgreSQL built-in
 [http://www.postgresql.org/docs/current/static/functions-trigger.html
 suppress_redundant_updates_trigger] trigger on some tables to save I/O
 when saving unchanged rows. With Django 1.6 this now causes
 IntegrityErrors.

 No big deal for us, we can drop the trigger when we migrate to 1.6, but I
 can imagine there are situations where trigger logic is a necessary part
 of the application. Maybe there should be a workaround for those poor
 users, like a per-model option to fall back to the old save behavior?

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:13>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/066.1dc813e58b15601be292dc567f289f10%40djangoproject.com.
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  closed
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:  fixed
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by akaariai):

 Maybe something like select_on_save or somesuch _meta option. I'd like to
 keep the default False as trying direct update is the correct thing to do
 most of the time. I will check how hard this is to do.

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:14>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/066.344d7e334e3d000a70a13a61d9bb03bc%40djangoproject.com.
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  closed
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:  fixed
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by akaariai):

 I've added a crude patch for "select_on_save" option. Docs missing, maybe
 more tests needed, too. See
 https://github.com/akaariai/django/commit/1ca398b0352b11868ec236f92cf17b6ce82ff88c

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:15>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/066.3d4713648d5c19c4c3213a1592ea2a26%40djangoproject.com.
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  closed
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:  fixed
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by akaariai):

 I added a new ticket for the incompatibility issue, see #20988.

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:16>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/066.18141157bd3d29828ebdaa747ed8b22b%40djangoproject.com.
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  closed
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:  fixed
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by Anssi Kääriäinen <akaariai@…>):

 In [changeset:"e973ee6a9879969b8ae05bb7ff681172cc5386a5"]:
 {{{
 #!CommitTicketReference repository=""
 revision="e973ee6a9879969b8ae05bb7ff681172cc5386a5"
 Fixed #20988 -- Added model meta option select_on_save

 The option can be used to force pre 1.6 style SELECT on save behaviour.
 This is needed in case the database returns zero updated rows even if
 there is a matching row in the DB. One such case is PostgreSQL update
 trigger that returns NULL.

 Reviewed by Tim Graham.

 Refs #16649
 }}}

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:17>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/066.18a02b3676693e63c0ac5797981e805f%40djangoproject.com.
For more options, visit https://groups.google.com/groups/opt_out.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [Django] #16649: Models.save() refactoring: check updated rows to determine action

Django
In reply to this post by Django
#16649: Models.save() refactoring: check updated rows to determine action
-------------------------------------+-------------------------------------
     Reporter:  akaariai             |                    Owner:  nobody
         Type:                       |                   Status:  closed
  Cleanup/optimization               |                  Version:  1.3
    Component:  Database layer       |               Resolution:  fixed
  (models, ORM)                      |             Triage Stage:  Ready for
     Severity:  Normal               |  checkin
     Keywords:                       |      Needs documentation:  0
    Has patch:  1                    |  Patch needs improvement:  0
  Needs tests:  0                    |                    UI/UX:  0
Easy pickings:  0                    |
-------------------------------------+-------------------------------------

Comment (by Anssi Kääriäinen <akaariai@…>):

 In [changeset:"76e38a21777243fec58c640d617bb71a251c5ba1"]:
 {{{
 #!CommitTicketReference repository=""
 revision="76e38a21777243fec58c640d617bb71a251c5ba1"
 [1.6.x] Fixed #20988 -- Added model meta option select_on_save

 The option can be used to force pre 1.6 style SELECT on save behaviour.
 This is needed in case the database returns zero updated rows even if
 there is a matching row in the DB. One such case is PostgreSQL update
 trigger that returns NULL.

 Reviewed by Tim Graham.

 Refs #16649

 Backport of e973ee6a9879969b8ae05bb7ff681172cc5386a5 from master

 Conflicts:
         django/db/models/options.py
         tests/basic/tests.py
 }}}

--
Ticket URL: <https://code.djangoproject.com/ticket/16649#comment:18>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/066.8f5d9c6366c5ede92034258e7f88a1de%40djangoproject.com.
For more options, visit https://groups.google.com/groups/opt_out.
Loading...