[Django] #30108: Include FK constraints when adding columns in PostgreSQL

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[Django] #30108: Include FK constraints when adding columns in PostgreSQL

Django
#30108: Include FK constraints when adding columns in PostgreSQL
------------------------------------------------+------------------------
               Reporter:  Dan Tao               |          Owner:  nobody
                   Type:  Cleanup/optimization  |         Status:  new
              Component:  Migrations            |        Version:  master
               Severity:  Normal                |       Keywords:
           Triage Stage:  Unreviewed            |      Has patch:  0
    Needs documentation:  0                     |    Needs tests:  0
Patch needs improvement:  0                     |  Easy pickings:  0
                  UI/UX:  0                     |
------------------------------------------------+------------------------
 On the Bitbucket team, we have been stung my Django's default behavior,
 when adding a nullable `ForeignKey` field to a large, high-traffic table,
 of doing so in two separate statements:

 {{{
 ALTER TABLE [...] ADD COLUMN [...]
 ALTER TABLE [...] ADD CONSTRAINT [...] FOREIGN KEY [...] REFERENCES [...]
 }}}

 The reason this is problematic is that, again for a very large table with
 many millions of rows and significant contention, the statement to add the
 constraint will acquire a lock and perform a full table scan to validate
 that there are no invalid foreign keys in the table.

 In PostgreSQL, if you execute ADD COLUMN with a REFERENCES clause, it can
 bypass the table scan since there cannot possibly be any invalid foreign
 keys in the table.

 We have written a custom operation to do just that whenever we encounter
 this scenario. Without it, we simply would not be able to add columns with
 foreign key constraints to some of our larger tables, other than by
 executing the SQL manually and then faking the migration. It would be nice
 if Django could take the more optimal approach by default, at least for
 the PostgreSQL database backend. (I'm not familiar enough with the other
 database backends to know whether this is also a problem for them.)

--
Ticket URL: <https://code.djangoproject.com/ticket/30108>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/047.248b80673ff203564874bf73587e4ad8%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Django] #30108: Include FK constraints when adding columns in PostgreSQL

Django
#30108: Include FK constraints when adding columns in PostgreSQL
-------------------------------------+-------------------------------------
     Reporter:  Dan Tao              |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |
    Component:  Migrations           |                  Version:  master
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:
                                     |  Unreviewed
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------

Comment (by Dan Tao):

 I've taken a look at the code already, and I believe it's ''almost''
 there.

 The `BaseDatabaseSchemaEditor` class includes the attribute
 `sql_create_inline_fk`, `None` by default, which is appended to a column
 definition when creating a table. It is not included when adding a field,
 but as far as I can tell that is simply an oversight.

 Somewhat surprisingly, this attribute is ''only'' defined on the sqlite3
 backend. (Why there was any need to optimize this particular backend, I
 couldn't say.)

 I attempted to make the following changes:

 1. Update the logic in `BaseDatabaseSchemaEditor.add_field` to append
 `sql_create_inline_fk` (if defined) to the column definition.
 2. Add the `sql_create_inline_fk` attribute to the postgresql backend
 class.

 Unfortunately this ended up causing test failures because there are
 [https://github.com/django/django/blob/master/tests/mutually_referential/models.py
 test models] defined with a cyclic dependency. It seems that SQLite
 doesn't mind if you create a table with a foreign key to another table
 that doesn't exist yet, but PostgreSQL does.

 {{{
 psycopg2.ProgrammingError: relation "mutually_referential_child" does not
 exist
 }}}

 I'm not sure what the solution is. Perhaps the code could be refactored so
 that the postgresql backend ''only'' applies the `sql_create_inline_fk`
 optimization in `add_field` (which is when this scenario can occur) and
 not in `create_model` (when it shouldn't matter, because a table scan on
 an empty table is not problematic).

--
Ticket URL: <https://code.djangoproject.com/ticket/30108#comment:1>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/062.48c5884b1b0c1bd152f921947f20eaab%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Django] #30108: Include FK constraints when adding columns in PostgreSQL

Django
In reply to this post by Django
#30108: Include FK constraints when adding columns in PostgreSQL
-------------------------------------+-------------------------------------
     Reporter:  Dan Tao              |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |
    Component:  Migrations           |                  Version:  master
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:
                                     |  Unreviewed
    Has patch:  1                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Changes (by Dan Tao):

 * cc: Dan Tao (added)
 * has_patch:  0 => 1


Old description:

> On the Bitbucket team, we have been stung my Django's default behavior,
> when adding a nullable `ForeignKey` field to a large, high-traffic table,
> of doing so in two separate statements:
>
> {{{
> ALTER TABLE [...] ADD COLUMN [...]
> ALTER TABLE [...] ADD CONSTRAINT [...] FOREIGN KEY [...] REFERENCES [...]
> }}}
>
> The reason this is problematic is that, again for a very large table with
> many millions of rows and significant contention, the statement to add
> the constraint will acquire a lock and perform a full table scan to
> validate that there are no invalid foreign keys in the table.
>
> In PostgreSQL, if you execute ADD COLUMN with a REFERENCES clause, it can
> bypass the table scan since there cannot possibly be any invalid foreign
> keys in the table.
>
> We have written a custom operation to do just that whenever we encounter
> this scenario. Without it, we simply would not be able to add columns
> with foreign key constraints to some of our larger tables, other than by
> executing the SQL manually and then faking the migration. It would be
> nice if Django could take the more optimal approach by default, at least
> for the PostgreSQL database backend. (I'm not familiar enough with the
> other database backends to know whether this is also a problem for them.)
New description:

 On the Bitbucket team, we have been stung by Django's default behavior,
 when adding a nullable `ForeignKey` field to a large, high-traffic table,
 of doing so in two separate statements:

 {{{
 ALTER TABLE [...] ADD COLUMN [...]
 ALTER TABLE [...] ADD CONSTRAINT [...] FOREIGN KEY [...] REFERENCES [...]
 }}}

 The reason this is problematic is that, again for a very large table with
 many millions of rows and significant contention, the statement to add the
 constraint will acquire a lock and perform a full table scan to validate
 that there are no invalid foreign keys in the table.

 In PostgreSQL, if you execute ADD COLUMN with a REFERENCES clause, it can
 bypass the table scan since there cannot possibly be any invalid foreign
 keys in the table.

 We have written a custom operation to do just that whenever we encounter
 this scenario. Without it, we simply would not be able to add columns with
 foreign key constraints to some of our larger tables, other than by
 executing the SQL manually and then faking the migration. It would be nice
 if Django could take the more optimal approach by default, at least for
 the PostgreSQL database backend. (I'm not familiar enough with the other
 database backends to know whether this is also a problem for them.)

--

--
Ticket URL: <https://code.djangoproject.com/ticket/30108#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/062.2a7c9caeb12fa1541896c951270628ec%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Django] #30108: Include FK constraints when adding columns in PostgreSQL

Django
In reply to this post by Django
#30108: Include FK constraints when adding columns in PostgreSQL
--------------------------------------+------------------------------------
     Reporter:  Dan Tao               |                    Owner:  nobody
         Type:  Cleanup/optimization  |                   Status:  new
    Component:  Migrations            |                  Version:  master
     Severity:  Normal                |               Resolution:
     Keywords:                        |             Triage Stage:  Accepted
    Has patch:  1                     |      Needs documentation:  0
  Needs tests:  0                     |  Patch needs improvement:  1
Easy pickings:  0                     |                    UI/UX:  0
--------------------------------------+------------------------------------
Changes (by Simon Charette):

 * needs_better_patch:  0 => 1
 * stage:  Unreviewed => Accepted


Comment:

 First, thanks for the very detailed report and the patch.

 For posterity here are similar comments to the ones I left on your Github
 PR.

 > Somewhat surprisingly, this attribute is only defined on the sqlite3
 backend. (Why there was any need to optimize this particular backend, I
 couldn't say.)

 That's because SQLite doesn't support `ALTER TABLE ADD CONSTRAINT`; the
 table has to be rebuilt on column addition so the only way to define
 foreign constraints is by inlining them.

 > I'm not sure what the solution is. Perhaps the code could be refactored
 so that the postgresql backend only applies the sql_create_inline_fk
 optimization in add_field (which is when this scenario can occur) and not
 in create_model (when it shouldn't matter, because a table scan on an
 empty table is not problematic).

 I think it should be done for all backends that support it on field
 addition. As you've said it's not ''that'' useful in table creation cases
 anyway because the table will be empty so the operation will be really
 fast. That would also be really hard to support for circular `CreateModel`
 across apps. To summarize I think it's a great addition for `AddField`
 (and probably `AlterField`?) but it's likely not worth the complexity it
 would incur for `CreateModel`.

 I left a few comments on your PR for improvements, mainly around adding
 support for MySQL and Oracle as well. Please uncheck ''patch needs
 improvement'' once you've addressed them so the patch appears in the
 review queue.

--
Ticket URL: <https://code.djangoproject.com/ticket/30108#comment:3>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/062.95c9e7ebe0e1b960f6d19b6f689e46b9%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Django] #30108: Include FK constraints when adding columns in PostgreSQL

Django
In reply to this post by Django
#30108: Include FK constraints when adding columns in PostgreSQL
--------------------------------------+------------------------------------
     Reporter:  Dan Tao               |                    Owner:  nobody
         Type:  Cleanup/optimization  |                   Status:  new
    Component:  Migrations            |                  Version:  master
     Severity:  Normal                |               Resolution:
     Keywords:                        |             Triage Stage:  Accepted
    Has patch:  1                     |      Needs documentation:  0
  Needs tests:  0                     |  Patch needs improvement:  1
Easy pickings:  0                     |                    UI/UX:  0
--------------------------------------+------------------------------------

Comment (by Dan Tao):

 To summarize I think it's a great addition for AddField (and probably
 AlterField?) but it's likely not worth the complexity it would incur for
 CreateModel.

 I think we're agreed on this. The benefit for CreateModel is negligible,
 especially compared to the complexity of the refactor that would be
 necessary to support it.

     mainly around adding support for MySQL and Oracle

 I am happy to do this. My only concern is that I am nowhere near an expert
 in either of these databases (not that I am an expert in PostgreSQL,
 either, but everything's relative!). From a cursory investigation, it
 seems that [https://stackoverflow.com/a/1545264/105570 the MySQL syntax
 for doing this] is a bit different and involves basically combining two
 statements (an `ADD COLUMN` and an `ADD CONSTRAINT`) with a comma. The
 [https://stackoverflow.com/a/45168000/105570 Oracle syntax] seems quite
 similar to the PostgreSQL syntax.

 Both the MySQL and the Oracle versions require that you specify a name for
 the constraint, whereas it appears PostgreSQL handles that automatically.
 Passing this along in the base schema editor will require a bit more
 refactoring, though it looks very doable.

 I'll do my best, and then we'll see how you feel about how things turn
 out. I'm wondering, in the event we run into further complications with
 MySQL and/or Oracle (for example, if you are not happy with the way I
 refactor the code to support them), if you'd be opposed to deferring those
 backends for now so that someone with more expertise in them can
 contribute the necessary changes? Perhaps this will be moot—I'm just
 preparing for the worst :)

 And one more thing: would you prefer that we strive to keep the majority
 of the discussion here on the ticket, or in the pull request? I'm assuming
 we should keep it mostly here aside from comments that are specific to
 details of the code; but if I've got that backwards I'm happy to continue
 the discussion on the PR.

--
Ticket URL: <https://code.djangoproject.com/ticket/30108#comment:4>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/062.941089ff4768bf5083aaa5ae09c4d0c0%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Django] #30108: Include FK constraints when adding columns in PostgreSQL

Django
In reply to this post by Django
#30108: Include FK constraints when adding columns in PostgreSQL
--------------------------------------+------------------------------------
     Reporter:  Dan Tao               |                    Owner:  nobody
         Type:  Cleanup/optimization  |                   Status:  new
    Component:  Migrations            |                  Version:  master
     Severity:  Normal                |               Resolution:
     Keywords:                        |             Triage Stage:  Accepted
    Has patch:  1                     |      Needs documentation:  0
  Needs tests:  0                     |  Patch needs improvement:  1
Easy pickings:  0                     |                    UI/UX:  0
--------------------------------------+------------------------------------
Changes (by Simon Charette):

 * cc: Simon Charette (added)


Comment:

 > I'll do my best, and then we'll see how you feel about how things turn
 out. I'm wondering, in the event we run into further complications with
 MySQL and/or Oracle (for example, if you are not happy with the way I
 refactor the code to support them), if you'd be opposed to deferring those
 backends for now so that someone with more expertise in them can
 contribute the necessary changes? Perhaps this will be moot—I'm just
 preparing for the worst :)

 I'm not opposed to that at all. Let's see how complex that is and if it
 doesn't work out it can be revisited later.

 > And one more thing: would you prefer that we strive to keep the majority
 of the discussion here on the ticket, or in the pull request? I'm assuming
 we should keep it mostly here aside from comments that are specific to
 details of the code; but if I've got that backwards I'm happy to continue
 the discussion on the PR.

 I think keeping all of the non-review related discussion here is more
 valuable given the ticket already has a lot of context.

--
Ticket URL: <https://code.djangoproject.com/ticket/30108#comment:5>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/062.cf5bddf0eb174d855a5b8ed952b06a27%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Django] #30108: Include FK constraints when adding columns in PostgreSQL

Django
In reply to this post by Django
#30108: Include FK constraints when adding columns in PostgreSQL
--------------------------------------+------------------------------------
     Reporter:  Dan Tao               |                    Owner:  nobody
         Type:  Cleanup/optimization  |                   Status:  new
    Component:  Migrations            |                  Version:  master
     Severity:  Normal                |               Resolution:
     Keywords:                        |             Triage Stage:  Accepted
    Has patch:  1                     |      Needs documentation:  0
  Needs tests:  0                     |  Patch needs improvement:  0
Easy pickings:  0                     |                    UI/UX:  0
--------------------------------------+------------------------------------
Changes (by Dan Tao):

 * needs_better_patch:  1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/30108#comment:6>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/062.1a8048ffd782de3133be4ea86c30edae%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Django] #30108: Include FK constraints when adding columns in PostgreSQL

Django
In reply to this post by Django
#30108: Include FK constraints when adding columns in PostgreSQL
--------------------------------------+------------------------------------
     Reporter:  Dan Tao               |                    Owner:  nobody
         Type:  Cleanup/optimization  |                   Status:  new
    Component:  Migrations            |                  Version:  master
     Severity:  Normal                |               Resolution:
     Keywords:                        |             Triage Stage:  Accepted
    Has patch:  1                     |      Needs documentation:  0
  Needs tests:  0                     |  Patch needs improvement:  1
Easy pickings:  0                     |                    UI/UX:  0
--------------------------------------+------------------------------------
Changes (by Simon Charette):

 * needs_better_patch:  0 => 1


Comment:

 Left a few comments for improvement but should be RFC once they are
 addressed.

--
Ticket URL: <https://code.djangoproject.com/ticket/30108#comment:7>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/062.57997a64af143487b98e202febe50647%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Django] #30108: Include FK constraints when adding columns in PostgreSQL

Django
In reply to this post by Django
#30108: Include FK constraints when adding columns in PostgreSQL
--------------------------------------+------------------------------------
     Reporter:  Dan Tao               |                    Owner:  nobody
         Type:  Cleanup/optimization  |                   Status:  new
    Component:  Migrations            |                  Version:  master
     Severity:  Normal                |               Resolution:
     Keywords:                        |             Triage Stage:  Accepted
    Has patch:  1                     |      Needs documentation:  0
  Needs tests:  0                     |  Patch needs improvement:  0
Easy pickings:  0                     |                    UI/UX:  0
--------------------------------------+------------------------------------
Changes (by Dan Tao):

 * needs_better_patch:  1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/30108#comment:8>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/062.aa85273585433d6ea2278630bb241f52%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Django] #30108: Include FK constraints when adding columns in PostgreSQL

Django
In reply to this post by Django
#30108: Include FK constraints when adding columns in PostgreSQL
-------------------------------------+-------------------------------------
     Reporter:  Dan Tao              |                    Owner:  nobody
         Type:                       |                   Status:  new
  Cleanup/optimization               |
    Component:  Migrations           |                  Version:  master
     Severity:  Normal               |               Resolution:
     Keywords:                       |             Triage Stage:  Ready for
                                     |  checkin
    Has patch:  1                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

 * stage:  Accepted => Ready for checkin


Comment:

 RFC except for some cosmetic changes (quote types) that can be handled by
 the committer.

--
Ticket URL: <https://code.djangoproject.com/ticket/30108#comment:9>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/062.73d9054d56bfe36d5f79889b8a5d4bf2%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: [Django] #30108: Include FK constraints when adding columns in PostgreSQL

Django
In reply to this post by Django
#30108: Include FK constraints when adding columns in PostgreSQL
-------------------------------------+-------------------------------------
     Reporter:  Dan Tao              |                    Owner:  nobody
         Type:                       |                   Status:  closed
  Cleanup/optimization               |
    Component:  Migrations           |                  Version:  master
     Severity:  Normal               |               Resolution:  fixed
     Keywords:                       |             Triage Stage:  Ready for
                                     |  checkin
    Has patch:  1                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------
Changes (by Tim Graham <timograham@…>):

 * status:  new => closed
 * resolution:   => fixed


Comment:

 In [changeset:"738faf9da2a5cd03148a36375db80746c99c9623" 738faf9]:
 {{{
 #!CommitTicketReference repository=""
 revision="738faf9da2a5cd03148a36375db80746c99c9623"
 Fixed #30108 -- Allowed adding foreign key constraints in the same
 statement that adds a field.
 }}}

--
Ticket URL: <https://code.djangoproject.com/ticket/30108#comment:10>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

--
You received this message because you are subscribed to the Google Groups "Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/062.ef8dc845bf26f9ed96ed5fead43ad695%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.