Adding a bulk_save method to models

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding a bulk_save method to models

Tom Forbes

Hello all,

I’d love for some feedback on an idea I’ve been mulling around lately, namely adding a bulk_save method to Dango.

A somewhat common pattern for some applications is to loop over a list of models, set an attribute and call save on them. This unfortunately can issue a lot of database queries which can be a significant slowdown. You can work around this by using ‘.update()’ in some cases, but not all.

It seems it would be possible to use a CASE statement in SQL to handle bulk-updating many rows with differing values. For example:

SomeModel.object.filter(id__in=[1,2]).update(
    some_field=Case(
        When(id=1, then=Value('Field value for ID=1')),
        When(id=2, then=Value('Field value for ID=2'))
    )
)

I’ve made a ticket for this here: https://code.djangoproject.com/ticket/29037

I managed to get a 70x performance increase using this technique on a fairly large table, and it seems it could be applicable to many projects just like bulk_create.

The downsides to this is that it can produce very large SQL statements when updating many rows (I had MySQL complain about a 10MB statement once), but this can be overcome with batching and other optimisations (i.e the same values can use WHEN id IN (x, y, z) rather than 3 individual WHEN statements).

I’m imagining an API very similar to bulk_create, but spend any time on a patch I thought I would ask if anyone have any feedback on this suggestion. Would this be a good addition to Dango?



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFNZOJOmmRDzZv9jMDxnp3-Wp%3Dg5F1dR_Gga3d51kARKGbrrzQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Adding a bulk_save method to models

neal
Hi Tom,

A built-in bulk save that's more flexible than update would certainly be nice. Just in case you haven't come across it though, there is a package called django-bulk-update:

https://github.com/aykut/django-bulk-update

I've found it very useful on a number of occassions where update isn't quite enough but the loop-edit-save pattern is too slow to be convenient.

Probably some useful things in there when considering the API and approach.

Cheers, Neal 

On Friday, January 19, 2018 at 5:49:48 PM UTC, Tom Forbes wrote:

Hello all,

I’d love for some feedback on an idea I’ve been mulling around lately, namely adding a bulk_save method to Dango.

A somewhat common pattern for some applications is to loop over a list of models, set an attribute and call save on them. This unfortunately can issue a lot of database queries which can be a significant slowdown. You can work around this by using ‘.update()’ in some cases, but not all.

It seems it would be possible to use a CASE statement in SQL to handle bulk-updating many rows with differing values. For example:

SomeModel.object.filter(id__in=[1,2]).update(
    some_field=Case(
        When(id=1, then=Value('Field value for ID=1')),
        When(id=2, then=Value('Field value for ID=2'))
    )
)

I’ve made a ticket for this here: <a href="https://code.djangoproject.com/ticket/29037" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fcode.djangoproject.com%2Fticket%2F29037\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6vecQFpiHFf7R9POT84mTPgP3dg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fcode.djangoproject.com%2Fticket%2F29037\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6vecQFpiHFf7R9POT84mTPgP3dg&#39;;return true;">https://code.djangoproject.com/ticket/29037

I managed to get a 70x performance increase using this technique on a fairly large table, and it seems it could be applicable to many projects just like bulk_create.

The downsides to this is that it can produce very large SQL statements when updating many rows (I had MySQL complain about a 10MB statement once), but this can be overcome with batching and other optimisations (i.e the same values can use WHEN id IN (x, y, z) rather than 3 individual WHEN statements).

I’m imagining an API very similar to bulk_create, but spend any time on a patch I thought I would ask if anyone have any feedback on this suggestion. Would this be a good addition to Dango?



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Adding a bulk_save method to models

Tom Forbes

Hey Neal,

Thank you very much for pointing that out, I actually found out about this package as I was researching the ticket - I wish I had known about this a couple of years ago as it would have saved me a fair bit of CPU and brain time!

I think that module is a good starting point and proves that it’s possible, however I think the implementation can be improved upon if we bring it inside core. I worked on a small PR to add this and the implementation was refreshingly simple. It still needs docs, a couple more tests and to fix a strange error with sqlite on Windows, but overall it seems like a lot of gain for a small amount of code.

Tom



On 22 January 2018 at 15:10:53, Neal Todd ([hidden email]) wrote:

Hi Tom,

A built-in bulk save that's more flexible than update would certainly be nice. Just in case you haven't come across it though, there is a package called django-bulk-update:

https://github.com/aykut/django-bulk-update

I've found it very useful on a number of occassions where update isn't quite enough but the loop-edit-save pattern is too slow to be convenient.

Probably some useful things in there when considering the API and approach.

Cheers, Neal 

On Friday, January 19, 2018 at 5:49:48 PM UTC, Tom Forbes wrote:

Hello all,

I’d love for some feedback on an idea I’ve been mulling around lately, namely adding a bulk_save method to Dango.

A somewhat common pattern for some applications is to loop over a list of models, set an attribute and call save on them. This unfortunately can issue a lot of database queries which can be a significant slowdown. You can work around this by using ‘.update()’ in some cases, but not all.

It seems it would be possible to use a CASE statement in SQL to handle bulk-updating many rows with differing values. For example:

SomeModel.object.filter(id__in=[1,2]).update(
    some_field=Case(
        When(id=1, then=Value('Field value for ID=1')),
        When(id=2, then=Value('Field value for ID=2'))
    )
)

I’ve made a ticket for this here: <a href="https://code.djangoproject.com/ticket/29037" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fcode.djangoproject.com%2Fticket%2F29037\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6vecQFpiHFf7R9POT84mTPgP3dg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fcode.djangoproject.com%2Fticket%2F29037\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6vecQFpiHFf7R9POT84mTPgP3dg&#39;;return true;"> https://code.djangoproject.com/ticket/29037

I managed to get a 70x performance increase using this technique on a fairly large table, and it seems it could be applicable to many projects just like bulk_create.

The downsides to this is that it can produce very large SQL statements when updating many rows (I had MySQL complain about a 10MB statement once), but this can be overcome with batching and other optimisations (i.e the same values can use WHEN id IN (x, y, z) rather than 3 individual WHEN statements).

I’m imagining an API very similar to bulk_create, but spend any time on a patch I thought I would ask if anyone have any feedback on this suggestion. Would this be a good addition to Dango?



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFNZOJO_yxQ-Kj4V6Ps2NtL7wfNMSkzeAu_H1c26NB8%3DgJzqug%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Adding a bulk_save method to models

neal
Hi Tom,

That's great, should be a helpful addition to core. Will follow the ticket and PR.

Neal

(Apologies - I hadn't spotted that you'd already referenced django-bulk-update in your ticket when I left my drive-by comment!)

On Monday, January 22, 2018 at 7:41:11 PM UTC, Tom Forbes wrote:

Hey Neal,

Thank you very much for pointing that out, I actually found out about this package as I was researching the ticket - I wish I had known about this a couple of years ago as it would have saved me a fair bit of CPU and brain time!

I think that module is a good starting point and proves that it’s possible, however I think the implementation can be improved upon if we bring it inside core. <a href="https://github.com/django/django/pull/9606/files#diff-5b0dda5eb9a242c15879dc9cd2121379R473" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdjango%2Fpull%2F9606%2Ffiles%23diff-5b0dda5eb9a242c15879dc9cd2121379R473\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOTySr3Ua581iOCP3fjsR3SgMKig&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdjango%2Fpull%2F9606%2Ffiles%23diff-5b0dda5eb9a242c15879dc9cd2121379R473\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOTySr3Ua581iOCP3fjsR3SgMKig&#39;;return true;">I worked on a small PR to add this and the implementation was refreshingly simple. It still needs docs, a couple more tests and to fix a strange error with sqlite on Windows, but overall it seems like a lot of gain for a small amount of code.

Tom



On 22 January 2018 at 15:10:53, Neal Todd (<a href="javascript:" target="_blank" gdf-obfuscated-mailto="U4XYvi88BQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">ne...@...) wrote:

Hi Tom,

A built-in bulk save that's more flexible than update would certainly be nice. Just in case you haven't come across it though, there is a package called django-bulk-update:

<a href="https://github.com/aykut/django-bulk-update" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Faykut%2Fdjango-bulk-update\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnD9anYCkAs7wUvCGKwcMJy1cMFg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Faykut%2Fdjango-bulk-update\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnD9anYCkAs7wUvCGKwcMJy1cMFg&#39;;return true;">https://github.com/aykut/django-bulk-update

I've found it very useful on a number of occassions where update isn't quite enough but the loop-edit-save pattern is too slow to be convenient.

Probably some useful things in there when considering the API and approach.

Cheers, Neal 

On Friday, January 19, 2018 at 5:49:48 PM UTC, Tom Forbes wrote:

Hello all,

I’d love for some feedback on an idea I’ve been mulling around lately, namely adding a bulk_save method to Dango.

A somewhat common pattern for some applications is to loop over a list of models, set an attribute and call save on them. This unfortunately can issue a lot of database queries which can be a significant slowdown. You can work around this by using ‘.update()’ in some cases, but not all.

It seems it would be possible to use a CASE statement in SQL to handle bulk-updating many rows with differing values. For example:

SomeModel.object.filter(id__in=[1,2]).update(
    some_field=Case(
        When(id=1, then=Value('Field value for ID=1')),
        When(id=2, then=Value('Field value for ID=2'))
    )
)

I’ve made a ticket for this here: <a href="https://code.djangoproject.com/ticket/29037" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fcode.djangoproject.com%2Fticket%2F29037\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6vecQFpiHFf7R9POT84mTPgP3dg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fcode.djangoproject.com%2Fticket%2F29037\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6vecQFpiHFf7R9POT84mTPgP3dg&#39;;return true;"> https://code.djangoproject.com/ticket/29037

I managed to get a 70x performance increase using this technique on a fairly large table, and it seems it could be applicable to many projects just like bulk_create.

The downsides to this is that it can produce very large SQL statements when updating many rows (I had MySQL complain about a 10MB statement once), but this can be overcome with batching and other optimisations (i.e the same values can use WHEN id IN (x, y, z) rather than 3 individual WHEN statements).

I’m imagining an API very similar to bulk_create, but spend any time on a patch I thought I would ask if anyone have any feedback on this suggestion. Would this be a good addition to Dango?



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="U4XYvi88BQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">django-develop...@googlegroups.com.
To post to this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="U4XYvi88BQAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">django-d...@googlegroups.com.
Visit this group at <a href="https://groups.google.com/group/django-developers" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/group/django-developers&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/group/django-developers&#39;;return true;">https://groups.google.com/group/django-developers.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;"> https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/ccf32398-57d1-427e-89de-8581cd2a52c3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Adding a bulk_save method to models

Tim Graham-2

I wanted to ask about naming of the new method. Currently the proposed name is "QuerySet.bulk_save()" but I think it's a bit confusing since it uses QuerySet.update(), not Model.save(). It works similarly to QuerySet.bulk_update() from https://github.com/aykut/django-bulk-update but the arguments are a bit different.


Josh's comment on the PR: "Since this only works for instances with an pk, do you think that bulk_update would be a better name? The regular save() method can either create or update depending on pk status which may confuse users here."

And Tom's reply: "I considered this, but queryset.update() is the best 'bulk update' method. I didn't want to confuse the two, this is more about saving multiple model fields with multiple differing values, gene bulk_save. Open to changing it though."


On Tuesday, January 23, 2018 at 7:38:18 AM UTC-5, Neal Todd wrote:
Hi Tom,

That's great, should be a helpful addition to core. Will follow the ticket and PR.

Neal

(Apologies - I hadn't spotted that you'd already referenced django-bulk-update in your ticket when I left my drive-by comment!)

On Monday, January 22, 2018 at 7:41:11 PM UTC, Tom Forbes wrote:

Hey Neal,

Thank you very much for pointing that out, I actually found out about this package as I was researching the ticket - I wish I had known about this a couple of years ago as it would have saved me a fair bit of CPU and brain time!

I think that module is a good starting point and proves that it’s possible, however I think the implementation can be improved upon if we bring it inside core. <a href="https://github.com/django/django/pull/9606/files#diff-5b0dda5eb9a242c15879dc9cd2121379R473" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdjango%2Fpull%2F9606%2Ffiles%23diff-5b0dda5eb9a242c15879dc9cd2121379R473\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOTySr3Ua581iOCP3fjsR3SgMKig&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdjango%2Fpull%2F9606%2Ffiles%23diff-5b0dda5eb9a242c15879dc9cd2121379R473\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOTySr3Ua581iOCP3fjsR3SgMKig&#39;;return true;">I worked on a small PR to add this and the implementation was refreshingly simple. It still needs docs, a couple more tests and to fix a strange error with sqlite on Windows, but overall it seems like a lot of gain for a small amount of code.

Tom



On 22 January 2018 at 15:10:53, Neal Todd ([hidden email]) wrote:

Hi Tom,

A built-in bulk save that's more flexible than update would certainly be nice. Just in case you haven't come across it though, there is a package called django-bulk-update:

<a href="https://github.com/aykut/django-bulk-update" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Faykut%2Fdjango-bulk-update\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnD9anYCkAs7wUvCGKwcMJy1cMFg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Faykut%2Fdjango-bulk-update\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnD9anYCkAs7wUvCGKwcMJy1cMFg&#39;;return true;">https://github.com/aykut/django-bulk-update

I've found it very useful on a number of occassions where update isn't quite enough but the loop-edit-save pattern is too slow to be convenient.

Probably some useful things in there when considering the API and approach.

Cheers, Neal 

On Friday, January 19, 2018 at 5:49:48 PM UTC, Tom Forbes wrote:

Hello all,

I’d love for some feedback on an idea I’ve been mulling around lately, namely adding a bulk_save method to Dango.

A somewhat common pattern for some applications is to loop over a list of models, set an attribute and call save on them. This unfortunately can issue a lot of database queries which can be a significant slowdown. You can work around this by using ‘.update()’ in some cases, but not all.

It seems it would be possible to use a CASE statement in SQL to handle bulk-updating many rows with differing values. For example:

SomeModel.object.filter(id__in=[1,2]).update(
    some_field=Case(
        When(id=1, then=Value('Field value for ID=1')),
        When(id=2, then=Value('Field value for ID=2'))
    )
)

I’ve made a ticket for this here: <a href="https://code.djangoproject.com/ticket/29037" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fcode.djangoproject.com%2Fticket%2F29037\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6vecQFpiHFf7R9POT84mTPgP3dg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fcode.djangoproject.com%2Fticket%2F29037\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6vecQFpiHFf7R9POT84mTPgP3dg&#39;;return true;"> https://code.djangoproject.com/ticket/29037

I managed to get a 70x performance increase using this technique on a fairly large table, and it seems it could be applicable to many projects just like bulk_create.

The downsides to this is that it can produce very large SQL statements when updating many rows (I had MySQL complain about a 10MB statement once), but this can be overcome with batching and other optimisations (i.e the same values can use WHEN id IN (x, y, z) rather than 3 individual WHEN statements).

I’m imagining an API very similar to bulk_create, but spend any time on a patch I thought I would ask if anyone have any feedback on this suggestion. Would this be a good addition to Dango?



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to [hidden email].
Visit this group at <a href="https://groups.google.com/group/django-developers" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/group/django-developers&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/group/django-developers&#39;;return true;">https://groups.google.com/group/django-developers.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium=email&amp;utm_source=footer" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;"> https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com.
For more options, visit <a href="https://groups.google.com/d/optout" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/5c77a802-40a5-4556-9b0b-13263e537d1b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Adding a bulk_save method to models

Raphael Michel
Hi,

I'd be very careful about calling it bulk_save(), since calling
it something with save() very strongly suggests that it calls pre_save
or post_save signals.

Best
Raphael


Am Fri, 14 Sep 2018 07:56:38 -0700 (PDT)
schrieb Tim Graham <[hidden email]>:

>  
>
> I wanted to ask about naming of the new method. Currently the
> proposed name is "QuerySet.bulk_save()" but I think it's a bit
> confusing since it uses QuerySet.update(), not Model.save(). It works
> similarly to QuerySet.bulk_update() from
> https://github.com/aykut/django-bulk-update but the arguments are a
> bit different.
>
>
> Josh's comment on the PR: "Since this only works for instances with
> an pk, do you think that bulk_update would be a better name? The
> regular save() method can either create or update depending on pk
> status which may confuse users here."
>
> And Tom's reply: "I considered this, but queryset.update() is the
> best 'bulk update' method. I didn't want to confuse the two, this is
> more about saving multiple model fields with multiple differing
> values, gene bulk_save. Open to changing it though."
>
>
> On Tuesday, January 23, 2018 at 7:38:18 AM UTC-5, Neal Todd wrote:
> >
> > Hi Tom,
> >
> > That's great, should be a helpful addition to core. Will follow the
> > ticket and PR.
> >
> > Neal
> >
> > (Apologies - I hadn't spotted that you'd already referenced
> > django-bulk-update in your ticket when I left my drive-by comment!)
> >
> > On Monday, January 22, 2018 at 7:41:11 PM UTC, Tom Forbes wrote:  
> >>
> >> Hey Neal,
> >>
> >> Thank you very much for pointing that out, I actually found out
> >> about this package as I was researching the ticket - I wish I had
> >> known about this a couple of years ago as it would have saved me a
> >> fair bit of CPU and brain time!
> >>
> >> I think that module is a good starting point and proves that it’s
> >> possible, however I think the implementation can be improved upon
> >> if we bring it inside core. I worked on a small PR to add this
> >> <https://github.com/django/django/pull/9606/files#diff-5b0dda5eb9a242c15879dc9cd2121379R473>
> >> and the implementation was refreshingly simple. It still needs
> >> docs, a couple more tests and to fix a strange error with sqlite
> >> on Windows, but overall it seems like a lot of gain for a small
> >> amount of code.
> >>
> >> Tom
> >>
> >>
> >> On 22 January 2018 at 15:10:53, Neal Todd ([hidden email])
> >> wrote:
> >>
> >> Hi Tom,
> >>
> >> A built-in bulk save that's more flexible than update would
> >> certainly be nice. Just in case you haven't come across it though,
> >> there is a package called django-bulk-update:
> >>
> >> https://github.com/aykut/django-bulk-update
> >>
> >> I've found it very useful on a number of occassions where update
> >> isn't quite enough but the loop-edit-save pattern is too slow to
> >> be convenient.
> >>
> >> Probably some useful things in there when considering the API and
> >> approach.
> >>
> >> Cheers, Neal
> >>
> >> On Friday, January 19, 2018 at 5:49:48 PM UTC, Tom Forbes wrote:  
> >>>
> >>> Hello all,
> >>>
> >>> I’d love for some feedback on an idea I’ve been mulling around
> >>> lately, namely adding a bulk_save method to Dango.
> >>>
> >>> A somewhat common pattern for some applications is to loop over a
> >>> list of models, set an attribute and call save on them. This
> >>> unfortunately can issue a lot of database queries which can be a
> >>> significant slowdown. You can work around this by using
> >>> ‘.update()’ in some cases, but not all.
> >>>
> >>> It seems it would be possible to use a CASE statement in SQL to
> >>> handle bulk-updating many rows with differing values. For example:
> >>>
> >>> SomeModel.object.filter(id__in=[1,2]).update(
> >>>     some_field=Case(
> >>>         When(id=1, then=Value('Field value for ID=1')),
> >>>         When(id=2, then=Value('Field value for ID=2'))
> >>>     )
> >>> )
> >>>
> >>> I’ve made a ticket for this here:
> >>> https://code.djangoproject.com/ticket/29037
> >>>
> >>> I managed to get a 70x performance increase using this technique
> >>> on a fairly large table, and it seems it could be applicable to
> >>> many projects just like bulk_create.
> >>>
> >>> The downsides to this is that it can produce very large SQL
> >>> statements when updating many rows (I had MySQL complain about a
> >>> 10MB statement once), but this can be overcome with batching and
> >>> other optimisations (i.e the same values can use WHEN id IN (x,
> >>> y, z) rather than 3 individual WHEN statements).
> >>>
> >>> I’m imagining an API very similar to bulk_create, but spend any
> >>> time on a patch I thought I would ask if anyone have any feedback
> >>> on this suggestion. Would this be a good addition to Dango?
> >>>
> >>>
> >>> --  
> >> You received this message because you are subscribed to the Google
> >> Groups "Django developers (Contributions to Django itself)" group.
> >> To unsubscribe from this group and stop receiving emails from it,
> >> send an email to [hidden email].
> >> To post to this group, send email to [hidden email].
> >> Visit this group at
> >> https://groups.google.com/group/django-developers. To view this
> >> discussion on the web visit
> >> https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com
> >> <https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium=email&utm_source=footer> .
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >>  
>
--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/20180914173155.41685505%40kvothe.
For more options, visit https://groups.google.com/d/optout.

attachment0 (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Adding a bulk_save method to models

Tom Forbes

My original reasoning was that Queryset.update() already bulk updates rows, so the bulk prefix seems a bit redundant here (how do you bulk something that already does something in bulk?). .save() however operates on a single object, so the bulk prefix seems more appropriate and easier to understand.

I agree bulk_save() maybe is not the best name as people might expect signals to be sent, but are there any suggestions other than bulk_update()? Maybe something more accurate, like bulk_update_fields()? Or bulk_save_fields()?




On 14 September 2018 at 16:32:05, Raphael Michel ([hidden email]) wrote:

Hi,

I'd be very careful about calling it bulk_save(), since calling
it something with save() very strongly suggests that it calls pre_save
or post_save signals.

Best
Raphael


Am Fri, 14 Sep 2018 07:56:38 -0700 (PDT)
schrieb Tim Graham <[hidden email]>:

>
>
> I wanted to ask about naming of the new method. Currently the
> proposed name is "QuerySet.bulk_save()" but I think it's a bit
> confusing since it uses QuerySet.update(), not Model.save(). It works
> similarly to QuerySet.bulk_update() from
> https://github.com/aykut/django-bulk-update but the arguments are a
> bit different.
>
>
> Josh's comment on the PR: "Since this only works for instances with
> an pk, do you think that bulk_update would be a better name? The
> regular save() method can either create or update depending on pk
> status which may confuse users here."
>
> And Tom's reply: "I considered this, but queryset.update() is the
> best 'bulk update' method. I didn't want to confuse the two, this is
> more about saving multiple model fields with multiple differing
> values, gene bulk_save. Open to changing it though."
>
>
> On Tuesday, January 23, 2018 at 7:38:18 AM UTC-5, Neal Todd wrote:
> >
> > Hi Tom,
> >
> > That's great, should be a helpful addition to core. Will follow the
> > ticket and PR.
> >
> > Neal
> >
> > (Apologies - I hadn't spotted that you'd already referenced
> > django-bulk-update in your ticket when I left my drive-by comment!)
> >
> > On Monday, January 22, 2018 at 7:41:11 PM UTC, Tom Forbes wrote:
> >>
> >> Hey Neal,
> >>
> >> Thank you very much for pointing that out, I actually found out
> >> about this package as I was researching the ticket - I wish I had
> >> known about this a couple of years ago as it would have saved me a
> >> fair bit of CPU and brain time!
> >>
> >> I think that module is a good starting point and proves that it’s
> >> possible, however I think the implementation can be improved upon
> >> if we bring it inside core. I worked on a small PR to add this
> >> <https://github.com/django/django/pull/9606/files#diff-5b0dda5eb9a242c15879dc9cd2121379R473>
> >> and the implementation was refreshingly simple. It still needs
> >> docs, a couple more tests and to fix a strange error with sqlite
> >> on Windows, but overall it seems like a lot of gain for a small
> >> amount of code.
> >>
> >> Tom
> >>
> >>
> >> On 22 January 2018 at 15:10:53, Neal Todd ([hidden email])
> >> wrote:
> >>
> >> Hi Tom,
> >>
> >> A built-in bulk save that's more flexible than update would
> >> certainly be nice. Just in case you haven't come across it though,
> >> there is a package called django-bulk-update:
> >>
> >> https://github.com/aykut/django-bulk-update
> >>
> >> I've found it very useful on a number of occassions where update
> >> isn't quite enough but the loop-edit-save pattern is too slow to
> >> be convenient.
> >>
> >> Probably some useful things in there when considering the API and
> >> approach.
> >>
> >> Cheers, Neal
> >>
> >> On Friday, January 19, 2018 at 5:49:48 PM UTC, Tom Forbes wrote:
> >>>
> >>> Hello all,
> >>>
> >>> I’d love for some feedback on an idea I’ve been mulling around
> >>> lately, namely adding a bulk_save method to Dango.
> >>>
> >>> A somewhat common pattern for some applications is to loop over a
> >>> list of models, set an attribute and call save on them. This
> >>> unfortunately can issue a lot of database queries which can be a
> >>> significant slowdown. You can work around this by using
> >>> ‘.update()’ in some cases, but not all.
> >>>
> >>> It seems it would be possible to use a CASE statement in SQL to
> >>> handle bulk-updating many rows with differing values. For example:
> >>>
> >>> SomeModel.object.filter(id__in=[1,2]).update(
> >>> some_field=Case(
> >>> When(id=1, then=Value('Field value for ID=1')),
> >>> When(id=2, then=Value('Field value for ID=2'))
> >>> )
> >>> )
> >>>
> >>> I’ve made a ticket for this here:
> >>> https://code.djangoproject.com/ticket/29037
> >>>
> >>> I managed to get a 70x performance increase using this technique
> >>> on a fairly large table, and it seems it could be applicable to
> >>> many projects just like bulk_create.
> >>>
> >>> The downsides to this is that it can produce very large SQL
> >>> statements when updating many rows (I had MySQL complain about a
> >>> 10MB statement once), but this can be overcome with batching and
> >>> other optimisations (i.e the same values can use WHEN id IN (x,
> >>> y, z) rather than 3 individual WHEN statements).
> >>>
> >>> I’m imagining an API very similar to bulk_create, but spend any
> >>> time on a patch I thought I would ask if anyone have any feedback
> >>> on this suggestion. Would this be a good addition to Dango?
> >>>
> >>>
> >>> --
> >> You received this message because you are subscribed to the Google
> >> Groups "Django developers (Contributions to Django itself)" group.
> >> To unsubscribe from this group and stop receiving emails from it,
> >> send an email to [hidden email].
> >> To post to this group, send email to [hidden email].
> >> Visit this group at
> >> https://groups.google.com/group/django-developers. To view this
> >> discussion on the web visit
> >> https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com
> >> <https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium=email&utm_source=footer> .
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >>
>

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/20180914173155.41685505%40kvothe.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFNZOJNS%2BnwSAHgxdsTxYZOi%3Dsed%3DQLjX0%2BcXhxzOORC0K%2BfoQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Adding a bulk_save method to models

Adam Johnson-2
Bikeshed time.

I'm also against bulk_save for the same reason that it implies save().

bulk_update sounds okay to me, update() is indeed already a 'bulk' operation but it could be claimed this is doing a 'bulk' amount of update operations.

bulk_update_fields also sounds good, the longer method name is probably balanced by the lower frequency of use.


On Sat, 15 Sep 2018 at 15:01, Tom Forbes <[hidden email]> wrote:

My original reasoning was that Queryset.update() already bulk updates rows, so the bulk prefix seems a bit redundant here (how do you bulk something that already does something in bulk?). .save() however operates on a single object, so the bulk prefix seems more appropriate and easier to understand.

I agree bulk_save() maybe is not the best name as people might expect signals to be sent, but are there any suggestions other than bulk_update()? Maybe something more accurate, like bulk_update_fields()? Or bulk_save_fields()?




On 14 September 2018 at 16:32:05, Raphael Michel ([hidden email]) wrote:

Hi,

I'd be very careful about calling it bulk_save(), since calling
it something with save() very strongly suggests that it calls pre_save
or post_save signals.

Best
Raphael


Am Fri, 14 Sep 2018 07:56:38 -0700 (PDT)
schrieb Tim Graham <[hidden email]>:

>
>
> I wanted to ask about naming of the new method. Currently the
> proposed name is "QuerySet.bulk_save()" but I think it's a bit
> confusing since it uses QuerySet.update(), not Model.save(). It works
> similarly to QuerySet.bulk_update() from
> https://github.com/aykut/django-bulk-update but the arguments are a
> bit different.
>
>
> Josh's comment on the PR: "Since this only works for instances with
> an pk, do you think that bulk_update would be a better name? The
> regular save() method can either create or update depending on pk
> status which may confuse users here."
>
> And Tom's reply: "I considered this, but queryset.update() is the
> best 'bulk update' method. I didn't want to confuse the two, this is
> more about saving multiple model fields with multiple differing
> values, gene bulk_save. Open to changing it though."
>
>
> On Tuesday, January 23, 2018 at 7:38:18 AM UTC-5, Neal Todd wrote:
> >
> > Hi Tom,
> >
> > That's great, should be a helpful addition to core. Will follow the
> > ticket and PR.
> >
> > Neal
> >
> > (Apologies - I hadn't spotted that you'd already referenced
> > django-bulk-update in your ticket when I left my drive-by comment!)
> >
> > On Monday, January 22, 2018 at 7:41:11 PM UTC, Tom Forbes wrote:
> >>
> >> Hey Neal,
> >>
> >> Thank you very much for pointing that out, I actually found out
> >> about this package as I was researching the ticket - I wish I had
> >> known about this a couple of years ago as it would have saved me a
> >> fair bit of CPU and brain time!
> >>
> >> I think that module is a good starting point and proves that it’s
> >> possible, however I think the implementation can be improved upon
> >> if we bring it inside core. I worked on a small PR to add this
> >> <https://github.com/django/django/pull/9606/files#diff-5b0dda5eb9a242c15879dc9cd2121379R473>
> >> and the implementation was refreshingly simple. It still needs
> >> docs, a couple more tests and to fix a strange error with sqlite
> >> on Windows, but overall it seems like a lot of gain for a small
> >> amount of code.
> >>
> >> Tom
> >>
> >>
> >> On 22 January 2018 at 15:10:53, Neal Todd ([hidden email])
> >> wrote:
> >>
> >> Hi Tom,
> >>
> >> A built-in bulk save that's more flexible than update would
> >> certainly be nice. Just in case you haven't come across it though,
> >> there is a package called django-bulk-update:
> >>
> >> https://github.com/aykut/django-bulk-update
> >>
> >> I've found it very useful on a number of occassions where update
> >> isn't quite enough but the loop-edit-save pattern is too slow to
> >> be convenient.
> >>
> >> Probably some useful things in there when considering the API and
> >> approach.
> >>
> >> Cheers, Neal
> >>
> >> On Friday, January 19, 2018 at 5:49:48 PM UTC, Tom Forbes wrote:
> >>>
> >>> Hello all,
> >>>
> >>> I’d love for some feedback on an idea I’ve been mulling around
> >>> lately, namely adding a bulk_save method to Dango.
> >>>
> >>> A somewhat common pattern for some applications is to loop over a
> >>> list of models, set an attribute and call save on them. This
> >>> unfortunately can issue a lot of database queries which can be a
> >>> significant slowdown. You can work around this by using
> >>> ‘.update()’ in some cases, but not all.
> >>>
> >>> It seems it would be possible to use a CASE statement in SQL to
> >>> handle bulk-updating many rows with differing values. For example:
> >>>
> >>> SomeModel.object.filter(id__in=[1,2]).update(
> >>> some_field=Case(
> >>> When(id=1, then=Value('Field value for ID=1')),
> >>> When(id=2, then=Value('Field value for ID=2'))
> >>> )
> >>> )
> >>>
> >>> I’ve made a ticket for this here:
> >>> https://code.djangoproject.com/ticket/29037
> >>>
> >>> I managed to get a 70x performance increase using this technique
> >>> on a fairly large table, and it seems it could be applicable to
> >>> many projects just like bulk_create.
> >>>
> >>> The downsides to this is that it can produce very large SQL
> >>> statements when updating many rows (I had MySQL complain about a
> >>> 10MB statement once), but this can be overcome with batching and
> >>> other optimisations (i.e the same values can use WHEN id IN (x,
> >>> y, z) rather than 3 individual WHEN statements).
> >>>
> >>> I’m imagining an API very similar to bulk_create, but spend any
> >>> time on a patch I thought I would ask if anyone have any feedback
> >>> on this suggestion. Would this be a good addition to Dango?
> >>>
> >>>
> >>> --
> >> You received this message because you are subscribed to the Google
> >> Groups "Django developers (Contributions to Django itself)" group.
> >> To unsubscribe from this group and stop receiving emails from it,
> >> send an email to [hidden email].
> >> To post to this group, send email to [hidden email].
> >> Visit this group at
> >> https://groups.google.com/group/django-developers. To view this
> >> discussion on the web visit
> >> https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com
> >> <https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium=email&utm_source=footer> .
> >> For more options, visit https://groups.google.com/d/optout.
> >>
> >>
>

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/20180914173155.41685505%40kvothe.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFNZOJNS%2BnwSAHgxdsTxYZOi%3Dsed%3DQLjX0%2BcXhxzOORC0K%2BfoQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


--
Adam

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAMyDDM2Ew6bY7FxXKfuzYPPgkbhQh5j-iZ%3Dy71-siPM-o3OfXQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Adding a bulk_save method to models

charettes
I also dislike bulk_save() for the same reasons.

I feel like bulk_update makes the most of sense given it has a signature similar to bulk_create where an iterable of model instances must be passed we're really just performing an update.

To the bulk_update and update is the natural analogous to what bulk_create is to create; bulk_update_fields feels too verbose and breaks the symmetry of bulk_create/create for update.

Cheers,
Simon

Le samedi 15 septembre 2018 18:15:39 UTC-4, Adam Johnson a écrit :
Bikeshed time.

I'm also against bulk_save for the same reason that it implies save().

bulk_update sounds okay to me, update() is indeed already a 'bulk' operation but it could be claimed this is doing a 'bulk' amount of update operations.

bulk_update_fields also sounds good, the longer method name is probably balanced by the lower frequency of use.


On Sat, 15 Sep 2018 at 15:01, Tom Forbes <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="qxYySDtMFwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">t...@...> wrote:

My original reasoning was that Queryset.update() already bulk updates rows, so the bulk prefix seems a bit redundant here (how do you bulk something that already does something in bulk?). .save() however operates on a single object, so the bulk prefix seems more appropriate and easier to understand.

I agree bulk_save() maybe is not the best name as people might expect signals to be sent, but are there any suggestions other than bulk_update()? Maybe something more accurate, like bulk_update_fields()? Or bulk_save_fields()?




On 14 September 2018 at 16:32:05, Raphael Michel (<a href="javascript:" target="_blank" gdf-obfuscated-mailto="qxYySDtMFwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">ma...@...) wrote:

Hi,

I'd be very careful about calling it bulk_save(), since calling
it something with save() very strongly suggests that it calls pre_save
or post_save signals.

Best
Raphael


Am Fri, 14 Sep 2018 07:56:38 -0700 (PDT)
schrieb Tim Graham <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="qxYySDtMFwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">timog...@...>:

>
>
> I wanted to ask about naming of the new method. Currently the
> proposed name is "QuerySet.bulk_save()" but I think it's a bit
> confusing since it uses QuerySet.update(), not Model.save(). It works
> similarly to QuerySet.bulk_update() from
> <a href="https://github.com/aykut/django-bulk-update" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Faykut%2Fdjango-bulk-update\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnD9anYCkAs7wUvCGKwcMJy1cMFg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Faykut%2Fdjango-bulk-update\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnD9anYCkAs7wUvCGKwcMJy1cMFg&#39;;return true;">https://github.com/aykut/django-bulk-update but the arguments are a
> bit different.
>
>
> Josh's comment on the PR: "Since this only works for instances with
> an pk, do you think that bulk_update would be a better name? The
> regular save() method can either create or update depending on pk
> status which may confuse users here."
>
> And Tom's reply: "I considered this, but queryset.update() is the
> best 'bulk update' method. I didn't want to confuse the two, this is
> more about saving multiple model fields with multiple differing
> values, gene bulk_save. Open to changing it though."
>
>
> On Tuesday, January 23, 2018 at 7:38:18 AM UTC-5, Neal Todd wrote:
> >
> > Hi Tom,
> >
> > That's great, should be a helpful addition to core. Will follow the
> > ticket and PR.
> >
> > Neal
> >
> > (Apologies - I hadn't spotted that you'd already referenced
> > django-bulk-update in your ticket when I left my drive-by comment!)
> >
> > On Monday, January 22, 2018 at 7:41:11 PM UTC, Tom Forbes wrote:
> >>
> >> Hey Neal,
> >>
> >> Thank you very much for pointing that out, I actually found out
> >> about this package as I was researching the ticket - I wish I had
> >> known about this a couple of years ago as it would have saved me a
> >> fair bit of CPU and brain time!
> >>
> >> I think that module is a good starting point and proves that it’s
> >> possible, however I think the implementation can be improved upon
> >> if we bring it inside core. I worked on a small PR to add this
> >> <<a href="https://github.com/django/django/pull/9606/files#diff-5b0dda5eb9a242c15879dc9cd2121379R473" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdjango%2Fpull%2F9606%2Ffiles%23diff-5b0dda5eb9a242c15879dc9cd2121379R473\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOTySr3Ua581iOCP3fjsR3SgMKig&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdjango%2Fpull%2F9606%2Ffiles%23diff-5b0dda5eb9a242c15879dc9cd2121379R473\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNEOTySr3Ua581iOCP3fjsR3SgMKig&#39;;return true;">https://github.com/django/django/pull/9606/files#diff-5b0dda5eb9a242c15879dc9cd2121379R473>
> >> and the implementation was refreshingly simple. It still needs
> >> docs, a couple more tests and to fix a strange error with sqlite
> >> on Windows, but overall it seems like a lot of gain for a small
> >> amount of code.
> >>
> >> Tom
> >>
> >>
> >> On 22 January 2018 at 15:10:53, Neal Todd ([hidden email])
> >> wrote:
> >>
> >> Hi Tom,
> >>
> >> A built-in bulk save that's more flexible than update would
> >> certainly be nice. Just in case you haven't come across it though,
> >> there is a package called django-bulk-update:
> >>
> >> <a href="https://github.com/aykut/django-bulk-update" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Faykut%2Fdjango-bulk-update\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnD9anYCkAs7wUvCGKwcMJy1cMFg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Faykut%2Fdjango-bulk-update\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnD9anYCkAs7wUvCGKwcMJy1cMFg&#39;;return true;">https://github.com/aykut/django-bulk-update
> >>
> >> I've found it very useful on a number of occassions where update
> >> isn't quite enough but the loop-edit-save pattern is too slow to
> >> be convenient.
> >>
> >> Probably some useful things in there when considering the API and
> >> approach.
> >>
> >> Cheers, Neal
> >>
> >> On Friday, January 19, 2018 at 5:49:48 PM UTC, Tom Forbes wrote:
> >>>
> >>> Hello all,
> >>>
> >>> I’d love for some feedback on an idea I’ve been mulling around
> >>> lately, namely adding a bulk_save method to Dango.
> >>>
> >>> A somewhat common pattern for some applications is to loop over a
> >>> list of models, set an attribute and call save on them. This
> >>> unfortunately can issue a lot of database queries which can be a
> >>> significant slowdown. You can work around this by using
> >>> ‘.update()’ in some cases, but not all.
> >>>
> >>> It seems it would be possible to use a CASE statement in SQL to
> >>> handle bulk-updating many rows with differing values. For example:
> >>>
> >>> SomeModel.object.filter(id__in=[1,2]).update(
> >>> some_field=Case(
> >>> When(id=1, then=Value('Field value for ID=1')),
> >>> When(id=2, then=Value('Field value for ID=2'))
> >>> )
> >>> )
> >>>
> >>> I’ve made a ticket for this here:
> >>> <a href="https://code.djangoproject.com/ticket/29037" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fcode.djangoproject.com%2Fticket%2F29037\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6vecQFpiHFf7R9POT84mTPgP3dg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fcode.djangoproject.com%2Fticket%2F29037\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG6vecQFpiHFf7R9POT84mTPgP3dg&#39;;return true;">https://code.djangoproject.com/ticket/29037
> >>>
> >>> I managed to get a 70x performance increase using this technique
> >>> on a fairly large table, and it seems it could be applicable to
> >>> many projects just like bulk_create.
> >>>
> >>> The downsides to this is that it can produce very large SQL
> >>> statements when updating many rows (I had MySQL complain about a
> >>> 10MB statement once), but this can be overcome with batching and
> >>> other optimisations (i.e the same values can use WHEN id IN (x,
> >>> y, z) rather than 3 individual WHEN statements).
> >>>
> >>> I’m imagining an API very similar to bulk_create, but spend any
> >>> time on a patch I thought I would ask if anyone have any feedback
> >>> on this suggestion. Would this be a good addition to Dango?
> >>>
> >>>
> >>> --
> >> You received this message because you are subscribed to the Google
> >> Groups "Django developers (Contributions to Django itself)" group.
> >> To unsubscribe from this group and stop receiving emails from it,
> >> send an email to django-develop...@googlegroups.com.
> >> To post to this group, send email to [hidden email].
> >> Visit this group at
> >> <a href="https://groups.google.com/group/django-developers" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/group/django-developers&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/group/django-developers&#39;;return true;">https://groups.google.com/group/django-developers. To view this
> >> discussion on the web visit
> >> <a href="https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com&#39;;return true;">https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com
> >> <<a href="https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;">https://groups.google.com/d/msgid/django-developers/5988d579-7843-4c42-a6f9-1e389c58ece6%40googlegroups.com?utm_medium=email&utm_source=footer> .
> >> For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.
> >>
> >>
>

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="qxYySDtMFwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">django-develop...@googlegroups.com.
To post to this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="qxYySDtMFwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">django-d...@googlegroups.com.
Visit this group at <a href="https://groups.google.com/group/django-developers" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/group/django-developers&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/group/django-developers&#39;;return true;">https://groups.google.com/group/django-developers.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/django-developers/20180914173155.41685505%40kvothe" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/django-developers/20180914173155.41685505%40kvothe&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/django-developers/20180914173155.41685505%40kvothe&#39;;return true;">https://groups.google.com/d/msgid/django-developers/20180914173155.41685505%40kvothe.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="qxYySDtMFwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">django-develop...@googlegroups.com.
To post to this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="qxYySDtMFwAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">django-d...@googlegroups.com.
Visit this group at <a href="https://groups.google.com/group/django-developers" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/group/django-developers&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/group/django-developers&#39;;return true;">https://groups.google.com/group/django-developers.
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/django-developers/CAFNZOJNS%2BnwSAHgxdsTxYZOi%3Dsed%3DQLjX0%2BcXhxzOORC0K%2BfoQ%40mail.gmail.com?utm_medium=email&amp;utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/msgid/django-developers/CAFNZOJNS%2BnwSAHgxdsTxYZOi%3Dsed%3DQLjX0%2BcXhxzOORC0K%2BfoQ%40mail.gmail.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/msgid/django-developers/CAFNZOJNS%2BnwSAHgxdsTxYZOi%3Dsed%3DQLjX0%2BcXhxzOORC0K%2BfoQ%40mail.gmail.com?utm_medium\x3demail\x26utm_source\x3dfooter&#39;;return true;">https://groups.google.com/d/msgid/django-developers/CAFNZOJNS%2BnwSAHgxdsTxYZOi%3Dsed%3DQLjX0%2BcXhxzOORC0K%2BfoQ%40mail.gmail.com.
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/d/optout&#39;;return true;">https://groups.google.com/d/optout.


--
Adam

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/9f9ebf5c-bed6-487f-8c9d-0e27a1ca6320%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Adding a bulk_save method to models

Tobias Kunze-2
In reply to this post by Adam Johnson-2
On 18-09-15 23:15:10, Adam Johnson wrote:
>> I agree bulk_save() maybe is not the best name as people might expect
>> signals to be sent, but are there any suggestions other than bulk_update()?
>> Maybe something more accurate, like bulk_update_fields()? Or
>> bulk_save_fields()?
>
>bulk_update_fields also sounds good, the longer method name is probably
>balanced by the lower frequency of use.

bulk_update_fields() sounds fine to me, as it makes clearer what
happens. With bulk_update() alone, I'd expect the exact analogous
action to update() to occur, since we're already used to that pattern
from create() vs bulk_create().

update_fields() alone may also work. Upside: it's shorter. Downside:
it's not immediately clear that it takes an iterable and not an
instance. I'd be happy with both options.

Best regards,
Tobias

--
You received this message because you are subscribed to the Google Groups "Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/20180916091931.7hlmh2xc5eo5z7ws%40ronja.localdomain.
For more options, visit https://groups.google.com/d/optout.

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Adding a bulk_save method to models

Tom Forbes

Thank you all for the feedback, I’ve changed the method to be bulk_update() as this seems to be the most liked option. Naming things is hard, and while bulk_update() isn’t perfect I think it’s a bit better than bulk_update_fields() or just update_fields().




On 16 September 2018 at 10:19:40, Tobias Kunze ([hidden email]) wrote:

On 18-09-15 23:15:10, Adam Johnson wrote:
>> I agree bulk_save() maybe is not the best name as people might expect
>> signals to be sent, but are there any suggestions other than bulk_update()?
>> Maybe something more accurate, like bulk_update_fields()? Or
>> bulk_save_fields()?
>
>bulk_update_fields also sounds good, the longer method name is probably
>balanced by the lower frequency of use.

bulk_update_fields() sounds fine to me, as it makes clearer what
happens. With bulk_update() alone, I'd expect the exact analogous
action to update() to occur, since we're already used to that pattern
from create() vs bulk_create().

update_fields() alone may also work. Upside: it's shorter. Downside:
it's not immediately clear that it takes an iterable and not an
instance. I'd be happy with both options.

Best regards,
Tobias

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/20180916091931.7hlmh2xc5eo5z7ws%40ronja.localdomain.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFNZOJOHYTA-%3D6BYVm7Doe%3DE2%3DAWgnJ-F7%2B_UKgzAsiH%2B0bm-A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.