Automatic slug generation in ModelAdmin via prepopulated_fields uses a urlify.js file which, among other behaviors, removes certain stop words from the slug. For example, a string like "To be or not to be, that is the question" will generate a slug "be-or-not-be-question", not "to-be-or-not-to-be-that-is-the-question" as one might expect. I’d like to solicit feedback on the idea of removing this logic so that slugs can contain these words.
-- For reference, the current list is: a, an, as, at, before, but, by, for, from, is, in, into, like, of, off, on, onto, per, since, than, the, this, that, to, up, via, with. Django ticket #30538 mentions this behavior as part of a more general comparison between urlify.js and Python slugify. It was closed as wontfix due to reasons of backwards compatibility. Per the triaging guidelines, I’m making this post to solicit feedback on the more specific question of addressing stopword removal in the JS code only -- not to try to address any other differences in behavior between these two methods. There’s been quite a bit of discussion on generating slugs for non-English languages (for example #2282), and this post is not intended to reopen that discussion. The current list of stopwords being removed seems to have been the same since at least 2005 (the earliest code I can find including this logic). Some of these words feel a little unexpected, for example “before” and “since”. After 15 years it seems reasonable to revisit the list and consider whether it still makes sense. Was removal of these words introduced for SEO reasons? If so, is this still a recommended default behavior? In 2020, search engines like Google seem smart enough to interpret them properly. Here's an arbitrary page that discusses this and includes a much longer list of what might be considered stopwords. As another datapoint, the popular WordPress Yoast SEO plugin used to remove stopwords, but stopped doing so a few years back. Potentially outdated SEO concerns aside, does this behavior still align well with the needs and desires of Django users? Is this something this community would be open to revisiting? Thanks for your consideration. (One minor point on language support: allowing these words would help to resolve at least some of the unequal treatment given to English over other languages, for example #12905. See also wagtail#4899, from which much of this post has been copied, for an example of how this logic impacts a Django-based CMS.) You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/fb6c9596-951d-4102-91b5-b5fd9c8c6340%40googlegroups.com. |
I for one am quite surprised to learn the admin has this behaviour. I'm extra surprised it assumes it's in English if only ASCII letters are used. This is quite a naïve assumption 😂 (See what I did in that sentence?)
Seems likely. Personally, for the reasons you've presented I think it would make sense to remove this behaviour. We can probably document how to wrap window.URLify to preserve the old behaviour. On Wed, 8 Apr 2020 at 20:38, Andy Chosak <[hidden email]> wrote:
-- Adam
-- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAMyDDM2NsRwp_EcoyBO5QJ7VYF2na20oWC6udkNfo1uAi7eS%2Bw%40mail.gmail.com. |
Thanks, Adam, for your reply. I've opened a ticket at https://code.djangoproject.com/ticket/31511, which includes a link to a PR that makes this change. Any advice on documenting how to wrap window.URLify? Thanks, Andy On Thursday, April 9, 2020 at 1:41:30 PM UTC-4, Adam Johnson wrote:
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/8f1e9719-da61-421a-97a1-9313ee0dd8db%40googlegroups.com. |
I agree with Mariusz on the ticket/PR that my answer alone isn't enough impetus to make this change. Hopefully someone more involved in i18n can weigh in. Although it changes the order of operations, I think this still works to achieve the same behaviour. This snippet can be run at the end of a page to wrap window.URLify. (function () { const originalURLify = window.URLify; function URLify(s, num_chars, allowUnicode) { let result = originalURLify(s, num_chars, allowUnicode); const hadUnicodeChars = /[^\u0000-\u007f]/.test(s); // Remove English words only if the string contains ASCII (English) // characters. if (!hasUnicodeChars) { const removeList = [ "a", "an", "as", "at", "before", "but", "by", "for", "from", "is", "in", "into", "like", "of", "off", "on", "onto", "per", "since", "than", "the", "this", "that", "to", "up", "via", "with" ]; const r = new RegExp('\\b(' + removeList.join('|') + ')\\b', 'gi'); result = result.replace(r, ''); } return result; }; window.URLify = newURlify; })(); On Thu, 23 Apr 2020 at 21:21, Andy Chosak <[hidden email]> wrote:
-- Adam
-- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAMyDDM0fOEXVyP%2Br1Eyw1izE7TFGCcE5i2STqpAWUFsjumhBVA%40mail.gmail.com. |
Does anyone else have an opinion on whether or not we should still be removing these stopwords?
I'm not sure if there are any i18n concerns here. In fact, ceasing this practice removes the impetus for the recurring issues being raised about how this practice negatively affects the experience for users of other languages, or doesn't remove words in their language, etc. Thanks for the suggested code, Adam. On the topic of deprecation, in general: Andy I weren't really sure how to approach that for a JavaScript-only change. We can't throw deprecation warnings in the Django console like we could if we were talking about Python code, can we? I could see adding some more aggressive messaging, maybe even in the Admin? You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/9bae6eba-2046-4270-b16b-69fe2b2c8e87%40googlegroups.com. |
I'm in favor of the change. It seems to me that most slugs I see these days have stop words in them and they read better because of that. I don't think JavaScript warnings would be helpful. A release note is sufficient. Admin users still get a preview of the slug and can edit it if needed.
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/bd4e8c7d-4f7e-4c00-9e8b-49378eb261ee%40googlegroups.com. |
In reply to this post by Andy Chosak
I very much prefer a slug "to-be-or-not-to-be-that-is-the-question" than "be-or-not-be-question" (which doesn't make sense). אורי On Wed, Apr 8, 2020 at 11:35 PM Andy Chosak <[hidden email]> wrote:
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CABD5YeG3G_NCp%3DjGcGggMg1zCUMd-DPAZo2pEJgC5UaM__k4ww%40mail.gmail.com. |
There's a bit more support now, and there have been no opinions against it. Because of this I've reopened the older closed ticket #11157: https://code.djangoproject.com/ticket/11157 . Andy/Scott, I hope you can retarget your PR as per my comment there. Thanks!
Agree, no need for deprecation warnings. This behaviour is in front of users with an easy override. On Sat, 16 May 2020 at 03:04, אורי <[hidden email]> wrote:
-- Adam
-- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAMyDDM3iB6szvMcQQ_LYJiODRACuq9%2Be6seQ%2BQih7Uu7Q3btLg%40mail.gmail.com. |
Thanks for the additional feedback, folks!
-- We have opened a fresh PR, rebased on the latest master and referencing #11157, at https://github.com/django/django/pull/12945 Best, Scott On Saturday, May 16, 2020 at 5:25:29 AM UTC-4, Adam Johnson wrote:
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/cf166ba2-162e-4adf-a0bc-2ef79365d1e9%40googlegroups.com. |
The PR was merged! Thanks everyone for your input and assistance.
-- On Wednesday, May 20, 2020 at 12:51:56 PM UTC-4, Scott Cranfill wrote:
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/de100786-9b69-4396-b594-4595f8103984%40googlegroups.com. |
Free forum by Nabble | Edit this page |