Quantcast

Review of DEP 201 - simplified routing syntax

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Review of DEP 201 - simplified routing syntax

Aymeric Augustin
Hello,

I reviewed the current version of DEP 201 as well as related discussions. I took notes and wrote down arguments along the way. I'm sharing them below. It may be useful to add some of these arguments to the DEP.

Sjoerd, Tom, I didn't want to edit your DEP directly, but if you agree with the items marked [Update DEP] below I can prepare a PR. I will now take a look at the pull requests implementing this DEP.


Should it live as a third-party package first?

The original motivation for this DEP was to make Django easier to use by people who aren't familiar with regexes.

While regexes are a powerful tool, notably for shell scripting, I find it counter-productive to make them a prerequisite for building a Django website. You can build a very nice and useful website with models, forms, templates, and the admin, without ever needing regexes — except, until now, for the URLconf!

Since we aren't going to say in the official tutorial "hey install this third-party package to manage your URLs", that goal can only be met by building the new system into Django.

Besides, I suspect many professional Django developers copy-paste regexes without a deep understanding of how they work. For example, I'd be surprised if everyone knew why it's wrong to match a numerical id in a URL with \d+ (answer at the bottom of this email).

Not only is the new system easier for beginners, but I think it'll also be adopted by experienced developers to reduce the risk of mistakes in URLpatterns, which are an inevitable downside of their syntax. Django can solve problems like avoiding \d+ for everyone.

Anecdote: I keep making hard-to-detect errors in URL regexes. The only URL regexes I wrote that can't be replicated with the new system are very dubious and could easily be replaced with a more explicit `if some_condition(request.path): raise Http404` in the corresponding view. I will be happy to convert my projects to the new system.

No progress was made in this area since 2008 because URL regexes are a minor annoyance. After you write them, you never see them again. I think that explains why no popular alternative emerged until now.

Since there's a significant amount of prior art in other projects, a strong consensus on DEP 201 being a good approach, and a fairly narrow scope, it seems reasonable to design the new system directly into Django.


What alternatives would be possible?

I have some sympathy with the arguments for a pluggable URL resolver system, similar to what I did for templates in Django 1.8. However I don't see this happening any time soon because there's too little motivation to justify the effort. As I explained above, developers tend to live with whatever the framework provides.

Of course, if someone wants to write a fully pluggable URL resolver, that's great! But there's no momentum besides saying that "it should be done that way". Furthermore, adding the new system shouldn't make it more difficult to move to a fully pluggable system. If anything, it will clean up the internals and prepare further work in the area. Some changes of this kind were already committed.

DEP 201 is mostly independent from the problem of allowing multiple views to match the same URL — that is, to resume resolving URL patterns if a view applies some logic and decides it can't handle a URL. This is perhaps the biggest complaint about the current design of the URL resolver. Solutions include hacking around the current design or changing it fundamentally.

This proposal doesn't change anything to the possibility of autogenerating URL structures, like DRF's routers do.

I'm aware of one realistic proposals for a more elaborate and perhaps cleaner system: Marten Kenbeek's dispatcher API refactor at: https://github.com/knbk/django/tree/dispatcher_api. I can't say if the implementation of DEP 201 will break that effort. Anyway we can't wait forever on a branch that has no schedule for completion.

To sum up, I don't see any bigger improvement that is likely to be implemented in the short run and would justify delaying DEP 201.


Which types should be supported?

Integers, slugs, UUIDs and arbitrary strings will cover the vast majority of cases. I haven't seen any other examples in all discussions.

I don't think it's reasonable to include floats in URLs. Reversing a URL could lead to atrocities such as /foo/0.300000000000000004/bar/. This doesn't look good.

I don't think the use case is sufficiently common to add supports for Decimal, which wouldn't suffer from that issue.

So I would remove float and add slug. [Update DEP]

I'm proposing the following regexes:

path = .+
string = [^/]+
int = [0-9]+  # negative integers are uncommon, I think they should be left to a custom converter
slug = [-a-zA-Z0-9_]  # see django.core.validators.slug_re — we must stick to the current definition.
uuid = [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}  # hyphens are mandatory to avoid multiple URLs resolving to the same view


How to handle the transition new API?

The plan to move imports to django.urls to facilitate the transition is a nice hack. Who doesn't like a nice hack? ;-)

It would be nice to specify a target for starting the deprecation of django.conf.urls.url. Deprecate in 3.1, remove in 4.1? [Update DEP]

Bikeshedding: I'm somewhat disturbed by the choice of path_regex instead of regex_path, because the thing is primarily a path, which happens to be implemented with a regex. I would like to suggest re_path, which reuses the name of the re module in Python to make a shorter name. [Update DEP]


How to manage leading and trailing slashes?

The second discussion in the mailing list determined that keeping the current system was fine: all URLs are relative (not starting with a slash) and the root URLconf is implicitly relative to /.

This is inconsistent with Flask, where URLs start with a leading slash, but Flask doesn't have relative URLs or includes.


Other bugs in the DEP

Failure to perform a type conversion against a captured string should result in an Http404 exception being raised.

I believe this should say "result in the pattern not matching". [Update DEP]

I'm not a fan of the CUSTOM_URL_CONVERTERS setting. I think it would be sufficient to document a `register_converter` function that users can call in their URLconf module — so the converter is registered before the URLconf is loaded. [Update DEP]

I think the section about "Preventing unintended errors" should be removed. If it turns out to be a real problem, then we can reconsider and add a check. [Update DEP]


Best regards,

-- 
Aymeric.


Answer to "why it's wrong to match an id in a URL with \d+"

Django's URL resolver matches URLs as str (Unicode strings), not bytes, after percent and UTF-8 decoding. As a consequence \d matches any character in Unicode character category Nd, for example, ١ which is 1 in Arabic (unless you also specified the ASCII flag on the regex).

Interestingly Python's int() function will still do the right thing: int('١') == 1. But that means you have multiple URLs resolving to the same page, which can be bad for SEO. In general it isn't a good practice to have unexpected URLs resolving by accident to an unintended view.

For this reason, I'm always using the more verbose [0-9] instead of \d in my URLconfs.

The Django admin suffers from this bug :-) See it for yourself at /admin/sites/site/١/.


--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/ABE47429-FE9E-42FA-BA0E-0BE5B8597CD0%40polytechnique.org.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Review of DEP 201 - simplified routing syntax

Aymeric Augustin
After getting approval from Tom on IRC, I updated the DEP according to my email below: https://github.com/django/deps/pull/41

The next steps are:

- account for any remaining feedback (I feel I made only minor changes compared to the last version, hopefully we can wrap this up quickly now)
- get approval from the technical board
- complete the implementation!

-- 
Aymeric.



On 12 May 2017, at 12:32, Aymeric Augustin <[hidden email]> wrote:

Hello,

I reviewed the current version of DEP 201 as well as related discussions. I took notes and wrote down arguments along the way. I'm sharing them below. It may be useful to add some of these arguments to the DEP.

Sjoerd, Tom, I didn't want to edit your DEP directly, but if you agree with the items marked [Update DEP] below I can prepare a PR. I will now take a look at the pull requests implementing this DEP.


Should it live as a third-party package first?

The original motivation for this DEP was to make Django easier to use by people who aren't familiar with regexes.

While regexes are a powerful tool, notably for shell scripting, I find it counter-productive to make them a prerequisite for building a Django website. You can build a very nice and useful website with models, forms, templates, and the admin, without ever needing regexes — except, until now, for the URLconf!

Since we aren't going to say in the official tutorial "hey install this third-party package to manage your URLs", that goal can only be met by building the new system into Django.

Besides, I suspect many professional Django developers copy-paste regexes without a deep understanding of how they work. For example, I'd be surprised if everyone knew why it's wrong to match a numerical id in a URL with \d+ (answer at the bottom of this email).

Not only is the new system easier for beginners, but I think it'll also be adopted by experienced developers to reduce the risk of mistakes in URLpatterns, which are an inevitable downside of their syntax. Django can solve problems like avoiding \d+ for everyone.

Anecdote: I keep making hard-to-detect errors in URL regexes. The only URL regexes I wrote that can't be replicated with the new system are very dubious and could easily be replaced with a more explicit `if some_condition(request.path): raise Http404` in the corresponding view. I will be happy to convert my projects to the new system.

No progress was made in this area since 2008 because URL regexes are a minor annoyance. After you write them, you never see them again. I think that explains why no popular alternative emerged until now.

Since there's a significant amount of prior art in other projects, a strong consensus on DEP 201 being a good approach, and a fairly narrow scope, it seems reasonable to design the new system directly into Django.


What alternatives would be possible?

I have some sympathy with the arguments for a pluggable URL resolver system, similar to what I did for templates in Django 1.8. However I don't see this happening any time soon because there's too little motivation to justify the effort. As I explained above, developers tend to live with whatever the framework provides.

Of course, if someone wants to write a fully pluggable URL resolver, that's great! But there's no momentum besides saying that "it should be done that way". Furthermore, adding the new system shouldn't make it more difficult to move to a fully pluggable system. If anything, it will clean up the internals and prepare further work in the area. Some changes of this kind were already committed.

DEP 201 is mostly independent from the problem of allowing multiple views to match the same URL — that is, to resume resolving URL patterns if a view applies some logic and decides it can't handle a URL. This is perhaps the biggest complaint about the current design of the URL resolver. Solutions include hacking around the current design or changing it fundamentally.

This proposal doesn't change anything to the possibility of autogenerating URL structures, like DRF's routers do.

I'm aware of one realistic proposals for a more elaborate and perhaps cleaner system: Marten Kenbeek's dispatcher API refactor at: https://github.com/knbk/django/tree/dispatcher_api. I can't say if the implementation of DEP 201 will break that effort. Anyway we can't wait forever on a branch that has no schedule for completion.

To sum up, I don't see any bigger improvement that is likely to be implemented in the short run and would justify delaying DEP 201.


Which types should be supported?

Integers, slugs, UUIDs and arbitrary strings will cover the vast majority of cases. I haven't seen any other examples in all discussions.

I don't think it's reasonable to include floats in URLs. Reversing a URL could lead to atrocities such as /foo/0.300000000000000004/bar/. This doesn't look good.

I don't think the use case is sufficiently common to add supports for Decimal, which wouldn't suffer from that issue.

So I would remove float and add slug. [Update DEP]

I'm proposing the following regexes:

path = .+
string = [^/]+
int = [0-9]+  # negative integers are uncommon, I think they should be left to a custom converter
slug = [-a-zA-Z0-9_]  # see django.core.validators.slug_re — we must stick to the current definition.
uuid = [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}  # hyphens are mandatory to avoid multiple URLs resolving to the same view


How to handle the transition new API?

The plan to move imports to django.urls to facilitate the transition is a nice hack. Who doesn't like a nice hack? ;-)

It would be nice to specify a target for starting the deprecation of django.conf.urls.url. Deprecate in 3.1, remove in 4.1? [Update DEP]

Bikeshedding: I'm somewhat disturbed by the choice of path_regex instead of regex_path, because the thing is primarily a path, which happens to be implemented with a regex. I would like to suggest re_path, which reuses the name of the re module in Python to make a shorter name. [Update DEP]


How to manage leading and trailing slashes?

The second discussion in the mailing list determined that keeping the current system was fine: all URLs are relative (not starting with a slash) and the root URLconf is implicitly relative to /.

This is inconsistent with Flask, where URLs start with a leading slash, but Flask doesn't have relative URLs or includes.


Other bugs in the DEP

Failure to perform a type conversion against a captured string should result in an Http404 exception being raised.

I believe this should say "result in the pattern not matching". [Update DEP]

I'm not a fan of the CUSTOM_URL_CONVERTERS setting. I think it would be sufficient to document a `register_converter` function that users can call in their URLconf module — so the converter is registered before the URLconf is loaded. [Update DEP]

I think the section about "Preventing unintended errors" should be removed. If it turns out to be a real problem, then we can reconsider and add a check. [Update DEP]


Best regards,

-- 
Aymeric.


Answer to "why it's wrong to match an id in a URL with \d+"

Django's URL resolver matches URLs as str (Unicode strings), not bytes, after percent and UTF-8 decoding. As a consequence \d matches any character in Unicode character category Nd, for example, ١ which is 1 in Arabic (unless you also specified the ASCII flag on the regex).

Interestingly Python's int() function will still do the right thing: int('١') == 1. But that means you have multiple URLs resolving to the same page, which can be bad for SEO. In general it isn't a good practice to have unexpected URLs resolving by accident to an unintended view.

For this reason, I'm always using the more verbose [0-9] instead of \d in my URLconfs.

The Django admin suffers from this bug :-) See it for yourself at /admin/sites/site/١/.



--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/B89F0B87-DC29-4453-B724-400F825218A4%40polytechnique.org.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Review of DEP 201 - simplified routing syntax

Florian Apolloner
In reply to this post by Aymeric Augustin


On Friday, May 12, 2017 at 12:33:28 PM UTC+2, Aymeric Augustin wrote:
Django's URL resolver matches URLs as str (Unicode strings), not bytes, after percent and UTF-8 decoding. As a consequence \d matches <a href="http://www.fileformat.info/info/unicode/category/Nd/list.htm" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.fileformat.info%2Finfo%2Funicode%2Fcategory%2FNd%2Flist.htm\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHT024oC6WtaxPUNAgs2FXuDU2J8A&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.fileformat.info%2Finfo%2Funicode%2Fcategory%2FNd%2Flist.htm\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHT024oC6WtaxPUNAgs2FXuDU2J8A&#39;;return true;">any character in Unicode character category Nd, for example, ١ which is 1 in Arabic (unless you also specified the ASCII flag on the regex)

Ha, I was thinking that you might get somewhere along the lines of this. That only became an issue with python 3 right? Before regex defaulted to re.ASCII.
 

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/214cf377-d740-4d82-bee8-a1c200a82a7a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Review of DEP 201 - simplified routing syntax

knbk
On Friday, May 12, 2017 at 6:43:34 PM UTC+2, Florian Apolloner wrote:


On Friday, May 12, 2017 at 12:33:28 PM UTC+2, Aymeric Augustin wrote:
Django's URL resolver matches URLs as str (Unicode strings), not bytes, after percent and UTF-8 decoding. As a consequence \d matches <a href="http://www.fileformat.info/info/unicode/category/Nd/list.htm" rel="nofollow" target="_blank" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.fileformat.info%2Finfo%2Funicode%2Fcategory%2FNd%2Flist.htm\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHT024oC6WtaxPUNAgs2FXuDU2J8A&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.fileformat.info%2Finfo%2Funicode%2Fcategory%2FNd%2Flist.htm\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHT024oC6WtaxPUNAgs2FXuDU2J8A&#39;;return true;">any character in Unicode character category Nd, for example, ١ which is 1 in Arabic (unless you also specified the ASCII flag on the regex)

Ha, I was thinking that you might get somewhere along the lines of this. That only became an issue with python 3 right? Before regex defaulted to re.ASCII.
 

That's not quite right. Django has actually been using the `re.UNICODE` flag since at least 1.0, so you'd have the same problem on Python 2. 

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/d2e6048a-3fbd-43f0-8059-fa77a1c6bc7c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Review of DEP 201 - simplified routing syntax

Aymeric Augustin
On 12 May 2017, at 19:05, Marten Kenbeek <[hidden email]> wrote:

That's not quite right. Django has actually been using the `re.UNICODE` flag since at least 1.0, so you'd have the same problem on Python 2. 


Since it was hardly noticed in a decade, it can't be too much of a problem in practice...

-- 
Aymeric.

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/E46EABDB-1BEE-407C-91DD-347506DE037D%40polytechnique.org.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Review of DEP 201 - simplified routing syntax

knbk
In reply to this post by Aymeric Augustin
The regex in `url()` can be translated using `ugettext_lazy()`, in which case the lazy translation happens when `resolve()` or `reverse()` is called. It seems the current `path()` implementation would force evaluation when the URLs are first loaded, so `resolve()` and `reverse()` would use a fixed language (whichever was active when they were loaded) rather than allowing lazy translations.

I'm not sure how often this feature is used, but it seems like something `path()` should support out of the box. 

On Friday, May 12, 2017 at 12:33:28 PM UTC+2, Aymeric Augustin wrote:
Hello,

I reviewed the current version of <a href="https://github.com/django/deps/blob/4ab472dd4aab102beac667c9b65aed10bb7d7ed3/draft/0201-simplified-routing-syntax.rst" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdeps%2Fblob%2F4ab472dd4aab102beac667c9b65aed10bb7d7ed3%2Fdraft%2F0201-simplified-routing-syntax.rst\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG_K5D1vQ0CqRJ2HMoiV1L0aEyuEw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdeps%2Fblob%2F4ab472dd4aab102beac667c9b65aed10bb7d7ed3%2Fdraft%2F0201-simplified-routing-syntax.rst\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG_K5D1vQ0CqRJ2HMoiV1L0aEyuEw&#39;;return true;">DEP 201 as well as <a href="https://groups.google.com/forum/#!topic/django-developers/u6sQax3sjO4" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/forum/#!topic/django-developers/u6sQax3sjO4&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/forum/#!topic/django-developers/u6sQax3sjO4&#39;;return true;">related <a href="https://groups.google.com/forum/#!msg/django-developers/nq_8Hi5x_RI/jg4NZm80BwAJ" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/forum/#!msg/django-developers/nq_8Hi5x_RI/jg4NZm80BwAJ&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/forum/#!msg/django-developers/nq_8Hi5x_RI/jg4NZm80BwAJ&#39;;return true;">discussions. I took notes and wrote down arguments along the way. I'm sharing them below. It may be useful to add some of these arguments to the DEP.

Sjoerd, Tom, I didn't want to edit your DEP directly, but if you agree with the items marked [Update DEP] below I can prepare a PR. I will now take a look at the pull requests implementing this DEP.


Should it live as a third-party package first?

The original motivation for this DEP was to make Django easier to use by people who aren't familiar with regexes.

While regexes are a powerful tool, notably for shell scripting, I find it counter-productive to make them a prerequisite for building a Django website. You can build a very nice and useful website with models, forms, templates, and the admin, without ever needing regexes — except, until now, for the URLconf!

Since we aren't going to say in the official tutorial "hey install this third-party package to manage your URLs", that goal can only be met by building the new system into Django.

Besides, I suspect many professional Django developers copy-paste regexes without a deep understanding of how they work. For example, I'd be surprised if everyone knew why it's wrong to match a numerical id in a URL with \d+ (answer at the bottom of this email).

Not only is the new system easier for beginners, but I think it'll also be adopted by experienced developers to reduce the risk of mistakes in URLpatterns, which are an inevitable downside of their syntax. Django can solve problems like avoiding \d+ for everyone.

Anecdote: I keep making hard-to-detect errors in URL regexes. The only URL regexes I wrote that can't be replicated with the new system are very dubious and could easily be replaced with a more explicit `if some_condition(request.path): raise Http404` in the corresponding view. I will be happy to convert my projects to the new system.

No progress was made in this area since 2008 because URL regexes are a minor annoyance. After you write them, you never see them again. I think that explains why no popular alternative emerged until now.

Since there's a significant amount of prior art in other projects, a strong consensus on DEP 201 being a good approach, and a fairly narrow scope, it seems reasonable to design the new system directly into Django.


What alternatives would be possible?

I have some sympathy with the arguments for a pluggable URL resolver system, similar to what I did for templates in Django 1.8. However I don't see this happening any time soon because there's too little motivation to justify the effort. As I explained above, developers tend to live with whatever the framework provides.

Of course, if someone wants to write a fully pluggable URL resolver, that's great! But there's no momentum besides saying that "it should be done that way". Furthermore, adding the new system shouldn't make it more difficult to move to a fully pluggable system. If anything, it will clean up the internals and prepare further work in the area. Some changes of this kind were already committed.

DEP 201 is mostly independent from the problem of <a href="https://pypi.python.org/pypi/django-multiurl/1.1.0" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fpypi.python.org%2Fpypi%2Fdjango-multiurl%2F1.1.0\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnapGqpLGUrn8VVWIXThE93S8nng&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fpypi.python.org%2Fpypi%2Fdjango-multiurl%2F1.1.0\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnapGqpLGUrn8VVWIXThE93S8nng&#39;;return true;">allowing multiple views to match the same URL — that is, to resume resolving URL patterns if a view applies some logic and decides it can't handle a URL. This is perhaps the biggest complaint about the current design of the URL resolver. Solutions include hacking around the current design or changing it fundamentally.

This proposal doesn't change anything to the possibility of autogenerating URL structures, like DRF's routers do.

I'm aware of one realistic proposals for a more elaborate and perhaps cleaner system: Marten Kenbeek's dispatcher API refactor at: <a href="https://github.com/knbk/django/tree/dispatcher_api" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fknbk%2Fdjango%2Ftree%2Fdispatcher_api\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGhYh2u95G1iPU0H-hHMk2xhuiI7Q&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fknbk%2Fdjango%2Ftree%2Fdispatcher_api\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGhYh2u95G1iPU0H-hHMk2xhuiI7Q&#39;;return true;">https://github.com/knbk/django/tree/dispatcher_api. I can't say if the implementation of DEP 201 will break that effort. Anyway we can't wait forever on a branch that has no schedule for completion.

To sum up, I don't see any bigger improvement that is likely to be implemented in the short run and would justify delaying DEP 201.


Which types should be supported?

Integers, slugs, UUIDs and arbitrary strings will cover the vast majority of cases. I haven't seen any other examples in all discussions.

I don't think it's reasonable to include floats in URLs. Reversing a URL could lead to atrocities such as /foo/0.300000000000000004/bar/. This doesn't look good.

I don't think the use case is sufficiently common to add supports for Decimal, which wouldn't suffer from that issue.

So I would remove float and add slug. [Update DEP]

I'm proposing the following regexes:

path = .+
string = [^/]+
int = [0-9]+  # negative integers are uncommon, I think they should be left to a custom converter
slug = [-a-zA-Z0-9_]  # see django.core.validators.slug_re — we must stick to the current definition.
uuid = [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}  # hyphens are mandatory to avoid multiple URLs resolving to the same view


How to handle the transition new API?

The plan to move imports to django.urls to facilitate the transition is a nice hack. Who doesn't like a nice hack? ;-)

It would be nice to specify a target for starting the deprecation of django.conf.urls.url. Deprecate in 3.1, remove in 4.1? [Update DEP]

Bikeshedding: I'm somewhat disturbed by the choice of path_regex instead of regex_path, because the thing is primarily a path, which happens to be implemented with a regex. I would like to suggest re_path, which reuses the name of the re module in Python to make a shorter name. [Update DEP]


How to manage leading and trailing slashes?

The second discussion in the mailing list determined that keeping the current system was fine: all URLs are relative (not starting with a slash) and the root URLconf is implicitly relative to /.

This is inconsistent with Flask, where URLs start with a leading slash, but Flask doesn't have relative URLs or includes.


Other bugs in the DEP

Failure to perform a type conversion against a captured string should result in an Http404 exception being raised.

I believe this should say "result in the pattern not matching". [Update DEP]

I'm not a fan of the CUSTOM_URL_CONVERTERS setting. I think it would be sufficient to document a `register_converter` function that users can call in their URLconf module — so the converter is registered before the URLconf is loaded. [Update DEP]

I think the section about "Preventing unintended errors" should be removed. If it turns out to be a real problem, then we can reconsider and add a check. [Update DEP]


Best regards,

-- 
Aymeric.


Answer to "why it's wrong to match an id in a URL with \d+"

Django's URL resolver matches URLs as str (Unicode strings), not bytes, after percent and UTF-8 decoding. As a consequence \d matches <a href="http://www.fileformat.info/info/unicode/category/Nd/list.htm" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.fileformat.info%2Finfo%2Funicode%2Fcategory%2FNd%2Flist.htm\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHT024oC6WtaxPUNAgs2FXuDU2J8A&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.fileformat.info%2Finfo%2Funicode%2Fcategory%2FNd%2Flist.htm\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHT024oC6WtaxPUNAgs2FXuDU2J8A&#39;;return true;">any character in Unicode character category Nd, for example, ١ which is 1 in Arabic (unless you also specified the ASCII flag on the regex).

Interestingly Python's int() function will still do the right thing: int('١') == 1. But that means you have multiple URLs resolving to the same page, which can be bad for SEO. In general it isn't a good practice to have unexpected URLs resolving by accident to an unintended view.

For this reason, I'm always using the more verbose [0-9] instead of \d in my URLconfs.

The Django admin suffers from this bug :-) See it for yourself at /admin/sites/site/١/.


--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/d5069b76-c67b-415c-9ad4-03a18389160b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Review of DEP 201 - simplified routing syntax

Sjoerd Job Postmus


On Saturday, May 13, 2017 at 1:35:37 PM UTC+2, Marten Kenbeek wrote:
The regex in `url()` <a href="https://docs.djangoproject.com/en/1.11/topics/i18n/translation/#translating-urlpatterns" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fdocs.djangoproject.com%2Fen%2F1.11%2Ftopics%2Fi18n%2Ftranslation%2F%23translating-urlpatterns\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGCu8gvJ3XvvA4v7VznUH_LrtzLAg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fdocs.djangoproject.com%2Fen%2F1.11%2Ftopics%2Fi18n%2Ftranslation%2F%23translating-urlpatterns\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGCu8gvJ3XvvA4v7VznUH_LrtzLAg&#39;;return true;">can be translated using `ugettext_lazy()`, in which case the lazy translation happens when `resolve()` or `reverse()` is called. It seems the current `path()` implementation would force evaluation when the URLs are first loaded, so `resolve()` and `reverse()` would use a fixed language (whichever was active when they were loaded) rather than allowing lazy translations.

I'm not sure how often this feature is used, but it seems like something `path()` should support out of the box. 

I agree. It's an oversight which I think is solvable. The `LocaleRegexProvider` class already handles the caching.

I will check later, but I think all that's needed is to decorate the `path` function with `keep_lazy_str`. Does that seem correct to you?

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/6c9101a4-2467-4910-aaf8-3fbe0040a04c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Review of DEP 201 - simplified routing syntax

Sjoerd Job Postmus
Thinking about it some more: that is not the solution.

Also, there are probably a couple of corner cases to consider (still: the same as for normal regular expressions):

* What if the "keys" (named parameters) of a translated path differs from that of the original path?
* What if the converters differ?

For now I think the best way forward is to assume the above corner cases do not happen, or otherwise fall out-of-scope.

On Saturday, May 13, 2017 at 2:32:00 PM UTC+2, Sjoerd Job Postmus wrote:


On Saturday, May 13, 2017 at 1:35:37 PM UTC+2, Marten Kenbeek wrote:
The regex in `url()` <a href="https://docs.djangoproject.com/en/1.11/topics/i18n/translation/#translating-urlpatterns" rel="nofollow" target="_blank" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fdocs.djangoproject.com%2Fen%2F1.11%2Ftopics%2Fi18n%2Ftranslation%2F%23translating-urlpatterns\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGCu8gvJ3XvvA4v7VznUH_LrtzLAg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fdocs.djangoproject.com%2Fen%2F1.11%2Ftopics%2Fi18n%2Ftranslation%2F%23translating-urlpatterns\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGCu8gvJ3XvvA4v7VznUH_LrtzLAg&#39;;return true;">can be translated using `ugettext_lazy()`, in which case the lazy translation happens when `resolve()` or `reverse()` is called. It seems the current `path()` implementation would force evaluation when the URLs are first loaded, so `resolve()` and `reverse()` would use a fixed language (whichever was active when they were loaded) rather than allowing lazy translations.

I'm not sure how often this feature is used, but it seems like something `path()` should support out of the box. 

I agree. It's an oversight which I think is solvable. The `LocaleRegexProvider` class already handles the caching.

I will check later, but I think all that's needed is to decorate the `path` function with `keep_lazy_str`. Does that seem correct to you?

--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/37d03083-ce35-4b9d-a881-0f9c45e90db6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Review of DEP 201 - simplified routing syntax

Aymeric Augustin
In reply to this post by Aymeric Augustin
Hello,


Sjoerd has taken the lead on the implementation, please get in touch if you'd like to help!

Thanks,

-- 
Aymeric.



On 12 May 2017, at 14:19, Aymeric Augustin <[hidden email]> wrote:

After getting approval from Tom on IRC, I updated the DEP according to my email below: https://github.com/django/deps/pull/41

The next steps are:

- account for any remaining feedback (I feel I made only minor changes compared to the last version, hopefully we can wrap this up quickly now)
- get approval from the technical board
- complete the implementation!

-- 
Aymeric.



On 12 May 2017, at 12:32, Aymeric Augustin <[hidden email]> wrote:

Hello,

I reviewed the current version of DEP 201 as well as related discussions. I took notes and wrote down arguments along the way. I'm sharing them below. It may be useful to add some of these arguments to the DEP.

Sjoerd, Tom, I didn't want to edit your DEP directly, but if you agree with the items marked [Update DEP] below I can prepare a PR. I will now take a look at the pull requests implementing this DEP.


Should it live as a third-party package first?

The original motivation for this DEP was to make Django easier to use by people who aren't familiar with regexes.

While regexes are a powerful tool, notably for shell scripting, I find it counter-productive to make them a prerequisite for building a Django website. You can build a very nice and useful website with models, forms, templates, and the admin, without ever needing regexes — except, until now, for the URLconf!

Since we aren't going to say in the official tutorial "hey install this third-party package to manage your URLs", that goal can only be met by building the new system into Django.

Besides, I suspect many professional Django developers copy-paste regexes without a deep understanding of how they work. For example, I'd be surprised if everyone knew why it's wrong to match a numerical id in a URL with \d+ (answer at the bottom of this email).

Not only is the new system easier for beginners, but I think it'll also be adopted by experienced developers to reduce the risk of mistakes in URLpatterns, which are an inevitable downside of their syntax. Django can solve problems like avoiding \d+ for everyone.

Anecdote: I keep making hard-to-detect errors in URL regexes. The only URL regexes I wrote that can't be replicated with the new system are very dubious and could easily be replaced with a more explicit `if some_condition(request.path): raise Http404` in the corresponding view. I will be happy to convert my projects to the new system.

No progress was made in this area since 2008 because URL regexes are a minor annoyance. After you write them, you never see them again. I think that explains why no popular alternative emerged until now.

Since there's a significant amount of prior art in other projects, a strong consensus on DEP 201 being a good approach, and a fairly narrow scope, it seems reasonable to design the new system directly into Django.


What alternatives would be possible?

I have some sympathy with the arguments for a pluggable URL resolver system, similar to what I did for templates in Django 1.8. However I don't see this happening any time soon because there's too little motivation to justify the effort. As I explained above, developers tend to live with whatever the framework provides.

Of course, if someone wants to write a fully pluggable URL resolver, that's great! But there's no momentum besides saying that "it should be done that way". Furthermore, adding the new system shouldn't make it more difficult to move to a fully pluggable system. If anything, it will clean up the internals and prepare further work in the area. Some changes of this kind were already committed.

DEP 201 is mostly independent from the problem of allowing multiple views to match the same URL — that is, to resume resolving URL patterns if a view applies some logic and decides it can't handle a URL. This is perhaps the biggest complaint about the current design of the URL resolver. Solutions include hacking around the current design or changing it fundamentally.

This proposal doesn't change anything to the possibility of autogenerating URL structures, like DRF's routers do.

I'm aware of one realistic proposals for a more elaborate and perhaps cleaner system: Marten Kenbeek's dispatcher API refactor at: https://github.com/knbk/django/tree/dispatcher_api. I can't say if the implementation of DEP 201 will break that effort. Anyway we can't wait forever on a branch that has no schedule for completion.

To sum up, I don't see any bigger improvement that is likely to be implemented in the short run and would justify delaying DEP 201.


Which types should be supported?

Integers, slugs, UUIDs and arbitrary strings will cover the vast majority of cases. I haven't seen any other examples in all discussions.

I don't think it's reasonable to include floats in URLs. Reversing a URL could lead to atrocities such as /foo/0.300000000000000004/bar/. This doesn't look good.

I don't think the use case is sufficiently common to add supports for Decimal, which wouldn't suffer from that issue.

So I would remove float and add slug. [Update DEP]

I'm proposing the following regexes:

path = .+
string = [^/]+
int = [0-9]+  # negative integers are uncommon, I think they should be left to a custom converter
slug = [-a-zA-Z0-9_]  # see django.core.validators.slug_re — we must stick to the current definition.
uuid = [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}  # hyphens are mandatory to avoid multiple URLs resolving to the same view


How to handle the transition new API?

The plan to move imports to django.urls to facilitate the transition is a nice hack. Who doesn't like a nice hack? ;-)

It would be nice to specify a target for starting the deprecation of django.conf.urls.url. Deprecate in 3.1, remove in 4.1? [Update DEP]

Bikeshedding: I'm somewhat disturbed by the choice of path_regex instead of regex_path, because the thing is primarily a path, which happens to be implemented with a regex. I would like to suggest re_path, which reuses the name of the re module in Python to make a shorter name. [Update DEP]


How to manage leading and trailing slashes?

The second discussion in the mailing list determined that keeping the current system was fine: all URLs are relative (not starting with a slash) and the root URLconf is implicitly relative to /.

This is inconsistent with Flask, where URLs start with a leading slash, but Flask doesn't have relative URLs or includes.


Other bugs in the DEP

Failure to perform a type conversion against a captured string should result in an Http404 exception being raised.

I believe this should say "result in the pattern not matching". [Update DEP]

I'm not a fan of the CUSTOM_URL_CONVERTERS setting. I think it would be sufficient to document a `register_converter` function that users can call in their URLconf module — so the converter is registered before the URLconf is loaded. [Update DEP]

I think the section about "Preventing unintended errors" should be removed. If it turns out to be a real problem, then we can reconsider and add a check. [Update DEP]


Best regards,

-- 
Aymeric.


Answer to "why it's wrong to match an id in a URL with \d+"

Django's URL resolver matches URLs as str (Unicode strings), not bytes, after percent and UTF-8 decoding. As a consequence \d matches any character in Unicode character category Nd, for example, ١ which is 1 in Arabic (unless you also specified the ASCII flag on the regex).

Interestingly Python's int() function will still do the right thing: int('١') == 1. But that means you have multiple URLs resolving to the same page, which can be bad for SEO. In general it isn't a good practice to have unexpected URLs resolving by accident to an unintended view.

For this reason, I'm always using the more verbose [0-9] instead of \d in my URLconfs.

The Django admin suffers from this bug :-) See it for yourself at /admin/sites/site/١/.




--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/F7910E83-7997-4D4A-8497-6C45647D693A%40polytechnique.org.
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Review of DEP 201 - simplified routing syntax

Chris Foresman
Huzzah! Looking forward to the new syntax landing.


On Sunday, May 21, 2017 at 2:56:13 AM UTC-5, Aymeric Augustin wrote:
Hello,

The technical board accepted DEP 201: <a href="https://github.com/django/deps/blob/master/accepted/0201-simplified-routing-syntax.rst" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdeps%2Fblob%2Fmaster%2Faccepted%2F0201-simplified-routing-syntax.rst\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGDVRORcn8cw7eO57Oga0e17xvztg&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdeps%2Fblob%2Fmaster%2Faccepted%2F0201-simplified-routing-syntax.rst\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGDVRORcn8cw7eO57Oga0e17xvztg&#39;;return true;">https://github.com/django/deps/blob/master/accepted/0201-simplified-routing-syntax.rst

Sjoerd has taken the lead on the implementation, please get in touch if you'd like to help!

Thanks,

-- 
Aymeric.



On 12 May 2017, at 14:19, Aymeric Augustin <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="hpEVt9i2BgAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">aymeric....@polytechnique.org> wrote:

After getting approval from Tom on IRC, I updated the DEP according to my email below: <a href="https://github.com/django/deps/pull/41" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdeps%2Fpull%2F41\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHA_jcjKmj5oR-g2mZOtVIooP0l6w&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdeps%2Fpull%2F41\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHA_jcjKmj5oR-g2mZOtVIooP0l6w&#39;;return true;">https://github.com/django/deps/pull/41

The next steps are:

- account for any remaining feedback (I feel I made only minor changes compared to the last version, hopefully we can wrap this up quickly now)
- get approval from the technical board
- complete the implementation!

-- 
Aymeric.



On 12 May 2017, at 12:32, Aymeric Augustin <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="hpEVt9i2BgAJ" rel="nofollow" onmousedown="this.href=&#39;javascript:&#39;;return true;" onclick="this.href=&#39;javascript:&#39;;return true;">aymeric....@polytechnique.org> wrote:

Hello,

I reviewed the current version of <a href="https://github.com/django/deps/blob/4ab472dd4aab102beac667c9b65aed10bb7d7ed3/draft/0201-simplified-routing-syntax.rst" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdeps%2Fblob%2F4ab472dd4aab102beac667c9b65aed10bb7d7ed3%2Fdraft%2F0201-simplified-routing-syntax.rst\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG_K5D1vQ0CqRJ2HMoiV1L0aEyuEw&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fdjango%2Fdeps%2Fblob%2F4ab472dd4aab102beac667c9b65aed10bb7d7ed3%2Fdraft%2F0201-simplified-routing-syntax.rst\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNG_K5D1vQ0CqRJ2HMoiV1L0aEyuEw&#39;;return true;">DEP 201 as well as <a href="https://groups.google.com/forum/#!topic/django-developers/u6sQax3sjO4" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/forum/#!topic/django-developers/u6sQax3sjO4&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/forum/#!topic/django-developers/u6sQax3sjO4&#39;;return true;">related <a href="https://groups.google.com/forum/#!msg/django-developers/nq_8Hi5x_RI/jg4NZm80BwAJ" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://groups.google.com/forum/#!msg/django-developers/nq_8Hi5x_RI/jg4NZm80BwAJ&#39;;return true;" onclick="this.href=&#39;https://groups.google.com/forum/#!msg/django-developers/nq_8Hi5x_RI/jg4NZm80BwAJ&#39;;return true;">discussions. I took notes and wrote down arguments along the way. I'm sharing them below. It may be useful to add some of these arguments to the DEP.

Sjoerd, Tom, I didn't want to edit your DEP directly, but if you agree with the items marked [Update DEP] below I can prepare a PR. I will now take a look at the pull requests implementing this DEP.


Should it live as a third-party package first?

The original motivation for this DEP was to make Django easier to use by people who aren't familiar with regexes.

While regexes are a powerful tool, notably for shell scripting, I find it counter-productive to make them a prerequisite for building a Django website. You can build a very nice and useful website with models, forms, templates, and the admin, without ever needing regexes — except, until now, for the URLconf!

Since we aren't going to say in the official tutorial "hey install this third-party package to manage your URLs", that goal can only be met by building the new system into Django.

Besides, I suspect many professional Django developers copy-paste regexes without a deep understanding of how they work. For example, I'd be surprised if everyone knew why it's wrong to match a numerical id in a URL with \d+ (answer at the bottom of this email).

Not only is the new system easier for beginners, but I think it'll also be adopted by experienced developers to reduce the risk of mistakes in URLpatterns, which are an inevitable downside of their syntax. Django can solve problems like avoiding \d+ for everyone.

Anecdote: I keep making hard-to-detect errors in URL regexes. The only URL regexes I wrote that can't be replicated with the new system are very dubious and could easily be replaced with a more explicit `if some_condition(request.path): raise Http404` in the corresponding view. I will be happy to convert my projects to the new system.

No progress was made in this area since 2008 because URL regexes are a minor annoyance. After you write them, you never see them again. I think that explains why no popular alternative emerged until now.

Since there's a significant amount of prior art in other projects, a strong consensus on DEP 201 being a good approach, and a fairly narrow scope, it seems reasonable to design the new system directly into Django.


What alternatives would be possible?

I have some sympathy with the arguments for a pluggable URL resolver system, similar to what I did for templates in Django 1.8. However I don't see this happening any time soon because there's too little motivation to justify the effort. As I explained above, developers tend to live with whatever the framework provides.

Of course, if someone wants to write a fully pluggable URL resolver, that's great! But there's no momentum besides saying that "it should be done that way". Furthermore, adding the new system shouldn't make it more difficult to move to a fully pluggable system. If anything, it will clean up the internals and prepare further work in the area. Some changes of this kind were already committed.

DEP 201 is mostly independent from the problem of <a href="https://pypi.python.org/pypi/django-multiurl/1.1.0" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fpypi.python.org%2Fpypi%2Fdjango-multiurl%2F1.1.0\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnapGqpLGUrn8VVWIXThE93S8nng&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fpypi.python.org%2Fpypi%2Fdjango-multiurl%2F1.1.0\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNFnapGqpLGUrn8VVWIXThE93S8nng&#39;;return true;">allowing multiple views to match the same URL — that is, to resume resolving URL patterns if a view applies some logic and decides it can't handle a URL. This is perhaps the biggest complaint about the current design of the URL resolver. Solutions include hacking around the current design or changing it fundamentally.

This proposal doesn't change anything to the possibility of autogenerating URL structures, like DRF's routers do.

I'm aware of one realistic proposals for a more elaborate and perhaps cleaner system: Marten Kenbeek's dispatcher API refactor at: <a href="https://github.com/knbk/django/tree/dispatcher_api" target="_blank" rel="nofollow" onmousedown="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fknbk%2Fdjango%2Ftree%2Fdispatcher_api\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGhYh2u95G1iPU0H-hHMk2xhuiI7Q&#39;;return true;" onclick="this.href=&#39;https://www.google.com/url?q\x3dhttps%3A%2F%2Fgithub.com%2Fknbk%2Fdjango%2Ftree%2Fdispatcher_api\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNGhYh2u95G1iPU0H-hHMk2xhuiI7Q&#39;;return true;">https://github.com/knbk/django/tree/dispatcher_api. I can't say if the implementation of DEP 201 will break that effort. Anyway we can't wait forever on a branch that has no schedule for completion.

To sum up, I don't see any bigger improvement that is likely to be implemented in the short run and would justify delaying DEP 201.


Which types should be supported?

Integers, slugs, UUIDs and arbitrary strings will cover the vast majority of cases. I haven't seen any other examples in all discussions.

I don't think it's reasonable to include floats in URLs. Reversing a URL could lead to atrocities such as /foo/0.300000000000000004/bar/. This doesn't look good.

I don't think the use case is sufficiently common to add supports for Decimal, which wouldn't suffer from that issue.

So I would remove float and add slug. [Update DEP]

I'm proposing the following regexes:

path = .+
string = [^/]+
int = [0-9]+  # negative integers are uncommon, I think they should be left to a custom converter
slug = [-a-zA-Z0-9_]  # see django.core.validators.slug_re — we must stick to the current definition.
uuid = [0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}  # hyphens are mandatory to avoid multiple URLs resolving to the same view


How to handle the transition new API?

The plan to move imports to django.urls to facilitate the transition is a nice hack. Who doesn't like a nice hack? ;-)

It would be nice to specify a target for starting the deprecation of django.conf.urls.url. Deprecate in 3.1, remove in 4.1? [Update DEP]

Bikeshedding: I'm somewhat disturbed by the choice of path_regex instead of regex_path, because the thing is primarily a path, which happens to be implemented with a regex. I would like to suggest re_path, which reuses the name of the re module in Python to make a shorter name. [Update DEP]


How to manage leading and trailing slashes?

The second discussion in the mailing list determined that keeping the current system was fine: all URLs are relative (not starting with a slash) and the root URLconf is implicitly relative to /.

This is inconsistent with Flask, where URLs start with a leading slash, but Flask doesn't have relative URLs or includes.


Other bugs in the DEP

Failure to perform a type conversion against a captured string should result in an Http404 exception being raised.

I believe this should say "result in the pattern not matching". [Update DEP]

I'm not a fan of the CUSTOM_URL_CONVERTERS setting. I think it would be sufficient to document a `register_converter` function that users can call in their URLconf module — so the converter is registered before the URLconf is loaded. [Update DEP]

I think the section about "Preventing unintended errors" should be removed. If it turns out to be a real problem, then we can reconsider and add a check. [Update DEP]


Best regards,

-- 
Aymeric.


Answer to "why it's wrong to match an id in a URL with \d+"

Django's URL resolver matches URLs as str (Unicode strings), not bytes, after percent and UTF-8 decoding. As a consequence \d matches <a href="http://www.fileformat.info/info/unicode/category/Nd/list.htm" target="_blank" rel="nofollow" onmousedown="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.fileformat.info%2Finfo%2Funicode%2Fcategory%2FNd%2Flist.htm\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHT024oC6WtaxPUNAgs2FXuDU2J8A&#39;;return true;" onclick="this.href=&#39;http://www.google.com/url?q\x3dhttp%3A%2F%2Fwww.fileformat.info%2Finfo%2Funicode%2Fcategory%2FNd%2Flist.htm\x26sa\x3dD\x26sntz\x3d1\x26usg\x3dAFQjCNHT024oC6WtaxPUNAgs2FXuDU2J8A&#39;;return true;">any character in Unicode character category Nd, for example, ١ which is 1 in Arabic (unless you also specified the ASCII flag on the regex).

Interestingly Python's int() function will still do the right thing: int('١') == 1. But that means you have multiple URLs resolving to the same page, which can be bad for SEO. In general it isn't a good practice to have unexpected URLs resolving by accident to an unintended view.

For this reason, I'm always using the more verbose [0-9] instead of \d in my URLconfs.

The Django admin suffers from this bug :-) See it for yourself at /admin/sites/site/١/.




--
You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
To post to this group, send email to [hidden email].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/cb952839-addc-4d64-b237-0362e7d09cde%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...