"Safe" Project Names and underscores in Project Names issue

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

"Safe" Project Names and underscores in Project Names issue

Michael-627
 Forgive me if I ask a stupid question but what characters are allowed
 in a project name on PyPI and by Distribute/Setuptools?
 I specifically ask because of what I think is a problem I found with
 Distribute/Setuptools with the underscore character ("_")

 It seems PyPI allows underscore as a valid character in a project name
 and as such there are projects like pyramid_beaker, django_email_auth.
 Distributes' behaviour is to replace all underscores to dashes using the
 safe_name function
 (https://bitbucket.org/tarek/distribute/src/611910892a04/pkg_resources.py#cl-1135).
 Project name is registered with dashes in the egg (i.e pyramid-beaker)
 and nowhere is the original name stored.

 It seems, following lead Pip has adjusted its code accordingly
 (https://github.com/pypa/pip/blob/ea60bb79b5f837f0aa36d2354415dbcca5368afe/pip/index.py#L317)
 As such, modifying the safe_name method to recognize undescore breaks
 pip.

 Doing "pip install pyramid_beaker" installs the package successfully
 however any package tools (i.e pip) report the project as
 "pyramid-beaker" and as such becomes unusable name if PyPI is queried
 via RPC Interfaces. Trying to turn all the dashes back underscores
 breaks packages that have underscores in them (i.e.
 sphinxcontrib-programoutput). And if the package were to use both, it
 would be impossible to determine what it is without running all the
 permutations.

 I would appreciate if any one could advise me on this matter and let me
 know if this is a bug underscore treatment with Distribute/Setuptools,
 pip or PyPI.


 Here is the test code to reproduce the problem (in a virtualenv):

 1. Setup a test virtualenv with "--distribute --no-site-packages" and
 activate it

 2. Install following packages:

 pip install --ignore-installed --no-deps pyramid_beaker
 pip install --ignore-installed --no-deps sphinxcontrib-programoutput

 3. Save and run the following script that uses pip and XMLRPC to check
 for new versions:
 --------------------------------
 import xmlrpclib
 import pip

 pypi = xmlrpclib.ServerProxy('http://pypi.python.org/pypi')
 for dist in pip.get_installed_distributions():
     available = pypi.package_releases(dist.project_name)
     if not available:
         # Try to capitalize pkg name
         available =
 pypi.package_releases(dist.project_name.capitalize())

     if not available:
         msg = 'no releases at pypi'
     elif available[0] != dist.version:
         msg = '{} available'.format(available[0])
     else:
         msg = 'up to date'
     pkg_info = '{dist.project_name} {dist.version}'.format(dist=dist)
     print '{pkg_info:40} {msg}'.format(pkg_info=pkg_info, msg=msg)

 ---------------------------------

 Output (specific to installed packages):

 sphinxcontrib-programoutput 0.4.1        up to date
 pyramid-beaker 0.5                       no releases at pypi

 [pyramid-beaker does not exist on PyPI, since its pyramid_beaker)


 Expected output (mind the underscores)

 sphinxcontrib-programoutput 0.4.1        up to date
 pyramid_beaker 0.5                       up to date

 ---- Additional info ---
 pyramid_beaker-0.5-py2.7.egg-info/PKG-INFO shows that the project name
 is:

 Name: pyramid-beaker


_______________________________________________
Distutils-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/distutils-sig
Reply | Threaded
Open this post in threaded view
|

Re: "Safe" Project Names and underscores in Project Names issue

PJ Eby
At 08:02 PM 8/4/2011 -0700, Michael wrote:
>Forgive me if I ask a stupid question but what characters are
>allowed in a project name on PyPI and by Distribute/Setuptools?

Setuptools allows names that have been processed through safe_name.


>As such, modifying the safe_name method to recognize undescore breaks pip.

Why are you modifying the safe_name function?  If you change it,
it'll no longer be safe.  ;-)



>Doing "pip install pyramid_beaker" installs the package successfully
>however any package tools (i.e pip) report the project as
>"pyramid-beaker" and as such becomes unusable name if PyPI is
>queried via RPC Interfaces. Trying to turn all the dashes back
>underscores breaks packages that have underscores in them (i.e.
>sphinxcontrib-programoutput). And if the package were to use both,
>it would be impossible to determine what it is without running all
>the permutations.
>
>I would appreciate if any one could advise me on this matter and let
>me know if this is a bug underscore treatment with
>Distribute/Setuptools, pip or PyPI.

It's a PyPI incompatibility - neither the distutils nor PyPI were
originally designed for a world involving automatically-resolved
dependencies, where names needed to be unambiguous.  That is, PyPI
and distutils had an implicit assumption that 1) people would choose
reasonable names and 2) be able to handle it when other people didn't.

Setuptools, on the other hand, needs unambiguous naming, and
therefore has a canonicalization algorithm that's designed to work
reasonably well with distutils' file naming conventions.  Distutils
normally uses '-' to separate name parts in a filename, and
(sometimes) escapes '-' using '_'.  So setuptools canonicalizes all
punctuation to '-', and escapes it as '_'.

Some of PyPI's code has been changed to work with this approach, and
some has not.


>[pyramid-beaker does not exist on PyPI, since its pyramid_beaker)

 From setuptools POV, the package name is pyramid-beaker, because '_'
is reserved for escaping punctuation in filenames.

If you want to look up a setuptools package name using PyPI XML-RPC,
you can't always use the name directly; you may have to perform a
PyPI search operation to obtain the package's PyPI name first.

_______________________________________________
Distutils-SIG maillist  -  [hidden email]
http://mail.python.org/mailman/listinfo/distutils-sig