Andreas Jung recently wrote “zopyx.trashfinder,” a quick-and-dirty script for assessing whether folks releasing software via the Python community’s “PyPI” repository are providing minimal metadata and actually making releases. (Andreas has been on the warpath against sloppy release management, it seems, perhaps having been recently frustrated.)
Andreas has assessed the zope., plone., and collective.* namespaces so far. He found 2 of 167 zope.* products with metadata problems, 2 of 117 plone.* products with problems, and 15 of 325 collective.* products with problems.
I decided to run an assessment against the Products.* namespace, which also contains lots of Plone add-on products. The news is pretty good… out of 250 products checked, only 4 had problems.
CRAP: Products.Clouseau==0.8.4dev - description < 40 chars CRAP: Products.Clouseau==0.8.4dev - summary < 10 chars CRAP: Products.Clouseau==0.8.4dev - no author and no maintainer email given CRAP: Products.Clouseau==0.8.4dev - no author and no maintainer name given CRAP: Products.listen==0.7.1 - description < 40 chars CRAP: Products.listen==0.7.1 - no author and no maintainer email given CRAP: Products.listen==0.7.1 - no author and no maintainer name given CRAP: Products.LoginLockout==0.2 - no release files, no valid download_url CRAP: Products.PloneInvite==1.1-alpha - no release files, no valid download_url
While I think Andreas is being a bit (characteristically) tough in his conclusion that “PyPI is the public data toilet of the Python community,” given the low rate of problems, I really love the idea of regular, automated sanity checking like this, or as Alex Limi and I have called it, “automated shame.” I hope Andreas adds the product maintainer or author’s name to the listing in a future release.
I’d love to see someone regularly interrogate all of the packages in Plone.org’s PyPI server with this query, and regularly publish the results to the Plone product-developers list.
Update: David Glick also referred me to Matthew Wilkes’ “mr.parker” which checks to make sure that PyPI packages have more than one authorized admins, in order to avoid the “hit-by-a-bus” factor. Good stuff.
Frustration started out yesterday while looking at three new but different collective. package dealing with slideshows, galleries etc. (by the same package author). Two packages had no metadata at all and no release files and only a link to the collective SVN. The other package had one release file but a horrible long description including broken image links to external sites. I know how to deal with SVN checkouts, average users don’t. That’s why the “data toilet” came back to my mind. Certainly this affects only a minority of package and maintainers but it is the minority that makes life kind of hard sometimes
Interesting, it seems that your automated test didn’t even pick up any of the collective.* slideshow products. Hmm, maybe some further refinement and regular “metadata quality” test runs?
Interesting, it seems that your automated test didn’t even pick up any of the collective.* slideshow products. Hmm, maybe some further refinement and regular “metadata quality” test runs?