python unicode abi

One of the reasons we suggested not using Python for apps meant to be distributed directly to end users is that libpython has an ABI that varies between distributions, causing failures for apps that include any custom C code or bindings.

I just posted [an outline of a possible solution](http://article.gmane.org/gmane.comp.autopackage.devel/5212) to the mailing list, though I don’t have plans to implement it anytime soon.

About these ads

8 Responses to “python unicode abi”

  1. Serge Says:

    Isn’t it more simple to compile the extension twice and install only the version that is compatible with the installed python? Note that from time to time python changes API besides UCS-2/UCS-4 problem. Compile four times? Sounds ridiculous, but that’s the price of convinience. By the way, do you compress with LZMA algorithm? It should decently compress similar files.

  2. Mike Says:

    The double compiling thing really isn’t a scalable solution at all – consider a Python extension implemented partly in C++. We’d need 4 binaries. The logic to select the right one alone would be terrifying.

    I’m aware the Python ABI changes without regard to binary compatibility and as most Python apps use libPython at some point it is for this reason I’ll continue to suggest NOT implementing software you intend to be installed by end users in it. I really wish we didn’t have to do that but seriously, many projects already realised that backwards compatibility is beneficial for end users (like gnome, kde, gtk, libc, even the kernel) and hopefully at some point Python will join that crowd ….

    Yes LZMA is used in the 1.2 series (not released yet).

  3. Serge Says:

    4 binaries is not that terrible, 8 is :) I don’t think the logic to select the right one will be awful, just put different extension binaries into different directories:
    ucs2-abi1/modulefoo.so
    ucs2-abi2/modulefoo.so
    ucs4-abi1/modulefoo.so
    ucs4-abi2/modulefoo.so
    and set python module search path to the right one before you start the real program, here is how to find out python unicode internal representation:
    —————–
    import sys
    if sys.maxunicode == 65535:
    print “ucs2″
    else:
    print “ucs4″
    —————–
    combine the output of this program with the output of the program that prints C++ abi, put the result into PYTHONPATH and you’re done.

    AFAIK libc is just frozen and tiny compared to python, that is how it gets binary compatibility. You can get the same result if you stick to one python version. Perhaps I’m missing something, but why you can’t recommend to use one fixed version of python? Can you show an example how such application can fail?

  4. Guido Schimmels Says:

    What I did with my PyGTK based “Contacts” vCard manager is, to “freeze” it with cx_freeze, which increased the download size by less than a megabyte, including the python interpreter, pygtk and gnomecanvas with libgnomecanvas statically compiled into the python extension module. That’s approx. the same overhead than e.g. with a semistatic gtkmm app.
    In case of python you don’t even pay with increased memory usage, as python doesn’t do VM sharing anyway.
    Therefore I don’t see what’s the big deal.

  5. Mike Says:

    Hmm, only a meg? That’s not bad at all. Maybe we should look into this cx_freeze program….

    Serge – libc is massive (look at the RPM filesizes some time) and has a lot of infrastructure for backwards compatibility purposes. Python is also large yes but the compatibility problems _could_ be avoided, if they wanted to.

  6. Mike Says:

    Oh, we can’t just recommend a fixed version because the apps are supposed to run against the version of Python shipped with the distro …..

  7. Serge Says:

    “supposed to run against the version of Python shipped with the distro” ? Supposed by whom? If the program wasn’t tested on newer version, you shouldn’t silently upgrade any library under the program. Unfortunately in Linux world shipping products that were not tested thoroughly is a norm :( Even bugfix can potentially make some programs fail. I remember I’ve seen a pure python program that was running on 2.3 but failed on 2.4.0 because of improved regular expression module. It was hitting a bug in new optimization. Of course there were no changes in python API.

    Libc is tiny compared to python, python has decimal arithmetic, email parsing, xml parsing, httpd server and a kitchen sink. Try to write a program using only libc ;)

    Sure somebody can support python backward compatibility, but AFAIK nobody from linux world even complained about that to python developers. I only remember windows users complaining about backward incompatibility, but that’s a different story because microsoft compiler breaks compatibility as well.

    By the way, according to modsupport.h http://tinyurl.com/fgg6y the last API change occured 3.5 years after another API change. It didn’t change with every minor version as you said.

  8. Mike Says:

    Well, you certainly have good points. Versioning is inherantly a stability/efficiency/flexibility tradeoff – the best stability would come from having every piece of software statically linked into every program such that it came as one self contained “guaranteed to work” bundle.

    But that would be very inefficient and inflexible – it would mean a security update or bugfix in a widely used library would require new versions of every app that used it ,along with all the bandwidth and admin overhead that implies. And of course lack of page-level VM sharing.

    The question is where to draw that line?

    For Python, because it is so very large, it makes sense to use the distros version. Shipping Python with every app that uses it doesn’t really make sense unless things like cx_freeze can produce massive efficiency improvements.

    Likewise it doesn’t make lots of sense to ship a canonical autopackage version that all autopackages are supposed to rely on. Developers choose dependencies independently of us – some kind of platform is badly needed but a separate project.

    On Pythons backwards compatibility; thankyou, I did not know that. I happily admit I’m repeating heresay – I’ve seen discussions about Python ABI instability on distro development lists but never investigated it directly myself (apart from the unicode thing). If libPython is a lot more stable than we thought then great! That’s wonderful news.

    Finally I’d say libc is really not that tiny. Consider – it includes:

    * Dynamic linking support
    * File handling
    * Network handling/DNS
    * Regular expression support
    * Unicode conversion
    * Interprocess communiation support
    * Threading
    * Database handling
    * More I forget

    Go browse the libc tree some time, it’s remarkably large. I quite agree that Pythons stdlib is larger, but don’t underestimate the size of libc ;)

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.

%d bloggers like this: