Archive for August, 2006

elf and program loading

August 25, 2006

Time to post another technical article from the queue, I think. I’ll be away on a family holiday for a week. Enjoy!

Via Slashdot we find this excellent article on how Mach-O (the MacOS X equivalent to ELF) works. Well, mostly excellent anyway, it doesn’t cover much beyond getting the program into memory, and it’s after you reach this point that the problems start.

Anyway, ELF is a big deal for autopackage and Linux developers in general, but it’s largely mysterious. So in a similar vein to the Mach-O article, here is a quick tutorial on it. Actually Mach-O and ELF are very similar, but some of the names and commands are different.

First steps

The life of a process starts in the kernel. The kernel loader code maps the executable file into memory, then reads the ELF headers to locate the PT_INTERP header. Mapping it in is a very similar to Mach-O so I’ll defer to the linked article on that. PT_INTERP tells the kernel the path to the “ELF Interpreter” – better known as the dynamic linker:

mike@linux:/> readelf -aW /usr/bin/readelf|grep interpreter
      [Requesting program interpreter: /lib/ld-linux.so.2]

On Linux this is always the same value, but it can be modified if there’s a reason to do so. The dynamic linker therefore runs entirely in userspace, and is bootstrapped into life by the kernel. That’s a common pattern that you can see on Windows and OS X too (and of course every ELF platform). The ELF file format and ABI is a UNIX-wide standard. Each UNIXs version is slightly different, naturally, but still the specification allows for a lot of code re-use.

Start of dynamic linking

The dynamic linker is some scary code. When it first gets control from the kernel, it can’t use global variables, nor can it do regular function calls. Instead it first dynamically links itself, and once its environment is sane, it proceeds to link the rest of the program together. This is a recursive process, which involves looking at the DT_NEEDED commands in the ELF file, and loading each shared library it needs into memory, then looking at the DT_NEEDED commands in that new shared library and so on.

You can see what an ELF file needs with the following command:

objdump -p myprogram | grep NEEDED

This is NOT the same as the ldd program, which dumps a list of every single thing loaded into a process by that ELF file, including dependencies of dependencies. Usually, if built correctly (with –as-needed) this list of needed files will be much smaller than the output from ldd.

Shared libraries and dependencies

The DT_NEEDED headers don’t specify file names directly. Instead they specify sonames. This is an additional layer of indirection designed to allow for simplistic versioning of files. It’s not enough, as we’ll see in a minute. The dynamic linker treats a soname as a filename however, and will scan directories in the search path looking for a filename equal to the given soname. Usually this will be a symbolic link to the real library.

The symlinks are created by the dynamic linker cache regeneration program, ldconfig. This program creates an mmappable cache file for fast lookup, and makes sure a symlink for each soname exists. If two libraries in a directory share the same soname, then ldconfig will try and figure out which one is newest by doing a numeric comparison of the filenames, and picking that one. The algorithm is simplistic and is implemented in _dl_cache_libcmp on Linux, so it’s worth being careful that your libraries will always compare correctly with this algorithm. Chances are good it will be fine if you use sensible naming.

This isn’t the same as allowing multiple versions to be in use at once, all soname versioning does it allow two to exist on disk simultaneously – the newer version will always override the older version however.

Lazy linking

The dynamic linkers job is to “link” programs together by rewriting the jump tables that sit at the top of the file in memory. Essentially, when one ELF file calls a function that is supposed to exist in another, the code generated first fetches its address from the global offset table (GOT), an array of addresses, and then jumps to it. Filling out this table is the dynamic linkers job, and is called symbol relocation. By default relocation is lazy …. that is, it’s delayed until the last possible moment. That’s a useful optimisation to make because often a program won’t call every function it’s linked to, it will depend on how the program is used. ELF is very slow at looking up symbols, so this also speeds up startup time.

To implement lazy linking, when the GOT is first initialized each entry points to an offset in the procedure linkage table (PLT). This is an array of code blocks. When executed the code block will jump into the dynamic linker and find the real address of the function, and then rewrite the GOT so next time the function is called the PLT is not involved. The PLT is generated ahead of time by the the “ld” compile time linker, and you can see it by running objdump again:


mike@linux:/> objdump -d -j .plt /usr/bin/objdump|head -n 20
/usr/bin/objdump:     file format elf32-i386
Disassembly of section .plt:

08049da8 <xmalloc @plt-0x10>:
 8049da8:       ff 35 54 b9 08 08       pushl  0x808b954
 8049dae:       ff 25 58 b9 08 08       jmp    *0x808b958
 8049db4:       00 00                   add    %al,(%eax)
08049db8 <xmalloc @plt>:
 8049db8:       ff 25 5c b9 08 08       jmp    *0x808b95c
 8049dbe:       68 00 00 00 00          push   $0x0
 8049dc3:       e9 e0 ff ff ff          jmp    8049da8 <_ init +0x18>
08049dc8 <disassembler @plt>:
 8049dc8:       ff 25 60 b9 08 08       jmp    *0x808b960
 8049dce:       68 08 00 00 00          push   $0x8
 8049dd3:       e9 d0 ff ff ff          jmp    8049da8 <_ init +0x18>

The first number on the left is the absolute address in memory where the plt is supposed to be loaded. The series of two-digit hex numbers are the opcodes being fed to the CPU, and the instructions on the right are the disassembly of these code blocks. The first PLT entry is special and modified to point to the dynamic linker itself.

Symbol scoping

This particular design quirk of ELF has caused me and many other developers bad headaches. In ELF, you might expect that if your library has a DT_NEEDED entry for libfoo.so.1, and libfoo exports a function foo_function then calling foo_function from your code would result in control being transferred to libfoo.so.1 – but you’d be wrong.

In ELF every symbol is loaded into a global scope. That is, whenever a shared library is loaded, the symbols it exports are added to the end of a big global list, and every time the linker wishes to resolve a symbol it scans that list from head to tail looking for a match. This means that in fact the call to that function could end up anywhere, even deep inside the place you were being called from!

This problem tends to hit in the following ways:

  • Somebody copies and pastes an internal function from one library to another, but changes the way it works or its prototype as the code evolves. The second library is one day used in the same program as the first, purely by chance, and the two different versions of the function interfere with each other.

  • A library breaks backwards compatibility and uses a new soname. However, the program is still being wrongly linked to the old version of the library because it’s in use somewhere else by the program. For instance, consider the case of a C++ program like Inkscape which links against version 6 of the C++ standard library, and also GTKSpell, which in turn loads Enchant, which in turn loads GNU ASpell, which is also implemented in C++ and is linked against version 5 of the standard library. This doesn’t work because they conflict.

Symbol versioning and visibility

Symbol versioning is an attempt to fix this problem. It allows each symbol to be tagged with an extra piece of text that isn’t a part of the library API (that is, the way you write the code) but is part of the library ABI (that is, what the operating system sees). A symbol with a version tag is printed like this: foo_function@LIBFOO_1.2, and won’t match in a symbol scan against foo_function@LIBFOO_1.3.

Symbol visibility can help control the problem with people copying and pasting code then changing. Libraries and programs usually export their internals to the world, and these symbols are linked like any other. By setting the visibility of a symbol to “hidden” it’s taken out of this scheme, which quite apart from improving robustness also improves performance.

There are two ways to set symbol visibility – either at the same time as applying version tags, or independently of symbol versioning. To do it independently the -fvisibility=hidden switch can be given to GCC. To do it as part of the version tagging process, just use the * wildcard in the local: section of the version script. The GCC switch is more convenient for C++ users. Version scripts? Yes, that’s a file you feed to the “ld” compile time linker. A version script says what symbols should receive what version tag. You can set it from within the source code as well if you like, however, this is rather non-portable.

One symbol can have multiple versions. In other words, a single API can have multiple ABIs. This is a very exotic and GNU-specific feature, as a result, nearly nothing except glibc uses it. The ABI to use is decided at compile time, whichever is newest wins the fight and is “burned” into the binary. If that symbol version isn’t available at runtime the linker will refuse to start the program. This is not a very good versioning scheme in my opinion, because if a program is recompiled (as happens all the time in a world where compile == install) then it risks being linked against a symbol version that it’s not compatible with. It’s possible to select which symbol version you will be linked to in the source code ahead of time, however, the glibc headers don’t do that, so we have to force it using apbuild (this was the first of of our build-on-new, run-on-old tricks).

on a levels

August 22, 2006

Another year, another round of bullshittery from adults to teenagers over what they’re spending their time on.

A-Levels can be one of the most stressful things you’ve ever done at the time you take them (it was total hell for me), and the last thing the candidates need is for all the usual public wanking over whether the exams are getting easier or not. The problem is simple enough – students are told A-Levels measure ability, and are told “if you can do ABC you will get XYZ marks”. You can see the mark scheme in some courses. Yet when the results actually come out, a tidal wave of cynicism and outrage swamps the nation over the fact that more people got an A grade than last year. Universities bitch and moan about how they might actually have to work in order to identify the candidates that fit their preferred profile, old-timers whinge about how it was much tougher back-in-’t-day and meanwhile the stressed out and exhausted students get dumped on.

To make matters worse, exam boards actively moderate results to try and make them fit into the bell curve – it’s not uncommon to hear about students getting 100% on exams where they didn’t even answer all the questions. So the qualification is stuck in some no mans land where it’s not clearly a measure of what you know, and it’s not clearly a measure of what fraction of the population you fall into either. In fact it doesn’t measure anything at all, except perhaps how good you are at exams.

I got BBBC for my A Levels and fluked into Durham …. so according to some BBC News “Have Your Say” posters, I basically failed. On average I now earn more than most of the people posting such comments, which if you insist on picking an arbitrary metric to measure success seems a lot more useful than exam grades.

The lesson of all this? Standard school qualifications basically say sweet fuck all about what you can do, and politicians are under pressure to remove the “basically” from that sentence. it seems clearer every day that the smart ones are those who leave school at 16 to get apprenticeships and work their way up. I could have done that and still ended up at Google – they had no idea how degrees are graded in the UK and couldn’t really have cared less when I explained. No qualifications are necessary to do SRE work, though if you have a degree it seems they’ll forgive you for lacking a few years experience (good luck passing the interview process based only on what is taught in a degree though).

Photo by dcJohn

google

August 15, 2006

Hi ho, it’s off to work I go.

Mid-September time I’ll be moving to Switzerland to work for Google in their Zurich office, in the SRE (site reliability engineering) division. That means I’ll be a part of the team that ensures the Google services are always available and fast. It’s sort of sysadmin work on steroids – there is a fair bit of programming involved to develop and maintain the administration infrastructure, doing performance tests, gathering statistics etc.

I’m actually not going to Zurich straight away. There are 3 months of training in Mountain View first, and so I’ll be moving there in the new year. I don’t know how much time I’ll have in that 3 months, but after that it should not be too hectic and I’ll be able to spend some of my 20% time on autopackage and other projects.

One thing that has kind of annoyed me is that there is very little information out there on what the Zurich facility is like. Once I settle down there I hope to post some trivia so people who are thinking of working there and are trying to find out about it don’t have to make do with [the Heidi Song](http://douweosinga.com/blog/0601/2006Jan13_1). If I ever get fired, no prizes for guessing why ;)

summary of vinod khosla on ethanol

August 9, 2006

[Vinod Khosla explains his vision for the future of the automobile](http://www.theoildrum.com/story/2006/8/8/2049/64576)

This is a very thorough analysis of the future of the automobile, and ethanol specifically, from a venture capitalist who has invested a lot of money into different energy related companies. Because it’s quite long I’ll summarise it here. Any mistakes are my own. This is a radically stripped down version, if you’re interested or feel there is an inconsistency please do read the whole thing and comment at TOD. In fact, if you’re even slightly interested in the future of our transport infrastructure, you should read it anyway. Thanks to Vinod for writing it.

BASIC POINTS

  • Vinod Khosla has investments in many future energy related areas: several ethanol companies (both corn and cellulosic), 3 non-ethanol future fuel ventures, solar, high-efficiency lighting, low impact low cost housing and battery tech.
  • He believes in a pragmatic approach based on incremental steps that provide a future people can believe in. This rules out the hydrogen economy as there’s no way to convert the existing infrastructure incrementally. He has no time for doomers.
  • He has no particular beef with alternatives like public transport investment, he just believes the car isn’t going to disappear and ethanol is where he’ll get the best results for his time (and money). Focussing only on gasoline replacement for now.
  • Nobody sane believes corn/sugarcane based ethanol can be scaled up to be a complete replacement, and he is not interested in anti-ethanol arguments that are based on that. He sees it as a stepping stone to cellulosic ethanol which reduces the risk of investment in it. He wouldn’t be investing in cellulosic if the groundwork wasn’t being laid by corn based ethanol. (Mike says: cellulosic ethanol currently is not manufactured currently at commercial scales, however trial plants exist showing the basic process is workable)
  • Ethanol is competitive NOW, even without subsidies.

    • He claims oil has a lot of hidden subsidies, consider the cost of defending the canals tankers use
    • Oil companies have manipulated legislation to get a variety of esoteric subsidies
  • Sees basic progress as _ethanol-gasoline hybrid_ through _highly efficient internal combustion engines_ heading towards _plug in hybrids_ and perhaps eventually _all electric cars_, but that isn’t for a long time in the future.
  • Miscanthus is a better cellulosic feedstock crop than switchgrass (the one mentioned by Bush), but cellulosic can run off pretty much anything biological – including wood waste.
  • No realistic alternative

LAND USE

  • Newly discovered/tested crops such as Miscanthus do not require intensive fertiliser based agriculture as they don’t suck nutrients out of the soil in the same way. Avoiding monoculture still important.
  • Miscanthus yields 16.5 dry tons/acre/year whereas switchgrass averaged 4.6 dry tons/acre/year
  • The DOE estimates that this scenario of collection of existing biomass with only small change in agricultural practices could generate 1.3 billion tons of biomass in the US
  • Drops in meat consumption would free land that could be used to grow biofuel crops. Miscanthus/switchgrass can grow in places that were traditionally not considered arable, further increasing potential yields.

ENERGY EFFICIENCY

  • Energy Return on Energy Invested (EROEI) is the wrong question to ask. Fossil energy used per mile driven is the right metric.
  • Most fossil fuel currently used in (corn) ethanol production is natural gas, not oil.
  • Engine efficiency improvements have a big part to play.
  • Claims even corn ethanol from a typical natural gas powered corn ethanol plant today has almost twice the energy balance of petroleum.

OTHER FUELS

  • He likes bio-butanol and similar fuelds. They are superior to ethanol in many ways but will use the same or similar feedstocks and run interchangeably with gasoline or ethanol in today’s engines. Some will require minor engine modifications and others will not. But if we tried to legislate butanol today it probably would not happen.
  • Why not biodiesel? Even though biodiesel has a better energy balance than corn ethanol, he doesn’t believe it will scale beyond hundreds of gallons per acre of land. Ethanol, butanol and any liquid fuel that has a shot at replacing our gasoline needs has to scale up to 2,000-3,000 gallons per acre.

FINAL THOUGHTS

Failure to look at these possibilities is a failure of imagination. We need to thread the needle between the “what is… and let’s extrapolate today’s corn ethanol production” tunnel vision crowd (generally done with an agenda – The American Petroleum Institute is concerned enough about world food supply and our welfare to have issued press releases on it!) and the “pie-in-the-sky academics who will dream up “what can be” without consideration for the path dependence of the choices we make and the pragmatics of how business works and what is needed to make a revolutionary future happen through a series of evolutionary steps. We should take our lesson from nature – from single celled amoeba to complex humans was an evolutionary process. It is the only way we can get billions of dollars deployed to make this alternative future happen.

on the performance of garbage collection

August 8, 2006

[Apple added garbage collection to Objective-C 2](http://it.slashdot.org/article.pl?sid=06/08/07/2126253)

Well, this was foreseen a long time ago. They’ve been talking about this for a long time now, and with it Apple finally haul their aging NeXT development platform into the 21st century. With GC and the new X-Codes Objective-C is starting to look less like a fossil and more like a modern, useful tool.

In the referenced Slashdot discussion there was a lot of talk about the performance of GC and comparing it to Java, which I now think is a mistake. I used to do this too because Java is really what introduced GC to mainstream C/C++ style software development, so many people now associate GC with Java style memory bloat. If you use IntelliJ IDEA then you have experienced the pain of waiting 5 seconds for a GC to take place. Other people experienced similar pain – problems related to memory usage – and now associate manual memory management with “fast but hard” and GC with “easy but slow”. This sort of correlation makes intuitive sense ….. no pain no gain, right …. but is wrong.

Why is desktop Java slow?

Because it uses lots of resources. In server side scenarios where you are in control of the RAM budget this isn’t as big a deal, but for desktops you have no idea in advance how much memory you have to play with. It will differ between machines. It will probably change during runtime as the user starts and quits other apps.

Java programs use lots of resources because the design of the Java language encourages profligate use of expensive constructs like objects and arrays that know their own length. An object is a heavy thing, especially as Java requires every object to have a vtable and lock. Yet even the most basic, small thing in Java *must* be an object.

The net result of this policy is that a typical Java heap contains bazillions of very small objects, many of which are dead. This observation led to the development of generational garbage collectors when the real solution is to simply use fewer objects!

The Java coding style is one of its better achievements – all Java code tends to look and feel pretty much the same, unlike in C++ or Python. Unfortunately most universities and courses teach students a very particular style of coding which is inefficient. The most obvious problem is the idea that public fields are bad, and that a data structure must use accessor methods. I frequently lost marks in my university course because I didn’t “encapsulate” my classes enough, by which they meant there weren’t enough get/set methods. Every get/set method increases code size and increases pressure on the JVMs inliner (assuming it inlines the methods at all, which it might not).

There are other problems like no ability to stack allocate non-primitives, leading to even more dead/small objects on the heap, and an over-reliance on threading (threads are expensive). But the one that gave GC a bad name is not really the fault of GC at all – it’s the fault of a language and coding style that encourage the massive over-use of small heap allocated temporary objects.

Why is desktop C++ fast?

Well, it’s not because it uses refcounting, which can lead to ‘death by a thousand cuts’ due to constant memory references.

Mostly it’s because C++ gives you a lot of ways to help the compiler produce efficient code, and the C++ coding style encourages very tight programming. Objects that do nothing except store a few variables are replaced with structs, saving the vtable. Only methods that need to be marked virtual are. Short-lived objects are put on the stack (which is hot in the cache). Very small objects are shunned entirely. A method that returns a complex result might do so by copying a region of the stack upwards, rather than allocating the result on the heap and passing back a pointer, which is used once then discarded.

Because C++ gives you these options and encourages their use, well written code can handily beat Java equivalents. How many popular instant messengers are written in Java vs C++?

GCd C++, a happy medium

You can have garbage collected C++, and even convert a codebase that uses manual memory management over to GC as the Inkscape guys are doing (though it can be painful and amplify existing leaks). A well known GCd codebase is Unreal Engine 3. In garbage collected C++, the main heap is managed by the GC and objects are deleted when they are no longer reachable just like in Java. The difference is that because there aren’t that many objects actually live on the heap, the performance impact is small. It can even be faster than the original.

people ready e-business?

August 2, 2006

I’m not normally one to take the piss out of Microsoft, but their “People Ready” website is such a great example of [web economy bullshit](http://dack.com/web/bullshit.html) that you have to laugh … found [here](http://www.microsoft.com/business/peopleready/overview/greatestasset.mspx):

View the story of a top executive who focuses on processes and efficiency but forgets his people. Key employees quit. Customers look elsewhere. The business suffers. But the executive learns how well-chosen new technology can help motivate people and accelerate the business by connecting people to information on customers, inventory, shipping, and to each other. The lesson: Empower your people with the right tools and there’s no limit to what they can achieve.

Just jaw-dropping, really. The first few sentences could have been describing Microsoft. Unfortunately the “happy ending” of this story is apparently not that the CEO remembers what his key assets are, but instead buys a new version of Office which motivates those who are left!