Win32 Is The Only Stable ABI on Linux

2022-08-15, by ivyl

TL;DR

DT_HASH and DT_GNU_HASH

In the ELF format there are two ways of providing a hash table of symbols. DT_HASH and DT_GNU_HASH. The first one is part of SYSV generic ABI and is well documented and marked as mandatory. DT_GNU_HASH is a newer, smaller, and faster replacement that is not documented, implemented in a few places (Glibc, GNU ld, Musl, LLVM, mold, etc.) and a subject of a couple of blog posts that I’ve found. Turns out it also doesn’t provide the same functionality as DT_HASH. To get the number of symbols you have to completely parse it.

It’s worth noting that unless you are building ELFOSABI_NONE binaries you are not forced to follow the generic ABI. Glibc uses ELFOSABI_GNU so technically not including DT_HASH is fine.

Glibc was forcing having both sections for a very long time claiming to maintain compatibility. It was recently dropped and the build instead relies on “the defaults”.

The Regression(s)

This problem was first noticed by rolling-release distros users when games using EAC EOS stopped working. There’s also an open source project that has regressed - a frame rate limiter called libstrangle. The non-EAC game Shovel Knight is broken as well.

Those are only a few breakages users of the more bleeding-edge distros found shortly after the Glibc 2.36 release. I suspect there will be more broken games (and other software) once 2.36 will hit the mainstream.

To ABI Or Not To ABI

I suggest you read through all the linked discussions and develop your own opinion on who is responsible for what exactly.

Personally I share Linus’ opinion that changing ABI is okay as long as no one has noticed, but once it gets noticed - then it’s a regression. Once it’s a thing people depend on, it becomes a feature.

Sadly not everything can be easily modified to accommodate or recompiled.

Shifting the problem downstream onto the distributions is also problematic and adds to fragmentation.

I know that DT_GNU_HASH is a thing for 16 years already and most distributions have switched to using it and only it by default. For those 16 years, it was Glibc who provided the compatibility and overrode the defaults for everyone and there never were any easy-to-spot deprecation warnings. It’s also unrealistic to expect every ELF consumer to keep up with undocumented developments of the format.

Why --hash-style=gnu is The Default?

Kinda by accident? GCC’s default behavior is to allow linkers to do whatever they want. The GNU Linker defaults to “both”

ld --help | grep hash-style
  --hash-style=STYLE          Set hash style to sysv/gnu/both.  Default: both

while mold defaults to “sysv”. So why most things are built with –hash-style=gnu? On a lot of distributions gcc -v claims that

Configured with: /build/gcc/src/gcc/configure ... --with-linker-hash-style=gnu ...

which seems to be a leftover from the days long gone where DT_GNU_HASH was so young that ld’s default was still sysv. Distributions wanted to have the promised speedup so they have just forced this and it stuck.

This seems to spread further - clang does distro detection and seems to be mimicking whatever the distro is doing with their default gcc.

Final Thoughts

I think this whole situation shows why creating native games for Linux is challenging. It’s hard to blame developers for targeting Windows and relying on Wine + friends. It’s just much more stable and much less likely to break and stay broken.