Golang is Trash

Datetime:2016-08-22 22:52:25          Topic: Golang  Assembler           Share

In the process of working on getting cgo to work on illumos , I learned that golang is trash.  I should mention that I have no real opinion on the language itself; I’m not a languages person (C will do, thanks), and that’s not the part of the system I have experience with anyway.  Rather, the runtime and toolchain are the parts on which I wish to comment.

Fundamentally, it’s apparent that gccgo was always the right answer.  The amount of duplication of effort in the toolchain is staggering.  Instead of creating (or borrowing from Plan9) an “assembly language” with its own assembler, “C” compiler (but it’s not really C), and an entire “linker” (that’s not really a linker nor a link-editor but does a bunch of other stuff), it would have been much better to simply reuse what already exists.  While that would have been true anyway, a look at the quality of the code involved makes it even clearer.  For example, the “linker” is extremely crude and is incapable of handling many common link-editing tasks such as mapfile processing, .dynamic manipulation, and even in some cases simply linking archive libraries containing objects with undefined external references.  There’s no great shame in that if it’s 1980 and we don’t already have full-featured, fairly well debugged link-editors, but we do.  Use them.

But I think the bit that really captures the essence of golang, as well as the psuedointellectual arrogance of Rob Pike and everything he stands for, is this little gem :

Instructions, registers, and assembler directives are always in UPPER CASE to remind you that assembly programming is a fraught endeavor.

Wait, what?  Are you being paternalistic or are you just an amateur?  Writing in normal (that is, adult) assembly language is not fraught at all.  While Mr. Pike was busying himself with Plan9, the rest of us were establishing ABIs , writing thorough processor manuals , and creating good tools that make writing and debugging assembly no more difficult (if still somewhat slower) than C.  That said, however, writing in the Fisher-Price “assembly language” that golang uses may very well be a fraught endeavor.  For starters, there’s this little problem:

The most important thing to know about Go’s assembler is that it is not a direct representation of the underlying machine.

Um, ok.  So what you’re telling me is that this is actually not assembly language at all but some intermediate compiler representation you’ve invented.  That’s perfectly acceptable, but there’s a good reason that libgcc isn’t written in RTL .  It gets better, though: if you’re going to have an intermediate representation, you’d think you’d want it to be both convenient for the tools to consume and sufficiently distinct from anything else that no human could possibly confuse it with any other representation, right?  Not if you’re working on Plan9!  Without the benefit of decades of lessons learned running Unix in production (because Unix is terrible and why would anyone want that instead of Plan9?), apparently such obvious thoughts never occurred to them, because the intermediate representation is almost a dead ringer for amd64 assembly!  For example:

TEXT runtime·munmap(SB),NOSPLIT,$0
MOVQ addr+0(FP), DI // arg 1 addr
MOVQ n+8(FP), SI // arg 2 len
MOVL $73, AX
SYSCALL
JCC 2(PC)
MOVL $0xf1, 0xf1 // crash
RET

This is a classic product of technical hubris: it borrows enough from adult x86 assembly to seem familiar to someone knowledgeable in the field, but has enough pointless differences to be confusing.  Clearly these are well-known instructions, with the familiar addressing modes and general syntax.  But wait: DI is a register?  How is that distinct from the memory location referred to by the symbol DI?  Presumably these registers simply become reserved words and all actual variables must be lower-case.  That would be fine if not for the fact that the author of a module does not own the namespace of symbols in other objects with which he may need to link his.  What then?  Oh, of course; I need to use the fake SB register for that, just like I don’t in any real assembly language.  But it gets worse: what’s FP?  Your natural assumption, knowing the ABI as one should, is that it’s a genericised reference to %rbp, the conventional frame pointer.  WRONG!  The documentation instead refers to it as a “virtual frame pointer”; in fact, rbp is usually used by the compiler as a general register, just as if you had foolishly built your code with gcc using -fomit-frame-pointer.  Thanks, guys: confusing and undebuggable!  We could go on, detailing the pointless divergence from actual x86 assembly and the failure to genuinely abstract registers and instructions in a way that would allow this “intermediate representation” to be generic across ISAs, but I think by this point it’s plain enough that this entire chunk of the toolchain is simply rubbish.  The real toolchains everyone else uses were not invented at Lucent nor Google, so obviously they needed their own, written in seclusion with all the benefits of a 1980 worldview.

The last fun bit I wish to discuss is that funny little character between “runtime” and “munmap” in our previous example.  You see, despite having written their own entire toolchain (including a compiler identifying itself as accepting C that does no such thing), the authors decided that the normal “.” character was simply too special to be repurposed in source code.  Instead, it would retain its existing meaning as the customary dot operator.  But this means some other character will be needed to identify symbols that should have a dot in their names.  So obviously the natural choice here is some high Unicode character.  Obviously.  And equally obviously, when such code is compiled, the character is replaced in symbol names with an ordinary dot.  Of course!

It’s no surprise that the golang people want to replace the “C” parts of the runtime.  I would, too; the “C” language accepted by the Plan9 compiler is not really C.  The compiler has no concept of a function pointer being equivalent to the function itself, or for that matter such similarly obscure aspects of the C standard as the constant identifier NULL (instead of NULL, one must write “nil”, quite possibly the most obnoxiously spurious product of NIH thinking I have ever seen in my life).  But the problems with the golang toolchain and runtime go far beyond an idiosyncratic C dialect; the same thinking behind that oddity permeates the entire work.  Everything about the implementation of the language environment feels amateurish.  The best thing they could do at this point is start working on golang 2.0, with the intent to completely discard the entire toolchain and much of the runtime.  Rewriting more of the runtime in Go is fine, too, but it’s critical that the language and compilers be mature enough to enable bootstrapping in some sensible way (perhaps the way that gcc builds libgcc with the new compiler, not relying in any way on installed tooling to build the finished artifact).  There’s no obvious reason to turf the entire language, but reimplementing it sanely would be a huge benefit to their ecosystem.  Every system already has good tools for assembling and linking code, and those tools support ABIs that enable easy reuse of external software.  The Plan9 crowd needs to spend time appreciating why this is so instead of arrogantly ignoring it.  A sane implementation that leverages those tools would make the Go language far more attractive.





About List