Languages that target the JVM
Way back, Java took a leap by specifying a virtual machine and a portable byte code format for developing portable code. While it can be endlessly debated whether its motto “write once, run anywhere” actually came true or ended up being “write once, debug everywhere”, the underlying VM and byte code format has become the compilation target of some modern functional programming languages such as Clojure and Scala. While these languages leverage the existing Java eco-system, they’ve nevertheless had to fight the impedance mismatch between what their ideal runtime environment would require and what they get with the JVM and Java byte code. To some extent the JVM/byte code developers have responded by adding features such as dynamic invocation to the system to better support late binding languages .. which Java itself isn’t.
Thus, the JVM and Java byte code provide a common robust platform and eco system on which to build services, but not so much on the client side, despite experiments such as Java Web Start.
For a couple of years now, Google has had an initiative to have a secure sandbox within its Chrome browser for running native x86 machine code, targeted at client side applications that place high computational loads such as audio/video/image processing. This is NaCl - short for “Native Client”. While NaCl being an x86 code based environment might suggest that any language that compiles to x86 assembly can be used within NaCl, its tool chain currently limits it to C/C++, though language support is growing.
LLVM and what it’s got to do with all this
LLVM - short for “Low Level Virtual Machine” - is billed as a compiler infrastructure project and provides a toolkit for working with a generic low level assembly language and bit code format with pluggable optimization passes. Apple funds as well as contributes to this open source project. LLVM saw interesting applications in MacOSX such as optimizing the performance of OpenGL driver calls on the fly.
The idea behind LLVM bitcode and assembly language has its parallels with Java byte code and the Java language, though its specification deals with computation at a much lower level and does not, for example, mandate a specific model of “objects” like Java byte code does.
Post the initial work on the compiler toolkit and optimization modules, LLVM has gained front ends for many languages including C, C++ and Objective-C++ and is pretty much the default choice for new experimental languages such as Mozilla’s Rust and Apple’s Swift. Chris Latner, who started and heads the LLVM project, is the one behind Apple’s Swift. Haskell, a language that’s much older than the LLVM project, has also grown an LLVM backend for its flagship GHC compiler.
Google’s NaCl x86 based sandbox now has a portable sibling called PNaCl - or “Portable NaCl”. PNaCl makes it possible for a browser application developer to ship architecture independent code to the browser that can run at near native speeds.
How does PNaCl achieve its portability and performance? The “architecture independent bitcode” is LLVM bitcode and the PNaCl system in Chrome compiles this bitcode package to high performance native code ahead of execution time. This way, you can distribute such code to x86 as well as ARM systems. While you can ship code to the x86 NaCl only via the Chrome Web Store, PNaCl is available today and enabled on latest desktop Chrome browsers on all platforms and you can target it without going via the Chrome Web Store. This strategy is similar to Java’s byte code system, except that the compilation to native code is done ahead of time instead of just in time.
About time for some unification on the client side
We’ve seen how LLVM is creeping under the hood of many client-side systems today. LLVM bitcode isn’t perfect for all languages - for ex, tail call elimination is an optimization that can’t be done on the LLVM IR, disadvantaging Scheme implementations a bit, though tail recursion to loop conversion can be done. Despite that LLVM has become the backend of choice for high performance code across a wide variety of systems and is now shipping in Safari, Chrome and (indirectly) Firefox.
Conclusion - LLVM and the JVM
All this brings us to the concluding point - that today we have high performance client-side and server-side code being backed by two systems - LLVM and the JVM - with corresponding architecture independent code formats - LLVM bitcode and Java byte code. With recent systems programming languages such as Rust and Go capable of serving the construction of reliable high performance services, I anticipate that LLVM-based services will grow to gain a share comparable to what Java has achieved in the services world.
This bodes very well for accelerating the creation of tools optimized more for programming productivity, safety, reliability, concurrency and distribution in the near (and far) future since such systems need only target LLVM to gain ubiquity. Furthermore, it may also become possible for these systems to share libraries in a way that has not been possible before, except perhaps on the JVM platform.
Overall, being a polyglot who is always on the lookout for better tools to think with, I’m pretty excited for the diverse programming ecosystem that’s emerging today.