Rust: Importing C constants: Proof of Concept


A proof-of-concept rustc plugin that imports C macro constants from C include files at compile time.

Motivational example usage


#![feature(plugin, custom_attribute)]

#[c_import(h = "limits.h", INT_MAX)]
mod limits
    // This module is empty now, but the plugin will import
    // requested constants in here at compile time.

fn main()
    println!("Yay, got INT_MAX: {}", limits::INT_MAX);

Intro: AST won’t cut it

This is probably fairly obvious for most people, but I feel I should mention it anyway: Constants defined as macros in C language aren’t easy to import into Rust because of all the things they sometimes feature. Just off the top of my head, these are some of the annoyances a C macro constant might contain:

  • Other macros that need to be looked up,
  • other macros with arguments,
  • arbitrarily complex arithmetic (bit shifts are common, etc),
  • enum variants, and
  • string concatenations and ‘stringifications’ of macro arguments.

Similar is true of enum variants, too, not to mention that any code can be enabled/disabled by preprocessor conditionals.

It’s also worth noting that C employs automatic integer promotion that makes evaluating arithmetic expressions composed of literals quite difficult.

These are reasons why obtaining an AST of a header file isn’t really that helpful when trying to import macro constants, since there is still a lot of work left to do that the compiler normally does. The conclusion is simple: Basically you need to compile.

This has implications for the tools that can be used. For example, the rust-bindgen utility uses Clang’s libclang C interface, which, however, is limited to -fsyntax-only and won’t compile anything for you. And unfortunately the same is true of Clang’s LibTooling API. Therefore I decided to use the underlying native Clang/LLVM API (I don’t know if there’s specific name for it) even though it’s not stable. (Honestly I’m not sure exactly how unstable the API is, I hope it’s stable at least across patch number versions.)

Update: Turns out the LibTooling API can be coerced into not passing the -fsyntax-only option. This is what clang-check does, for example. However, the way LibTooling API is done, it’s impossible to obtain resulting LLVM IR without a global variable so I’m still not going to use it.

The plugin

The idea is pretty simple: The plugin is passed a header file name and a list of constants that the user wishes to import. It then composes a simple C++11 snippet that includes the header file and for each macro constant it defines a variable initialized by that macro.

Why C++11? Because C++11 has type inference via auto. Without that the type would either have to be inferred somehow else (how?) or specified by the user, none of which seemed like a good idea to me.

The snippet is then compiled by Clang/LLVM into a LLVM IR module, which can be enumerated fairly easily. The values as well as inferred types are extracted from the module and Rust AST is generated based on that information.

For example, this is a snippet the plugin genereates in the above example:

extern "C" {
#include "limits.h"

auto rust_import_INT_MAX = INT_MAX;

Annoyances along the way

Compiler APIs aren’t exactly the easiest to get a hang of, and Clang is definitely no exception. A notorious problem when working with Clang API is Clang’s builtin headers. Normally, the compiler locates these by first obtaining location of itself (the binary file) and using a hardcoded relative path from there. Which kind of makes sense for the compiler, but makes life difficult for API users. I use a build script that invokes clang to find out builtins path as well as other system include paths.

Second problem is rustc, specifically, the fact that it has parts of LLVM 3.3 statically linked into it. This makes it impossible for plugins to link to LLVM, even if using dlopen(), because LLVM apparently performs some static init of some static data (I bumped into this problem here).

For that reason I decided to split off the Clang/LLVM part into a seperate backend binary that the plugin invokes as a process and communicates with. This makes linking problems go away and also the backend binary (called clang-rust-import) can be easily tested separately from rustc.

The plugin talks to the backend process in JSON.

Source code

The code is short and very rough around the edgedes, I wrote it in a couple of afternoons most of which I sepnt trying to decipher the enigma that the Clang API is (and to a lesser extent, Rust compiler API).

So far the plugin only imports macro constants and only of integral type. No other types are supported as well as no other kinds of C declarations. Not that it’s impossible or particularly difficult, but I just couldn’t be bothered yet.

Anyway, the repo is here:

To compile & use, following tools & libs are needed:

  • cmake
  • clang 3.6
  • jsoncpp

The backend binary is compiled separately, it’s located in ./backend/. Compile it using cmake and symlink the binary somewhere in $PATH so that the plugin can find it.

The plugin can be compiled using cargo from the top-level directory, but it’s best to use it as a dependency in another project, such as the one in ./test directory.

NOTE: The plugin might not work as expected, might not even compile, or might compile but do something preposterous. Approach with caution.

Enjoy and please feel free to criticise & provide feedback.