<iframe src="//www.googletagmanager.com/ns.html?id=GTM-KXSCJR" height="0" width="0" style="display:none;visibility:hidden">

Prevoty Labs: Rust and Java

Josh Chase on Nov 15, 2016

Rust_programming_language_black_logo.svg.pngTL;DR Call Rust from Java using this crate.

Lately, there's been a lot of buzz around Mozilla's relatively new language, Rust. For the unfamiliar, it's a systems programming language that fills the same niche as C or C++ while adding strong memory safety guarantees, a powerful hygienic macro system, first-class function closures, and a type system directly influenced by Haskell - all without requiring a garbage collector. These properties make it an excellent language choice for large projects where security and performance are main concerns.

All of this sounds great for new project development, but what about programs that we’ve already written? Traditionally, if one needed to get more performance out of a high-level language, program bottlenecks could be replaced with bindings to a feature equivalent C library. Fortunately for us, Rust supports compiling to a shared object and presenting a C-compatible interface with no runtime overhead. Since many of the world’s programs run on Java, it's a prime candidate for a low barrier-to-entry FFI helper library.

The JNI

To create a Rust library that can be called from Java, an understanding of the Java Native Interface (JNI) is required. The JNI is a standard developed with the goal of unifying the mechanism by which all Java implementations can call native code. For a developer, this means that functions exported by a native library need to conform to the standardized JVM signatures.

Let’s take the following class definition:

public class HelloWorld {
    private static native void hello(String name);
    static {
        System.loadLibrary("hello");
    }

    public static void main(String[] args) {
        hello("Josh");
    }
}

The native qualifier on the hello method declares that a call will be made to a native library written in C, C++ or Rust, instead of a Java method. The loadLibrary call in the static initializer is responsible for discovering and loading the shared library that provides the hello function.

How does a developer create a native library or even figure out the function signature so that the JVM can locate it? Fortunately, the Java Development Kit (JDK) ships with the javah utility, which generates a C header with the necessary function declaration:

/*
 * Class:     HelloWorld
 * Method:    hello
 * Signature: (Ljava/lang/String;)V
 */
JNIEXPORT void JNICALL Java_HelloWorld_hello
  (JNIEnv *, jclass, jstring);

The first argument to this definition is perhaps the most important - it's the interface to the Java Virtual Machine (JVM). All object constructors, methods,etc. need to go through functions that this structure points to. The remaining two arguments are just opaque pointers to the class for the static method, and the name argument respectively.

While we could write this library in C or C++, both have their share of pitfalls. Since Java is a garbage-collected language, there's no guarantee that the pointers passed to the native function will be valid after the function returns or during subsequent calls. Consequently, these pointers cannot be stored without the very real possibility of a "use after free" error. This can be avoided by calling the NewGlobalRef JNI method, but that comes with the possibility of a memory leak if one forgets to call DeleteGlobalRef later. Strings have a similar issue if GetStringUTFChars is called without a corresponding ReleaseStringUTFChars later. JNI-specific issues aside, a C or C++ library has the potential to take down the entire JVM with a segfault.

JNI-rs

Enter Rust. As described in the introduction, its safety guarantees make it an ideal candidate for writing shared libraries. While there's some unsafety inherent in calling foreign functions, much of that can be abstracted away and hidden behind safe interfaces.

First Steps

The first thing needed for a C-compatible shared library is to conform to C's memory layout. To this end, Rust supports the #[repr(C)] attribute for structures that will force them to use the same memory layout as they would in C. While we could manually go through all of the structures in jni.h, it gets tedious rather quickly. Instead, there's the bindgen tool which will do for us!

Next, we need a way to declare functions that makes them callable by C. Unlike C, Rust, uses a different calling convention and rewrites function names so that they're unique in order to allow multiple library versions to coexist. Working around this is as easy as adding a couple of attributes to the declaration. #[no_mangle] will prevent the name rewriting and declaring the function as pub extern "C" will make it use the C calling convention. From there, it's just a matter of naming the function and filling in the argument as well as return types to match the C header.

#[no_mangle]
pub extern "C" fn JavaHelloWorld_hello(env: *mut sys::JNIEnv,
                                       class: sys::jclass,
                                       input: sys::jstring)

Safety

Unfortunately, this isn't really any safer than its C equivalent. We still need mechanisms to prevent "use after free" errors, null pointer dereferences, and memory leaks. That's where the jni-rs crate comes in!

Lifetimes

Rust's concept of lifetimes is the key to preventing "use after free" errors. In Rust, all references are assigned a lifetime. Lifetimes prevent references from outliving the objects that they refer to. Since structures can contain references as members, they can also be assigned lifetimes. Armed with this, we can create wrappers for our arguments that prevent them from escaping our exported function.

#[repr(C)]
pub struct JNIEnv<'a> {
    internal: *mut sys::JNIEnv,
    lifetime: PhantomData<&'a ()>,
}

Here, the lifetime field is a 0-sized marker type that simply carries the lifetime information (the 'a type argument). Since it's #[repr(C)], this can be used in any FFI context where a *mut sys::JNIEnv would normally get passed. The wrappers for jobject, jclass, and jstring look similar:

#[repr(C)]
pub struct JObject<'a> {
    internal: jobject,
    lifetime: PhantomData<&'a ()>,
}

Using these, our new function looks like this:

pub extern "C" fn JavaHelloWorld_hello(env: JNIEnv,
                                       class: JClass,
                                       input: JString)

But where did the lifetimes go? Lifetimes add a lot of noise to function definitions and variable declarations. To make things easier on the developer, Rust infers proper lifetimes for variables when it can (and it's pretty good at it!). With explicit lifetimes, it would look like this:

pub extern "C" fn JavaHelloWorld_hello<'a>(env: JNIEnv<'a>,
                                           class: JClass<'a>,
                                           input: JString<'a>)

This declares that the function has the lifetime 'a, as do all of its arguments. Therefore, when the function returns, the lifetime for the arguments is over, and they can't be used. This gets enforced at compile-time, so even if one were to attempt to place the JString into thread-local storage, it would fail to compile.

Drop

Lifetimes are nice, but what if we need to store a pointer to a Java object that can be used later? The NewGlobalRef function prevents an object from getting garbage-collected, but has the potential to leak memory if DeleteGlobalRef is never called. Lucky for us, Rust has the special Drop trait. This enables structs to define special behavior upon going out of scope, similar to C++'s destructor methods. Everything in Rust has a default implementation of Drop, even if it's just a no-op.

So how can we use Drop to enforce a call to DeleteGlobalRef? First, we need a struct on which to implement our Drop trait:

pub struct GlobalRef {
    obj: jobject,
    env: *mut JNIEnv,
}

Notice anything? No lifetimes! That means we are free to hang onto an object of this type for as long as we need to by putting it in thread-local storage. This is going to be returned from the NewGlobalRef call wrapper. It requires the object and a pointer to the JNIEnv for when Drop::drop gets called. Here's the Drop implementation:

impl Drop for GlobalRef {
    fn drop(&mut self) {
        let res = self.drop_ref();
        match res {
            Ok(()) => {}
            Err(e) => debug!("error dropping global ref: {:#?}", e),
        }
    }
}

The actual details of drop_ref are not important - all one needs to know is that it calls DeleteGlobalRef and returns a Result (which should never happen). That's it! The compiler does the rest of the work in deciding when drop needs to get called, and we get guaranteed freedom from memory leaks!

Pointer Safety

In safe Rust, indirections are mostly limited to references. Rust references carry lifetime information so that they can be proved safe to dereference. But when we cross the FFI boundary, we start to deal with pointers that aren't managed by Rust and aren't guaranteed to be valid or to not be null. We can't do much about the validity of pointers - that comes down to trusting that Rust and the foreign code both hold up their ends of the contract, but we can prevent segfaults by enforcing null checks.

In Rust, dereferencing a raw pointer requires an unsafe block:

let a: bool = true;
let b: *const bool = &a as *const bool;
let c: bool = unsafe { *b };

If we want to make this safer, a null check is in order:

let c: bool = if b.is_null() { return Err("oops!") } else { unsafe { *b } };

Yuck! We don't have to worry about the possibility of b being null, but our assignment just got a lot noisier. If we couple this with multiple levels of dereferencing that have to happen on every JNI call then we’d have a mess on our hands.

Macros to the rescue! Rust's hygienic macro system is perfectly suited to these sorts of tasks. Similar to the standard try! macro, we need something that will check for null, dereference it if it's safe, and return an error if it's not:

macro_rules! deref {
    ( $obj:expr ) => {
        if $obj.is_null() {
            return Err("Null pointer deref!".into());
        } else {
            unsafe { *$obj }
        }
    };
}

And the call site:

let c: bool = deref!(b);

Much cleaner! The actual implementation in JNI-rs is a bit more complicated than this since it also includes extracting functions from the environment, checking for exceptions, and checking the returns for null pointers. Further reading here.

While this will prevent Rust from taking down the process via a segfault, there’s still nothing stopping library authors from writing code that causes a panic in Rust. This could be caused by an index out of bounds, a lazy call to unwrap rather than handling an error case, or explicit panic!, unreachable! or unimplemented! calls. To be extra sure that a panic won’t escape from Rust, std::panic::catch_unwind can be used in the extern “C” functions to stop it.

Conclusion

JNI-rs is still very much a work in progress, and there's too much to cover in one blog post. In the coming months, our engineering team will share more about how we’re using JNI-rs and Rust to help us build Prevoty’s products. Full source and commit history can be found on GitHub.

We welcome bug reports and pull requests! If you are interested in working on Rust or just want to say hello, drop us a note at careers@prevoty.com.


Back to blog





Josh Chase

Software Engineer at Prevoty

Topics: Prevoty Technology, Prevoty Labs, rust, java, JNI, Programming Languages