Binary JSON with bson4jackson

Recently, JSON has become an excellent alternative to XML. But most JSON parsers written in Java are still rather slow. On my search for faster libraries I found two things: BSON and Jackson.

BSON is binary encoded JSON. The format has been designed with fast machine readability in mind. BSON has gained prominence as the main data exchange format for the document-oriented database management system MongoDB. According to the JVM serializers benchmark Jackson is one of the fastest JSON processors available. Apart from that, Jackson allows writing custom extensions. This feature can be used to add further data exchange formats.

bson4jackson

This is the moment where bson4jackson steps in. The library extends Jackson by the capability of reading and writing BSON documents. Since bson4jackson is fully integrated, you can use the very nice API of Jackson to serialize simple POJOs. Think of the following class:

public class Person {
  private String _name;

  public void setName(String name) {
    _name = name;
  }

  public String getName() {
    return _name;
  }
}

You may use the ObjectMapper to quickly serialize objects:

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import com.fasterxml.jackson.databind.ObjectMapper;
import de.undercouch.bson4jackson.BsonFactory;

public class ObjectMapperSample {
  public static void main(String[] args) throws Exception {
    //create dummy POJO
    Person bob = new Person();
    bob.setName("Bob");

    //serialize data
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ObjectMapper mapper = new ObjectMapper(new BsonFactory());
    mapper.writeValue(baos, bob);

    //deserialize data
    ByteArrayInputStream bais = new ByteArrayInputStream(
      baos.toByteArray());
    Person clone_of_bob = mapper.readValue(bais, Person.class);

    assert bob.getName().equals(clone_of_bob.getName());
  }
}

Or you may use Jackson’s streaming API and serialize the object manually:

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import com.fasterxml.jackson.core.JsonGenerator;
import com.fasterxml.jackson.core.JsonFactory;
import com.fasterxml.jackson.core.JsonParser;
import com.fasterxml.jackson.core.JsonToken;
import de.undercouch.bson4jackson.BsonFactory;

public class ManualSample {
  public static void main(String[] args) throws Exception {
    //create dummy POJO
    Person bob = new Person();
    bob.setName("Bob");

    //create factory
    BsonFactory factory = new BsonFactory();

    //serialize data
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    JsonGenerator gen = factory.createJsonGenerator(baos);
    gen.writeStartObject();
    gen.writeFieldName("name");
    gen.writeString(bob.getName());
    gen.close();

    //deserialize data
    ByteArrayInputStream bais = new ByteArrayInputStream(
      baos.toByteArray());
    JsonParser parser = factory.createJsonParser(bais);
    Person clone_of_bob = new Person();
    parser.nextToken();
    while (parser.nextToken() != JsonToken.END_OBJECT) {
      String fieldname = parser.getCurrentName();
      parser.nextToken();
      if ("name".equals(fieldname)) {
        clone_of_bob.setName(parser.getText());
      }
    }

    assert bob.getName().equals(clone_of_bob.getName());
  }
}

Optimized streaming

One disadvantage of BSON is the fact that each document begins with a number denoting the document’s length. When creating an object this length has to be known in advance and bson4jackson is forced to buffer the whole document before it can be written to the OutputStream. bson4jackson’s parser ignores this length field and so you may also leave it empty. Therefore, you have to create the BsonFactory as follows:

BsonFactory fac = new BsonFactory();
fac.enable(BsonGenerator.Feature.ENABLE_STREAMING);

This trick can increase the serialization performance for large documents and reduce the memory footprint a lot. The official MongoDB Java driver also ignores the length field. So, you may also use this optimization if your bson4jackson-created documents shall be read by the MongoDB driver.

Performance

Version 1.1.0 of bson4jackson introduced support for Jackson 1.7 as well as a lot of performance improvements. At the moment, bson4jackson is much faster than the official MongoDB driver for Java (as of January 2011). For serialization, this is only true using the streaming API, since Jackson’s ObjectMapper adds a little bit of overhead (actually the MongoDB driver also uses some kind of a streaming API). Deserialization is always faster. The latest benchmark results can be reviewed on the following website:

https://github.com/eishay/jvm-serializers/wiki

Compatibility with MongoDB

In version 1.2.0 bson4jackson’s compatibility with MongoDB has been improved a lot. Thanks to the contribution by James Roper the BsonParser class now supports the new HONOR_DOCUMENT_LENGTH feature which makes the parser honor the first 4 bytes of a document which usually contain the document’s size. Of course, this only works if BsonGenerator.Feature.ENABLE_STREAMING has not been enabled during document generation.

This feature can be useful for reading consecutive documents from an input stream produced by MongoDB. You can enable it as follows:

BsonFactory fac = new BsonFactory();
fac.enable(BsonParser.Feature.HONOR_DOCUMENT_LENGTH);
BsonParser parser = (BsonParser)fac.createJsonParser(...);

Compatibility with Jackson

bson4jackson 2.x is compatible to Jackson 2.x and higher. Due to some compatibility issues both libraries’ major and minor version numbers have to match. That means you have to use at least bson4jackson 2.1 if you use Jackson 2.1, bson4jackson 2.2 if you use Jackson 2.2, etc. I will try to keep bson4jackson up to date. If there is a compatibility issue I will update bson4jackon, usually within a couple of days after the new Jackson version has been released.

Here’s the compatibility matrix for the current library versions:

Jackson 2.7.xJackson 2.6.xJackson 2.5.x
bson4jackson 2.7.xYesYesYes
bson4jackson 2.6.xNoYesYes
bson4jackson 2.5.xNoNoYes

If you’re looking for a version compatible to Jackson 1.x, please use bson4jackson 1.3. It’s the last version for the 1.x branch. bson4jackson 1.3 is compatible to Jackson 1.7 up to 1.9.

Download

Pre-compiled binaries

Pre-compiled binary files of bson4jackson can be downloaded from Maven Central. Additionally, you will need a copy of Jackson to start right away.

Maven/Gradle/buildr/sbt

Alternatively, you may also use Maven to download bson4jackson:

<dependencies>
  <dependency>
    <groupId>de.undercouch</groupId>
    <artifactId>bson4jackson</artifactId>
    <version>2.7.0</version>
  </dependency>
</dependencies>

For Gradle you may use the following snippet:

compile 'de.undercouch:bson4jackson:2.7.0'

For buildr use the following snippet:

compile.with 'de.undercouch:bson4jackson:jar:2.7.0'

If you’re using sbt, you may add the following line to your project:

val bson4jackson = "de.undercouch" % "bson4jackson" % "2.7.0"

License

bson4jackson is licensed under the Apache License, Version 2.0.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an “AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.