Evolution of a file format - using XStream

Part 1: The Model...

...can be found here...

Part 2: Serializing using XStream

In the typical project, those objects have to be serialized in one or another way. And of course it is a good idea to choose a simple way.

Therefore many people start to serialize their objects using http://xstream.codehaus.org/.

XStream xStream = new XStream();
//We define some aliases to get a nicer xml output without fqns
xStream.alias( "car", Car.class );
xStream.alias( "extra", Extra.class );
xStream.alias( "money", Money.class );

Car sampleCar = ...
String xml = xStream.toXML( sampleCar );

Resulting XML

The resulting XML looks quite ok:

<car>
  <model>
    <name>Ford</name>
  </model>
  <basePrice>
    <amount>19000.0</amount>
  </basePrice>
  <extras>
    <extra>
      <description>Whoo effect</description>
      <price>
        <amount>99.98</amount>
      </price>
    </extra>
    <extra>
      <description>Better Whoo effect</description>
      <price>
        <amount>199.0</amount>
      </price>
    </extra>
  </extras>
</car>

The only (minor?) issue is the missing namespace. It is always a good idea to have a namespace...

Everything ok?

Now many people cheer: "Wow XStream is such a great tool! It just took me three lines to get a great XML!"

And they will start to use XStream everywhere possible...

And yes - everything works great.

Performance?

The performance isn't great compared to hand written code. But it isn't too bad either. So the small performance penalty is an acceptable tradeoff in most of the cases.

Part 3: Shipping the application

Now it is time to ship the application... Time to be happy. XStream saved us a lot of time. So maybe we have been able to release a little bit earlier than expected...

Part 4: UUps. Made a small mistake?

After a while the users start to complain about some random(?) rounding errors within our application.

The problem

The reason is simple: The money object stores the amount as double. So calculations result in a loss of precision:

public class Money {
  private final double amount;

  public Money( double amount ) {
    this.amount = amount;
  }

  public double getAmount() {
    return amount;
  }

The fix

Fixing this issue is simple. We just have to store the amount as integer:

public class Money {
  private final long cents;

  public Money( long cents ) {
    this.cents = cents;
  }

  @Deprecated
  public Money( double amount ) {
    this( convertValueToCents( amount ) );
  }

  @Deprecated
  public double getAmount() {
    return cents / 100.0;
  }

  public long getCents() {
    return cents;
  }

  public static long convertValueToCents( double amount ) {
    return Math.round( amount * 100 );
  }

To stay compatible we keep the old constructor and getter. Marking them as @Deprecated is enough.

Part 5: Serialization?

What about the serialization using XStream? If we have just created roundtrip tests and never checked the generated XMLs everything seemed to be well. But hopefully we created some good tests and recognized the differences in the XML output.

The new XML looks like that:

<car>
  <model>
    <name>Ford</name>
  </model>
  <basePrice>
    <cents>1900000</cents>
  </basePrice>
  <extras>
    <extra>
      <description>Whoo effect</description>
      <price>
        <cents>9998</cents>
      </price>
    </extra>
    <extra>
      <description>Better Whoo effect</description>
      <price>
        <cents>19900</cents>
      </price>
    </extra>
  </extras>
</car>

The difference!

It is just a small difference - but it is an important difference. Every time a Money object is serialized, the XML has changed from something like

<basePrice>
  <amount>19000.00</amount>
</basePrice>

to

<basePrice>
  <cents>1900000</cents>
</basePrice>

The consequence!

We are no longer able to read XML files created with the old version of our software. And of course the deployed installations of the old software are not able to read the changed XML formats.

Even worse:

Deserialization fails with a com.thoughtworks.xstream.converters.ConversionException.

This extends RuntimeException so probably we forgot to catch that Exception and provide nice feedback to the user. But even if we did, we just know that the XML format is "invalid". We can't provide a reason for that behaviour.

So there are two common scenarios: Either our application just fails with a strange stack trace. Or (if we are really foresighted developers) we have to tell our user that the file is somehow "corrupted".

Probably this will result in angry customers and a lot of work for our support team...

What to do? How can we solve that problem?

XStream has one powerful fallback for all kind of troubles that can't be solved automatically. The solution is to implement own converters for problematic classes.

So we implement an MoneyConverter:

Marshalling

Marshalling the XML is down in that method. The API is simple (very similar to STAX):

public class MoneyConverter implements Converter {
  @Override
  public void marshal( Object source, HierarchicalStreamWriter writer, MarshallingContext context ) {
    writer.startNode( "cents" );
    writer.setValue( String.valueOf( ( ( Money ) source ).getCents() ) );
    writer.endNode();
  }

We chose to write the new format. (In that particular case we might succeed in writing the old format, but in many cases this is not possible. So please kindly oversight that shortcomming of this sample).

Since we can't change our old application (that has been deployed several times) we have to live with missing/inadequate error messages there...

Unmarshalling

Unmarshalling is a little bit more complicated. Since we want to still be able to read the old format, we have to be able to guess which kind of XML we have.

Good luck: We just have to compare the node names to be able to decide which version we have:

  @Override
  public Object unmarshal( HierarchicalStreamReader reader, UnmarshallingContext context ) {
    reader.moveDown();
    long cents;

    //We have to guess which kind of XML we have
    //This might become very difficult and complicated for complex scenarios
    if ( reader.getNodeName().equals( "amount" ) ) {
      //Legacy!
      cents = Money.convertValueToCents( Double.parseDouble( reader.getValue() ) );
    } else {
      cents = Long.parseLong( reader.getValue() );
    }
    reader.getValue();
    reader.moveUp();

    return new Money( cents );
  }

"Wow! I like XStream: It worked like a charm"

Yes. Nice work done. We solved our problem. And with custom converters we should be able to solve nearly all kind of problems.

By the way: Did you notice that you did write the XML writing/parsing code on your own? So as soon as you have to tweak the XML, you are forced to write your own converters...

Why did we choose XStream in the first place? Yes - we wanted to avoid writing the (un)marshalling code ourselves...

Part 6: Intermediate Result

Old application

The old application can't read the new format (of course!). No framework could help about that...

Error messages

Unfortunately we fail late. We have to parse the XML file and wait until an error occures. There is no way to recognize the file version before.

And we can't differentiate between an invalid XML (maybe the result of a nasty bug when creating the XML?) and new(er) file formats. So in the best case we can tell our users that parsing the XML has failed due to some reason we don't know anything about...

In the worst case our application doesn't handle the Exception properly and the user gets some unspecific message ("Internal Application Error") or some stack traces on the console...

New application

The new application is able to parse the new and the old XML files. Unfortunately we had to write a lot of the (un)marshalling code ourselves... But hey: It works.

The converter

It has been a little work to write the converter. So XStream couldn't save us from writing that code at the end. But well, at least it isn't worse than writing that code in the beginning, isn't it?

Okay it is a little bit disturbing that we have to guess about the version of the XML using some weird heuristics (node names etc). But it works. Isn't that the most important thing?

The file format

The file format looks nice. While it isn't perfect in every place (maybe some fine tuning could help), it is definitely good enough. And if I *really* want to improve some aspects, I could write a converter...

Okay, maybe we should add some version information or namespace declaration... But at the moment this isn't absolutely necessary.