How to build a holodeck, week 33

This week I’ve been figuring out how to send large chunks of stuff over the network in a maintainable fashion, which has boiled down to fighting with serialization and structure/schema definitions. Not a fun week, all told.

TL;DR

Week 33 - network pointcloud feed

Networking and serialization

For starters, having a Z in there is annoying me, but I guess I’ll just have to train myself to not use an S. But, I digress.

Last week, I got to the point with my prototype where I had two cameras, manually aligned in world space, feeding depth feeds (point cloud feeds) over the network such that one machine can aggregate multiple feeds. The video link above shows this “working”.

The restrictions on how much data I can pass over the network (using Photon) mean the network feed is incredibly slow - I need to be sending a lot more data for this to work at a reasonable rate. Ideally, I want to add some persistence / fading into the dataset so I can deal with moving objects in the scene gracefully (i.e. people walking around) rather than just static data (like the floor, walls and ceiling). The data feed rate needs to be probably 10-100x faster than the current feed rate for this to happen, though (and that’s not including the amount of time it takes to actually build the TSDF dataset and then mesh it).

So, that brings up a whole new set of technical challenges. How do I send that much data around the local network (because I don’t need to be doing this over t’internets), and how do I describe that data for transmission?

For my previous network tests, I just chucked the (much smaller) data into a PhotonStream object, essentially by manually serialising the information I wanted to send. So if the object was, for example, a Kinect Skeleton frame (made of around 40 position Vector3s and rotation Quaternions) I had a chunk of code that said “add a Vector3 to the stream, then a Quaterion, then another V3, then another Quat, then …”, building up the representation of the data bit by bit. Photon then took that stream data and sent it out to the other clients.

I’m not able to use PhotonStreams with my new network code, which means I need to figure out a new way to serialise the data.

This brings up a load of options, and all of them are pretty sucky.

What do I want?

Ideally, I want the following things from my serialisation solution.

  1. I don’t want to have to manually mark up my classes.
  2. I want it to be fast.
  3. I want it to be human-readable.
  4. I want it to take very little memory.
  5. I want it to be very small.

I’m not certain yet on the ones I’m willing to compromise on, but I’d imagine it’ll be the human elements - how much work I have to do and how easy it is to parse the data.

Data format representations and descriptions

So first of all, I need to decide on how I’m going to describe the data. Do I send binary data? Do I send some form of human readable data (for example XML or JSON)? Do I need to compress it?

The best way to answer these questions is to try out a few options, so that’s what I’ve been doing most of this week.

Unity’s JSONUtility serialisation

I’m a big fan of human-readable data, so my first attempt at this was to use Unity’s built-in JSON serialisation/deserialisation utilities. In Theory, you can mark up a class with the System.Serializable attribute, then chuck it into ToJSON to generate the JSON string representing the data. This works, but has a slew of caveats - firstly, only public fields on the relevant class are serialised. Secondly, the construction/serialisation order of the object means if you have intermediate data that’s built from the initial data (for example, you pass in a set of values to a camera extrinsic and calculate a load of things like FOV from the focal lengths and clip planes), there doesn’t seem to be an easy way to persuade the serialisation to generate that data - I had hoped to do it in a bare constructor, but that doesn’t want to work for me. (I’m probably missing the obvious here).

The way I tend to write my C# classes relies quite heavily on properties (which I normally make public get, private set, so that I can be confident that data can be read but not modified). Unfortunately, that data doesn’t get serialised - which means I need to go back and rewrite a big chunk of the classes that I want to be throwing around over the network.

The amount of time it takes to generate the JSON is also pretty high - and it uses a load of memory to perform the generation. The end result data is also enormous (which is expected, as it’s human readable). Compressing this using GZip or Deflate gets the data down to a reasonable size, but that’s another chunk of time required to get something I can throw over the wire.

Obviously, every step I take during serialisation and compression needs to be reversed at the other end, which means the host is chewing time and memory extracting the result at the other end.

Because I don’t really want to have to manually mark up all my code with special serialisation code, I really wanted this to work - but if I’m going to have to re-write all the code anyway to expose the relevant fields, and having to figure out how to construct my intermediate data, and given the resulting size and memory hit - JSON probably isn’t going to cut it.

BinaryFormatter

.NET has some built-in serialisation options, and the next one on the list to try was BinaryFormatter. This has similar properties to the JSON flow - marking up a class to tell the system what to serialise should be as simple as adding the Serializable attribute. A big plus of this route is that properties (both public and private) are serialised correctly, so my intermediate data was already part of the serialisation dataset without having to be recalculated.

The downside that I wasn’t expecting is that Unity’s core types (Vector2, Vector3, Quaternion, etc)

  • none of these work out of the box as they’re not marked as Serializable. It’s possible to work around this by creating Surrogate serialisers for each of the types, so I’ve done this for the basic types that I need right now - it’s not a massive amount of work, but it is a bit annoying.

The size of the resulting data is a lot larger than I was hoping for, too - this is (again) expected, to some degree, as general purpose serialisation requires a load of type info to be embedded into the resulting binary dataset, and handing references and cycles in the data also needs some additional data. The bloat isn’t ridiculous, but for small types it’s a significant overhead, and even for larger types (e.g. a fully populated 4K point cloud) it pushes the raw data size up by > 50% in my tests.

I added compression on to the result of a 4K point cloud passed through BinaryFormatter, and got the packet down to about 10K (not ridiculous at all) but the time cost was around 100ms to serialise and compress, and about 50ms to deserialise and decompress. That’s far too expensive.

Other options

Is there anything out there that allows me to be lazy marking up my class, but generates a small amount of data? I’m really not sure, but the small amount of digging I’ve done so far would suggest not. The built-in XmlFormatter will generate larger results than the JSON, and will likely cost the same or more than BinaryFormatter does (and I’m expecting all the same caveats).

Unity has serialisation built in - this is how the internal data for scenes is represented. I took a quick look at whether I could leverage that directly, but the info I could find would suggest that I’m not able to trivially use it (and it brings problems of it’s own, which I won’t go into here).

Given that I’m probably going to have to re-write the classes holding my data anyway, this means being lazy about describing the data is likely off the table.

.NET also has a DataContractSerializer class, which I took a quick look at. This means marking up every field or property that I want to serialize, but has some negatives where abstract classes need to inform the type system of how the concrete implementations handle themselves. I didn’t dig too much into the details, but it sounds a bit messy.

Let’s clarify my remaining constraints (assuming I spend time re-writing my classes to mark them up) - I want the data to be small and I want the cost (time and memory) of serialisation/deserialisation to be low. Ideally, I don’t want to have to spend ages marking up my classes.

This leaves two likely candidates for some more prototyping work - Marc Gravell’s ProtoBuf .NET implementation and Google’s FlatBuffers. I’ve done some work with ProtoBuf previously, but I’ve not tried FlatBuffers before, and a few folks have said they have successfully used it - so that’s the next route I’m going down.

Is this really worth it?

Possibly not right now - in fact, for this specific prototype, probably not. I’m likely going to burn another week or so trying to get a nice format for my network data, at the expense of actually fixing the issues with the current prototype (alignment and performance). However, I think this is a valuable chunk of tech to understand and have a good foundation for future prototyping work - A lot of the upcoming prototypes will require high bandwidth feeds for depth and texture, and the more effort I put into manually serialising / transporting the data, the more I’m going to curse myself later when I have to re-do it all.

So, I’m going to burn a week, take it on the chin, and see what I get out the other end.

Then, back to solving the real problems.

Next week …

Catching up with the output from Oculus Connect, and more seriali(s z)ation.
Written on October 7, 2016