Serialization using ByteBuffer and recycled objects
Overview
I have always been of the view that using recycled objects is faster for serialization.So I wrote a test based on Thrift Protobuf Compare to see where it does well or poorly.
The benchmark suggest that serialization/deserialization is fast with ByteBuffer and recycled objects, but creating new objects is relatively expensive.
This suggests that provided you have a simple strategy for reusing objects, it may be worth using this approach. However, if you can't recycle objects I suspect it won't be worth the extra effort.
Total Serailization time
|binaryxml/FI|hessian|javolution xmlformat|stax/woodstox|stax/aalto|protostuff-numeric-json|protostuff-json|json (jackson)|thrift|avro-generic|sbinary|avro-specific|activemq protobuf|protobuf|kryo|kryo-optimized|MessagePack (buggy)|java (externalizable)|ByteBuffer-specific&chm=N *f*,000000,0,-1,10&lklk&chdlp=t&chco=660000|660033|660066|660099|6600CC|6600FF|663300|663333|663366|663399|6633CC|6633FF|666600|666633|666666&cht=bhg&chbh=10&nonsense=aaa.png)
All results
Note: the creation time for a recyclable object is much higher, but the serialization/deserialization times are much better.Starting , Object create, Serialize, /w Same Object, Deserialize, and Check Media, and Check All, Total Time, Serialized Size ByteBuffer-specific , 154.98000, 242.50000, 303.00000, 132.00000, 132.00000, 132.00000, 374.50000, 213 avro-generic , 791.64500, 1675.00000, 1024.00000, 890.00000, 890.00000, 890.00000, 2565.00000, 211 avro-specific , 460.89500, 1089.00000, 756.50000, 1055.00000, 1055.00000, 1055.00000, 2144.00000, 211 activemq protobuf , 49.61000, 1203.50000, 186.00000, 7.00000, 612.50000, 934.00000, 2137.50000, 231 protobuf , 61.55500, 1130.50000, 762.50000, 579.00000, 797.00000, 839.50000, 1970.00000, 231 thrift , 62.97000, 1401.00000, 1502.00000, 1530.00000, 1530.00000, 1530.00000, 2931.00000, 353 hessian , 34.69000, 3208.00000, 3186.00000, 5597.00000, 5597.00000, 5597.00000, 8805.00000, 526 kryo , 35.33000, 988.50000, 1132.50000, 957.00000, 957.00000, 957.00000, 1945.50000, 226 kryo-optimized , 36.00000, 887.50000, 1051.00000, 915.00000, 915.00000, 915.00000, 1802.50000, 207 MessagePack (buggy) , 34.93000, 873.50000, 960.00000, 674.00000, 674.00000, 674.00000, 1547.50000, 216 java , 38.50000, 5012.50000, 4705.50000, 23918.50000, 23918.50000, 23918.50000, 28931.00000, 919 java (externalizable) , 38.99000, 707.00000, 839.50000, 610.00000, 610.00000, 610.00000, 1317.00000, 264 scala , 33.98000, 11184.50000, 10630.50000, 63917.00000, 63917.00000, 63917.00000, 75101.50000, 2024 json (jackson) , 37.69000, 1934.50000, 2196.00000, 1738.00000, 1738.00000, 1738.00000, 3672.50000, 378 json/jackson-databind , 38.55500, 8888.00000, 9277.00000, 9537.50000, 9537.50000, 9537.50000, 18425.50000, 1815 JsonMarshaller , 40.04500, 4856.00000, 5127.00000, 8456.00000, 8456.00000, 8456.00000, 13312.00000, 370 protostuff-json , 67.91000, 2153.00000, 2361.00000, 2139.00000, 2139.00000, 2139.00000, 4292.00000, 448 protostuff-numeric-json , 65.95000, 2171.00000, 2220.50000, 2188.50000, 2188.50000, 2188.50000, 4359.50000, 359 json/google-gson , 40.13000, 102987.00000, 103714.00000, 123158.00000, 123158.00000, 123158.00000, 226145.00000, 470 stax/woodstox , 39.84500, 2357.50000, 2651.00000, 3576.50000, 3576.50000, 3576.50000, 5934.00000, 475 stax/aalto , 37.01500, 2008.50000, 2220.00000, 3104.50000, 3104.50000, 3104.50000, 5113.00000, 475 binaryxml/FI , 40.43000, 5322.00000, 5639.50000, 5092.50000, 5092.50000, 5092.50000, 10414.50000, 300 xstream (stax with conv), 41.80500, 4512.50000, 4276.50000, 8374.50000, 8374.50000, 8374.50000, 12887.00000, 399 javolution xmlformat , 36.89000, 3197.00000, 3309.00000, 3769.00000, 3769.00000, 3769.00000, 6966.00000, 419 sbinary , 33.50000, 1510.00000, 1690.00000, 972.00000, 972.00000, 972.00000, 2482.00000, 264The main advantage over Java Externalizable is the option to re-use objects rather than create them each time. There is also a small advantage in being able to read/write a long as a single native read/write instead or 8 bytes with bit shift operations. (Similar for int)
The Code
Note: You have to write more code yourself if you are going to use ByteBuffers, but not much more than custom Java Externalisable. The main shift of thinking required is creating recycleable structures. In a multi-threaded context this ia made additionally difficult, however if you use these with care or model you application as a series of independant single threaded processes, this doesn't have to be a problem.ByteBuffers are builtin to Java so no additional library is required to use them, however a small number of helper methods are useful.
Using ByteBuffers for serialization can also improve IO performance as the data is in the "C" space already. A byte[] is not required.
I've read your ByteBuffer serialization example code and see that there is limit of serialized data - 4*1024B, or I miss something?
ReplyDeleteSecond thing: you use ByteOrder.nativeOrder() - don't you think that can cause problem when transmitting data between two different computer system?
I picked 4 KB as this is a page of memory. Its actually much larger than need for this example. Its a custom serializer and can be made any size really.
ReplyDeleteThe ByteOrder.nativeOrder() doesn't change performance as much as you might think so it can be dropped. Conversely, many microprocessors use little endian including x86/x64/Android ARM so it may not be as much of a problem as you might imagine. e.g. if you are reading/writing on the same box or the same OS or even Windows to Linux to Android.
http://stackoverflow.com/questions/6212951/endianness-of-android-ndk
Thank for response. Part about nativeOrder() I understood... but 4 KB is problem when serializing eg. a mp3 file which can be 3 MB or 100 MB (for 60 min songs)... Size for mp3 file can be approximated, but for other data it's hard to decide about size and for every serialized object we need to create other ByteBuffer with different size, which takes time (for "direct" ByteBuffer it takes even more time). Maybe I'm wrong but this is place which we pay for using that "the fastest" solution. Do you know how to go through this?
ReplyDeleteIf you reusing your direct ByteBuffers you don't need to worry about how big they are as they use virtual memory rather then main memory (until it is used)
ReplyDeleteIf you have a 64-bit JVM you can make all the ByteBuffers 1 GB in size and only the pages you access are allocated. (If you use a 32-bit JVM you can run out of virtual memory as well)
What I do is serialize to/from memory mapped files so I don't need to worry about copying the serialized data (or even a system call to read/write each object)
This data is great. I've been trying to zero in on a serialization algorithm that not only is fast to serialize/de-searialize but also has the ability to do targeted de-serialization. Like for example, I have a java object containing a few String objects and then after serialization, I need to read(de-serialize) the value of the specific string object? I've searched the web for some conclusive answers and started the effort of having my own serialization format, but thought of checking with you one last time.
ReplyDeleteJava Chronicle (a library of mine) supports this. You can selected a portion of data to de-serialize.
ReplyDeleteNote: It doesn't support plain Java Serialization however.
you probably missed that both fst and kryo are significantly faster when using externalizable compared to jdk-externalizable (objectinputstream has high init cost). Also it really depends on the data serialized. What would the test look like if serializing a cyclic structure like a HashMap of or a graph of interlinked (potentially cyclic) Pojo's ?
ReplyDeleteIn addition both fst and kryo offer interfaces to directly read/write byte buffers (less alloc, no byte[] required then), which aggain makes a significant difference.
Is there a github repo for the test ?