13 July 2014

Data Serialization

Serialization is an important step in converting objects or data states into particular storage formats and then reconstructed for further processing. The process of serialization of objects is called marshalling and the process of extracting data from deserialization of bytes is called unmarshalling. The benefits of data serialization involve: method for persisting objects for storage, a method for remote procedure calls, a method for distributing objects, and a method for detecting changes in data. Object serialization is supported by many languages. However, different data serialization formats provide for different efficiencies in performance or flexibility over domain contexts. Big data requirements often rely on efficient data serialization formats for processing that are not only compact but also provide native support for partitioning as well as schema evolution features. However, in other cases it may be more appropriate to rely on text formats of XML and JSON which provide for more sophisticated data structures with composite fields as well as hierarchical data.

List of Data Serialization Formats
Comparison of Thrift vs ProtoBuff vs Avro
Comparison of Data Serialization Formats
Understanding RDF Serialization Formats
RDF And Serialization Formats
Thrift
Avro
JSON
JSONLD
YAML
Protocol Buffers
MessagePack
XML
XML-RPC