[36.3] How do I decide whether to serialize to human-readable ("text") or non-human-readable ("binary") format?
There is no "right" answer to this question; it really depends on your goals.
Here are a few of the pros/cons of human-readable ("text") format
vs. non-human-readable ("binary") format:
- Text format is easier to "desk check." That means you won't have to
write extra tools to debug the input and output; you can open the serialized
output with a text editor to see if it looks right.
- Binary format typically uses fewer CPU cycles. However that is
relevant only if your application is CPU bound and you intend to do
serialization and/or unserialization on an inner loop/bottleneck. Remember:
90% of the CPU time is spent in 10% of the code, which means there won't be
any practical performance benefit unless your "CPU meter" is pegged at 100%,
and your serialization and/or unserialization code is consuming a healthy
portion of that 100%.
- Text format lets you ignore programming issues like sizeof
and little-endian vs. big-endian.
- Binary format lets you ignore separations between adjacent values,
since many values have fixed lengths.
- Text format can produce smaller results when most numbers are small
and when you need to textually encode binary results, e.g., uuencode or
- Binary format can produce smaller results when most numbers are
large or when you don't need to textually encode binary results.
You might think of others to add as well... The important thing to remember
is that one size does not fit all — make a careful decision here.
One more thing: no matter which you choose, you might want to start each file
/ stream with a "magic" tag and a version number. The version number would
indicate the format rules. That way if you decide to make a radical change in
the format, you hopefully will still be able to read the output produced by
the old software.