Serialization and Deserialization: The Bridge Between Runtime Objects and Bytes

Every time you save a file, make an API call, or store data in a database, you’re using serialization. Yet many developers use these mechanisms daily without understanding the fundamental transformation happening under the hood.

Let’s demystify serialization and deserialization by understanding what they really are: conversions between runtime objects and bytes.


The Core Concept

SERIALIZATION:   Runtime Objects → Bytes
DESERIALIZATION: Bytes → Runtime Objects

That’s it. Everything else is implementation details.


What Are Runtime Objects?

Runtime objects are data structures that exist in your program’s memory while it’s running. They’re language-specific constructs with:

  • Type information
  • Memory addresses
  • Language-specific structure (prototypes, vtables, reference counting)
  • Behavior (methods, functions)

Examples across languages:

Go:

1
2
3
4
5
6
7
// Runtime object: struct instance in memory
profile := Profile{
    Name:     "my-project",
    IsActive: true,
    Created:  time.Now(),
}
// Exists as a memory structure with fields at specific offsets

JavaScript:

1
2
3
4
5
6
7
// Runtime object: JS object in V8 heap
const profile = {
    name: "my-project",
    isActive: true,
    created: new Date()
};
// Exists with prototype chain, hidden classes, etc.

Python:

1
2
3
4
5
6
7
# Runtime object: dict in Python heap
profile = {
    "name": "my-project",
    "is_active": True,
    "created": datetime.now()
}
# Exists as PyObject with reference counting

The key insight: These objects only exist while your program is running. They live in RAM. When your program exits, they vanish.

Key Concept: Runtime objects are ephemeral. They exist only in memory while your program runs. Once the program exits or the object goes out of scope, it’s gone forever. This is why we need serialization to preserve data.

Why Bytes?

Bytes are universal:

  • Files on disk store bytes
  • Network packets contain bytes
  • Database records are bytes
  • HTTP bodies are bytes
  • Everything that persists or travels is bytes

Runtime objects are ephemeral and language-specific:

  • They only exist in RAM while your program runs
  • A Go struct can’t be directly stored on disk
  • A JavaScript object can’t be sent over a network socket
  • A Python dict can’t be read by a Java program

Bytes bridge this gap. They’re the universal intermediate format that enables:

  • Persistence - Survive program restarts
  • Communication - Cross machine boundaries
  • Interoperability - Cross language boundaries
  • Storage - Save to disk, databases, caches

The Transformation

flowchart TB subgraph memory["Runtime Memory (Ephemeral)"] obj["Go Struct
───────────
type Profile struct {
Name string
IsActive bool
}

profile := Profile{
Name: 'my-project',
IsActive: true
}"] end subgraph bytes["Bytes (Persistent)"] data["Byte Sequence
───────────
[123, 34, 110, 97, 109, 101, 34, 58, 34, ...]

OR as text:
{'name':'my-project','is_active':true}"] end subgraph storage["Storage / Transmission"] disk["Disk File"] net["Network Packet"] db["Database Record"] cache["Cache Entry"] end obj -->|"Serialize
(Marshal/Encode)"| data data -->|"Deserialize
(Unmarshal/Decode)"| obj data --> disk data --> net data --> db data --> cache style memory fill:#1e3a5f,stroke:#4a9eff,color:#e2e8f0 style bytes fill:#2c5282,stroke:#4299e1,color:#e2e8f0 style storage fill:#22543d,stroke:#2f855a,color:#e2e8f0 style obj fill:#1a365d,stroke:#2c5282,color:#e2e8f0 style data fill:#2c5282,stroke:#63b3ed,color:#e2e8f0

Serialization: Objects → Bytes

Serialization converts runtime objects into a byte sequence.

Go example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// 1. Runtime object (in memory)
profile := Profile{
    Name:     "my-project",
    IsActive: true,
}

// 2. Serialize to bytes
jsonBytes, _ := json.Marshal(profile)
// jsonBytes = [123, 34, 110, 97, 109, 101, 34, 58, ...] (raw bytes)
// As string: {"name":"my-project","is_active":true}

// 3. Now you can persist/transmit the bytes
os.WriteFile("profile.json", jsonBytes, 0644)  // Write to disk
conn.Write(jsonBytes)                           // Send over network
redis.Set("profile", jsonBytes)                 // Store in cache

What happens during serialization:

  1. Traverse object structure - Walk through all fields/properties
  2. Convert to format - Apply encoding rules (JSON, protobuf, etc.)
  3. Generate bytes - Produce sequential byte stream
  4. Discard metadata - Type info, methods, pointers are lost

The bytes have no structure - they’re just a sequence of numbers. No type information. No methods. Just data.


Deserialization: Bytes → Objects

Deserialization reconstructs runtime objects from bytes.

Go example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
// 1. Read bytes (from disk, network, database)
jsonBytes, _ := os.ReadFile("profile.json")
// jsonBytes = [123, 34, 110, 97, 109, 101, 34, ...]

// 2. Deserialize to runtime object
var profile Profile
json.Unmarshal(jsonBytes, &profile)

// 3. Use the reconstructed object
fmt.Println(profile.Name)       // Access fields
if profile.IsActive {           // Use in logic
    activate(profile)           // Pass to functions
}

What happens during deserialization:

  1. Parse bytes - Interpret according to format rules
  2. Allocate memory - Create new object/struct/dict
  3. Populate fields - Assign values from parsed data
  4. Type checking - Validate against schema (if statically typed)

The Lifecycle

flowchart LR subgraph prog1["Program 1 (Go)"] create["Create Object
profile := Profile{...}"] use1["Use Object
fmt.Println(profile.Name)"] end subgraph serial["Serialization"] marshal["json.Marshal()
Object → Bytes"] end subgraph persist["Persistence"] file["file.json
bytes on disk"] end subgraph deserial["Deserialization"] unmarshal["json.Unmarshal()
Bytes → Object"] end subgraph prog2["Program 2 (JavaScript)"] parse["Parse JSON
JSON.parse(bytes)"] use2["Use Object
console.log(obj.name)"] end create --> use1 use1 --> marshal marshal --> file file --> unmarshal unmarshal --> use1 file -.->|"Different language!"| parse parse --> use2 style prog1 fill:#1e3a5f,stroke:#4a9eff,color:#e2e8f0 style serial fill:#742a2a,stroke:#c53030,color:#e2e8f0 style persist fill:#2c5282,stroke:#4299e1,color:#e2e8f0 style deserial fill:#22543d,stroke:#2f855a,color:#e2e8f0 style prog2 fill:#1e3a5f,stroke:#4a9eff,color:#e2e8f0

Notice: The bytes don’t “know” they came from Go. JavaScript can read the same bytes and build a JavaScript object. This is the power of serialization.


Serialization Formats

Different formats offer different tradeoffs:

JSON (JavaScript Object Notation)

Characteristics:

  • Human-readable text
  • Language-agnostic
  • UTF-8 encoded
  • Self-describing (field names included)

Tradeoffs:

  • + Easy to debug
  • + Universal support
  • + Works with any language
  • - Verbose (large size)
  • - Slow to parse
  • - No schema enforcement

Use cases: Config files, REST APIs, human-readable data

Example:

1
2
3
4
5
{
  "name": "my-project",
  "is_active": true,
  "tags": ["work", "golang"]
}

Protocol Buffers (protobuf)

Characteristics:

  • Binary format
  • Schema-required (.proto files)
  • Strongly typed
  • Very compact

Tradeoffs:

  • + Extremely fast
  • + Very small size
  • + Strong typing
  • + Forward/backward compatibility
  • - Not human-readable
  • - Requires schema
  • - Requires code generation

Use cases: gRPC, high-performance APIs, microservices

Schema (.proto):

1
2
3
4
5
message Profile {
  string name = 1;
  bool is_active = 2;
  repeated string tags = 3;
}

Bytes (hex):

0a 0a 6d 79 2d 70 72 6f 6a 65 63 74 10 01 1a 04 77 6f 72 6b

MessagePack

Characteristics:

  • Binary format
  • Like “binary JSON”
  • No schema required
  • More compact than JSON

Tradeoffs:

  • + Smaller than JSON
  • + Faster than JSON
  • + No schema needed
  • + Multiple language support
  • - Not human-readable
  • - Less universal than JSON

Use cases: Redis caching, log shipping, binary APIs


XML (eXtensible Markup Language)

Characteristics:

  • Human-readable text
  • Tag-based structure
  • Schema optional (XSD)
  • Verbose

Tradeoffs:

  • + Self-describing
  • + Schema validation available
  • + Mature tooling
  • - Very verbose
  • - Slow to parse
  • - Falling out of favor

Use cases: Legacy systems, SOAP APIs, enterprise integration

Example:

1
2
3
4
5
6
7
8
<profile>
  <name>my-project</name>
  <is_active>true</is_active>
  <tags>
    <tag>work</tag>
    <tag>golang</tag>
  </tags>
</profile>

YAML (YAML Ain’t Markup Language)

Characteristics:

  • Human-readable text
  • Indentation-based
  • Superset of JSON
  • Comments supported

Tradeoffs:

  • + Very readable
  • + Supports comments
  • + Less verbose than JSON
  • - Indentation-sensitive
  • - Ambiguous syntax
  • - Slower to parse

Use cases: Config files, CI/CD (GitHub Actions, Kubernetes), Ansible

Example:

1
2
3
4
5
name: my-project
is_active: true
tags:
  - work
  - golang

TOML (Tom’s Obvious Minimal Language)

Characteristics:

  • Human-readable text
  • INI-file inspired
  • Explicit and unambiguous
  • Table-based structure

Tradeoffs:

  • + Very readable
  • + Unambiguous syntax
  • + Good for config
  • - Limited adoption
  • - Verbose for nested data

Use cases: Config files (Cargo.toml, pyproject.toml)

Example:

1
2
3
name = "my-project"
is_active = true
tags = ["work", "golang"]

Format Comparison

flowchart TB question{"What's your priority?"} human["Human readability?"] perf["Performance?"] compat["Maximum compatibility?"] config["Configuration files?"] json["JSON
• Universal
• REST APIs
• Debugging"] yaml["YAML
• Config files
• Comments
• Readable"] toml["TOML
• Simple config
• Unambiguous
• Rust/Python"] protobuf["Protobuf
• gRPC
• High perf
• Typed"] msgpack["MessagePack
• Binary JSON
• Fast
• Compact"] question --> human question --> perf question --> compat question --> config human --> yaml human --> toml perf --> protobuf perf --> msgpack compat --> json config --> yaml config --> toml style question fill:#2d3748,stroke:#4a5568,color:#e2e8f0 style human fill:#742a2a,stroke:#c53030,color:#e2e8f0 style perf fill:#742a2a,stroke:#c53030,color:#e2e8f0 style compat fill:#742a2a,stroke:#c53030,color:#e2e8f0 style config fill:#742a2a,stroke:#c53030,color:#e2e8f0 style json fill:#2c5282,stroke:#4299e1,color:#e2e8f0 style yaml fill:#2c5282,stroke:#4299e1,color:#e2e8f0 style toml fill:#2c5282,stroke:#4299e1,color:#e2e8f0 style protobuf fill:#22543d,stroke:#2f855a,color:#e2e8f0 style msgpack fill:#22543d,stroke:#2f855a,color:#e2e8f0

Real-World Examples

Example 1: Saving Config

Go - dotclaude saving active profile:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// 1. Runtime object
config := Config{
    ActiveProfile: "my-project",
    BackupCount:   5,
    LastSync:      time.Now(),
}

// 2. Serialize to bytes
data, _ := json.MarshalIndent(config, "", "  ")

// 3. Write bytes to disk
os.WriteFile("~/.dotclaude/config.json", data, 0644)

// Program exits. Object gone. Only bytes remain on disk.

Later, loading config:

1
2
3
4
5
6
7
8
9
// 1. Read bytes from disk
data, _ := os.ReadFile("~/.dotclaude/config.json")

// 2. Deserialize to runtime object
var config Config
json.Unmarshal(data, &config)

// 3. Use the object
fmt.Println("Active:", config.ActiveProfile)

Example 2: REST API

Client (JavaScript) sending data:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
// 1. Runtime object
const user = {
    name: "Alice",
    email: "alice@example.com",
    role: "admin"
};

// 2. Serialize to JSON bytes
const json = JSON.stringify(user);
// json = '{"name":"Alice","email":"alice@example.com","role":"admin"}'

// 3. Send bytes over network
fetch('/api/users', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: json  // Bytes go over HTTP
});

Server (Go) receiving data:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
// 1. Read bytes from HTTP request
body, _ := io.ReadAll(r.Body)

// 2. Deserialize to runtime object
var user User
json.Unmarshal(body, &user)

// 3. Use the object
if user.Role == "admin" {
    grantAdminAccess(user)
}

Different languages, same data!


Example 3: Database Storage

Saving to database:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// 1. Runtime object
session := Session{
    UserID:    123,
    Token:     "abc123",
    ExpiresAt: time.Now().Add(24 * time.Hour),
}

// 2. Serialize to bytes (JSON or protobuf)
bytes, _ := json.Marshal(session)

// 3. Store bytes in database
db.Exec("INSERT INTO sessions (data) VALUES (?)", bytes)

Loading from database:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// 1. Query bytes from database
var bytes []byte
db.QueryRow("SELECT data FROM sessions WHERE id = ?", id).Scan(&bytes)

// 2. Deserialize to runtime object
var session Session
json.Unmarshal(bytes, &session)

// 3. Use the object
if time.Now().After(session.ExpiresAt) {
    return errors.New("session expired")
}

Common Pitfalls

1. Assuming Serialization Preserves Everything

Not preserved:

  • Methods/functions
  • Private fields (depends on language/serializer)
  • Pointer relationships
  • Type information (in some formats)
  • Circular references
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
type User struct {
    Name string
    // Methods are NOT serialized
}

func (u *User) Greet() string {
    return "Hello, " + u.Name
}

// After serialization → deserialization:
// You get the Name field, but NOT the Greet() method

2. Version Compatibility

Schema changes break deserialization:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
// Version 1
type Config struct {
    Name string
}

// Serialize with V1, store in database

// Version 2 (field added)
type Config struct {
    Name   string
    Email  string  // NEW FIELD
}

// Deserialize old bytes with V2 struct
// Email will be zero value (empty string)

Solution: Use versioning strategies:

  • Optional fields with defaults
  • Schema evolution (protobuf)
  • Version numbers in serialized data

3. Performance Assumptions

JSON is slow for large datasets:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
// BAD: Repeatedly serialize in hot path
for i := 0; i < 1000000; i++ {
    json.Marshal(largeObject)  // Expensive!
}

// GOOD: Serialize once, reuse bytes
bytes, _ := json.Marshal(largeObject)
for i := 0; i < 1000000; i++ {
    sendBytes(bytes)
}

4. Security Vulnerabilities

Security Warning: Deserializing untrusted data is dangerous and can lead to remote code execution vulnerabilities. Never deserialize data from untrusted sources without proper validation.

Example of dangerous code:

1
2
3
# NEVER do this with untrusted input
import pickle
data = pickle.loads(untrusted_bytes)  # Can execute arbitrary code!

Safe approach:

  • Use safe formats (JSON, not pickle)
  • Validate after deserialization
  • Use schemas (JSON Schema, protobuf)
  • Set size limits

Best Practices

1. Choose the Right Format

ScenarioFormat
Config filesYAML or TOML
REST APIsJSON
High-performance RPCProtobuf
Logs/metricsMessagePack or JSON
Legacy systemsXML
Binary cachingMessagePack

2. Handle Errors

1
2
3
4
5
6
7
8
9
// BAD: Ignoring errors
var config Config
json.Unmarshal(bytes, &config)

// GOOD: Check errors
var config Config
if err := json.Unmarshal(bytes, &config); err != nil {
    return fmt.Errorf("failed to deserialize config: %w", err)
}

3. Use Schemas When Possible

Protobuf schema:

1
2
3
4
message User {
  string name = 1 [(validate.rules).string.min_len = 1];
  string email = 2 [(validate.rules).string.email = true];
}

Benefits:

  • Type safety
  • Validation
  • Documentation
  • Code generation
  • Version compatibility

4. Consider Size and Speed

Performance comparison (serializing/deserializing 1000 user records):

FormatSizeSpeedBest For
Protobuf61 KB12msInternal services, gRPC
MessagePack89 KB35msCaching, binary APIs
JSON245 KB100msREST APIs, config files
XML412 KB187msLegacy systems (avoid for new projects)

Key takeaway: Binary formats (protobuf, MessagePack) are 2-4x smaller and 3-8x faster than text formats (JSON, XML).

Rule of thumb:

  • JSON - REST APIs, config files, anything human-readable
  • Protobuf - High-performance internal services, gRPC
  • MessagePack - Fast caching, log shipping
  • XML - Only for legacy integration

Conclusion

Serialization and deserialization are fundamental transformations that enable:

  1. Persistence - Objects survive program restarts
  2. Communication - Objects travel across networks
  3. Interoperability - Objects cross language boundaries
  4. Storage - Objects live in databases and caches

The key insight: Runtime objects are ephemeral and language-specific. Bytes are persistent and universal. Serialization is the bridge.

Every time you save a file, call an API, or query a database, you’re converting between these two worlds. Understanding this transformation helps you choose the right format, debug issues, and build robust systems.


Further Reading