The Packet Parser Crate
By Cyprien Avico
www.linkedin.com/in/cyprien-avico
The repo of the crate: GitHub - Packet Parser
The crate link: coming soon...
This is not a Rust doc. This is a documentation on how I personally parse packets.
Feel free to criticize, do a PR, or send me a message on LinkedIn.
Introduction
Packet Parser is a Rust library designed for parsing network frames.
This book explains how I developed it, its internal architecture, so you can contribute.
Key Features
- Multi-layer support: Supports parsing of data link, network, transport, and application layers.
- Data validation: Built-in mechanisms to ensure packet integrity.
- Precise error management: Each layer has its own dedicated error types for better debugging.
- Optimized performance: Integrated benchmarking using Criterion.
- Extensibility: Modular architecture that allows easy addition of new protocols.
Purpose of this crate
The goal of this crate is to provide a function that transforms a Packet or a list of bytes to be more precice into any type of packet structure or an error if you are just getting fooled and reciev uncohrent bytes.
- It is not restricted to a specific layer: You can pass a TCP payload, and it will return an HTTP, TLS, NTP, or other applicable protocol structure.
- You can provide a full network packet, and it will return a structured representation containing data link, network, transport, and application layers.
To explain how i made this crate les dive into packet parsing my passion.
have to know what do i call a packet because thats what we are stating from.
then we'll se how i parse this packet:
Parsing procedure
now that we know how to parse packet. let's see how we retrieve the network layer structure as an example, so you can understand the data validation procedure I use for every struct in this crate: TryFrom::
.
Packet Structure and Parsing Approach
What is a Packet?
A network packet is a sequence of bytes transmitted over a network. Hereβs an example of a raw packet in hexadecimal format:
A packet is essentially a list of bytes representing network data.
For example:
#![allow(unused)] fn main() { let packet: &[u8] = &[0x00, 0x11, 0x22, 0x33, 0x44, 0x55, /* other bytes */]; }
It is preferable to reference the packet (&[u8]
) rather than copying it to avoid unnecessary memory usage and improve performance.
π¨ Identifying Protocols in the Packet
Each protocol occupies a specific part of the packet. By analyzing the bytes, we can identify different layers.
πͺ Protocols are Nested (Like Russian Dolls)
A network packet is structured as a series of encapsulated layers: each layer contains a protocol that encapsulates the next.
ParsedPacket
Layered Structure
Once parsed, a packet is structured into four layers, following the OSI model:
The Data Link Layer is always present, while the others depend on the packet type.
π How Layers Interact with Addresses and Entry/Exit Points
Each layer contains specific information to identify source and destination addresses.
π§ Detailed Breakdown of Parsed Structures
Each protocol has its own structure with unique fields.
Parsing Strategy Based on Payloads
Parsing is determined by the payloads extracted at each stage.
Independent Layer Parsing
Each layer must be parsed independently from the others.
We do not use information from one layer to infer details about another.
Why?
- Security: Attackers can manipulate packet fields (e.g., changing port numbers).
- Flexibility: Some protocols do not strictly follow conventional port assignments.
- Reliability: Parsing should be based on raw data, not assumptions.
For example, we do not parse an application-layer protocol based on the transport-layer port number.
Just because a packet has port 80 does not mean it contains HTTPβit could be anything.
Data validation procedure
When we receive a packet, we use TryFrom to apply several validation steps to it. If the validations succeed, the function returns a structured representation of the packet or a part of it. If the validations fail, it returns a custom error, implemented using the thiserror crate.
Chapter 1
Parsing the Data Link Layer from a Raw Packet
Understanding how to parse the Data Link Layer from raw packets is a crucial step in network packet analysis. The Data Link Layer provides essential information such as MAC addresses, Ethertype, and payload extraction. This section explains the approach I took to implement the DataLink
structure parsing.
𧩠Understanding the Data Link Structure
The Data Link Layer is responsible for frame-level communication between devices on the same network segment. In an Ethernet frame, the structure is as follows:
The main components are:
- Destination MAC Address (6 bytes) - The unique physical identifier of the receiving network hardware.
- Source MAC Address (6 bytes)
- Ethertype (2 bytes) β Determines the protocol encapsulated in the payload.
- Payload (Variable length) β Contains the encapsulated network-layer packet (ipv4, arp, etc ...).
Breaking Down the MAC Address Structure
To parse MAC addresses correctly, we need to ensure that:
- They are always 6 bytes long.
- They are formatted properly for readability.
- We extract Organizationally Unique Identifiers (OUI) to identify the manufacturer.
MAC Address Structure:
Breaking Down the Ethertype Field
The Ethertype is a 2-byte field that defines the type of payload carried by the frame.
π Key Considerations:
- Extract the 2-byte big-endian value.
- Map well-known Ethertypes (IPv4, IPv6, ARP, etc.).
- Allow handling of unknown protocols without failure.
π Example of Well-Known Ethertypes (IEEE Standard Correspondence Table):
Ethertype (Hex) | Protocol |
---|---|
0x0800 | IPv4 |
0x86DD | IPv6 |
0x0806 | ARP |
0x8100 | VLAN Tagging |
π Steps Taken to Parse the Data Link Layer
To correctly extract this information, I followed these key steps:
Validations
While parsing, I implemented validations to ensure the raw packet is coherent.
π Validations Performed:
β
Packet Minimum Length Check β Ensure the packet is at least 14 bytes (MAC_DST + MAC_SRC + Ethertype
).
β
Macaddress Minimum Length Check β at least 6 bytes.
β
etherthype ceherence β if ethertype is ipv4 or ipv6 the payload can't be empty.
Structuring the Parsed Packet
After extracting all components, I structured the parsed frame in a clear format. This makes it easier to analyze, debug, and process packets dynamically.
π Why Structure Matters?
- Improves readability of parsed data.
- Makes it easier to extract key information.
- Supports future protocol extensions.
π Conclusion
Parsing the Data Link Layer requires careful validation and structured extraction. By following a modular approach:
- MAC addresses are extracted safely.
- Ethertype is correctly mapped.
- Payload validation prevents out-of-bounds errors.
- The structure is extensible for future protocols.
This foundational parsing is crucial for higher-layer analysis, such as decoding IP, TCP, UDP, and application-level protocols.
π Next Steps: Exploring network-layer parsing (IPv4/IPv6)!