Bitfields
bitvec
’s more technically-interesting capability is that it provides
load/store memory access behaviors that allow you to write values into, and read
them back out of, any BitSlice
in memory rather than being constrained to
well-formed references to well-typed memory.
This is useful for the de/construction of packed memory buffers, such as transporting data through I/O protocols.
**AUTHOR’S NOTE**: If you are using `bitvec` to do **anything** related to the
underlying memory representation of a bit-buffer, you **must** read this
chapter, and **all** of the API docs of [`bitvec::field`] and its contents,
**in their entirety**.
I have written extensively, and yet still insufficiently, about the intricacies
involved in operating the `BitField` trait correctly. If you skim this
documentation, you *will* have unexpected behavior, you *will* get frustrated
with me for writing a bad library, you *will* file an issue about it, and I will
*probably* tell you that the behavior is correct and that I already addressed it
in the documentation.
It took me a long time to think about and a long time to write. It should take
you *also* a long time to read and a long time to think about.
All of this behavior is contained in the BitField
trait. Let’s explore that:
// bitvec::field
pub trait BitField {
fn load<M>(&self) -> M;
fn store<M>(&mut self, value: M);
}
impl<T> BitField for BitSlice<T, Lsb0> {
fn load<M>(&self) -> M { /* snip */ }
fn store<M>(&mut self, value: M) { /* snip */ }
}
impl<T> BitField for BitSlice<T, Msb0> {
fn load<M>(&self) -> M { /* snip */ }
fn store<M>(&mut self, value: M) { /* snip */ }
}
The actual trait is more complex than this, and will be visited later. The
important part, right now, is that BitField
allows you to load values out of
BitSlice
s and store values into them. Furthermore, it is implemented
specifically on BitSlices
that use the bit orderings provided by bitvec
,
and is not generic over all orderings.
While bitvec
could in theory provide a default implementation for all
<O: BitOrder>
, this would by necessity have the most pessimal possible
performance, and the lack of specialization for overlapping trait
implementations means that faster performance can never be written.
The downside of the two specific implementations is that Rust coherence rules
forbid implementation of a bitvec
trait, on a bitvec
type, parameterized
with a local, but non-bitvec
, implementor of BitOrder
. On the off chance
that you find yourself writing a new BitOrder
implementor, file an issue.
The M
type parameter on the load and store methods is bounded by funty’s
Integral
, trait. It can store any unsigned or signed integer at any partial
width. This parameterization allows you to combine any integer type for transfer
with any integer type for storage, rather than being restricted to only
transferring T
data into and out of a BitSlice<T, _>
.
Unfortunately, adding a second integer type parameter is not the only
complication to the BitStore
memory model. There is also a second dimension of
segment ordering. bitvec
tries to make explicitly clear that the Lsb0
and
Msb0
types refer only to the ordering of bits within registers, and not to
the ordering of bytes within registers. However, when the integer bit-sequence
being stored or stored does not fit within one register of the storage
BitSlice
, it must be split into multiple segments, and those segments must
somehow be ordered in memory.
Segment Orderings
Author’s Note: **READ THIS**. I have received *several* issues about this exact
concept. *It is not obvious*.
There are two segment orderings: little-endian and big-endian. You may select
the segment endianness you prefer by using the _le
or _be
suffix,
respectively, on the .load()
and .store()
methods. The unsuffixed method is
an alias for the endianness of your processor: _be
on big-endian targets, and
_le
on little-endian. This is a convenience only. If you are writing I/O
buffers, you should really use the explicitly-named methods.
Let us imagine a BitSlice<u8, Lsb0>
used to store a u16
that is misaligned,
and thus stored in three successive bytes. This algorithm is true for all
circumstances where the stored region occupies more than one register of the
backing slice, but smaller examples are simpler to draw.
This diagram uses 0
to refer to the least significant bit, and 7
to refer to
the most significant bit. The first row shows bytes of memory, the second row
shows the bit indices in memory used by .store_le()
, and the third row shows
the bit indices in memory used by .store_be()
.
[ 7 6 5 4 3 2 1 0 ] [ 7 6 5 4 3 2 1 0 ] [ 7 6 5 4 3 2 1 0 ]
3 2 1 0 b a 9 8 7 6 5 4 f e d c
f e d c b a 9 8 7 6 5 4 3 2 1 0
.store_le()
places the least significant segment in the low address, while
.store_be()
places the most significant segment in the low address. The
ordering of bits within a segment is always preserved, no matter which
ordering parameter is used by the BitSlice
.
Here is the same example, but using the Msb0
bit ordering instead. Again, the
second row uses .store_le()
, and the third row uses .store_be()
.
[ 7 6 5 4 3 2 1 0 ] [ 7 6 5 4 3 2 1 0 ] [ 7 6 5 4 3 2 1 0 ]
3 2 1 0 b a 9 8 7 6 5 4 f e d c
f e d c b a 9 8 7 6 5 4 3 2 1 0
The only change is in how the segments are placed into memory. The ordering of bits within a segment never changes, and is always the processor’s significance order as implemented in hardware.
How to Use BitField
You will probably find real use of the BitField
trait more educational than
the previous section. It has a very straightforward API, that you can combine
with println!
-debugging or your favorite means of viewing memory in order to
observe its actions.
Step one: create any BitSlice
-capable buffer. This can be any of the
Rust-native sequence types, or any of the bitvec
types.
use bitvec::prelude::*;
let mut data = [0u8; 4];
let bits = data.view_bits_mut::<Msb0>();
Then, narrow the BitSlice
to be the region you want to access as storage. It
must be no wider than the integer type you are transferring: BitSlice
s whose
length is outside the domain 1 ..= M::BITS
will panic during .load()
or
.store()
. The easiest way to narrow a BitSlice
(or buffer type that
dereferences to it) is by using range indexing, [start .. end]
.
use bitvec::prelude::*;
let bits = bits![mut u8, Msb0; 0; 32];
bits[10 ..][.. 13].store_be::<u16>(0x765);
assert_eq!(bits[10 .. 23].load_be::<u16>(), 0x765);
bits[10 .. 23].store_le::<u16>(0x432);
assert_eq!(bits[10 .. 23].load_le::<u16>(), 0x432);
That’s the entire API. .store()
truncates the stored value to the width of the
receiving BitSlice
, and .load()
sign-extends the loaded value to the width
of the destination register type.
Storing signed integers can be surprisingly fraught: `bitvec` **will not**
attempt to detect and preserve the most-significant sign bit when truncating! If
you store the number `-12i8` (`0b1111_0100`) in a 4-bit slot, it will be stored
as `4i8` and reloaded as such! Similarly, storing `12i8` (`0b0000_1100)` in a
4-bit slot will produce a load of `-4i8` (`0b1111_1100`).
Signed integers do not behave like unsigned integers. You are wholly responsible
for ensuring that you remember that allowing negative numbers halves the
magnitude: 4 bits unsigned is `0 .. 16`, but 4 bits signed is `-8 .. 8`.
`bitvec` **only stores bit patterns**. Retaining numerical intelligibility is
**your** responsibility.
You can see an example that uses the BitField
trait to implement an I/O
protocol in the examples/ipv4.rs
program in the repository. Use
cargo run --example ipv4
to see it in action.