Architecture & Concepts
System overview
NetRL intercepts the observation path between a Gymnasium environment and an RL agent. Instead of the agent receiving the state directly, the observation is transmitted through a configurable channel model that introduces loss, delay, and (for ns-3 backends) realistic wireless contention:
┌─────────────────────────────────────────────────────┐
│ Gymnasium step loop │
│ │
│ env.step(action) │
│ │ │
│ ▼ │
│ raw_obs ──► CommChannel.transmit() │
│ │ (loss / delay) │
│ ▼ │
│ CommChannel.flush() ──► ObservationBuffer.add() │
│ │ │
│ ▼ │
│ agent ◄── Dict{"observations", "recv_mask"} │
└─────────────────────────────────────────────────────┘
For MultiViewNetworkedEnv, N independent paths run in
parallel — one per observer — managed by a single CentralNode:
raw_obs ──► MultiViewModel.observe() ──► {obs_0, obs_1, ..., obs_N}
│
┌──────────────────────────────┬─┴──────────────────────┐
│ Observer 0 │ Observer N │
│ CommChannel.transmit(obs_0) │ CommChannel.transmit() │
│ ↓ │ ↓ │
│ CommChannel.flush() │ CommChannel.flush() │
│ ↓ │ ↓ │
│ ObservationBuffer.add() │ ObservationBuffer.add()│
└──────────────────────────────┴────────────────────────┘
│
▼
Dict{"obs_0": {...}, "obs_1": {...}, ...}
Class hierarchy
CommChannel (ABC)
├── GEChannel — Markov chain; C++ core (netcomm extension)
├── PerfectChannel — lossless; zero-delay
├── NS3WiFiChannelFast — 802.11a ad-hoc; pybind11 in-process (netrl_ext)
├── NS3WifiChannel — 802.11a ad-hoc; ns-3 subprocess
├── NS3MmWaveChannel — 5G mmWave EPC; ns-3 subprocess
├── NS3LenaChannel — 5G NR; ns-3 subprocess
└── NS3WifiUEChannel — per-UE proxy; shared NS3WifiMultiUEBackend
ObservationBuffer — fixed-size circular buffer + recv_mask
CentralNode — Dict[node_id → CommChannel + ObservationBuffer]
gym.Wrapper
├── NetworkedEnv — single observer; owns one CentralNode
└── MultiViewNetworkedEnv — N observers; owns one CentralNode
Timing model
Time is discretised into integer env steps. Step t occupies ns-3
simulation time [t · step_ms, (t+1) · step_ms).
transmit(obs, step=t)The packet carrying observation
obsis scheduled to be sent att · step_ms + ε(a tiny offset into the step window).flush(step=t)The ns-3 simulator is advanced to
(t+1) · step_ms. Any packets whose receive callback fired during[t · step_ms, (t+1) · step_ms)are returned as the result.
For the Gilbert–Elliott backend there is no real-time simulation: transmit
rolls a Markov state transition and samples a Bernoulli loss; flush pops
all packets whose arrival_step ≤ step from an in-memory deque.
Persisted simulation state
The ns-3 subprocess backends run continuously across steps.
Simulator::Run() is called once per FLUSH with an increasing
stop-time. The pybind11 fast backend (NS3WiFiChannelFast)
operates in the same way — the same NS3 simulator object lives inside the
Python process and is advanced in-place each step. In both cases:
MAC backoff counters, retry queues, and association state persist between steps.
A
RESET(triggered byenv.reset()) callsSimulator::Destroy()and rebuilds the topology from scratch.
Warm-up period
Infrastructure-mode backends (Multi-UE WiFi, mmWave, 5G-LENA) require a
warm-up phase before the first READY:
Backend |
Warm-up |
|
|---|---|---|
802.11a (ad-hoc, single STA) |
310 ms (3 beacon intervals) |
30 s |
802.11a (infrastructure, N STAs) |
500 ms (association) |
60 s |
5G mmWave / 5G-LENA |
500 ms (UE attach + bearer) |
60 s |
Gilbert–Elliott channel model
The GE channel is a two-state hidden Markov model:
p_gb p_bg
┌─────────────────────────┐
│ │
▼ │
GOOD ──────────────────► BAD
loss_good loss_bad
At each transmit() call:
The Markov state is updated: transition with probability
p_gb(Good→Bad) orp_bg(Bad→Good).The packet is dropped with
loss_good(Good state) orloss_bad(Bad state).If not dropped, the packet is queued with
arrival_step = step + delay_steps.
The C++ implementation uses a Mersenne Twister (std::mt19937_64) seeded
at construction.
ns-3 subprocess protocol
All ns-3 backends use an identical line-oriented stdin/stdout protocol:
Python → subprocess
TRANSMIT <step_id> <pkt_size> # single-UE backends
TRANSMIT <ue_id> <step_id> <pkt_size> # multi-UE backend
FLUSH <step_id>
RESET
QUIT
Subprocess → Python
READY # once, at startup
OK # ACK for TRANSMIT / RESET
RECV <id1> <id2> ... # single-UE: space-separated step_ids
RECV <ue_id>:<step_id> ... # multi-UE: ue_id:step_id pairs
ERROR <message>
The Python side stores the observation in a _pending dict keyed by
step_id (or (ue_id, step_id)). On a successful FLUSH, the
received ids are looked up to retrieve the original NumPy arrays.
Observation buffer semantics
Each ObservationBuffer is a fixed-size circular window.
add(obs_or_None) advances by one slot every step, whether or not a
packet arrived:
step 0: transmit → arrives at step 2 (delay_steps=2)
step 1: transmit → arrives at step 3
step 2: flush → obs from step 0 arrives → buffer[-1] = obs_0
step 3: flush → obs from step 1 arrives → buffer[-1] = obs_1
buffer[-2] = obs_0
get_padded() always returns (obs_array, recv_mask) of shape
(maxlen, *obs_shape) and (maxlen,). Unwritten or lost-packet slots
contain zero arrays with recv_mask == False.
Strategy pattern
CentralNode uses the Strategy pattern for channel
selection. The channel_factory parameter is a
Callable[[NetworkConfig], CommChannel] called once per node. To add a
new backend:
Subclass
CommChanneland implement the four methods.Create a config dataclass with a
validate()method.Pass
channel_factory=YourChanneltoCentralNodeor either environment wrapper.