Core Processors

The biologger-sim package is built around a pluggable processor architecture that allows researchers to switch between real-time (causal) and post-facto (acausal) processing logic.

Processor Types: Lab vs. Simulation

The system supports two primary processor types designed for different stages of the research lifecycle:

Processor Comparison

Feature

StreamingProcessor (Causal)

PostFactoProcessor (Acausal)

Primary Goal

On-tag real-time execution

Highest possible accuracy (Lab)

Processing

Single-pass, strictly causal

Multi-pass, assumes lookahead

Filter Style

lfilter (IIR/Butterworth)

filtfilt (Zero-phase)

Memory

Fixed window (O(1))

Full record set (O(N))

Hardware

Resource-constrained tags

High-performance workstations

StreamingProcessor (Causal)

The StreamingProcessor is the core of the Digital Twin mode. It simulates the constraints of a physical biologger tag where data arrives one sample at a time and future data is unknown.

Methodology: The 11-Step Causal Pipeline

The processor follows a strictly sequential, low-latency pipeline:

  1. Input Acquisition: Raw acceleration (0.1g counts), magnetometer, and pressure depth.

  2. Attachment Correction: Fixed roll/pitch rotation to align sensor axes with the animal’s body.

  3. Causal Gsep: 3-second trailing window for gravity separation (Static vs. Dynamic).

  4. Dead Reckoning Timing: Dynamic dt calculation to handle sensor jitter.

  5. R-Style Orientation: Pitch/Roll from gravity using legacy-compatible formulas.

  6. World Frame Transform: Body-to-World (NED) rotation of acceleration.

  7. High-Pass Filtering: 4th-order causal bias removal for vertical acceleration.

  8. INS Depth Estimation: 2-state Kalman Filter nowcast (fusing Baro + Accel).

  9. Multi-Scale Smoothing: Activity-weighted blending of Fast/Slow EMAs for depth.

  10. Magnetometer & Heading: Hard-iron compensated, tilt-corrected heading estimation.

  11. Dead Reckoning Integration: Position update via heading and speed (constant or ODBA-scaled).

Configuration Parameters

Streaming processors are configured via the entities section in the simulation YAML:

  • filt_len: (Default: 48) Length of the causal Gsep window (in samples).

  • freq: (Default: 16) Sampling frequency for filters and integration.

  • locked_attachment_roll_deg / locked_attachment_pitch_deg: Fixed calibration angles.

  • locked_mag_offset_x/y/z / locked_mag_sphere_radius: Hard-iron calibration parameters.

Advantages & Inherent Disadvantages

Advantages: - Real-Time Visibility: Allows for “nowcasting” depth and position with zero lag. - Portability: Code is designed to be easily transcribed to C/C++/Warp for embedded tags. - Scalability: Can simulate hundreds of entities in parallel due to fixed memory overhead.

Disadvantages (The Cost of Causality): - Filter Phase Shift: Causal filters (like Butterworth) introduce a small group delay in the signal. - Initialization (Warmup): Requires a short “warmup” period (e.g., 3s) for averaging windows to fill. - Noise Sensitivity: Lacks the benefit of centered averaging (filtfilt), making signals inherently noisier than their lab counterparts.

PostFactoProcessor (Acausal)

The PostFactoProcessor is used in Lab Mode (configured via strict_r_parity: true). It is optimized for validation against established R implementations.

Methodology

Unlike the streaming version, this processor: 1. Loads the entire dataset first. 2. Performs batch calibration (finding the optimal attachment angles and mag offsets from the whole file). 3. Uses zero-phase filters (filtfilt) which process the data both forward and backward to eliminate phase shift. 4. Applies linear interpolation for depth gaps before any processing.

API Reference

Streaming Processor

class biologger_sim.processors.streaming.StreamingProcessor(filt_len: int = 48, freq: int = 16, debug_level: int = 0, locked_attachment_roll_deg: float | None = None, locked_attachment_pitch_deg: float | None = None, locked_mag_offset_x: float | None = None, locked_mag_offset_y: float | None = None, locked_mag_offset_z: float | None = None, locked_mag_sphere_radius: float | None = None, ema_fast_alpha: float = 0.2, ema_slow_alpha: float = 0.02, ema_cross_threshold: float = 0.5, zmq_publisher: ZMQPublisher | None = None, eid: int | None = None, sim_id: str = 'default', tag_id: str = 'unknown', dead_reckoning_speed_model: str = 'odba_scaled', dead_reckoning_constant_speed: float = 1.0, dead_reckoning_odba_factor: float = 2.0, highpass_cutoff: float = 0.1, **kwargs: Any)[source]

Bases: BiologgerProcessor

Causal (real-time) streaming processor for digital twin and on-tag simulation.

calibrate_from_batch_data() None[source]
get_current_state() dict[str, Any][source]

Get current processor state and internal parameters.

Returns:

Dict containing current state information, calibration status, buffer contents, and algorithm-specific diagnostics

get_performance_summary() dict[str, Any][source]

Get comprehensive performance metrics and telemetry.

Returns:

Dict containing timing statistics, throughput metrics, algorithm performance, and diagnostic information

process(record: dict[str, Any] | Any) dict[str, Any][source]

Process input data through the pipeline.

This is the main processing method that all processors must implement. The input format is flexible to support different processor types: - Streaming: Dict with sensor records (accel, mag, depth, etc.) - Adaptive: np.ndarray with accelerometer samples - Batch: Dict with full dataset parameters

Parameters:

data – Input data - format depends on processor type

Returns:

Dict containing processed results and metadata

Raises:

ProcessingError – If processing fails due to invalid input or internal error

reset() None[source]

Reset processor to initial state.

Clears all internal buffers, calibration state, and accumulated statistics while preserving configuration settings.

update_config(config_updates: dict[str, Any]) None[source]

Update processor configuration at runtime.

Parameters:

config_updates – Dictionary of configuration parameters to update

Raises:

ConfigurationError – If configuration update is invalid

Lab Processor (Post-Facto)

class biologger_sim.processors.lab.PostFactoProcessor(filt_len: int = 48, freq: int = 16, debug_level: int = 0, r_exact_mode: bool = False, compute_attachment_angles: bool = True, locked_attachment_roll_deg: float | None = None, locked_attachment_pitch_deg: float | None = None, compute_mag_offsets: bool = True, locked_mag_offset_x: float | None = None, locked_mag_offset_y: float | None = None, locked_mag_offset_z: float | None = None, locked_mag_sphere_radius: float | None = None, depth_cfg: DepthConfig | None = None, zmq_publisher: ZMQPublisher | None = None, eid: int | None = None, sim_id: str | None = None, tag_id: str = 'unknown', clock_source: ClockSource = ClockSource.FIXED_FREQ, **kwargs: Any)[source]

Bases: BiologgerProcessor

Post-facto (non-causal) biologger processor for R-compatibility.

This processor uses R’s centered moving average filter (filter(sides=2, circular=TRUE)) to achieve exact tie-out with the gRumble R package. Unlike StreamingProcessor which uses causal lfilter (trailing window), this uses a centered window looking both forward and backward in time, which is only possible for post-hoc analysis.

Memory Footprint:
  • Fixed size: O(filt_len) samples (48 samples @ 16Hz for 3s window)

  • Independent of dataset size (unlike batch/ which loads all data)

Validation Target:
  • <0.1° error vs R (pitch, roll, heading)

  • Exact ODBA/VeDBA match

calibrate_from_batch_data() None[source]

Perform batch calibration from collected data.

This should be called after the first pass through the dataset in r_exact_mode. Computes attachment angles and magnetometer offsets from full dataset.

get_current_state() dict[str, Any][source]

Get current processor state.

get_output_schema() list[str][source]

Get list of output fields (R-compatible expanded schema).

get_performance_summary() dict[str, Any][source]

Get performance metrics.

process(data: dict[str, Any] | ndarray) dict[str, Any][source]

Process a single record using non-causal filtfilt.

Parameters:

data – Raw sensor record (dict or array)

Returns:

Dictionary with processed state, or minimal state if record is skipped.

reset() None[source]

Reset processor to initial state.

update_config(config_updates: dict[str, Any]) None[source]

Update processor configuration.