From b0a164795618199c7f6c4c5306e029b1ad5e7942 Mon Sep 17 00:00:00 2001 From: Daniel Rosel Date: Fri, 23 Jan 2026 12:52:58 +0100 Subject: [PATCH] docs --- lab/README.md | 74 +++++++++++++++++++++++++++ lab/docs/index.rst | 1 + lab/docs/system_overview.rst | 97 ++++++++++++++++++++++++++++++++++++ 3 files changed, 172 insertions(+) create mode 100644 lab/docs/system_overview.rst diff --git a/lab/README.md b/lab/README.md index c4db76a..b5226aa 100644 --- a/lab/README.md +++ b/lab/README.md @@ -1 +1,75 @@ # MOS (Money Operating System) + +Research-grade quote-control simulator for studying dynamic pricing and market making policies. +The system models pricing as a closed loop of **Quote → Arrival → Execution → Position**, enabling +controlled experimentation with demand models, inventory constraints, and reward shaping. + +## Core Loop + +1. **Quote** – the policy posts prices (one-sided or two-sided depending on the mechanism). +2. **Arrival** – a population model generates purchase opportunities or market orders. +3. **Execution** – an execution model decides whether an arrival converts at the quoted price. +4. **Position** – inventory/position limits censor fills and generate holding/shortage costs. +5. **Observation & Reward** – censored fills and aggregate metrics are exposed to the agent, while + objectives turn metrics into a scalar reward. + +Each stage is pluggable via light-weight protocols so you can swap in alternative mechanisms, +demand models, or objectives without rewriting the rest of the simulator. + +## Package Layout + +| Module | Purpose | +|-------------------|---------| +| `lab.outlet` | Core simulation engine, domain types, pricing mechanisms, objectives. | +| `lab.population` | Demand arrival models, execution probability models, competitor/market dynamics. | +| `lab.experiments` | Rollout utilities, baseline policies, and off-policy evaluation helpers. | +| `lab.config` | Convenience factories for preconfigured retail and market-making environments. | + +## Preconfigured Scenarios + +### Retail Dynamic Pricing +- Mechanism: posted prices with margin and delta constraints. +- Arrivals: browsing sessions with contamination support (scrapers). +- Execution: elasticity model with competitor cross-effects. +- Position: inventory tracking with holding and shortage costs. +- Market: reactive competitor that can trigger price wars. +- Objective: PnL minus volatility, holding cost, and lost opportunity penalties. + +```python +from lab.config import make_retail_platform +from lab.experiments import rollout, fixed_price_policy + +platform = make_retail_platform() +policy = fixed_price_policy(platform.instruments.refs) +result = rollout(platform, policy, n_steps=100) +print(result.total_pnl) +``` + +### Market Making +- Mechanism: two-sided quoting with bid/ask spreads. +- Arrivals: Hawkes order flow for clustered demand. +- Execution: Avellaneda–Stoikov style intensity model. +- Position: inventory risk limits and quadratic penalty objective. +- Market: geometric Brownian motion mid-price process. +- Objective: PnL plus spread capture minus inventory risk. + +```python +from lab.config import make_market_making_platform +from lab.experiments import rollout + +platform = make_market_making_platform() +mm_policy = lambda obs, t: (platform.instruments.refs, 1.0) +result = rollout(platform, mm_policy, n_steps=200, seed=42) +print(result.total_pnl) +``` + +## Extending the Simulator + +- Implement `lab.outlet.protocols.Mechanism` or `ArrivalModel` to introduce new pricing +domains or demand processes. +- Compose objectives with `lab.outlet.objectives.factory.make_composite` to study alternate +reward formulations. +- Use `lab.experiments.compare_policies` to benchmark candidate policies across multiple +random seeds. + +Comprehensive API documentation lives in `lab/docs` (build with `make html`). diff --git a/lab/docs/index.rst b/lab/docs/index.rst index b53fbba..bd36ecd 100644 --- a/lab/docs/index.rst +++ b/lab/docs/index.rst @@ -28,6 +28,7 @@ Quick Start :maxdepth: 2 :caption: Contents: + system_overview modules/outlet modules/population modules/experiments diff --git a/lab/docs/system_overview.rst b/lab/docs/system_overview.rst new file mode 100644 index 0000000..3fda8ad --- /dev/null +++ b/lab/docs/system_overview.rst @@ -0,0 +1,97 @@ +System Overview +=============== + +The simulator organises dynamic pricing and market-making experiments as a +closed loop with the following stages: + +* **Quote** – a policy or agent emits a :class:`lab.outlet.types.Quote`. The + quote is normalised and validated by a concrete + :class:`lab.outlet.protocols.Mechanism` implementation + (posted-price, two-sided, auction). +* **Arrival** – a :class:`lab.outlet.protocols.ArrivalModel` samples a stream of + :class:`lab.outlet.types.Opportunity` objects given the current time, + instrument catalogue, and market state. +* **Execution** – the :class:`lab.outlet.protocols.ExecutionModel` converts an + opportunity into a probabilistic fill using the active quote, optional + competitor prices, and demand-side context. +* **Position** – a :class:`lab.outlet.protocols.PositionModel` enforces + inventory or position constraints, censors oversized fills, and accrues + holding and shortage costs. +* **Observation & Reward** – the + :class:`lab.outlet.protocols.ObservationBuilder` constructs the censored view + exposed to the agent, while a :class:`lab.outlet.protocols.Objective` + transforms :class:`lab.outlet.types.StepMetrics` into a scalar reward with an + optional breakdown per term. + +These components are orchestrated by :class:`lab.outlet.platform.Platform`, +which manages internal hidden state, deterministic seeding, and logging. + +Component Matrix +---------------- + +=============================== ============================================== +Layer Responsibilities / Examples +=============================== ============================================== +Mechanisms Quote normalisation, execution semantics + (`posted_price`, `two_sided`, `auction`). +Population models Arrivals (:mod:`lab.population.arrivals`), + execution probability models + (:mod:`lab.population.execution`), and + competitor or market dynamics + (:mod:`lab.population.competitors`). +Position management Inventory limits, replenishment, holding and + shortage costs (:mod:`lab.outlet.stock`). +Observation & logging Censored observations and optional event logs + (:mod:`lab.outlet.observation`). +Objectives Reward composition utilities + (:mod:`lab.outlet.objectives`). +Experiments Rollout helpers, baseline policies, off-policy + evaluation (:mod:`lab.experiments.eval`). +=============================== ============================================== + +Preconfigured Platforms +----------------------- + +Two high-level factories in :mod:`lab.config` wire common combinations of the +building blocks: + +* **Retail dynamic pricing** – posted-price mechanism, session arrivals with + contamination, elasticity-based executions, reactive competitor model, and a + composite objective that penalises volatility, holding costs, and lost + opportunities. +* **Market making** – two-sided quoting, Hawkes order flow, intensity-based + executions, geometric Brownian motion mid-prices, and an objective combining + PnL, spread capture, and quadratic inventory risk. + +State & Reset Behaviour +----------------------- + +When you call :meth:`lab.outlet.platform.Platform.reset`, the platform resets +instrument positions, quotes, and hidden state, but component implementations +may maintain their own internal buffers. For reproducible experiments: + +* Reuse freshly instantiated arrival/market models per episode, or add explicit + ``reset`` methods if the model keeps history (for example, + :class:`lab.population.arrivals.HawkesArrivalModel` maintains an event + history, while :class:`lab.population.competitors.ReactiveCompetitorModel` + tracks prior competitor quotes). +* Seed randomness through the factory configuration (``RetailConfig.seed`` or + ``MarketMakingConfig.seed``) or pass a seed to ``Platform.reset`` for + deterministic rollouts. + +Extending the Platform +---------------------- + +To support a new domain: + +1. Create custom Mechanism/Arrival/Execution/Market/Observation components by + implementing the respective protocol in :mod:`lab.outlet.protocols`. +2. Compose a new objective with + :func:`lab.outlet.objectives.factory.make_composite` or write a bespoke + :class:`lab.outlet.objectives.base.BaseObjective`. +3. Wire everything together via :class:`lab.outlet.platform.Platform` directly + or expose a helper factory in :mod:`lab.config`. + +Use :func:`lab.experiments.rollout` and +:func:`lab.experiments.compare_policies` to benchmark candidate policies under +multiple random seeds, collecting per-step logs for analysis or OPE.