docs

2026-07-16 01:53:37 +00:00 · 2026-01-23 12:52:58 +01:00
parent 19bb4fd517
commit b0a1647956
3 changed files with 172 additions and 0 deletions
--- a/lab/README.md
+++ b/lab/README.md
@@ -1 +1,75 @@
 # MOS (Money Operating System)
 Research-grade quote-control simulator for studying dynamic pricing and market making policies.
 The system models pricing as a closed loop of **Quote → Arrival → Execution → Position**, enabling
 controlled experimentation with demand models, inventory constraints, and reward shaping.
 ## Core Loop
 1. **Quote** – the policy posts prices (one-sided or two-sided depending on the mechanism).
 2. **Arrival** – a population model generates purchase opportunities or market orders.
 3. **Execution** – an execution model decides whether an arrival converts at the quoted price.
 4. **Position** – inventory/position limits censor fills and generate holding/shortage costs.
 5. **Observation & Reward** – censored fills and aggregate metrics are exposed to the agent, while
   objectives turn metrics into a scalar reward.
 Each stage is pluggable via light-weight protocols so you can swap in alternative mechanisms,
 demand models, or objectives without rewriting the rest of the simulator.
 ## Package Layout
 | Module            | Purpose |
 |-------------------|---------|
 | `lab.outlet`      | Core simulation engine, domain types, pricing mechanisms, objectives. |
 | `lab.population`  | Demand arrival models, execution probability models, competitor/market dynamics. |
 | `lab.experiments` | Rollout utilities, baseline policies, and off-policy evaluation helpers. |
 | `lab.config`      | Convenience factories for preconfigured retail and market-making environments. |
 ## Preconfigured Scenarios
 ### Retail Dynamic Pricing
 - Mechanism: posted prices with margin and delta constraints.
 - Arrivals: browsing sessions with contamination support (scrapers).
 - Execution: elasticity model with competitor cross-effects.
 - Position: inventory tracking with holding and shortage costs.
 - Market: reactive competitor that can trigger price wars.
 - Objective: PnL minus volatility, holding cost, and lost opportunity penalties.
 ```python
 from lab.config import make_retail_platform
 from lab.experiments import rollout, fixed_price_policy
 platform = make_retail_platform()
 policy = fixed_price_policy(platform.instruments.refs)
 result = rollout(platform, policy, n_steps=100)
 print(result.total_pnl)
 ```
 ### Market Making
 - Mechanism: two-sided quoting with bid/ask spreads.
 - Arrivals: Hawkes order flow for clustered demand.
 - Execution: Avellaneda–Stoikov style intensity model.
 - Position: inventory risk limits and quadratic penalty objective.
 - Market: geometric Brownian motion mid-price process.
 - Objective: PnL plus spread capture minus inventory risk.
 ```python
 from lab.config import make_market_making_platform
 from lab.experiments import rollout
 platform = make_market_making_platform()
 mm_policy = lambda obs, t: (platform.instruments.refs, 1.0)
 result = rollout(platform, mm_policy, n_steps=200, seed=42)
 print(result.total_pnl)
 ```
 ## Extending the Simulator
 - Implement `lab.outlet.protocols.Mechanism` or `ArrivalModel` to introduce new pricing
 domains or demand processes.
 - Compose objectives with `lab.outlet.objectives.factory.make_composite` to study alternate
 reward formulations.
 - Use `lab.experiments.compare_policies` to benchmark candidate policies across multiple
 random seeds.
 Comprehensive API documentation lives in `lab/docs` (build with `make html`).
--- a/lab/docs/index.rst
+++ b/lab/docs/index.rst
@@ -28,6 +28,7 @@ Quick Start
   :maxdepth: 2
   :caption: Contents:
   system_overview
   modules/outlet
   modules/population
   modules/experiments
--- a/lab/docs/system_overview.rst
+++ b/lab/docs/system_overview.rst
@@ -0,0 +1,97 @@
 System Overview
 ===============
 The simulator organises dynamic pricing and market-making experiments as a
 closed loop with the following stages:
 * **Quote** – a policy or agent emits a :class:`lab.outlet.types.Quote`. The
  quote is normalised and validated by a concrete
  :class:`lab.outlet.protocols.Mechanism` implementation
  (posted-price, two-sided, auction).
 * **Arrival** – a :class:`lab.outlet.protocols.ArrivalModel` samples a stream of
  :class:`lab.outlet.types.Opportunity` objects given the current time,
  instrument catalogue, and market state.
 * **Execution** – the :class:`lab.outlet.protocols.ExecutionModel` converts an
  opportunity into a probabilistic fill using the active quote, optional
  competitor prices, and demand-side context.
 * **Position** – a :class:`lab.outlet.protocols.PositionModel` enforces
  inventory or position constraints, censors oversized fills, and accrues
  holding and shortage costs.
 * **Observation & Reward** – the
  :class:`lab.outlet.protocols.ObservationBuilder` constructs the censored view
  exposed to the agent, while a :class:`lab.outlet.protocols.Objective`
  transforms :class:`lab.outlet.types.StepMetrics` into a scalar reward with an
  optional breakdown per term.
 These components are orchestrated by :class:`lab.outlet.platform.Platform`,
 which manages internal hidden state, deterministic seeding, and logging.
 Component Matrix
 ----------------
 ===============================  ==============================================
 Layer                            Responsibilities / Examples
 ===============================  ==============================================
 Mechanisms                       Quote normalisation, execution semantics
                                 (`posted_price`, `two_sided`, `auction`).
 Population models                Arrivals (:mod:`lab.population.arrivals`),
                                 execution probability models
                                 (:mod:`lab.population.execution`), and
                                 competitor or market dynamics
                                 (:mod:`lab.population.competitors`).
 Position management              Inventory limits, replenishment, holding and
                                 shortage costs (:mod:`lab.outlet.stock`).
 Observation & logging            Censored observations and optional event logs
                                 (:mod:`lab.outlet.observation`).
 Objectives                       Reward composition utilities
                                 (:mod:`lab.outlet.objectives`).
 Experiments                      Rollout helpers, baseline policies, off-policy
                                 evaluation (:mod:`lab.experiments.eval`).
 ===============================  ==============================================
 Preconfigured Platforms
 -----------------------
 Two high-level factories in :mod:`lab.config` wire common combinations of the
 building blocks:
 * **Retail dynamic pricing** – posted-price mechanism, session arrivals with
  contamination, elasticity-based executions, reactive competitor model, and a
  composite objective that penalises volatility, holding costs, and lost
  opportunities.
 * **Market making** – two-sided quoting, Hawkes order flow, intensity-based
  executions, geometric Brownian motion mid-prices, and an objective combining
  PnL, spread capture, and quadratic inventory risk.
 State & Reset Behaviour
 -----------------------
 When you call :meth:`lab.outlet.platform.Platform.reset`, the platform resets
 instrument positions, quotes, and hidden state, but component implementations
 may maintain their own internal buffers. For reproducible experiments:
 * Reuse freshly instantiated arrival/market models per episode, or add explicit
  ``reset`` methods if the model keeps history (for example,
  :class:`lab.population.arrivals.HawkesArrivalModel` maintains an event
  history, while :class:`lab.population.competitors.ReactiveCompetitorModel`
  tracks prior competitor quotes).
 * Seed randomness through the factory configuration (``RetailConfig.seed`` or
  ``MarketMakingConfig.seed``) or pass a seed to ``Platform.reset`` for
  deterministic rollouts.
 Extending the Platform
 ----------------------
 To support a new domain:
 1. Create custom Mechanism/Arrival/Execution/Market/Observation components by
   implementing the respective protocol in :mod:`lab.outlet.protocols`.
 2. Compose a new objective with
   :func:`lab.outlet.objectives.factory.make_composite` or write a bespoke
   :class:`lab.outlet.objectives.base.BaseObjective`.
 3. Wire everything together via :class:`lab.outlet.platform.Platform` directly
   or expose a helper factory in :mod:`lab.config`.
 Use :func:`lab.experiments.rollout` and
 :func:`lab.experiments.compare_policies` to benchmark candidate policies under
 multiple random seeds, collecting per-step logs for analysis or OPE.