From ea5e4326726abc9305978bf7fbc1975cde3c7b4b Mon Sep 17 00:00:00 2001 From: Daniel Rosel Date: Fri, 9 Jan 2026 18:22:17 +0100 Subject: [PATCH] fix: align --- paper/src/chapters/03-methodology.tex | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/paper/src/chapters/03-methodology.tex b/paper/src/chapters/03-methodology.tex index cc2eb61..c8af6ec 100644 --- a/paper/src/chapters/03-methodology.tex +++ b/paper/src/chapters/03-methodology.tex @@ -32,7 +32,6 @@ where $\alpha \in [0, 1]$ represents the contamination parameter (proportion of - \subsection{Cost of Information (COI) Framework} The \textit{Cost of Information} (COI) represents the markup a pricing policy $\pi$ attempts to extract from the market by leveraging demand signals. We define COI as the expected premium over the minimum viable price $\underline{p}$ (or marginal cost). This also speaks to the financial urgency as a consequence of information asymmetry between the platform and the actors. @@ -40,8 +39,8 @@ The \textit{Cost of Information} (COI) represents the markup a pricing policy $\ \begin{definition}[Cost of Information] Let $\pi(\tau)$ be a pricing policy mapping interaction histories to prices. The COI is defined as: \begin{align} - \text{COI} &= \mathbb{E}[P] - \underline{p} \\ - &= \int_{\underline{p}}^{\bar{p}} (1 - F_\pi(p)) \, dp +\text{COI} &= \mathbb{E}[P] - \underline{p} \\ + &= \int_{\underline{p}}^{\bar{p}} (1 - F_\pi(p)) \, dp \end{align} where $F_\pi(p)$ is the cumulative distribution function of prices generated by $\pi$ under standard operating conditions. \end{definition} @@ -183,8 +182,10 @@ Study methodology and approach. Data acquisition strategy. Defined objectives an To develop a robust pricing agent, we require a simulation environment capable of generating realistic, contaminated interaction data. We achieve this by learning from our Phantom platform data using a two-stage approach. + + \subsubsection{GOFAI-Based Separability} -We employ Good Old-Fashioned AI (GOFAI) heuristics to generate initial weak labels for separability. We define a set of rule-based predicates $\phi_j: \tau \to \{0, 1\}$ (e.g., inter-arrival time consistency, DOM-traversal linearity) to partition the dataset $\mathcal{D}$ into high-confidence sets $\mathcal{D}_H$ and $\mathcal{D}_A$. +We employ Good Old-Fashioned AI (GOFAI) heuristics to generate initial weak labels for separability. We define a set of rule-based predicates $\phi_j: \tau \to \{0, 1\}$ to partition the dataset $\mathcal{D}$ into high-confidence sets $\mathcal{D}_H$ and $\mathcal{D}_A$. We construct distinct MDPs per each behavioral profile of humans and agents and from those we establish $D_{KL}$. \subsubsection{Transition Probability Estimation} For both subsets, we model the session dynamics as a Markov Decision Process (MDP) and estimate the transition kernel $\mathcal{T}$. The probability of transitioning to state $s'$ given state $s$ is estimated via maximum likelihood: @@ -225,4 +226,4 @@ Steve Burns, superior culliculus (face heuristics) we create this sort of part o We could say that a DQN for example is the learnin subsystem and then within our reward mechanism or some other computational method we introduce a steering subsystem which acts as the proposed ``pricing heuristic'' against the given non human transaction data. -\section{} +\section{Market construction}