From ea5e4326726abc9305978bf7fbc1975cde3c7b4b Mon Sep 17 00:00:00 2001
From: Daniel Rosel <daniel@alves.world>
Date: Fri, 9 Jan 2026 18:22:17 +0100
Subject: [PATCH] fix: align

---
 paper/src/chapters/03-methodology.tex | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/paper/src/chapters/03-methodology.tex b/paper/src/chapters/03-methodology.tex
index cc2eb61..c8af6ec 100644
--- a/paper/src/chapters/03-methodology.tex
+++ b/paper/src/chapters/03-methodology.tex
@@ -32,7 +32,6 @@ where $\alpha \in [0, 1]$ represents the contamination parameter (proportion of
 
 
 
-
 \subsection{Cost of Information (COI) Framework}
 
 The \textit{Cost of Information} (COI) represents the markup a pricing policy $\pi$ attempts to extract from the market by leveraging demand signals. We define COI as the expected premium over the minimum viable price $\underline{p}$ (or marginal cost). This also speaks to the financial urgency as a consequence of information asymmetry between the platform and the actors.
@@ -40,8 +39,8 @@ The \textit{Cost of Information} (COI) represents the markup a pricing policy $\
 \begin{definition}[Cost of Information]
 Let $\pi(\tau)$ be a pricing policy mapping interaction histories to prices. The COI is defined as:
 \begin{align}
-    \text{COI} &= \mathbb{E}[P] - \underline{p} \\
-               &= \int_{\underline{p}}^{\bar{p}} (1 - F_\pi(p)) \, dp
+\text{COI} &= \mathbb{E}[P] - \underline{p} \\
+            &= \int_{\underline{p}}^{\bar{p}} (1 - F_\pi(p)) \, dp
 \end{align}
 where $F_\pi(p)$ is the cumulative distribution function of prices generated by $\pi$ under standard operating conditions.
 \end{definition}
@@ -183,8 +182,10 @@ Study methodology and approach. Data acquisition strategy. Defined objectives an
 
 To develop a robust pricing agent, we require a simulation environment capable of generating realistic, contaminated interaction data. We achieve this by learning from our Phantom platform data using a two-stage approach.
 
+
+
 \subsubsection{GOFAI-Based Separability}
-We employ Good Old-Fashioned AI (GOFAI) heuristics to generate initial weak labels for separability. We define a set of rule-based predicates $\phi_j: \tau \to \{0, 1\}$ (e.g., inter-arrival time consistency, DOM-traversal linearity) to partition the dataset $\mathcal{D}$ into high-confidence sets $\mathcal{D}_H$ and $\mathcal{D}_A$.
+We employ Good Old-Fashioned AI (GOFAI) heuristics to generate initial weak labels for separability. We define a set of rule-based predicates $\phi_j: \tau \to \{0, 1\}$ to partition the dataset $\mathcal{D}$ into high-confidence sets $\mathcal{D}_H$ and $\mathcal{D}_A$. We construct distinct MDPs per each behavioral profile of humans and agents and from those we establish $D_{KL}$.
 
 \subsubsection{Transition Probability Estimation}
 For both subsets, we model the session dynamics as a Markov Decision Process (MDP) and estimate the transition kernel $\mathcal{T}$. The probability of transitioning to state $s'$ given state $s$ is estimated via maximum likelihood:
@@ -225,4 +226,4 @@ Steve Burns, superior culliculus (face heuristics) we create this sort of part o
 
 We could say that a DQN for example is the learnin subsystem and then within our reward mechanism or some other computational method we introduce a steering subsystem which acts as the proposed ``pricing heuristic'' against the given non human transaction data.
 
-\section{}
+\section{Market construction}