agentic givergence, finally

2026-07-16 01:53:37 +00:00 · 2026-01-13 16:10:10 +01:00
parent 4b392fc0a5
commit 21349c9387
2 changed files with 10 additions and 3 deletions
--- a/paper/src/chapters/03-methodology.tex
+++ b/paper/src/chapters/03-methodology.tex
@@ -185,15 +185,22 @@ To develop a robust pricing agent, we require a simulation environment capable o
 \subsubsection{GOFAI-Based Separability}
-We employ Good Old-Fashioned AI (GOFAI) heuristics to generate initial weak labels for separability. We define a set of rule-based predicates $\phi_j: \tau \to \{0, 1\}$ to partition the dataset $\mathcal{D}$ into high-confidence sets $\mathcal{D}_H$ and $\mathcal{D}_A$. We construct distinct MDPs per each behavioral profile of humans and agents and from those we establish $D_{KL}$.
+We employ Good Old-Fashioned AI (GOFAI) heuristics to generate initial weak labels for separability. We define a set of rule-based predicates $\phi_j: \tau \to \{0, 1\}$ to partition the dataset $\mathcal{D}$ into high-confidence sets $\mathcal{D}_H$ and $\mathcal{D}_A$. We construct distinct MDPs per each behavioral profile of humans and agents and from those we establish $D_{KL}$. From initial findings we compute a KL divergence of $\approx 2.0236$ across transition probabilities between states which can be seen in \ref{fig:human_mdp_viz} and \ref{fig:agent_mdp_viz}.
 \begin{figure}[ht]
    \centering
    \includegraphics[width=0.8\textwidth]{chapters/mdp_human.pdf}
-    \caption{Markov Decision Process visualization illustrating the behavioral transition dynamics for human and agent actor profiles. The state space and transition probabilities are learned from observed session trajectories to enable generative contamination.}
+    \caption{Markov Decision Process visualization illustrating the behavioral transition dynamics for human actions.}
-    \label{fig:mdp_viz}
+    \label{fig:human_mdp_viz}
 \end{figure}
 \begin{figure}[ht]
    \centering
    \includegraphics[width=0.8\textwidth]{chapters/mdp_agent.pdf}
    \caption{Markov Decision Process visualization illustrating the behavioral transition dynamics for \textbf{agent} behavior profiles. The state space and transition probabilities are learned from observed session trajectories to enable generative contamination.}
    \label{fig:agent_mdp_viz}
  \end{figure}
 \subsubsection{Transition Probability Estimation}
 For both subsets, we model the session dynamics as a Markov Decision Process (MDP) and estimate the transition kernel $\mathcal{T}$. The probability of transitioning to state $s'$ given state $s$ is estimated via maximum likelihood:
 \begin{equation}
--- a/paper/src/chapters/mdp_agent.pdf
+++ b/paper/src/chapters/mdp_agent.pdf