mirror of
https://github.com/velocitatem/PHANTOM.git
synced 2026-05-31 16:43:36 +00:00
agentic givergence, finally
This commit is contained in:
@@ -185,15 +185,22 @@ To develop a robust pricing agent, we require a simulation environment capable o
|
|||||||
|
|
||||||
|
|
||||||
\subsubsection{GOFAI-Based Separability}
|
\subsubsection{GOFAI-Based Separability}
|
||||||
We employ Good Old-Fashioned AI (GOFAI) heuristics to generate initial weak labels for separability. We define a set of rule-based predicates $\phi_j: \tau \to \{0, 1\}$ to partition the dataset $\mathcal{D}$ into high-confidence sets $\mathcal{D}_H$ and $\mathcal{D}_A$. We construct distinct MDPs per each behavioral profile of humans and agents and from those we establish $D_{KL}$.
|
We employ Good Old-Fashioned AI (GOFAI) heuristics to generate initial weak labels for separability. We define a set of rule-based predicates $\phi_j: \tau \to \{0, 1\}$ to partition the dataset $\mathcal{D}$ into high-confidence sets $\mathcal{D}_H$ and $\mathcal{D}_A$. We construct distinct MDPs per each behavioral profile of humans and agents and from those we establish $D_{KL}$. From initial findings we compute a KL divergence of $\approx 2.0236$ across transition probabilities between states which can be seen in \ref{fig:human_mdp_viz} and \ref{fig:agent_mdp_viz}.
|
||||||
|
|
||||||
\begin{figure}[ht]
|
\begin{figure}[ht]
|
||||||
\centering
|
\centering
|
||||||
\includegraphics[width=0.8\textwidth]{chapters/mdp_human.pdf}
|
\includegraphics[width=0.8\textwidth]{chapters/mdp_human.pdf}
|
||||||
\caption{Markov Decision Process visualization illustrating the behavioral transition dynamics for human and agent actor profiles. The state space and transition probabilities are learned from observed session trajectories to enable generative contamination.}
|
\caption{Markov Decision Process visualization illustrating the behavioral transition dynamics for human actions.}
|
||||||
\label{fig:mdp_viz}
|
\label{fig:human_mdp_viz}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
|
\begin{figure}[ht]
|
||||||
|
\centering
|
||||||
|
\includegraphics[width=0.8\textwidth]{chapters/mdp_agent.pdf}
|
||||||
|
\caption{Markov Decision Process visualization illustrating the behavioral transition dynamics for \textbf{agent} behavior profiles. The state space and transition probabilities are learned from observed session trajectories to enable generative contamination.}
|
||||||
|
\label{fig:agent_mdp_viz}
|
||||||
|
\end{figure}
|
||||||
|
|
||||||
\subsubsection{Transition Probability Estimation}
|
\subsubsection{Transition Probability Estimation}
|
||||||
For both subsets, we model the session dynamics as a Markov Decision Process (MDP) and estimate the transition kernel $\mathcal{T}$. The probability of transitioning to state $s'$ given state $s$ is estimated via maximum likelihood:
|
For both subsets, we model the session dynamics as a Markov Decision Process (MDP) and estimate the transition kernel $\mathcal{T}$. The probability of transitioning to state $s'$ given state $s$ is estimated via maximum likelihood:
|
||||||
\begin{equation}
|
\begin{equation}
|
||||||
|
|||||||
BIN
paper/src/chapters/mdp_agent.pdf
Normal file
BIN
paper/src/chapters/mdp_agent.pdf
Normal file
Binary file not shown.
Reference in New Issue
Block a user