rephrasing some things and updating language

2026-05-31 16:43:36 +00:00 · 2026-04-09 09:30:23 +02:00
parent 47b07daa6c
commit ace52e8e14
11 changed files with 70 additions and 67 deletions
--- a/paper/src/main.tex
+++ b/paper/src/main.tex
@@ -18,14 +18,18 @@
 \end{titlepage}

 \begin{abstract}
-With accelerated growth of Large Language Model agents in e-commerce, a novel adversarial dynamic to digital markets emerges. This paper addresses the vulnerability of dynamic pricing systems to AI intermediaries that decouple the information gather stages from the transaction execution. By conducting reconnaissance in isolated sessions, agents circumvent the ``Cost of Information'' (COI) defined as the accumulated price premium typically via demand expression estimators. We formally define this phenomenon and derive the Cost of Information Theorem, proving that as the saturation of independent, utility-maximizing agents increases, the platform's ability to sustain a COI converges to zero, rendering standard dynamic pricing mechanisms incentive-incompatible. To respond to this threat, we propose a defensive framework which integrates behavioral economics with Adversarially Distributionally Robust Optimization (DRO). We introduce a custom e-commerce research platform built on a hybrid Kappa-Lambda architecture, designed to capture and simulate high-fidelity controlled interaction trajectories. We further demonstrate through modeling that human and agent behaviors exhibit distinct transition probability kernels, enabling the construction of discriminative models based on Kullback-Leibler divergence. These behavioral signals serve as inputs for a Distributionally Robust Reinforcement Learning (DR-RL) agent. We formulate the pricing problem as a Stackelberg game where the learner optimizes against an ambiguity set of demand distributions defined by the Wasserstein distance. This approach allows the pricing policy to remain robust against non-stationary contamination without overfitting to deterministic demand curves. Extensive TPU-accelerated factorial training demonstrates that while agent contamination causally reduces short-term revenue, our robust mechanism successfully preserves COI margin integrity and market equilibrium, particularly under higher contamination ratios and larger catalog sizes. Additionally, we show that integrating a balanced UX penalty drastically reduces supra-competitive pricing tendencies, minimizing degradation to the legitimate human user experience. Finally, we release our custom interaction framework and dataset as public artifacts to support future research in agent-mediated traffic.
+\noindent
+Large language model (LLM) agents are spreading in e-commerce; one consequence is intermediaries that can separate information gathering from transaction execution. This thesis studies dynamic pricing when agents reconnoitre in isolated sessions and thereby weaken the \emph{Cost of Information} (COI), the premium platforms typically extract once demand signals are expressed.
+
+We formalize the phenomenon and prove a Cost of Information theorem: as independent, utility-maximizing agents saturate price queries, the platform's sustainable COI goes to zero, so ordinary dynamic pricing is incentive-incompatible in the limit.
+
+The defensive design combines behavioral signals with distributionally robust optimization (DRO). We implement a controlled storefront on a hybrid Kappa--Lambda architecture and show that human and agent sessions induce different transition kernels. Kullback--Leibler divergence to class prototypes yields session scores that feed a distributionally robust reinforcement learning (DR-RL) policy, posed as a Stackelberg game with a Wasserstein ambiguity set over demand so the learner does not collapse to a single empirical demand curve under shifting contamination.
+
+Factorial training on TPUs shows the expected short-run revenue hit from contamination and that the robust objective recovers COI and equilibrium structure in harder regimes (higher contamination, larger catalogs), accounting for UX to prevent supra-competitive pricing. Code and an interaction dataset are released for work on agent-mediated traffic.
 \end{abstract}

 \noindent\textbf{Keywords:} Dynamic Pricing, LLM Agents, Adversarial Machine Learning, E-commerce, Behavioral Detection, Reinforcement Learning

-\vspace{1em}
-\noindent\textbf{Acknowledgments:} This research was supported by the TPU Research Cloud program, which provided access to Google Cloud TPU accelerators (including TPU v4, v5e, and v6e).
-
 \vspace{0.5em}
 \noindent\textbf{Project page:} \url{https://velocitatem.github.io/PHANTOM/}

@@ -37,6 +41,8 @@ With accelerated growth of Large Language Model agents in e-commerce, a novel ad
 \input{chapters/05-discussion}
 \input{chapters/06-conclusion}

+\input{chapters/acknowledgements}
+
 \printbibliography

 \clearpage