Paper lit review (#45)

* chore: updating apa citation and fixing citation in-text and parent * fixing in lit review * adjusting citations and improving schema * chore: fixed formating and adjusting other components * refined abstract * one page fitting * constrainative proposals * fix: syntax of transtion probs * refined lit review and soruces * research Objectives * adding logo graphics * chore: fixing citation completeness * updating with newly built algoerith * lit review document setup
2026-07-16 01:53:37 +00:00 · 2026-01-26 13:04:32 +01:00
parent a9d73ccce5
commit b5f19e04b7
9 changed files with 375 additions and 77 deletions
--- a/paper/src/main.tex
+++ b/paper/src/main.tex
@@ -1,41 +1,41 @@
 % -*- TeX-master: t -*-
 \documentclass[12pt,letterpaper]{article}

-\pagestyle{plain}
-
 \input{preamble}

 \begin{document}

-\title{Adversarially Distributionally Robust Optimization and Reinforcement Learning for Informed Dynamic Pricing under Strategic Demand Contamination}
-
-\author{
-  Daniel Rösel\thanks{Primary author and student researcher. Email: daniel@alves.world} \\
-  IE University, Madrid, Spain \\[1em]
-  Alberto Martín Izquierdo\thanks{Thesis advisor. Email: amartini@faculty.ie.edu} \\
-  IE University, Madrid, Spain
-}
-
-\date{\today}
-
-\maketitle
+\begin{titlepage}
+    \centering
+    \includegraphics[width=0.3\textwidth]{graphics/SST.png}\\[1cm]
+    \LARGE\textbf{PHANTOM: Pricing Heuristics Against Non-human Transaction Orchestration Mechanisms}\\[0.5cm]
+    \Large\textbf{Daniel Rösel}\\
+    \large\textit{Bachelor of Computer Science \& Artificial Intelligence}\\[0.5cm]
+    \Large\textit{Supervised by:}\\
+    \Large\textbf{Alberto Martín Izquierdo}\\
+    \large\textit{IE University, Madrid, Spain}\\[1cm]
+    \large\today
+\end{titlepage}

 \begin{abstract}
-The primary objective of this thesis is to develop and validate pricing heuristics that protect e-commerce platforms from systematic exploitation by Large Language Model (LLM) agents within dynamic pricing environments. As AI agents increasingly mediate consumer transactions, they enable users to circumvent the Cost of Information (the price premium accumulated through demand signal expression) by conducting reconnaissance in isolated sessions before executing purchases through clean sessions at base prices. This research will make an anticipatory contribution by adapting recommendation system methodologies to distinguish between genuine human browsing behavior and agent-orchestrated information gathering, thereby enabling pricing systems to maintain margin integrity without degrading the user experience for legitimate customers or getting rid of leads generated by LLMs.
+With accelerated growth of Lager Language Model agents in e-commerce a novel adversarial dynamic to digital markets emerges. This paper address the vulnerability of dynamic pricing systems to AI intermediaries that decouple the information gather stages from the transaction execution. By conducing reconnaissance isolates sessions, agents circumvent the ``Cost of Information'' (COI) defined as the accumulated price premium typically thought demand expression estimators.
+We formally define this phenomenon and derive the Cost of Information Theorem, proving that as the saturation of independent, utility-maximizing agents increases, the platform’s ability to sustain a COI converges to zero, rendering standard dynamic pricing mechanisms incentive-incompatible.
+To respond to this threat we propose a defensive framework which integrates behavioral economics with Adversarially Distributionally Robust Optimization (DRO). We introduce a custom e-commerce research platform built on hybrid Kappa-Lambda architecture, designed to capture and simulate high-fidelity controlled interaction trajectories. We further demonstrate through modeling that human and agent behaviors exhibit distinct transition probability kernels, enabling the construction of discriminative models based on Kullback-Leibler divergence.
+These behavioral signals serve as inputs for a Distributionally Robust Reinforcement Learning (DR-RL) agent. We formulate the pricing problem as a Stackelberg game where the learner optimizes against an ambiguity set of demand distributions defined by the Wasserstein distance. This approach allows the pricing policy to remain robust against non-stationary contamination without overfitting to deterministic demand curves. The research validates a mechanism for preserving margin integrity and market equilibrium in an agent-mediated economy, while minimizing degradation to the legitimate human user experience (UX).
 \end{abstract}

+\noindent\textbf{Keywords:} Dynamic Pricing, LLM Agents, Adversarial Machine Learning, E-commerce, Behavioral Detection, Reinforcement Learning

+\vspace{1em}
+\noindent\textbf{Acknowledgments:} Eugene Bykovets, PhD - ETH for helping with problem formulation. This research was supported by the TPU Research Cloud program.
+
+\clearpage
 \input{chapters/01-intro}
 \input{chapters/02-literature-review}
-\input{chapters/03-methodology}
-\input{chapters/04-results}
-\input{chapters/05-discussion}
-\input{chapters/06-conclusion}
-
-
-\section*{Acknowledgments}
-Eugene Bykovets, PhD - ETH for helping with problem formulation.
-Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC).
+% \input{chapters/03-methodology}
+% \input{chapters/04-results}
+% \input{chapters/05-discussion}
+% \input{chapters/06-conclusion}

 \printbibliography

@@ -46,6 +46,6 @@ Research supported with Cloud TPUs from Google's TPU Research Cloud (TRC).
 \item[Agent $A$] An actor of non-human nature, powered by an LLM.
 \item[Human $H$] An individual human with some job to be done.
 \end{description}
-\input{../build/concatenated_code}
+% \input{../build/concatenated_code}

 \end{document}