feat: improved discussion

2026-07-15 17:43:36 +00:00 · 2026-04-09 18:19:55 +02:00
parent 0ff0c0432c
commit 895a004807
6 changed files with 48 additions and 23 deletions
--- a/paper/src/chapters/03-methodology.tex
+++ b/paper/src/chapters/03-methodology.tex
@@ -187,7 +187,17 @@ The interface is organized as a product catalog where each product belongs to a
 Since users act with motivations, we define a pool of tasks (jobs to be done) and assign tasks randomly to participants.
 We discuss limitations and choices made in this experimental design in Section~\ref{sec:limitations_risks}.
-The task pool is stored as a structured table with fields \texttt{id}, \texttt{created\_at}, \texttt{task\_name}, \texttt{task\_description}, and \texttt{task\_def\_of\_done}. We formulate the tasks as compact jobs-to-be-done rather than as rigid instructions, because the target is to elicit realistic browsing and comparison behavior which can capture nuance of different people. In hotel mode the assigned tasks include \textit{Cheapest Room}, \textit{Cheapest Room w/ View}, \textit{MultiStep Cheapest Room}, \textit{The Digital Nomad (Executive)}, and \textit{The 3-Way Tradeoff (Desk + Quiet + Flexible)}. These prompts deliberately require critical thought in search, inspection of room details, comparison of amenities or images, return visits to the listing page, and a final booking decision which create a degree of cognitive load. In airline mode we use \textit{Last-Minute One-Way Flight} or \textit{Family/Work Emergency Travel}, where the actor must urgently travel to LAX from either SEA or JFK within the next 1 to 3 days, inspect at least a small set of candidate itineraries, and then book a reasonable earliest departure.
+The task pool is stored as a structured table with fields \texttt{id}, \texttt{created\_at}, \texttt{task\_name}, \texttt{task\_description}, and \texttt{task\_def\_of\_done}. We formulate the tasks as compact jobs-to-be-done rather than as rigid instructions, because the target is to elicit realistic browsing and comparison behavior which can capture nuance of different people. In hotel mode the assigned tasks include \textit{Cheapest Room}, \textit{Cheapest Room w/ View}, \textit{MultiStep Cheapest Room}, \textit{The Digital Nomad (Executive)}, and \textit{The 3-Way Tradeoff (Desk + Quiet + Flexible)}. These prompts deliberately require critical thought in search, inspection of room details, comparison of amenities or images, return visits to the listing page, and a final booking decision which create a degree of cognitive load. In airline mode we use \textit{Last-Minute One-Way Flight} or \textit{Family/Work Emergency Travel}, where the actor must urgently travel to LAX from either SEA or JFK within the next 1 to 3 days, inspect at least a small set of candidate itineraries, and then book a reasonable earliest departure. Figure~\ref{fig:exp_design_tree} summarizes the assignment tree.
 \begin{figure}[ht]
  \centering
  \resizebox{0.88\columnwidth}{!}{%
    \input{chapters/figures/experiment_design_tree.tex}
  }
  \caption{Experimental design decision tree for participant assignment.}
  \label{fig:exp_design_tree}
 \end{figure}
 A representative task is to find the cheapest feasible catalog item under explicit constraints while removing strict financial limits so we avoid trivial optimization behavior. Participants are also randomly assigned to one experimental platform mode (hotel or airline). Once assigned, they are dropped into the experiment with an actor ID. Under each experiment ID, we can observe multiple sessions across time and gather long interaction traces for the same actor. This de-risks our lower sample size of individuals by allowing broad interaction data to come from each one.
 The human data collection involved 13 participants, all of whom provided explicit informed consent prior to their session. Participants had an average age of 21 years and were recruited from a university population. Alongside the 13 human sessions we ran 16 agent sessions of equivalent task scope, yielding 29 labeled trajectories in total (45\% human, 55\% agent). Each participant was assigned a single platform mode and a single task drawn from the pool, and completed the session independently without guidance on navigation or pricing strategy.
--- a/paper/src/chapters/05-discussion.tex
+++ b/paper/src/chapters/05-discussion.tex
@@ -1,31 +1,25 @@
 \section{Discussion}
 % TODO: Gpdr here
 \subsection{Transition to Agentic Market Microstructure}
-Our analysis of interaction dynamics between the platform and non-human actors suggests that static posted-price models are a weak match for an economy in which software agents mediate search and purchase. If one pushes toward direct-revelation or auction-like pricing, volatility rises: prices behave more like traded claims than like sticky retail quotes, though without the fungibility of securities.
+Our analysis of the interaction dynamics between the platform and non-human actors suggests that the current static pricing models are insufficient for an agent-mediated economy. If we assume a transition toward a direct revelation mechanism, where actors must reveal their true valuation of a good through bidding dynamics, we inevitably introduce significant stochasticity into the pricing system. Unlike traditional e-commerce where prices are relatively sticky, such a mechanism implies a high volatility characteristic of financial equity markets (without the fungibility however).
-E-commerce goods differ from financial assets in a hard way: unit economics and reservation values set a floor. The market might ``want'' an iPhone at \$1, however that is not permissible. Pricing therefore needs an anchor $P_{0}$ (cost plus target margin) around which offers may move. In that setting, large language model (LLM) agents resemble institutional liquidity providers: they quote, probe, and clear subsets of flow. As autonomy of agentic systems increases, end users may delegate browsing and checkout to assistants rather than to retailer sites directly, which shifts where demand signals originate. The scenario presumes agents eventually hold delegated payment authority; until then, our results bound a near-term reconnaissance-heavy regime.
+However, e-commerce commodities differ fundamentally from financial securities: they possess a hard floor defined by unit economics and reservation prices. The market might react enthusiastically to an iPhone priced at \$1. Such a transaction is not permissible. The platform must establish an initial valuation anchor ($P_0$) defined by the marginal cost plus a target margin, around which the market price is permitted to fluctuate.
 We float the introduction of GenAI Agents as Institutional Market Makers. As the arms race for greater autonomy of agentic systems grows, the commercial viability of AI agents has the potential to disseminate into everyday users directly interacting with them rather than e-commerce platforms. This is also under the assumption of expected transactional capabilities being given to AI Agents.
 \subsection{Risk Assessment and Limitations}
 \label{sec:limitations_risks}
-Behavior-based pricing raises predictable ethics questions when models are opaque: a behavioral profile can become a basis for price discrimination or exclusion if deployed without governance. Universal behavioral profile modeling (UBPM) in recommendation already shows how fine-grained traces enable strong personalization; the same machinery applied to prices needs guardrails.
+Behavior-based pricing raises predictable ethics questions when models are opaque: a behavioral profile can become a basis for price discrimination or exclusion if deployed without governance. Universal behavioral profile modeling (UBPM) in recommendation already shows how fine-grained traces enable strong personalization. The same machinery applied to prices needs guardrails.
 In our experiments participants are randomized to platform mode and task. Figure~\ref{fig:exp_design_tree} summarizes the assignment tree.
-\begin{figure}[ht]
+We balance human and agent sessions near one-to-one so cohorts are comparable despite different population sizes. The row-level dataset still contains thousands of events.
  \centering
  \resizebox{0.92\columnwidth}{!}{%
    \input{chapters/figures/experiment_design_tree.tex}
  }
  \caption{Experimental design decision tree for participant assignment.}
  \label{fig:exp_design_tree}
 \end{figure}
-The human sample is small but each session is long-form; we balance human and agent sessions one-to-one so cohorts are comparable despite different population sizes. The row-level dataset still contains thousands of events.
+% Rapid change in agent capabilities and user expectations induces model drift; the UX term in reward shaping was included partly to penalize policies that sacrifice legitimate users for short-run revenue. Reinforcement learning adds its own risks---reward hacking and limited interpretability---which matter when policies touch live revenue; deployment would require monitoring and constraints beyond what we exercised here.
-
+With the exponential growth in capability of agents aswell as user expectations, a degree of model drift is expected in this setting. The computational requirements for continuous extraction of margin as demonstrated by our work are required by the persistent speed of the market. Reinforcement learning that sacrifices legitimate user experience for short run revenue does not hold up in the long run. Reward hacking, to which pricing algorithms are not impervious due to their limited interpretability is a significant risk for a company if live revenue is in play. Deployment requires consistent monitoring and constraints beyond what was done as exercise in this work.
 Rapid change in agent capabilities and user expectations induces model drift; the UX term in reward shaping was included partly to penalize policies that sacrifice legitimate users for short-run revenue. Reinforcement learning adds its own risks---reward hacking and limited interpretability---which matter when policies touch live revenue; deployment would require monitoring and constraints beyond what we exercised here.
 % \subsection{Implications of Findings} Interpretation of results and altenrative scenarios with broader market implications.
--- a/paper/src/chapters/figures/experiment_design_tree.tex
+++ b/paper/src/chapters/figures/experiment_design_tree.tex
@@ -1,9 +1,28 @@
 % Horizontal tree: level distance must exceed ~half parent + half child width or nodes overlap (resizebox does not fix that).
 \begin{tikzpicture}[
-  level distance=14mm,
+  grow=right,
-  sibling distance=36mm,
+  level distance=30mm,
-  decision/.style={rectangle, draw, rounded corners=2pt, align=center, minimum width=26mm, minimum height=8mm, font=\small},
+  sibling distance=23mm,
-  leaf/.style={rectangle, draw, align=center, minimum width=30mm, minimum height=8mm, font=\small},
+  decision/.style={
-  edge from parent/.style={draw, -{Latex[length=2mm]}}
+    rectangle,
    draw,
    rounded corners=1.5pt,
    align=center,
    inner sep=1.2pt,
    minimum width=14mm,
    minimum height=4.8mm,
    font=\scriptsize,
  },
  leaf/.style={
    rectangle,
    draw,
    align=center,
    inner sep=1.2pt,
    text width=19mm,
    minimum height=4mm,
    font=\scriptsize,
  },
  edge from parent/.style={draw, -{Latex[length=1.2mm]}},
 ]
 \node[decision] {Participant}
  child {
--- a/paper/src/chapters/mdp_agent.pdf
+++ b/paper/src/chapters/mdp_agent.pdf
--- a/paper/src/chapters/mdp_human.pdf
+++ b/paper/src/chapters/mdp_human.pdf
--- a/paper/src/mirrors/genpop/05-discussion.tex
+++ b/paper/src/mirrors/genpop/05-discussion.tex
@@ -2,9 +2,11 @@
 \subsection{Transition to Agentic Market Microstructure}
-Our analysis of the interaction dynamics between the platform and non-human actors suggests that the current static pricing models are insufficient for an agent-mediated economy. If we assume a transition toward a direct revelation mechanism, where actors must reveal their true valuation of a good through bidding dynamics, we inevitably introduce significant stochasticity into the pricing system. Unlike traditional e-commerce where prices are relatively sticky, such a mechanism implies a high volatility characteristic of financial equity markets (without the fungability however).
+Our analysis of the interaction dynamics between the platform and non-human actors suggests that the current static pricing models are insufficient for an agent-mediated economy. If we assume a transition toward a direct revelation mechanism, where actors must reveal their true valuation of a good through bidding dynamics, we inevitably introduce significant stochasticity into the pricing system. Unlike traditional e-commerce where prices are relatively sticky, such a mechanism implies a high volatility characteristic of financial equity markets (without the fungibility however).
-However, ecommerce commodities differ fundamentally from financial securities: they possess a hard floor defined by unit economics and reservation prices. The market might react enthusiastically to an iPhone priced at \$1, such a transaction is not permissible. The platform must establish an initial valuation anchor defined by the marginal cost plus a target margin, around which the market price is permitted to fluctuate. We float the introduction of GenAI Agents as Institutional Market Makers. As the arms race for greater autonomy of agnetic systems grows, the commercial viability of AI agents has the potential to disseminate into every-day users directly interacting with them rather than e-commerce platforms. This is also under the assumption of expected transactional capabilities being given to AI Agents.
+However, e-commerce commodities differ fundamentally from financial securities: they possess a hard floor defined by unit economics and reservation prices. The market might react enthusiastically to an iPhone priced at \$1. Such a transaction is not permissible. The platform must establish an initial valuation anchor ($P_0$) defined by the marginal cost plus a target margin, around which the market price is permitted to fluctuate.
 We float the introduction of GenAI Agents as Institutional Market Makers. As the arms race for greater autonomy of agentic systems grows, the commercial viability of AI agents has the potential to disseminate into everyday users directly interacting with them rather than e-commerce platforms. This is also under the assumption of expected transactional capabilities being given to AI Agents.
 \subsection{Risk Assessment and Limitations}