fix: typos and flow

This commit is contained in:
2026-04-09 20:25:37 +02:00
parent 3eea137f49
commit a3dc5125df
2 changed files with 4 additions and 4 deletions

View File

@@ -21,6 +21,6 @@ Behavior-based pricing raises predictable ethics questions when models are opaqu
We balance human and agent sessions near one-to-one so cohorts are comparable despite different population sizes. The row-level dataset still contains thousands of events.
% Rapid change in agent capabilities and user expectations induces model drift; the UX term in reward shaping was included partly to penalize policies that sacrifice legitimate users for short-run revenue. Reinforcement learning adds its own risks---reward hacking and limited interpretability---which matter when policies touch live revenue; deployment would require monitoring and constraints beyond what we exercised here.
With the exponential growth in capability of agents aswell as user expectations, a degree of model drift is expected in this setting. The computational requirements for continuous extraction of margin as demonstrated by our work are required by the persistent speed of the market. Reinforcement learning that sacrifices legitimate user experience for short run revenue does not hold up in the long run. Reward hacking, to which pricing algorithms are not impervious due to their limited interpretability is a significant risk for a company if live revenue is in play. Deployment requires consistent monitoring and constraints beyond what was done as exercise in this work.
With the exponential growth in capability of agents aswell as user expectations, a degree of model drift is expected in this setting. The computational requirements for continuous extraction of margin as demonstrated by our work are required by the persistent speed of the market. Reinforcement learning that sacrifices legitimate user experience for short run revenue does not hold up in the long run. Reward hacking, to which pricing algorithms are not impervious due to their limited interpretability, is a significant risk for a company if live revenue is in play. Deployment requires consistent monitoring and constraints beyond what was done as an exercise in this work.
% \subsection{Implications of Findings} Interpretation of results and altenrative scenarios with broader market implications.