mirror of
https://github.com/velocitatem/PHANTOM.git
synced 2026-06-01 00:53:36 +00:00
fix: typos and flow
This commit is contained in:
@@ -21,6 +21,6 @@ Behavior-based pricing raises predictable ethics questions when models are opaqu
|
||||
We balance human and agent sessions near one-to-one so cohorts are comparable despite different population sizes. The row-level dataset still contains thousands of events.
|
||||
|
||||
% Rapid change in agent capabilities and user expectations induces model drift; the UX term in reward shaping was included partly to penalize policies that sacrifice legitimate users for short-run revenue. Reinforcement learning adds its own risks---reward hacking and limited interpretability---which matter when policies touch live revenue; deployment would require monitoring and constraints beyond what we exercised here.
|
||||
With the exponential growth in capability of agents aswell as user expectations, a degree of model drift is expected in this setting. The computational requirements for continuous extraction of margin as demonstrated by our work are required by the persistent speed of the market. Reinforcement learning that sacrifices legitimate user experience for short run revenue does not hold up in the long run. Reward hacking, to which pricing algorithms are not impervious due to their limited interpretability is a significant risk for a company if live revenue is in play. Deployment requires consistent monitoring and constraints beyond what was done as exercise in this work.
|
||||
With the exponential growth in capability of agents aswell as user expectations, a degree of model drift is expected in this setting. The computational requirements for continuous extraction of margin as demonstrated by our work are required by the persistent speed of the market. Reinforcement learning that sacrifices legitimate user experience for short run revenue does not hold up in the long run. Reward hacking, to which pricing algorithms are not impervious due to their limited interpretability, is a significant risk for a company if live revenue is in play. Deployment requires consistent monitoring and constraints beyond what was done as an exercise in this work.
|
||||
|
||||
% \subsection{Implications of Findings} Interpretation of results and altenrative scenarios with broader market implications.
|
||||
|
||||
Reference in New Issue
Block a user