Paper first fillout (#39)

* initial environemnt definitions

* high level defintion

* formlating the reward simply

* improved implementation

* tailored docker compose image for secondary tenaordboard

* preliminary desriptions and babble

* details on formulation and defintion of agent and its loop

* typos one

* more grammar issues

* fluidity improvements and refactors

* more decluttering and dnoising

* finalizing introduction review

* some methodology

* somehow this disappeared

* bit more of this and that

* methodology of how we do architectuer and online DP

* fix: compilation

* expanding on the taxonomy and economic references

* authoer notes

* acks + google GCP

* making space w new format nada lit review

* stronger lit review and more sources

* forgot about tables and graphs

* dedupe citations

* adding cloudflare

* fixing env vars

* updating docs with url

* upating embed

* fixing the url

* paper badge

* formaliztaion of rewards and adding definitions

* noisy formulations

* connecting some more dots here

* adding significant weight in prices

* fixing error

* fixing typos and consistency

* extra math formulations and refferenceot DRO

* fixing diagram of loops

* github mindmap

* fixing erro and thiknig about big picture

* enhancing the website

* goals methodology and gitignore

* some more references and theory links

* talking about some wtp

* feature: added wordcounter

* forcing latex builds and fixining the bib #

* refactor: update Cost of Information equations and notation for clarity

* some more math and refactors

* refactor: unify notation and improve clarity in COI equations

* refactor: generalize master function for demand estimation and pricing strategies

* we dont like math but we have to do it :(

* refactor: enhance Cost of Information framework with additional context and illustration

* refactor: enhance literature review and methodology sections with economic theory insights and system architecture details

* alining format to fit the rubric

* refactoring bibliography

* fix: align

* mdp additionally

* trying different title

* adding balance figure

* agentic givergence, finally

* fix: figure fonts adjusted to match
This commit is contained in:
Daniel Alves Rösel
2026-01-13 17:07:29 +01:00
committed by GitHub
parent 221e71a503
commit a9d73ccce5
24 changed files with 1656 additions and 107 deletions

View File

@@ -1,15 +1,44 @@
\section{Literature Review}
\subsection{Foundational Concepts}
To better understand all wedges of the work, we must start by exploring the nature of agents and agentic computer use and web automation, complementing that with economic reasoning and strategic interaction. The final surface to cover, leads us to data-driven dynamic pricing under uncertainty. The key technical risk is not ``agents buying things'' per se, but agents shaping the behavioral and demand signals that downstream pricing systems consume and depend on. The introduction of these mediating actor entities into economic systems, is further creating a threat of false-name bidding \cite{yokoo_effect_2004}, which prior research has explored in a trading context. Other research on pseudonyms in dynamic systems, demonstrate whitewashing in AI agents which can ignore defensive mechanisms by re-entry with different identities \cite{feldman_free-riding_2004}. Dynamic pricing assumes demand proxies are behaviorally meaningful, while bot detection aims at security and access control. The missing bridge is a principled framework for separating non-human reconnaissance from genuine human demand expression and integrating that separation into pricing heuristics without degrading legitimate user experience (in our research tracked by the user-experience index). This gap, is what our contribution aims to address, particularly for the aforementioned stakeholder groups.
\subsection{Agent Taxonomy and Definitions}
An agent in the context of artificial intelligence is generally defined by anything that can reason and act upon observations of its environments (collected through some sensory inputs) and carry out actions through effectors. Moreover, a rational agent is an entity that is capable of perceiving the world around them and taking actions to advance specified goals. This definition by \cite{russell_artificial_nodate} is further developed in an economic context by \cite{parkes_economic_2015}, suggesting AI research attempts to construct a synthetic \textit{homo economicus}, which may also be termed \textit{machina economicus}.
A specific class or taxon of this \textit{machina economicus}, the Large Language Model (LLM) agent, is defined as an autonomous system capable of achieving goals and adapting post-training, often without needing explicit code or fundamental model changes. \cite{xia_evaluation-driven_2025}
We must however acknowledge the current SOTA as presented by OSWORLD simulations in \cite{xie_osworld_nodate} have demonstrated that multi-modal tasks across desktop and web interaction modes, have a top-performing score of only 12.24\% success, whereas humans have a higher 72\% success rate. This weakness matters for this research because it clarifies the near-term threat model: practical exploitation does not require a fully competent ``computer assistant'', only enough automation to perform high-volume reconnaissance actions (search/filter/open product pages, probe availability/price boundaries) that can contaminate behavioral signals. With the expected growth of these capabilities, this threat only becomes more perilous to revenue management systems.
We model an agent session as producing some events with lower in-session conversion levels relative to humans, this we state in our assumption that $P(\text{purchase} \vert A) \ll P(\text{purchase} \vert H)$ but with a potentially higher volatility in $\hat{q}$, which we observe through the look-to-book metrics in our simulation.
\subsection{Economic Agents: From Homo Economicus to Machina Economicus}
Existing behavioral economic models tend to be criticized for the assumption of rational behavior, as is embodied in the term of homo economicus. The definition of a machina economicus by \cite{parkes_economic_2015} is quite appropriate for our case, particularly because these assumptions of rationality have been argued to be a very adequate reference for AI research by \cite{varian_economic_1995}. For modeling this behavior, the trajectories of these agents can be formally defined to be partially observable Markov decision processes. \cite{xie_osworld_nodate} Agents are however not to be confused with web-bots which have previously been known as automated software applications or scrapers which are set with a purpose of carrying out specific tasks on the internet, without a higher level of internal judgement. \cite{imperva_rapid_2025} In our research, we refer to this actor simply as an Agent belonging to the distribution $A$.
This economic framing also helps separate two related but distinct phenomena of agents as buyers (changing market demand composition), and agents as information gatherers (changing the observed interactions used by pricing/recommendation systems). The thesis focuses on the second, where information acquisition strategically precedes purchase execution. We do not however dismiss the proposed expectation that existing economic systems serving humans, will not be populated by AIs across multiple channels and with various possibly misaligned goals as stated by \cite{parkes_economic_2015}.
What is the taxonomy and definition of an agent and an actor in this case, a bit more about interaction models in sessions and about dynamic pricing algorithms.
\subsection{Problem Evidence and Market Impact}
Documented instances of agent-driven market disruptions - Quantitative evidence of pricing manipulation - Case studies from affected industries
\subsection{Theoretical Foundations: Economic Prallels}
The statistical issue of contamination in dynamic pricing systems that observe demand features as a means to update prices has been documented in various previous contexts. The airline industry (which has accounted for 24\% of observed disruptions) has seen malicious activity with a measureable impact on skewing key performance indicators by behavior visible in the look-to-book metrics. Excessive reconnaissance traffic inflates search volume without corresponding completed bookings, thereby skewing demand forecasts and disrupting dynamic pricing models. Demand proxies have also been observed to cause significant threat to inventory management by creating artificial scarcity that distorts the demand-supply relationships in the enterprise model. Censored demand as shown in \cite{amjad_censored_2017} can also be observed in low-bias demand under-estimation caused by a distortion effect coming from non-human traffic data. \cite{imperva_rapid_2025}
When dynamic pricing algorithms operate on highly contaminated or noisy data, the risk grows significantly in creating inaccurate price inferences. The emergent mitigation driven by un-informed reward and regret signals might lead to price suppression for sales continuity which results in harming margins and resulting in a revenue loss. System that poorly fit undesired behavior might result in price gouging, which calls for strong guardrails while preserving targeted business strategy. \cite{mullapudi_reinforcement_nodate}
%Documented instances of agent-driven market disruptions - Quantitative evidence of pricing manipulation - Case studies from affected industries
\subsection{Theoretical Foundations: Economic Parallels}
Early hints of exploration of prices in a standard English auction explored in \cite{varian_economic_1995} which hints at exploration of prices in a sequential manner, which leads to a marginally different cost to the bidder than the reservation price of the seller. This is a setting in which there is no cost incured by the buyer for their actions or exploring prices in the market. They propose that any agent responsable for the pricing of a good must be imune to dynamic strategies which might extract private information from a market. A key take-away which relates to the Vickery auction mechanism (also called a \textit{direct mechanism}) suggests that not only would defenses against such exploitation be necessary, but the construction of a mechanism in which revelation of the true willingness to pay is the dominant strategy for commerce.
Like in classical revenue-maximizing auctions \cite{roughgarden_cs364a_2013} we assume that the human actor in our system has a private valuation $v$ which we formally draw from later defined distributions. The important note here is that the agent proxy does not have a mechanism to convey this private information into the demand data which directly impacts the pricing systems.
% Economic foundations: relating the problem to options pricing theory. Cost of Information (COI) concept and its relevance
% Link Coasean Singularity and other economic market theory and highlight specific information of supra competitive pricing.
Economic foundations: relating the problem to options pricing theory. Cost of Information (COI) concept and its relevance
\subsection{Landscape of Existing Work}