mirror of
https://github.com/velocitatem/PHANTOM.git
synced 2026-05-31 08:33:36 +00:00
fixing typos and inconsistencies
This commit is contained in:
@@ -83,7 +83,7 @@ In order for our research to have grounding in interactions we built a robust e-
|
|||||||
|
|
||||||
The architecture of this platform begins with the deployed web-apps posting interaction data to our backend which processes them and stores each ingested interaction into a kafka cluster. This serves as our data reservoir tracking and associating each interaction with its session and importantly with which experiment it belongs to. Not only do we track the behavioral interactions, but our pricing provider micro-service, once called by the frontend reports the observed/queried price-product into kafka. This kafka cluster is subscribed to by our pipeline which is configured on a schedule in Airflow, with the possibility of manual trigger. The final stage of the pricing pipeline, submits computed dynamic pricing results into a redis database for quick updates which is then read by the pricing provider and displayed on the webapp. This is a very generic end-to-end mechanism which is applicable to a variety of different e-commerce tasks. We intentionally put emphasis on the development of this infrastructure to establish a reproducible framework for interaction and to minimize any noise.
|
The architecture of this platform begins with the deployed web-apps posting interaction data to our backend which processes them and stores each ingested interaction into a kafka cluster. This serves as our data reservoir tracking and associating each interaction with its session and importantly with which experiment it belongs to. Not only do we track the behavioral interactions, but our pricing provider micro-service, once called by the frontend reports the observed/queried price-product into kafka. This kafka cluster is subscribed to by our pipeline which is configured on a schedule in Airflow, with the possibility of manual trigger. The final stage of the pricing pipeline, submits computed dynamic pricing results into a redis database for quick updates which is then read by the pricing provider and displayed on the webapp. This is a very generic end-to-end mechanism which is applicable to a variety of different e-commerce tasks. We intentionally put emphasis on the development of this infrastructure to establish a reproducible framework for interaction and to minimize any noise.
|
||||||
|
|
||||||
\paragraph{Public Web Artifact} We transition the Kappa like architecture of the data collection to a Lambda architecture for actual learning in a surrogate environment. This allows us to move faster on data which is provided and helps us create a feedback loop for production deployment. To support further research in this intersection of fields we release P4P \footnote{\url{https://github.com/velocitatem/p4p}} as a public repository providing the interaction layer of the PHANTOM framework. This provides a configurable storefront which can be tailored to any commercial setting with a standardized session-level event tracking. We document the API adapters or what the framework expects in terms of schemas for pricing providers and log ingestion servicse. The repository is intended for controlled experimentation and method replication rather than production commerce deployment.
|
\paragraph{Public Web Artifact} We transition the Kappa-like architecture of the data collection to a Lambda architecture for actual learning in a surrogate environment. This allows us to move faster on data which is provided and helps us create a feedback loop for production deployment. To support further research in this intersection of fields we release P4P \footnote{\url{https://github.com/velocitatem/p4p}} as a public repository providing the interaction layer of the PHANTOM framework. This provides a configurable storefront which can be tailored to any commercial setting with a standardized session-level event tracking. We document the API adapters or what the framework expects in terms of schemas for pricing providers and log ingestion servicse. The repository is intended for controlled experimentation and method replication rather than production commerce deployment.
|
||||||
|
|
||||||
\subsubsection{DevOps Principles}
|
\subsubsection{DevOps Principles}
|
||||||
|
|
||||||
|
|||||||
@@ -39,7 +39,7 @@ In this paper we present an exploration and defense against the presence of new
|
|||||||
We formally define interaction data as coming from some actor which can either be an agent ($A$) or human ($H$).
|
We formally define interaction data as coming from some actor which can either be an agent ($A$) or human ($H$).
|
||||||
Dynamic pricing algorithms rely on directly translating demand features $q$ to new price assignments $\hat{p}$ across a catalogue of products of size $N$.
|
Dynamic pricing algorithms rely on directly translating demand features $q$ to new price assignments $\hat{p}$ across a catalogue of products of size $N$.
|
||||||
This opens opportunities to design a \textit{tabula rasa} of digital market mechanisms that will shape the future of commerce in the age of artificial intelligence.
|
This opens opportunities to design a \textit{tabula rasa} of digital market mechanisms that will shape the future of commerce in the age of artificial intelligence.
|
||||||
We propose a robust optimization objective defined in our methodology, transforming the pricing problem into a form of Distributionally Robust Optimization \parencite{kuhn_distributionally_2025} where the learner must guard against adversarial contamination in observed demand distributors.
|
We propose a robust optimization objective defined in our methodology, transforming the pricing problem into a form of Distributionally Robust Optimization \parencite{kuhn_distributionally_2025} where the learner must guard against adversarial contamination in observed demand distributions.
|
||||||
For purposes of this research, an agent is an algorithmic loop with the ability to access a web platform and perform actions such as clicks, scrolls, and input field fills.
|
For purposes of this research, an agent is an algorithmic loop with the ability to access a web platform and perform actions such as clicks, scrolls, and input field fills.
|
||||||
|
|
||||||
\vspace{0.5em}
|
\vspace{0.5em}
|
||||||
@@ -63,7 +63,7 @@ We intentionally put emphasis on the development of this infrastructure to estab
|
|||||||
In addition to behavioral events, the platform logs price observations to a separate Kafka topic.
|
In addition to behavioral events, the platform logs price observations to a separate Kafka topic.
|
||||||
Each price query generates a record $(i, p, \text{sid}, \phi, t)$ associating the product, displayed price, requesting session, platform mode, and timestamp.
|
Each price query generates a record $(i, p, \text{sid}, \phi, t)$ associating the product, displayed price, requesting session, platform mode, and timestamp.
|
||||||
This dual-stream architecture enables joint analysis of price exposure and behavioral response.
|
This dual-stream architecture enables joint analysis of price exposure and behavioral response.
|
||||||
We transition the Kappa like architecture of the data collection to a Lambda architecture for actual learning in a surrogate environment.
|
We transition the Kappa-like architecture of the data collection to a Lambda architecture for actual learning in a surrogate environment.
|
||||||
This allows us to move faster on data which is provided and helps us create a feedback loop for production deployment.
|
This allows us to move faster on data which is provided and helps us create a feedback loop for production deployment.
|
||||||
Operationally, goals and experiment runs are tracked in PostgreSQL (goal table, run table, and assignment mapping).
|
Operationally, goals and experiment runs are tracked in PostgreSQL (goal table, run table, and assignment mapping).
|
||||||
This data-acquisition phase is the first half of the methodology and is intentionally a disconnected component that feeds the later contributions.
|
This data-acquisition phase is the first half of the methodology and is intentionally a disconnected component that feeds the later contributions.
|
||||||
@@ -83,7 +83,7 @@ We utilize the Wasserstein distance metric to define the set of plausible demand
|
|||||||
The robust policy $\pi^*$ is obtained by solving the maximin problem $\pi^* = \arg \max_{\pi} \min_{Q \in \mathcal{U}_\epsilon} \mathbb{E}_{d \sim Q} \left[ R(p, d) - \lambda \cdot \text{COI}_{\text{leak}}(p,\tau') - \eta_{\text{ux}} \cdot \text{UX}(\tau', p) \right]$ where $R(p, d)$ is the revenue function, $\lambda$ weighs the information-leakage penalty, and $\eta_{\text{ux}}$ weighs the UX term.
|
The robust policy $\pi^*$ is obtained by solving the maximin problem $\pi^* = \arg \max_{\pi} \min_{Q \in \mathcal{U}_\epsilon} \mathbb{E}_{d \sim Q} \left[ R(p, d) - \lambda \cdot \text{COI}_{\text{leak}}(p,\tau') - \eta_{\text{ux}} \cdot \text{UX}(\tau', p) \right]$ where $R(p, d)$ is the revenue function, $\lambda$ weighs the information-leakage penalty, and $\eta_{\text{ux}}$ weighs the UX term.
|
||||||
In practice, we parameterize this with a session-level leakage term $\text{COI}_{\text{leak}}(p,\tau') = f(\tau')\cdot \text{InfoValue}(p,\tau')$ where $f(\tau')$ is the weak agent probability.
|
In practice, we parameterize this with a session-level leakage term $\text{COI}_{\text{leak}}(p,\tau') = f(\tau')\cdot \text{InfoValue}(p,\tau')$ where $f(\tau')$ is the weak agent probability.
|
||||||
As part of reward engineering, we keep a UX factor ($UX\in[0,1]$) as an auxiliary evaluation axis.
|
As part of reward engineering, we keep a UX factor ($UX\in[0,1]$) as an auxiliary evaluation axis.
|
||||||
Our training budget is provisioned through TPU Research Cloud and spans 384 chips across TPU v4, v5e, and v6e generations, with a spot-heavy allocation plus an on-demand reserve.
|
Our training budget is provisioned through TPU Research Cloud and spans 320 chips across TPU v4, v5e, and v6e generations, with a spot-heavy allocation plus an on-demand reserve.
|
||||||
At peak BF16 throughput this corresponds to approximately $160$\,PFLOPS of aggregate compute.
|
At peak BF16 throughput this corresponds to approximately $160$\,PFLOPS of aggregate compute.
|
||||||
|
|
||||||
\vspace{0.5em}
|
\vspace{0.5em}
|
||||||
|
|||||||
Reference in New Issue
Block a user