mirror of
https://github.com/velocitatem/PHANTOM.git
synced 2026-05-31 16:43:36 +00:00
chore: refactoring, proper citation and updating on data and refs and apendices
This commit is contained in:
@@ -46,15 +46,44 @@ These behavioral signals serve as inputs for a Distributionally Robust Reinforce
|
||||
\appendix
|
||||
\section{Terminology}
|
||||
\begin{description}
|
||||
\item[Agent $A$] An actor of non-human nature, powered by an LLM.
|
||||
\item[Human $H$] An individual human with some job to be done.
|
||||
\item[Actor $\theta$] Defines a type of class which is either Agent or Human and has the capability to carry out actions on a web platform.
|
||||
\item[Platform] Any web-based platform which serves an interface to a collection of items that can be purchased, each at some price $p_i$.
|
||||
\item[Behavioral Model] A mathematical model predicting what action comes after a series of prior actions.
|
||||
\item[LLM] Large Language Model served by some provider with the abstracted capability of tool calling.
|
||||
\item[TPU] Tensor Processing Unit which is a unique kind of chip architecture developed by Google.
|
||||
\item[Trajectory] Defined as a series of unspecified length, collecting data on states of some object over time.
|
||||
% TODO: maybe define other things in a similar succient manner
|
||||
\item[Agent $A$] A non-human actor, typically an LLM-driven system that executes web actions toward a goal.
|
||||
\item[Human $H$] A human participant interacting with the platform to complete a task.
|
||||
\item[Actor Type $\theta$] A latent class parameter describing whether a session is generated by a human or an agent profile.
|
||||
\item[Platform] A web interface exposing purchasable items and their offered prices.
|
||||
\item[Session $s$] A bounded interaction record tied to one actor and one session identifier.
|
||||
\item[Event $e_{s,k}$] A single interaction tuple in a session, including action, item target, and timestamp.
|
||||
\item[Trajectory $\tau_s$] The ordered sequence of events generated within a session.
|
||||
\item[Demand Proxy $\hat{q}_{t,i}$] A weighted aggregate of observed actions used as an operational substitute for latent demand.
|
||||
\item[Action Weight Function $\omega(a)$] A mapping from action type to signal strength in the demand proxy.
|
||||
\item[True Demand $d(p;\theta)$] The latent purchase response as a function of price and actor type.
|
||||
\item[Contamination $\alpha$] The proportion of agent-generated traffic in the session mixture.
|
||||
\item[Non-stationary Noise $\epsilon_t$] Time-varying residual variation not explained by the actor mixture.
|
||||
\item[Pricing Policy $\pi(\tau)$] A function mapping observed interaction history to an offered price.
|
||||
\item[Cost of Information (COI)] The expected premium above the minimum viable price induced by the pricing policy.
|
||||
\item[COI Leakage] A per-quote penalty term modeling information revealed to reconnaissance behavior.
|
||||
\item[First-Order Statistic $p_{(1)}$] The minimum observed price among multiple independent queries.
|
||||
\item[Transition Kernel $\mathcal{T}$] A Markov transition matrix over behavioral states or actions.
|
||||
\item[Separability] The degree to which human and agent sessions can be distinguished from behavior alone.
|
||||
\item[KL Divergence $D_{KL}$] A relative-entropy measure used to compare session transition structure against class prototypes.
|
||||
\item[Divergence Scores $\Delta_H,\Delta_A$] Session-level distances to human and agent transition centroids.
|
||||
\item[Weak Agent Probability $f(\tau)$] A session-level score estimating the likelihood that a trajectory is agent-generated.
|
||||
\item[Contamination Generator $\mathcal{G}(\alpha)$] A simulator component that injects synthetic agent trajectories to reach a target mixture level.
|
||||
\item[Stackelberg Game] A leader-follower formulation where the platform sets prices and demand responds.
|
||||
\item[Ambiguity Set $\mathcal{U}_{\epsilon}$] A set of plausible demand distributions considered under distributional uncertainty.
|
||||
\item[Wasserstein Ball] A distance-bounded neighborhood around an empirical distribution used in robust optimization.
|
||||
\item[DR-RL] Distributionally Robust Reinforcement Learning for policies trained against worst-case distributional shifts.
|
||||
\item[Nominal Contamination $\alpha_0$] The baseline contamination level around which robust candidates are evaluated.
|
||||
\item[Robustness Radius $\epsilon_\alpha$] The local interval width used for inner minimization over contamination scenarios.
|
||||
\item[Query-Tax Surrogate] A constant leakage proxy assigning fixed penalty to suspected reconnaissance queries.
|
||||
\item[Revelation Surrogate] A leakage proxy based on $-\log\pi(p\mid\tau)$ to penalize highly informative quotes.
|
||||
\item[Limbo Stack] The alternating game-history buffer that stores leader price moves and follower demand responses.
|
||||
\item[UX Index] A bounded user-experience metric tracked to evaluate policy side effects on legitimate users.
|
||||
\item[Look-to-Book Ratio] The ratio of search-like interactions to completed purchases, used as an operational contamination indicator.
|
||||
\item[Hybrid Kappa-Lambda Architecture] A data design combining streaming ingestion with offline and batch learning loops.
|
||||
\item[MDP / POMDP] Sequential decision models with full observability (MDP) or partial observability (POMDP).
|
||||
\item[Behavioral Model] A model predicting what action is likely to follow from prior actions.
|
||||
\item[LLM] Large Language Model served through an inference provider with tool-use capability.
|
||||
\item[TPU] Tensor Processing Unit, a specialized accelerator architecture developed by Google.
|
||||
\end{description}
|
||||
|
||||
\section{Aggregate Compute Budget Derivation}
|
||||
@@ -81,109 +110,19 @@ v4 & 64 & 275 & $64 \times 275 = 17{,}600$ \\
|
||||
|
||||
Converting to petaFLOPS: $160{,}320\;\text{TFLOPS} = 160.32\;\text{PFLOPS} \approx 160\;\text{PFLOPS}$. This is the theoretical peak under sustained BF16 arithmetic; realized throughput depends on memory bandwidth utilization and inter-chip communication overhead, but the figure serves as a useful upper bound for provisioning decisions.
|
||||
|
||||
\section{Full Slope-Test Derivation: Revenue vs. Contamination}
|
||||
\section{Slope-Test Verification: Revenue vs. Contamination}
|
||||
\label{app:alpha_revenue_slope}
|
||||
|
||||
This appendix gives the full ordinary least squares computation for the linear effect of contamination on mean revenue. Let
|
||||
This appendix provides a compact verification of the slope result reported in the main results section. Using the same run-level pairs $x_i=\texttt{study/alpha}_i$ and $y_i=\texttt{eval/revenue\_mean}_i$ ($n=95$), we re-checked the ordinary least squares slope test in Python with standard test routines (SciPy two-sided $t$ test for the slope).
|
||||
|
||||
\[
|
||||
x_i = \texttt{study/alpha}_i, \qquad y_i = \texttt{eval/revenue\_mean}_i,
|
||||
\widehat{y}=326{,}878.57-60{,}631.95\,x,
|
||||
\]
|
||||
and fit
|
||||
\[
|
||||
y_i = \beta_0 + \beta_1 x_i + \varepsilon_i, \qquad i=1,\dots,n.
|
||||
\]
|
||||
The slope test is
|
||||
\[
|
||||
H_0: \beta_1 = 0 \qquad \text{vs.} \qquad H_1: \beta_1 \neq 0.
|
||||
t(93)=-8.2148,\qquad p=1.2038\times 10^{-12},\qquad R^2=0.4205,\qquad 95\%\,\text{CI}_{\beta_1}=[-75{,}288.76,\,-45{,}975.13].
|
||||
\]
|
||||
|
||||
\subsection{Sample moments and least-squares coefficients}
|
||||
|
||||
From the data:
|
||||
\[
|
||||
n=95, \qquad \bar{x}=0.3810526316, \qquad \bar{y}=303{,}774.6096.
|
||||
\]
|
||||
Define
|
||||
\[
|
||||
S_{xx}=\sum_{i=1}^{n}(x_i-\bar{x})^2, \qquad S_{xy}=\sum_{i=1}^{n}(x_i-\bar{x})(y_i-\bar{y}).
|
||||
\]
|
||||
Numerically,
|
||||
\[
|
||||
S_{xx}=7.0508947368, \qquad S_{xy}=-427{,}509.4691.
|
||||
\]
|
||||
The least-squares slope and intercept are
|
||||
\[
|
||||
\hat{\beta}_1 = \frac{S_{xy}}{S_{xx}} = \frac{-427{,}509.4691}{7.0508947368} = -60{,}631.9460,
|
||||
\]
|
||||
\[
|
||||
\hat{\beta}_0 = \bar{y} - \hat{\beta}_1\bar{x} = 303{,}774.6096 - (-60{,}631.9460)(0.3810526316) = 326{,}878.5722.
|
||||
\]
|
||||
So the fitted line is
|
||||
\[
|
||||
\hat{y} = 326{,}878.5722 - 60{,}631.9460\,x.
|
||||
\]
|
||||
|
||||
\subsection{Residual variance and standard error of the slope}
|
||||
|
||||
For each observation, $\hat{y}_i = \hat{\beta}_0 + \hat{\beta}_1 x_i$ and $e_i = y_i - \hat{y}_i$. The residual sum of squares is
|
||||
\[
|
||||
\mathrm{SSE} = \sum_{i=1}^{n} e_i^2 = 35{,}721{,}896{,}352.27375.
|
||||
\]
|
||||
With $df=n-2=93$,
|
||||
\[
|
||||
\mathrm{MSE} = \frac{\mathrm{SSE}}{n-2} = \frac{35{,}721{,}896{,}352.27375}{93} = 384{,}106{,}412.3900.
|
||||
\]
|
||||
The slope standard error is
|
||||
\[
|
||||
SE(\hat{\beta}_1) = \sqrt{\frac{\mathrm{MSE}}{S_{xx}}} = \sqrt{\frac{384{,}106{,}412.3900}{7.0508947368}} = 7{,}380.8038.
|
||||
\]
|
||||
|
||||
\subsection{t-statistic, p-value, and confidence interval}
|
||||
|
||||
Under $H_0: \beta_1=0$,
|
||||
\[
|
||||
t = \frac{\hat{\beta}_1}{SE(\hat{\beta}_1)} = \frac{-60{,}631.9460}{7{,}380.8038} = -8.2148,
|
||||
\]
|
||||
with $df=93$. The two-sided p-value is
|
||||
\[
|
||||
p = 2\,\Pr\left(T_{93} \ge |t|\right) = 1.2038\times 10^{-12}.
|
||||
\]
|
||||
The 95\% confidence interval is
|
||||
\[
|
||||
\hat{\beta}_1 \pm t_{0.975,93}\,SE(\hat{\beta}_1)
|
||||
= -60{,}631.9460 \pm (1.9858)(7{,}380.8038)
|
||||
= [-75{,}288.7597,\,-45{,}975.1324].
|
||||
\]
|
||||
|
||||
\subsection{Effect size and fit statistics}
|
||||
|
||||
The sample correlation is $r=-0.64846$, so
|
||||
\[
|
||||
R^2 = r^2 = 0.4205.
|
||||
\]
|
||||
Hence, 42.05\% of the variation in \texttt{eval/revenue\_mean} is explained by a linear trend in \texttt{study/alpha}.
|
||||
|
||||
The slope interpretation is direct:
|
||||
\[
|
||||
\hat{\beta}_1 = -60{,}631.9460 \quad \Rightarrow \quad \Delta y \approx -6{,}063.19 \text{ for } \Delta x = +0.1.
|
||||
\]
|
||||
From $\alpha=0$ to $\alpha=0.8$, the fitted drop is
|
||||
\[
|
||||
0.8\times (-60{,}631.9460) = -48{,}505.5568,
|
||||
\]
|
||||
so the model predicts roughly $48{,}506$ lower revenue units on average.
|
||||
|
||||
\subsection{Conclusion of the slope test}
|
||||
|
||||
The estimated model is
|
||||
\[
|
||||
\hat{y}=326{,}878.57-60{,}631.95\,x,
|
||||
\]
|
||||
with
|
||||
\[
|
||||
t(93)=-8.2148, \qquad p=1.2038\times 10^{-12}, \qquad 95\%\,\text{CI}=[-75{,}288.76,\,-45{,}975.13].
|
||||
\]
|
||||
The slope is therefore strongly negative and statistically different from zero.
|
||||
The Python verification reproduces the reported coefficients and inference values, confirming that the slope-test results are correct under standard methods.
|
||||
|
||||
% \input{../build/concatenated_code}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user