mirror of
https://github.com/velocitatem/PHANTOM.git
synced 2026-05-31 16:43:36 +00:00
more on revelatin
This commit is contained in:
@@ -126,15 +126,9 @@ In code we do the basic fix: add a tiny floor $\varepsilon$ to both the numerato
|
||||
\section{Why the logarithm appears in the revelation surrogate}
|
||||
\label{app:revelation_log}
|
||||
|
||||
Recall that $\text{COI}_{\text{leak}}(p,\tau') = f(\tau')\cdot\text{InfoValue}(p,\tau')$. The query-tax surrogate fixes $\text{InfoValue}$ to a positive constant: every suspected reconnaissance quote is penalized equally, which tracks the erosion theorem where independent query volume drives COI to zero. The revelation surrogate instead sets
|
||||
\begin{equation}
|
||||
\text{InfoValue}(p,\tau') = -\log \pi(p\mid\tau'),
|
||||
\end{equation}
|
||||
where $\pi(\cdot\mid\tau')$ is the pricing policy's distribution over quoted prices in context $\tau'$ (after whatever discretization or binning the engine uses).
|
||||
Leakage is $\text{COI}_{\text{leak}} = f(\tau')\cdot\text{InfoValue}$. The query-tax form fixes $\text{InfoValue}=c>0$. The revelation form sets $\text{InfoValue}(p,\tau')=-\log\pi(p\mid\tau')$, with $\pi(\cdot\mid\tau')$ the policy distribution over quoted prices in context $\tau'$ (discretized as in the engine).
|
||||
|
||||
For an outcome that occurs with probability $q$, the quantity $-\log q$ is the usual \emph{surprisal}: likely draws have small surprisal, rare draws have large surprisal. That is the same ``surprise'' people import into recommender systems when they formalize novelty as low predicted probability under a model---here the model is our own policy. The log is not decorative: it is the standard information-theoretic coding of ``how unexpected was this draw under $\pi$?'' In the reconnaissance reading, a quote from a thin slice of the policy's support is more identifying than a modal quote, because it pins down what the rule is willing to do in places where little mass sits.
|
||||
|
||||
Put together, the revelation form is \emph{contamination-weighted surprisal}: $f(\tau')$ scales how agent-like we judge the session, and $-\log\pi(p\mid\tau')$ scales how informative that realized price is relative to $\pi(\cdot\mid\tau')$. In implementation you still floor $\pi(p\mid\tau')$ away from zero so tail bins do not explode the penalty---the same honesty as Appendix~\ref{app:kl_zeros}: we use a regularized surrogate, not a literal infinite penalty.
|
||||
For an outcome with probability $q$, the quantity $-\log q$ is \emph{surprisal}. For independent events, $-\log\prod_i q_i=\sum_i(-\log q_i)$. The revelation term is surprisal under $X\sim\pi(\cdot\mid\tau')$, multiplied by $f(\tau')$. In practice we do $\max\{\pi,\varepsilon\}$ in place of $\pi$ so the log stays finite (same spirit as Appendix~\ref{app:kl_zeros}).
|
||||
|
||||
|
||||
% \input{../build/concatenated_code}
|
||||
|
||||
Reference in New Issue
Block a user