more on revelatin

This commit is contained in:
2026-04-09 10:29:38 +02:00
parent 02328b20f2
commit 835e10d6ef
4 changed files with 7 additions and 14 deletions

View File

@@ -126,15 +126,9 @@ In code we do the basic fix: add a tiny floor $\varepsilon$ to both the numerato
\section{Why the logarithm appears in the revelation surrogate}
\label{app:revelation_log}
Recall that $\text{COI}_{\text{leak}}(p,\tau') = f(\tau')\cdot\text{InfoValue}(p,\tau')$. The query-tax surrogate fixes $\text{InfoValue}$ to a positive constant: every suspected reconnaissance quote is penalized equally, which tracks the erosion theorem where independent query volume drives COI to zero. The revelation surrogate instead sets
\begin{equation}
\text{InfoValue}(p,\tau') = -\log \pi(p\mid\tau'),
\end{equation}
where $\pi(\cdot\mid\tau')$ is the pricing policy's distribution over quoted prices in context $\tau'$ (after whatever discretization or binning the engine uses).
Leakage is $\text{COI}_{\text{leak}} = f(\tau')\cdot\text{InfoValue}$. The query-tax form fixes $\text{InfoValue}=c>0$. The revelation form sets $\text{InfoValue}(p,\tau')=-\log\pi(p\mid\tau')$, with $\pi(\cdot\mid\tau')$ the policy distribution over quoted prices in context $\tau'$ (discretized as in the engine).
For an outcome that occurs with probability $q$, the quantity $-\log q$ is the usual \emph{surprisal}: likely draws have small surprisal, rare draws have large surprisal. That is the same ``surprise'' people import into recommender systems when they formalize novelty as low predicted probability under a model---here the model is our own policy. The log is not decorative: it is the standard information-theoretic coding of ``how unexpected was this draw under $\pi$?'' In the reconnaissance reading, a quote from a thin slice of the policy's support is more identifying than a modal quote, because it pins down what the rule is willing to do in places where little mass sits.
Put together, the revelation form is \emph{contamination-weighted surprisal}: $f(\tau')$ scales how agent-like we judge the session, and $-\log\pi(p\mid\tau')$ scales how informative that realized price is relative to $\pi(\cdot\mid\tau')$. In implementation you still floor $\pi(p\mid\tau')$ away from zero so tail bins do not explode the penalty---the same honesty as Appendix~\ref{app:kl_zeros}: we use a regularized surrogate, not a literal infinite penalty.
For an outcome with probability $q$, the quantity $-\log q$ is \emph{surprisal}. For independent events, $-\log\prod_i q_i=\sum_i(-\log q_i)$. The revelation term is surprisal under $X\sim\pi(\cdot\mid\tau')$, multiplied by $f(\tau')$. In practice we do $\max\{\pi,\varepsilon\}$ in place of $\pi$ so the log stays finite (same spirit as Appendix~\ref{app:kl_zeros}).
% \input{../build/concatenated_code}