updating computation power graph

2026-07-16 01:53:37 +00:00 · 2026-03-08 14:22:54 +01:00
parent 17c128cbc0
commit 28dbcacd95
6 changed files with 142 additions and 114 deletions
--- a/paper/src/main.tex
+++ b/paper/src/main.tex
@@ -53,6 +53,31 @@ These behavioral signals serve as inputs for a Distributionally Robust Reinforce
 \item[Trajectory] Defined as a series of unspecified length, collecting data on states of some object over time.
 % TODO: maybe define other things in a similar succient manner
 \end{description}
+
+\section{Aggregate Compute Budget Derivation}
+\label{app:compute_budget}
+
+The claimed peak throughput of approximately 160\,PFLOPS follows from multiplying the per-chip BF16 peak (from official Google Cloud TPU documentation) by the number of chips in each allocation tier and summing across generations.
+
+\begin{table}[ht]
+\centering
+\caption{Per-generation contribution to aggregate BF16 throughput.}
+\label{tab:compute_derivation}
+\begin{tabular}{@{}lrrr@{}}
+\toprule
+\textbf{TPU Gen.} & \textbf{Chips} & \textbf{Peak BF16/chip (TFLOPS)} & \textbf{Subtotal (TFLOPS)} \\
+\midrule
+v6e (Trillium) & 128 & 918 & $128 \times 918 = 117{,}504$ \\
+v5e            & 128 & 197 & $128 \times 197 = 25{,}216$  \\
+v4             &  64 & 275 & $64  \times 275 = 17{,}600$  \\
+\midrule
+\textbf{Total} & \textbf{320} & & $\mathbf{160{,}320}$ \\
+\bottomrule
+\end{tabular}
+\end{table}
+
+Converting to petaFLOPS: $160{,}320\;\text{TFLOPS} = 160.32\;\text{PFLOPS} \approx 160\;\text{PFLOPS}$. This is the theoretical peak under sustained BF16 arithmetic; realized throughput depends on memory bandwidth utilization and inter-chip communication overhead, but the figure serves as a useful upper bound for provisioning decisions.
+
 % \input{../build/concatenated_code}

 \end{document}