mirror of
https://github.com/velocitatem/PHANTOM.git
synced 2026-05-31 08:33:36 +00:00
fix: align
This commit is contained in:
@@ -32,7 +32,6 @@ where $\alpha \in [0, 1]$ represents the contamination parameter (proportion of
|
|||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\subsection{Cost of Information (COI) Framework}
|
\subsection{Cost of Information (COI) Framework}
|
||||||
|
|
||||||
The \textit{Cost of Information} (COI) represents the markup a pricing policy $\pi$ attempts to extract from the market by leveraging demand signals. We define COI as the expected premium over the minimum viable price $\underline{p}$ (or marginal cost). This also speaks to the financial urgency as a consequence of information asymmetry between the platform and the actors.
|
The \textit{Cost of Information} (COI) represents the markup a pricing policy $\pi$ attempts to extract from the market by leveraging demand signals. We define COI as the expected premium over the minimum viable price $\underline{p}$ (or marginal cost). This also speaks to the financial urgency as a consequence of information asymmetry between the platform and the actors.
|
||||||
@@ -183,8 +182,10 @@ Study methodology and approach. Data acquisition strategy. Defined objectives an
|
|||||||
|
|
||||||
To develop a robust pricing agent, we require a simulation environment capable of generating realistic, contaminated interaction data. We achieve this by learning from our Phantom platform data using a two-stage approach.
|
To develop a robust pricing agent, we require a simulation environment capable of generating realistic, contaminated interaction data. We achieve this by learning from our Phantom platform data using a two-stage approach.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
\subsubsection{GOFAI-Based Separability}
|
\subsubsection{GOFAI-Based Separability}
|
||||||
We employ Good Old-Fashioned AI (GOFAI) heuristics to generate initial weak labels for separability. We define a set of rule-based predicates $\phi_j: \tau \to \{0, 1\}$ (e.g., inter-arrival time consistency, DOM-traversal linearity) to partition the dataset $\mathcal{D}$ into high-confidence sets $\mathcal{D}_H$ and $\mathcal{D}_A$.
|
We employ Good Old-Fashioned AI (GOFAI) heuristics to generate initial weak labels for separability. We define a set of rule-based predicates $\phi_j: \tau \to \{0, 1\}$ to partition the dataset $\mathcal{D}$ into high-confidence sets $\mathcal{D}_H$ and $\mathcal{D}_A$. We construct distinct MDPs per each behavioral profile of humans and agents and from those we establish $D_{KL}$.
|
||||||
|
|
||||||
\subsubsection{Transition Probability Estimation}
|
\subsubsection{Transition Probability Estimation}
|
||||||
For both subsets, we model the session dynamics as a Markov Decision Process (MDP) and estimate the transition kernel $\mathcal{T}$. The probability of transitioning to state $s'$ given state $s$ is estimated via maximum likelihood:
|
For both subsets, we model the session dynamics as a Markov Decision Process (MDP) and estimate the transition kernel $\mathcal{T}$. The probability of transitioning to state $s'$ given state $s$ is estimated via maximum likelihood:
|
||||||
@@ -225,4 +226,4 @@ Steve Burns, superior culliculus (face heuristics) we create this sort of part o
|
|||||||
|
|
||||||
We could say that a DQN for example is the learnin subsystem and then within our reward mechanism or some other computational method we introduce a steering subsystem which acts as the proposed ``pricing heuristic'' against the given non human transaction data.
|
We could say that a DQN for example is the learnin subsystem and then within our reward mechanism or some other computational method we introduce a steering subsystem which acts as the proposed ``pricing heuristic'' against the given non human transaction data.
|
||||||
|
|
||||||
\section{}
|
\section{Market construction}
|
||||||
|
|||||||
Reference in New Issue
Block a user