Goodhart-Proof Incentives: Linear Multitask Contracts with Strategic Metric Manipulation and Audits

Metadata

Total Words: 14,347
Export Date: 2026-01-16 06:21:55
Description: We study multitask contracting in a modern platform setting where compensation is tied to machine-learned performance metrics that agents can game. A principal posts a linear contract on observed task signals, but agents can devote resources to manipulation that inflates measured performance without increasing true value. Building on the multitask linear-contract framework and the learning-as-measurement-error insight from Zuo (2025), we introduce a tractable manipulation technology and characterize optimal linear incentives.

In a benchmark quadratic model, manipulation generates a wedge between measured output and true effort. We show the Stackelberg-optimal linear contract is a matrix shrinkage/rotation of the principal’s marginal values: tasks that are cheaper to manipulate receive strictly lower pay weights. We then study auditing/repeated measurement: an audit signal provides an unmanipulable (but noisy) measure of effort, and the optimal implementable policy is to pay on the manipulable metric when unaudited and on the audit metric when audited, which equivalently reduces effective manipulability by the audit rate.

Finally, we treat the task values and manipulability as unknown and show that audited repeats serve as instruments enabling consistent GMM estimation and low-regret online learning. Under agent diversity, we obtain exploration-light (greedy) learning guarantees. The results provide implementable guidance for designing incentives that remain robust to Goodharting while staying simple and auditable.

1. Introduction: metric gaming as a multitask contracting problem; motivating examples (creator platforms, fulfillment centers, call centers); contributions and roadmap.
2. Related literature: multitask principal–agent, linear contract robustness/uniformity/learning, strategic classification/Goodhart, auditing and measurement error, online IV and bandits.
3. Model: tasks, linear contracts, manipulation technology, audit process; equilibrium definition and discussion of modeling choices.
4. Benchmark characterization (known parameters): agent best response (effort + manipulation); optimal linear contract paying on x only; interpretation as shrinkage by manipulability.
5. Audits and sufficient statistics: contracts contingent on audits; optimal ‘pay x if no audit, pay if audit’; extension to noisy audits and linear combinations; comparative statics in audit rate and noise.
6. Identification and estimation: two-stage estimation of manipulability M from audited differences; IV/GMM estimation of using corrected signals; finite-sample error bounds.
7. Online learning: epoch-based greedy learner with audits; regret bounds under diversity; discussion of when explicit exploration is necessary (weak audits/weak diversity).
8. Extensions: correlated/low-rank manipulation matrices; partial manipulability (m affects only subset of tasks); nonnegativity constraints and caps; approximate quadratic costs; endogenous audit targeting.
9. Empirical implications and measurement: what platforms can log; testable predictions (weight changes vs manipulability proxies); how to estimate in real telemetry.
10. Conclusion: design principles for Goodhart-proof linear incentives; limitations and future work.

Content

1. Introduction: metric gaming as a multitask contracting problem; motivating examples (creator platforms, fulfillment centers, call centers); contributions and roadmap.

Digital labor markets and algorithmic workplaces increasingly govern behavior through a small number of observable metrics: clicks and watch time for creators, items-per-hour for fulfillment workers, average handle time for call-center agents, on-time delivery rates for drivers, and defect counts for quality inspectors. These metrics are attractive because they are scalable, comparable across workers, and easily fed into automated pay, ranking, and promotion systems. Yet the same properties that make metrics operationally convenient also make them vulnerable: when compensation and continued access to a platform depend on a measure, the measure becomes a target. In practice, agents respond not only by reallocating productive effort toward measured activities, but also by actively ``working the metric’’—altering inputs, timing, reporting, or customer interactions in ways that raise recorded performance without raising true value. This tension is often summarized by Goodhart’s Law, but from the perspective of contract design it is a concrete principal–agent problem with a distinctive feature: the principal observes a manipulable signal that the agent can inflate.

We study this phenomenon as a multitask contracting problem in which the agent has two levers. The first is true effort—the collection of task-specific actions that raise the principal’s realized value (for example, solving customer issues, carefully picking and packing, or producing high-quality content). The second is manipulation—actions that raise the observed metric without proportionate gains in value (for example, transferring calls to reduce handle time, prioritizing easy picks, keyword stuffing, engagement bait, or strategically timing work around measurement windows). Conceptually, both levers respond to incentives, but only one generates surplus. This distinction is economically central: a platform that optimizes an easily gamed metric can end up financing manipulation rather than production, even when the platform’s underlying objective is well-defined and stable.

Several motivating examples illustrate why multitask structure and manipulability must be modeled jointly. On creator platforms, a contract that pays on short-run engagement can induce creators to shift effort away from long-horizon quality and toward tactics that exploit recommendation systems or viewer psychology. The resulting content may maximize clicks while eroding retention and brand value, a gap that is difficult to diagnose in real time because the platform observes its own engagement metrics continuously but observes true viewer welfare only noisily and with delay. In fulfillment centers, incentives based on pick rate and scan counts can raise throughput while simultaneously increasing mis-picks, unsafe movement, or the avoidance of harder tasks, generating hidden costs that appear later as returns, injuries, or turnover. In call centers, pay and discipline tied to handle time predictably induce call avoidance and premature termination, sometimes raising customer churn that is observable only after the fact. In each case, the principal has access to an immediate metric that is both informative and manipulable, and may also have access to a slower or costlier signal—quality audits, repeat measurements, customer follow-ups, or manual reviews—that is harder to game.

Our aim is to illuminate a basic tradeoff that practitioners confront: powerful incentives improve measured performance, but also amplify gaming when measurement is manipulable. We therefore adopt a deliberately transparent linear-quadratic framework. The benefit of this choice is not realism for its own sake, but tractability and interpretability. The model delivers a closed-form mapping from primitives—task productivities, task-specific manipulabilities, and audit intensity—to optimal incentive weights. This mapping is useful precisely because it can be read as a design rule: it tells the platform when to downweight a metric, when to invest in auditing, and how to combine signals when multiple measurements of performance are available.

The first contribution is to formalize ``metric gaming’’ as a separable response to incentives. When the agent can both exert true effort and inflate the metric, the contract loads on each task not only according to the principal’s marginal value of effort, but also according to how cheaply that task’s metric can be inflated. In our environment, this yields a shrinkage logic: tasks whose observed metrics are highly manipulable should receive weaker pay incentives, all else equal. The economic intuition mirrors standard multitask considerations, but with a sharper interpretation. In classic multitask models, distortion arises because some tasks are unmeasured or measured with noise; here, distortion arises because measurement is endogenous. The metric becomes a choice variable, and the principal must treat the prospect of manipulation as an additional effective cost of using that metric for pay.

The second contribution is to show how audits change the incentive problem in a particularly simple way. Many platforms can occasionally obtain an alternative measurement that is less manipulable: a manual review, a spot check, an independent log, or a second sensor. Importantly, such audits need not be perfect; they need only be less affected by manipulation than the primary metric. When pay can be conditioned on whether an audit occurs—for instance, paying on the primary metric most of the time but switching to the audited measure when the spot check is triggered—the marginal return to manipulation falls in proportion to the probability that manipulation will be ignored at payment time. This yields a clean comparative static: increasing audit frequency allows the principal to safely strengthen incentives, pushing the contract closer to the benchmark that would obtain if manipulation were impossible. In practice, this captures a familiar operational lesson: auditing is not merely a compliance tool; it is a way to ``harden’’ metrics and thereby enable higher-powered incentives on dimensions the platform truly cares about.

Our third contribution concerns learning and deployment in settings where the platform does not know, ex ante, which tasks are valuable and which metrics are manipulable. This informational challenge is ubiquitous. A new platform may be uncertain about how different behaviors translate into long-run retention, and even an established platform may face shifting user preferences, changing algorithmic opportunities for gaming, or heterogeneous populations of agents. We propose an approach that uses audited repeats not only to deter manipulation, but also to identify it. When both the manipulable metric and the audit measurement are observed in the same round, their difference isolates manipulation up to noise. Because manipulation responds systematically to incentive weights, contract variation can be exploited to estimate task-specific manipulability. Once manipulation is estimated and netted out, the platform recovers a classical repeated-measurement structure that supports instrumental-variables estimation of the value of effort. This sequence—use audits to identify manipulation, then use corrected signals to learn task values—connects mechanism design to empirical identification in a way that is operationally implementable.

These results speak to a broader set of questions about metric-based governance. First, they clarify why ``fixing the metric’’ by simply adding more metrics can fail. Without understanding manipulability, expanding the dashboard may simply expand the space of gaming. Second, they suggest that the right response to gaming is often neither to abandon incentives nor to rely on ex post punishment, but to redesign the measurement and payment system jointly: shrink incentives on easily gamed dimensions, and allocate auditing capacity to dimensions where stronger incentives would otherwise be valuable. Third, they highlight a practical role for randomness. Random audits and randomized contract variation are not only statistically convenient; they are economically meaningful because they change the agent’s expected payoff from manipulation and provide the exogenous variation needed for identification.

At the same time, we are explicit about limitations. We work with linear contracts, quadratic costs, and risk neutrality, which together deliver a clean decomposition between productive effort and manipulation and allow us to express optimal weights in closed form. Real environments feature richer dynamics: agents may learn the platform’s detection rules, manipulation may have nonlinear or threshold effects, and platforms may care about distributional outcomes or constraints beyond expected surplus. Moreover, our audit signal is assumed to be less manipulable and conditionally independent, which is a useful abstraction but not always satisfied (auditors can be corrupted; audits can induce anticipatory behavior). We view these simplifications as clarifying rather than innocuous: they isolate the core economic force—endogenous measurement—and provide a baseline against which richer models can be compared.

Roadmap. In the remainder of the paper, we proceed in three steps. We first present the multitask model of effort and metric manipulation and derive equilibrium behavior under linear incentive weights, emphasizing the separation between productive and manipulative responses. We then characterize the platform’s optimal contract when it pays only on the manipulable metric, and show how optimal incentives implement a task-by-task shrinkage rule based on manipulability. Next, we introduce audits and show how audit-contingent payment policies reduce effective manipulability and strengthen optimal incentives. Finally, we turn to learning: using audited repeats to estimate manipulability and task values, and designing an epoch-based policy that leverages these estimates to achieve low regret. We conclude with implications for platform policy, including when to audit, how to prioritize metrics for incentive pay, and how to balance transparency with the need to preserve the deterrent and informational value of auditing.

Our paper sits at the intersection of multitask moral hazard, endogenous measurement, and learning under strategic responses. A unifying theme across these literatures is that the object the principal can condition on—a performance measure, a score, a proxy label—is not a passive statistic: it is shaped by the agent’s incentives. What we add to this conversation is a deliberately simple mapping from primitives (task productivities, task-specific manipulabilities, and audit intensity) to optimal linear incentives and to a practical identification strategy that uses audited repeats to separate productive effort from manipulation.

The backbone of our environment is the multitask moral hazard framework initiated by , in which a principal chooses incentive weights on observable signals and the agent allocates effort across tasks. In that tradition, distortions arise because some tasks are unmeasured, measured noisily, or measured with different precisions, yielding the canonical ``incentives vs. distortion’’ tradeoff. Our model shares the emphasis on task-by-task incentive design, but differs in the channel that generates distortion: the measured signal is manipulable, so the agent can raise the metric directly rather than only through productive effort. This feature connects to a broader set of principal–agent models in which agents choose actions that affect both output and information (e.g., influence activities, window dressing, or costly signaling), and in which the principal treats the signal as an equilibrium object rather than an exogenous statistic.

The idea that performance pay can induce socially wasteful activities has a long history in economics and organizational design. Classic discussions of incentive pay emphasize that compensation based on a single measure can encourage ``teaching to the test,’’ tunnel vision, and the reallocation of effort toward what is rewarded rather than what is valuable . Empirically and theoretically, researchers have documented a range of dysfunctional responses to metric-based contracts—from gaming in education and healthcare to manipulation in sales and finance—often framed as agents exploiting the gap between an operational metric and the principal’s objective . We formalize a particularly stark version of this gap by allowing the agent to choose both productive effort and metric inflation, with separable convex costs, so that the observed metric decomposes cleanly into value-creating and value-destroying components.

A second relevant strand studies when linear contracts are optimal or approximately optimal, and when they are robust to misspecification. Linear pay is not merely a modeling convenience: it is widely used in practice because it is transparent, scalable, and easy to implement in automated systems. From a theory perspective, linear contracts arise in classic CARA-normal or related environments and as optimal within restricted classes; they are also commonly used as benchmarks in multidimensional moral hazard because they deliver comparative statics that can be interpreted as design rules (how to tilt weights across tasks as primitives change). Our contribution is in this spirit: by keeping the contract linear and costs quadratic, we obtain closed-form ``shrinkage’’ of incentive weights as manipulability rises, and an equally transparent adjustment when audits reduce the effective return to manipulation.

A growing literature in algorithmic contracting and data-driven mechanism design asks how a principal should choose contracts when the mapping from incentives to performance is unknown and must be learned over time. This work often combines classical incentive constraints with statistical estimation, emphasizing that exploration changes behavior and that one must learn under endogenous data . Our learning component is closest to work that exploits structural restrictions to obtain low-regret learning with limited experimentation, particularly when the endogenous response can be expressed as a low-dimensional function of contract parameters. In our setting, separability and linear best responses make it possible to use contract variation as a clean source of identification for manipulability and, conditional on correction, for task values.

Outside traditional contract theory, a large literature in computer science and adjacent fields studies how agents strategically respond to scoring rules and classifiers. The strategic classification and ``performative prediction’’ literatures formalize the idea that once a predictive score is used for decision-making, agents change their features, breaking the original statistical relationship between features and outcomes . This line of work shares our focus on endogenous covariates and equilibrium effects of optimization on proxies. Our model is economically distinct in two ways. First, we separate a value-relevant latent action (true effort) from a purely manipulative action (metric inflation), which clarifies the welfare implications of adaptation: not all behavior change is gaming. Second, we embed the score (the metric) in a contracting problem where payments transfer surplus, so the principal internalizes the fiscal cost of rewarding manipulable components. In this sense, our shrinkage rule can be viewed as an economic analogue of regularization against strategic manipulation: we downweight features (tasks) that are easier to move without creating value.

More broadly, Goodhart’s Law is often invoked as a qualitative warning that targets cease to be good measures once optimized. We view that maxim as a prompt for model-based design: once manipulation is feasible, the principal should treat the metric as an equilibrium outcome and redesign incentives and measurement accordingly. Our paper contributes a simple analytic case in which the Goodhart effect has a precise form (an additive manipulation component with incentive-responsive magnitude) and therefore admits a transparent correction (shrinkage in pay weights and audit-contingent conditioning).

A fourth set of connections is to the auditing and monitoring literature. In many principal–agent models, the principal can invest in monitoring to better infer effort or to deter hidden actions, with audits arriving stochastically or through costly verification . This literature emphasizes how verification changes incentives even when it is imperfect or probabilistic: the possibility of being checked disciplines behavior by lowering the expected return to misreporting or opportunism. Our audit mechanism is deliberately minimalistic—an alternative measurement that is less manipulable and observed with some probability—but it captures a core operational feature of platform governance: spot checks, manual reviews, and secondary sensors are common precisely because continuous perfect monitoring is infeasible.

Related models study how principals optimally combine routine performance measures with occasional verification, and how audit probabilities should depend on reported performance or on the history of behavior. We do not endogenize the audit policy in full generality; instead, we take the audit probability as given (or as a policy lever summarized by a scalar) and focus on how audit-contingent pay changes the agent’s marginal incentives. The key economic force is that auditing scales down the marginal benefit of manipulation by the probability that the manipulable signal is actually used for pay. This yields a particularly sharp comparative static: more auditing permits higher-powered incentives without proportionately higher gaming. While richer audit schemes may improve on our simple policy, we view this result as a baseline that clarifies what any more elaborate mechanism must exploit.

Our learning strategy draws on the econometrics of measurement error and instrumental variables. When observed regressors are noisy or systematically distorted, naive regression of outcomes on measured performance is biased. A standard solution is to use instruments or repeated measurements: two noisy proxies for the same latent variable can identify the relationship between the latent variable and outcomes under conditional independence and relevance conditions . The audit signal in our setting provides precisely such a second measurement of effort, but only after we account for manipulation in the primary metric. The structural relation between incentives and manipulation makes this correction feasible: on audited rounds, the difference between the manipulable metric and the audit metric isolates manipulation (up to noise), and contract variation supplies the needed rank conditions.

This approach connects to work on validation samples and audit-based correction, where a subset of observations receives higher-quality measurement and can be used to de-bias estimates from the full sample. Our contribution is to integrate this idea into a strategic environment: the validation sample (audits) affects behavior, and the correction must therefore respect the equilibrium mapping from incentives to actions. In other words, the audit is both an identification device and a policy intervention, and the two roles are inseparable.

Finally, our online component relates to bandit and online learning models in which actions affect not only payoffs but also the data-generating process. In standard stochastic bandits, rewards are exogenous conditional on the chosen arm; in our setting, the principal’s contract choice changes agent behavior, which changes both the observed metrics and the outcome. This endogeneity makes naive ``learn-then-optimize’’ approaches fragile and motivates methods that combine exploration with consistent estimation under behavioral responses. Recent work studies bandits with confounding, instrumental-variable bandits, and adaptive experiments in which instruments or randomization are used to recover causal parameters despite endogenous regressors . We contribute to this line by showing that, in our linear-quadratic environment, audits deliver a repeated-measurement structure that supports a simple two-stage estimator and an epoch-based greedy policy with low regret under a diversity (eigenvalue growth) condition.

At the same time, we emphasize what our framework does capture. Many strategic classification models allow agents to choose features directly but abstract from transfers; many auditing models treat misreporting rather than endogenous metric creation; and many online learning models assume stationarity that may fail when agents adapt to detection rules or when manipulability evolves. Our aim is to provide a tractable synthesis: a model in which metric manipulation is an explicit action, auditing both deters and identifies manipulation, and learning can be carried out with interpretable moment conditions. We view this synthesis as complementary to richer frameworks, offering a baseline that is analytically transparent and operationally suggestive for platforms deciding how to weight metrics, when to audit, and how to learn what they truly value.

3. Model: tasks, linear contracts, manipulation technology, audit process; equilibrium definition and discussion of modeling choices.

We study a platform that repeatedly incentivizes a stream of strategic agents to perform a collection of d tasks. The platform cares about a latent, value-relevant notion of performance (``true effort’’) but can only contract on operational metrics that are, at least in part, manipulable. Our goal in this section is to lay out a minimal environment that makes this wedge precise: incentives move both productive effort and wasteful metric inflation, and occasional auditing provides an alternative measurement that is harder to game.

Time is indexed by rounds t = 1, 2, …, T. In each round the platform (principal) posts a linear contract, an agent arrives, chooses actions, performance signals are realized, and the platform updates its beliefs for future rounds. Formally, in round t:
(i) the principal chooses contract parameters; (ii) the agent chooses a true effort vector a_t ∈ ℝ₊^d and a manipulation vector m_t ∈ ℝ₊^d; (iii) a routinely observed metric x_t ∈ ℝ^d is generated; (iv) with some probability an audit (or secondary measurement) takes place and an additional signal x̃_t ∈ ℝ^d is observed; (v) the agent is paid according to the posted contract and realized signals; (vi) the principal observes a noisy outcome y_t ∈ ℝ and proceeds to the next round. We treat each round as involving a fresh agent with the same primitives, so the intertemporal problem faced by the principal is a learning-and-control problem rather than a dynamic incentive problem with a single long-lived agent.

The operational metric x_t is an equilibrium object: it reflects both genuine productive input and the agent’s ability to inflate the metric directly. We posit an additive technology
x_t = a_t + m_t + ε_t,
where ε_t is a mean-zero subgaussian noise vector that captures idiosyncratic measurement error, environmental randomness, and other sources of variation orthogonal to the agent’s choice. The key modeling choice here is that manipulation enters x_t additively, separable from true effort. This captures settings in which there are direct levers for improving measured performance without commensurate improvements in underlying value—for instance, manipulating clicks, polishing superficial features, or strategically timing actions to look good under a metric.

Auditing provides a second measurement that is (by design) less manipulable. When an audit occurs, the platform observes
x̃_t = a_t + ε̃_t,
where ε̃_t is mean-zero subgaussian noise, independent of ε_t conditional on actions. We interpret x̃_t as a more reliable but more expensive or less scalable measurement: manual review, secondary sensors, back-end logs, or verified outcomes. Importantly, x̃_t depends on effort a_t but not on manipulation m_t. This is the formal expression of what ``auditing’’ accomplishes in practice: it does not eliminate noise, but it breaks the link between metric inflation and the measurement used for evaluation.

Audits arrive stochastically. Let p ∈ [0, 1] denote the audit probability. With probability p the platform observes x̃_t (and can condition pay on it); with probability 1 − p it observes only x_t. We take p as an exogenous monitoring intensity, or as a reduced-form policy variable summarizing the platform’s investment in verification. In the online learning part of the paper, p also controls how frequently the platform obtains the ``high-quality’’ measurement needed for identification.

The principal cares about a realized outcome y_t (e.g., downstream user value, retention, revenue, or a quality score that is not directly contractible). We assume that expected outcome is linear in true effort:
𝔼[y_t ∣ a_t] = ⟨θ^*, a_t⟩,
where θ^* ∈ ℝ₊^d is the vector of marginal values of effort in each task, and where the realized outcome is y_t = ⟨θ^*, a_t⟩ + η_t with mean-zero subgaussian noise η_t. The nonnegativity θ^* ≥ 0 encodes that each task is (weakly) socially valuable at the margin; the central distortion in our environment is not that tasks are harmful, but that the platform may nonetheless want to reduce incentives on tasks whose measurement is easy to game.

Given a payment rule w_t, the principal’s per-round utility is quasi-linear:
U_P, t = y_t − w_t.
Thus the platform internalizes the fiscal cost of metric-based pay: rewarding an inflated metric is costly even if it does not create real value. This accounting is important for the design problem, and it distinguishes our setting from purely predictive or classification problems in which the decision maker does not literally pay for performance.

We restrict attention to linear contracts, both for tractability and because many platform compensation rules are effectively linear in scores, counts, and ratings (possibly after normalization). When both signals are available, we write the most general linear rule as
w_t(x_t, x̃_t) = ⟨β^(x), x_t⟩ + ⟨β^(x̃), x̃_t⟩,
with nonnegative weights β^(x), β^(x̃) ∈ ℝ₊^d. In rounds without an audit, the contract necessarily reduces to payment on x_t alone. A particularly salient and operationally common specialization is an audit-contingent rule: the platform pays on the routinely observed metric when there is no audit, but switches to the audit metric when audited. With a single weight vector β ∈ ℝ₊^d, this takes the form
w_t = ⟨β, x_t⟩ ⋅ 1{no audit} + ⟨β, x̃_t⟩ ⋅ 1{audit}.
This ``switching’’ rule captures a simple governance logic: the platform wants to use the cheap metric most of the time, but when it pays the cost to verify, it wants compensation to depend on what verification reveals.

We allow the principal’s choice of weights to be constrained to some feasible set B (e.g., budget, regulatory, or product constraints), such as B = {β ∈ [0, 1]^d}. These constraints play no conceptual role in the benchmark characterization when the unconstrained optimum lies in the interior, but they become relevant when we discuss implementability and learning with bounded experimentation.

Agents are risk neutral and choose both effort and manipulation to maximize expected payment minus costs. Under a linear contract, only expected pay matters, so the noise terms enter payoffs only through their means (which are zero). We assume separable quadratic costs:
$$ c(a_t) \;=\; \tfrac12 a_t^\top K^{-1} a_t, \qquad g(m_t) \;=\; \tfrac12 m_t^\top M^{-1} m_t, $$
where K = diag(κ₁, …, κ_d) ≻ 0 and M = diag(μ₁, …, μ_d) ≻ 0. We interpret κ_i as the responsiveness/productivity of task i: higher κ_i means incentives translate into larger changes in real effort along that dimension. Likewise, μ_i summarizes manipulability: higher μ_i means metric inflation is cheaper, so the same incentive weight induces more gaming.

The diagonal structure is a deliberate simplification. It isolates the core tradeoff on each task—how to pay for a measured dimension that can be moved either by real effort or by distortion—without introducing cross-task technological interactions. In many applications, substitutability and complementarity across tasks are surely present (e.g., time spent gaming one metric crowds out work on another), and one could incorporate these forces by allowing non-diagonal cost or production matrices. We view the diagonal case as a baseline that yields transparent comparative statics and a clean identification strategy; it also aligns with the practical reality that many platforms can tune incentive weights task-by-task even if the underlying technology is more entangled.

Given a posted contract, the agent’s per-round expected utility is
U_A, t = 𝔼[w_t] − c(a_t) − g(m_t),
subject to a_t ≥ 0 and m_t ≥ 0 componentwise. The nonnegativity constraints reflect that effort'' andinflation’’ are intensities rather than signed controls; analytically, these constraints matter only when the principal were to set negative pay weights, which we exclude.

Within each round, the principal is the leader and the agent is the follower. A (per-round) Stackelberg equilibrium under a given contract class is defined by: (i) a principal choice of contract parameters (weights, and possibly an audit-contingent form) and (ii) an agent best response mapping from the contract to (a_t, m_t), such that the principal’s choice maximizes expected utility anticipating the agent’s response. In the repeated environment, the equilibrium object of interest is a policy for choosing contract parameters over time based on past observations, together with the induced sequence of agent best responses. We emphasize that the principal never observes (a_t, m_t) directly; it observes only the signals (x_t, x̃_t) (when available) and the outcome y_t.

This distinction between within-round strategic response and across-round learning is central. The data the principal collects are endogenous to the posted contract. When parameters such as θ^* and M are unknown, naive estimation that treats x_t as an exogenous regressor will generally be biased, precisely because x_t mixes productive effort with incentive-driven manipulation. The audit process, by producing a second measurement x̃_t that is not manipulable, is what allows us to separate these components and learn consistently.

Several simplifications are worth making explicit. First, we assume manipulation does not directly affect y_t; it is privately wasteful (it raises pay and potentially consumes real resources via g(m_t)) but produces no social value. This assumption cleanly captures ``pure gaming.’’ In some contexts, what looks like manipulation may have ambiguous welfare effects (e.g., marketing that inflates a metric but also increases demand). Extending the model to allow m_t to enter y_t would blur the normative interpretation of gaming but would not change the logic that the principal must distinguish value-creating from value-destroying responses to incentives.

Second, we treat audits as exogenous and non-strategic: the audit probability p is not conditioned on the realized metric, and agents do not face additional penalties beyond the fact that audited pay depends on x̃_t. This captures environments where verification is used mainly to improve measurement rather than to punish. In practice, platforms often combine verification with sanctions, targeted audits, and adaptive detection. Our reduced-form audit mechanism is intended as a baseline that isolates a single force—audits reduce the return to manipulation by weakening the link between inflated metrics and pay—while still generating empirically meaningful testable implications (namely, differences between x_t and x̃_t on audited rounds).

Third, linear contracts and quadratic costs are chosen to deliver a transparent mapping from primitives to behavior. The benefit is interpretability: we can tie comparative statics directly to task-specific productivity and manipulability, and we can use the linear response structure for identification and learning. The cost is that we abstract from risk aversion, non-linear bonus schemes, and richer forms of strategic interaction. We view this tradeoff as appropriate for our main use case: automated, large-scale contracting where simplicity and implementability are first-order, and where the principal’s challenge is to design and learn in the presence of systematic metric gaming.

The next section uses this structure to characterize behavior and optimal linear incentives under known primitives, and to clarify precisely how manipulability and audit intensity shape the platform’s optimal weighting of performance metrics.

4. Benchmark characterization (known parameters): agent best response (effort + manipulation); optimal linear contract paying on x only; interpretation as shrinkage by manipulability.

We begin with a static benchmark in which the platform knows the primitives (θ^*, K, M) and is restricted to contracts that pay only on the routinely observed (and manipulable) metric x. This case isolates the central wedge in our environment: because the agent can move x both by productive effort and by metric inflation, the platform optimally incentive weights on dimensions that are easy to game.

Fix a nonnegative linear contract β ∈ ℝ₊^d and suppose pay depends on x alone, w = ⟨β, x⟩. Since noises are mean zero, the agent solves
$$ \max_{a\ge 0,\,m\ge 0}\ \mathbb{E}[w]-c(a)-g(m) \;=\; \max_{a\ge 0,\,m\ge 0}\ \langle \beta,a+m\rangle -\tfrac12 a^\top K^{-1}a -\tfrac12 m^\top M^{-1}m. $$
Two features make this problem especially transparent. First, the objective is additively separable in a and m, so the agent chooses productive effort and manipulation as if they were two independent ``technologies’’ for increasing expected pay. Second, under our quadratic costs, each of these technologies is linear in incentives at the margin.

When the nonnegativity constraints do not bind (which is the relevant case under β ≥ 0), the first-order conditions are
K⁻¹a = β, M⁻¹m = β,
yielding the unique interior best response

Because K and M are diagonal, this mapping is componentwise: for each task i,
a_i^*(β) = κ_iβ_i, m_i^*(β) = μ_iβ_i.
Thus, higher incentive weight on a task increases both true effort and metric inflation on that task, with slopes governed respectively by productivity κ_i and manipulability μ_i.

A useful sufficient statistic is the induced mean metric,

which decomposes additively into a value-relevant component Kβ and a purely distortionary component Mβ. This decomposition already highlights why contracting on x is not innocuous: any increase in β mechanically increases expected pay both through effort and through manipulation, but only the former contributes to the platform’s objective.

Anticipating , the platform chooses β to maximize expected outcome net of expected payments:
U_P(β) = 𝔼[y] − 𝔼[⟨β, x⟩] = ⟨θ^*, a^*(β)⟩ − ⟨β, 𝔼[x ∣ β]⟩.
Substituting – gives the concave quadratic program
M),
\end{equation}
where B is a feasible set of contracts (e.g., box constraints), included to accommodate implementability but not needed for the interior benchmark. The term β^⊤(K + M)β is the expected payment bill induced by the agent’s response. Intuitively, the platform faces an ``effective’’ marginal cost of incentives that reflects not only how paying harder induces more effort (via K) but also how it induces more gaming (via M).

When B = ℝ₊^d and the solution is interior, the first-order condition is
∇_βU_P(β) = Kθ^* − 2(K + M)β = 0,
which yields the unique maximizer

Strict concavity follows from K + M ≻ 0, so is the unique optimum in the unconstrained problem. With constraints, a convenient interpretation is that the constrained optimum is the maximizer of the same concave quadratic over B, and in many common cases (e.g., B = [0, 1]^d) can be viewed as a projection of onto B under an appropriate norm induced by (K + M).

The diagonal structure delivers a particularly crisp expression. Since (K + M)⁻¹K is diagonal with entries κ_i/(κ_i + μ_i), we have

Relative to the no-manipulation benchmark $\beta_i=\tfrac12 \theta_i^*$, the platform scales down the weight on task i by the factor κ_i/(κ_i + μ_i) ∈ (0, 1]. This is a precise, contract-theoretic version of the informal governance maxim ``do not pay too much for what is easy to fake’’: when manipulation is cheap (large μ_i), even modest incentives generate substantial inflation, so the platform optimally weakens pay-for-performance along that dimension.

Expression also clarifies what kinds of tasks remain highly incentivized. Holding value θ_i^* fixed, a task that is very responsive to incentives in productive effort (large κ_i) justifies a larger weight, while a task that is highly manipulable (large μ_i) is downweighted. In the extremes, if μ_i → 0 then $\beta_i^*\to \tfrac12\theta_i^*$, whereas if μ_i → ∞ then β_i^* → 0 even when θ_i^* > 0: the platform would rather forego incentives than purchase mostly gaming.

It is also informative to translate these weights into the induced behaviors. Under , the platform elicits
$$ a_i^* = \kappa_i\beta_i^* = \tfrac12 \theta_i^*\cdot \frac{\kappa_i^2}{\kappa_i+\mu_i}, \qquad m_i^* = \mu_i\beta_i^* = \tfrac12 \theta_i^*\cdot \frac{\kappa_i\mu_i}{\kappa_i+\mu_i}. $$
Thus, even at the optimal contract, some manipulation is generically present whenever μ_i > 0; the platform mitigates gaming by attenuating incentives rather than eliminating it.

We can view as an ``effective-cost’’ reformulation. If the agent could only exert effort (no manipulation), the platform would choose β to balance marginal value Kθ^* against a marginal payment cost proportional to K. Manipulation adds an additional payment leakage proportional to M. In this sense, M acts like an extra quadratic cost term borne by the principal, not because the principal pays for manipulation directly, but because the principal pays on a metric that manipulation inflates. The optimal rule is therefore a form of : it scales down the first-best weight $\tfrac12\theta^*$ by a factor that depends only on the relative ease of productive versus distortive responses.

This benchmark has an immediate operational implication for platform design: when a metric is known to be manipulable, the platform should not respond by simply turning up'' the weight to compensate for noise or dilution. Instead, the correct adjustment goes in the opposite direction, because the platform is purchasing a mixture of effort and inflation. Practitioners often articulate this informally asif you pay on clicks, you buy click fraud.’’ Our characterization makes that statement quantitative: with known (κ_i, μ_i), the exact tradeoff is pinned down by .

Two limitations of this characterization are worth emphasizing because they motivate our subsequent focus on audits and learning. First, assumes the platform knows both task values θ^* and manipulability M. In most environments, neither is directly observable: we rarely know which dimensions are easy to game or how strongly they map into latent value. Second, when the platform can condition pay on an audit signal that is less manipulable, it can do better than shrink incentives uniformly; it can instead change the agent’s marginal return to manipulation by changing which measurement determines pay. The next section formalizes this second force: audits reduce the effective manipulability of x and thereby allow the platform to increase incentive power while containing gaming.

5. Audits and sufficient statistics: contracts contingent on audits; optimal ‘pay x if no audit, pay if audit’; extension to noisy audits and linear combinations; comparative statics in audit rate and noise.

The benchmark in Section makes clear why a manipulable metric forces the platform to attenuate incentives: when pay depends on x alone, the agent can increase expected compensation through either productive effort or metric inflation, and the platform must purchase a mixture of the two. We now formalize how even a simple audit technology relaxes this tradeoff. The key force is not that audits change the agent’s cost of manipulation, but that they change the by altering which measurement determines compensation.

Suppose that in each round an audit occurs with probability p ∈ (0, 1]. When an audit occurs, the platform observes an additional measurement x̃ = a + ε̃ that is not affected by m. A simple and operationally natural contract is:

That is, the platform ``pays on what it sees routinely’’ in non-audited rounds and switches to paying on the audit metric in audited rounds. This rule has a stark implication for incentives. Conditional on (a, m), expected pay is
𝔼[w ∣ a, m] = (1 − p) ⟨β, a + m⟩ + p ⟨β, a⟩,
so the agent’s marginal return to productive effort a remains β, while the marginal return to manipulation m is scaled down to (1 − p)β. Under our quadratic costs, the interior first-order conditions become
K⁻¹a = β, M⁻¹m = (1 − p)β,
yielding the unique interior best response

Relative to , audits do not weaken effort incentives at all, but they directly reduce manipulation incentives by the factor 1 − p. This separation is the central advantage of audit-contingent pay: it creates a wedge between the agent’s return to productive actions and to gaming.

More generally, what matters for behavior under any linear payment rule is the on a and on m in expected compensation. To see this, let the payment be any linear function of the observed signals in a given round:
w = ⟨β^(x), x⟩ + ⟨β^(x̃), x̃⟩,
with the understanding that β^(x̃) = 0 when x̃ is missing. Since x = a + m + ε and x̃ = a + ε̃, expected pay conditional on (a, m) depends on (a, m) only through
𝔼[w ∣ a, m] = ⟨β^(x) + β^(x̃), a⟩ + ⟨β^(x), m⟩.
Thus, for a risk-neutral agent with separable quadratic costs, the entire signal structure collapses to two vectors:
γ_a := β^(x) + β^(x̃) (incentive on true effort), γ_m := β^(x) (incentive on manipulation).
The agent then chooses
a^* = Kγ_a, m^* = Mγ_m,
subject to nonnegativity. Under audit-contingent observation, the same logic applies in expectation: the only difference is that the coefficient on m is multiplied by the probability that x is actually payoff-relevant. Contract is precisely the choice γ_a = β and γ_m = (1 − p)β.

This ``sufficient statistic’’ perspective will be useful below, because it makes clear what audits buy: they allow the platform to set γ_a high while holding γ_m low by shifting weight from x to x̃ in audited rounds.

Anticipating , the platform’s expected utility under is
U_P(β; p) = ⟨θ^*, a^*(β; p)⟩ − 𝔼[w] = ⟨θ^*, Kβ⟩ − (β^⊤Kβ + (1 − p)β^⊤Mβ),
where the payment term reflects that the platform pays for effort in either state, but pays for manipulation only in non-audited states. The objective is strictly concave in β because K + (1 − p)M ≻ 0, giving the unique unconstrained maximizer

In the diagonal case this simplifies task-by-task to

Comparing to , audits act exactly as if they replaced μ_i with an (1 − p)μ_i. The induced behaviors inherit this scaling:
a_i^*(p) = κ_iβ_i^*(p), m_i^*(p) = (1 − p)μ_iβ_i^*(p).
As p increases, manipulation falls both directly (through the (1 − p) term) and indirectly (because the platform adjusts β). At the same time, effort increases because the platform can safely increase incentive power.

Equation delivers clean monotonicity. For each i with θ_i^* > 0, β_i^*(p) is increasing in p and converges to the no-manipulation benchmark:
$$ \lim_{p\to 1}\beta_i^*(p)=\tfrac12 \theta_i^*. $$
Conversely, as p ↓ 0 we recover the x-only benchmark $\beta_i^*(0)=\tfrac12\theta_i^*\cdot \kappa_i/(\kappa_i+\mu_i)$. Operationally, increasing the audit rate permits stronger pay-for-performance precisely on those dimensions where gaming would otherwise force severe shrinkage. This clarifies a common platform-design intuition: auditing is most valuable when it targets metrics that are both high-stakes (large θ_i^*) and easy to fake (large μ_i).

One might ask whether the platform should ever pay on x even when x̃ is observed. Within the class of risk-neutral linear contracts in our environment, the answer is essentially no: conditional on wanting a given incentive γ_a on true effort in audited rounds, placing any weight on x in those rounds increases γ_m one-for-one and therefore increases manipulation without increasing value. Formally, holding fixed the total weight on a in audited rounds, shifting an ε amount of weight from x to x̃ leaves a^* unchanged (since γ_a is unchanged) but strictly decreases m^* (since γ_m falls), improving the platform’s objective. This logic is the reason emerges as the canonical contract: it uses x only when it is the only available measurement of effort.

The clean scaling in relies on the audit metric being immune to manipulation. In practice, audits may be imperfect: the audited measurement could still be partially gameable, or could incorporate some component correlated with m (for example, if auditors observe traces the agent can influence). A parsimonious way to capture this is
x̃ = a + ρm + ε̃, ρ ∈ [0, 1),
where ρ indexes the degree of residual manipulability of the audit. Under the same audit-contingent rule , the agent’s expected marginal benefit of manipulation becomes ((1 − p) + pρ)β, hence
a^*(β; p) = Kβ, m^*(β; p) = ((1 − p) + pρ)Mβ.
The platform’s optimal weight is correspondingly

This nests our baseline (ρ = 0) and makes transparent how audit quality and audit frequency are substitutes: increasing p or decreasing ρ both reduce effective manipulability. When ρ is close to 1, an audit does little to deter gaming, and approaches the x-only solution even if p is large.

In our baseline with risk neutrality, the variances of ε and ε̃ do not affect the one-shot optimal contract because they wash out of expected pay and expected output. Noise matters, however, for two reasons that will be central in Section 6. First, it governs statistical precision: the within-round difference x − x̃ equals m + (ε − ε̃), so noisier audits reduce the signal-to-noise ratio for learning manipulation. Second, if one imposes implementability constraints that are sensitive to payment volatility (e.g., implicit risk constraints, limited liability interacting with realized payments, or explicit caps in B motivated by predictability), then the platform may prefer to condition more heavily on the less noisy measurement when both are available.

A convenient way to represent this latter possibility is to allow, in audited rounds, a linear combination
w_audit = ⟨β^(x), x⟩ + ⟨β^(x̃), x̃⟩,
while in non-audited rounds w_no = ⟨β⁽⁰⁾, x⟩. The agent’s incentives are still summarized by the effective coefficients on a and m:
γ_a = (1 − p)β⁽⁰⁾ + p(β^(x) + β^(x̃)), γ_m = (1 − p)β⁽⁰⁾ + pβ^(x) (or (1 − p)β⁽⁰⁾ + p(β^(x) + ρβ^(x̃)) under residual manipulability).
Thus, once we fix a desired γ_a, shifting audited-round weight from x to x̃ reduces γ_m but may (in constrained environments) increase payment volatility if x̃ is noisier. In that sense, the relative noise levels of x and x̃ determine how aggressively the platform uses the audit metric when both enter pay.

Audits therefore play two conceptually distinct roles in our environment. Contractually, they reduce the effective manipulability that the platform must internalize, permitting stronger incentives as in . Econometrically, audited rounds generate a natural sufficient statistic for gaming, the difference x − x̃, which isolates manipulation up to mean-zero noise. In the next section we exploit this to identify and estimate the manipulability matrix M and the task values θ^* from observed data, and then to design a learning policy that approaches the audit-robust benchmark.

6. Identification and estimation: two-stage estimation of manipulability M from audited differences; IV/GMM estimation of using corrected signals; finite-sample error bounds.

We now turn to the econometric counterpart of the contractual logic in Section . The platform observes, in each round t, the posted contract β_t and the realized signals x_t (always) and x̃_t (only if audited), as well as the downstream outcome y_t. Our objective is to recover two structural objects that discipline optimal incentives: the manipulability matrix M (which governs how gaming responds to incentives) and the task-value vector θ^* (which governs how true effort translates into value). With these in hand, the platform can compute the audit-robust benchmark contract $\beta^*(p)=\tfrac12\bigl(K+(1-p)M\bigr)^{-1}K\theta^*$.

Throughout this section we maintain the behavioral implications derived above: under the audit-contingent rule , the agent best-responds according to . Thus, on any audited round,
x_t − x̃_t = m_t + (ε_t − ε̃_t) = (1 − p)Mβ_t + (ε_t − ε̃_t),
and the outcome satisfies
y_t = ⟨θ^*, a_t⟩ + η_t = ⟨θ^*, Kβ_t⟩ + η_t.
The key idea is that audits generate a direct ``difference signal’’ that isolates manipulation up to mean-zero noise, and once we subtract estimated manipulation from x, we obtain a repeated-measurement structure (a corrected x and the audit metric x̃) that supports IV/GMM estimation of θ^*.

Let 𝒜 ⊆ {1, …, T} denote the set of audited rounds, with n_A := |𝒜|. Define the audited difference vector
Δ_t := x_t − x̃_t, t ∈ 𝒜.
Under we have the linear regression model

where ζ_t is mean-zero subgaussian with covariance proxy inherited from (ε_t, ε̃_t).

In the diagonal case M = diag(μ₁, …, μ_d), decouples by task:
Δ_t, i = (1 − p)μ_iβ_t, i + ζ_t, i, i = 1, …, d.
A convenient estimator is coordinate-wise OLS on the audited subsample:

provided ∑_{t ∈ 𝒜}β_t, i² > 0 for each i we seek to estimate. (When a task is never incentivized in audited rounds, its manipulability is not identified from these data; this is an economically natural limitation rather than a technical artifact.)

More generally, without diagonality one can estimate M via multivariate regression of Δ_t on β_t, i.e.,
$$ \widehat{M} = \frac{1}{1-p} \left(\sum_{t\in\mathcal{A}} \Delta_t \beta_t^\top\right) \left(\sum_{t\in\mathcal{A}} \beta_t\beta_t^\top\right)^{-1}, $$
whenever the audited design matrix is full rank. We focus on the diagonal case because it matches the separable-cost foundation and yields transparent finite-sample statements.

To make the preceding estimators operational in a learning environment, we need quantitative conditions under which M̂ concentrates around M. The only substantive requirement is that the platform induce enough variation in β on audited rounds.

There exists λ₀ > 0 such that, with high probability (or deterministically under a designed exploration schedule),

In the diagonal case this implies ∑_{t ∈ 𝒜}β_t, i² ≥ n_Aλ₀ for all i.

Under subgaussian noise and bounded contracts (e.g. β_t ∈ [0, 1]^d), standard self-normalized concentration yields the following generic implication of : there is a universal constant C > 0 such that, for any δ ∈ (0, 1), with probability at least 1 − δ,

where the constant absorbs the subgaussian scale of ζ_t, i. Two features are worth emphasizing. First, the 1/(1 − p) factor is structural: when audits are rare, manipulation is rarely payoff-relevant and thus harder to learn from differences. Second, the dependence on λ₀ formalizes the intuition that learning manipulability requires contract variation, not merely large samples.

Given M̂, we form a corrected version of the manipulable metric by subtracting the predicted manipulation induced by the posted contract:

On audited rounds, substituting x_t = a_t + m_t + ε_t and m_t = (1 − p)Mβ_t gives

When M̂ = M, we recover the classical repeated-measurement structure:
x_t^c = a_t + ε_t, x̃_t = a_t + ε̃_t,
with conditionally independent noises. This structure is precisely what makes audits valuable for identifying θ^*: x̃_t is correlated with a_t (the endogenous regressor we care about) but orthogonal to ε_t, so it serves as a natural instrument for x_t^c in the outcome equation.

Formally, consider the moment condition on audited rounds

Under θ = θ^* and M̂ = M, we have y_t = ⟨θ^*, a_t⟩ + η_t and x_t^c = a_t + ε_t, so
y_t − ⟨θ^*, x_t^c⟩ = η_t − ⟨θ^*, ε_t⟩,
which is mean-zero and independent of x̃_t conditional on a_t. Thus holds. Identification requires that the corresponding Jacobian be nonsingular, i.e. that 𝔼[x̃_tx_t^c⊤] be invertible. Under our behavioral model,
𝔼[x̃_tx_t^c⊤] = 𝔼[a_ta_t^⊤] = 𝔼[Kβ_tβ_t^⊤K],
so invertibility reduces again to contract diversity.

Let X̃ ∈ ℝ^n_A × d be the matrix stacking x̃_t^⊤ over t ∈ 𝒜, let X^c ∈ ℝ^n_A × d stack (x_t^c)^⊤, and let Y ∈ ℝ^n_A stack y_t. The sample analog of yields the (just-identified) IV estimator

whenever X̃^⊤X^c is invertible. This is also the exactly identified GMM estimator with instrument x̃ and regressor x^c.

To understand how audits translate into learnability of θ^*, it is useful to decompose the estimation error into (i) sampling noise from (η, ε, ε̃) and (ii) first-stage error from estimating M. Write the structural outcome equation on audited rounds as
Y = X^cθ^* + u, u := η − Eθ^* − b,
where E stacks ε_t^⊤ and the ``bias’’ term from imperfect correction is
b_t := ⟨θ^*, (1 − p)(M − M̂)β_t⟩.
Then

Equation makes the two necessary ingredients transparent. First, the matrix X̃^⊤X^c must be well-conditioned; this is the same diversity requirement that appeared in estimating M, now expressed in terms of realized measurements. Second, the projected noise X̃^⊤u must be controlled; this is where subgaussianity delivers concentration and where M̂ enters through b.

Under an event on which σ_min(X̃^⊤X^c) is bounded below (e.g. by a constant multiple of n_Aλ₀ up to noise fluctuations), a typical high-probability bound implied by takes the schematic form

where C depends on subgaussian scales. While suppresses constants, its economics are clear: learning θ^* is fast when (i) audited samples are plentiful (n_A large), (ii) incentives vary enough to make the effective ``first stage’’ strong (large σ_min), and (iii) manipulability is estimated precisely (small ∥M̂ − M∥). Combining with standard matrix concentration for σ_min(X̃^⊤X^c) yields the familiar $\widetilde{O}(1/\sqrt{n_A})$ rate, with an additional term reflecting the two-stage structure.

It is worth acknowledging the limits of the approach. First, identification of M hinges on observing both x and x̃ in the same round; without audited repeats, manipulation is observationally equivalent to effort in x, and only the reduced-form object K + M is learnable from x alone. Second, identification of θ^* requires that the platform generate sufficiently rich variation in incentives. In the static model this is irrelevant because θ^* is primitive; in the dynamic model it is a policy choice, and it is precisely why we will impose (and later engineer) diversity conditions such as .

At the same time, the approach is operationally attractive: the difference x − x̃ is a direct measure of gaming, and the IV moment uses only observed variables and a transparent correction . In particular, the platform need not ever observe true effort a; it only needs occasional audited repeats to create a second measurement channel.

The estimators and are building blocks for an adaptive contracting policy. The remaining step is algorithmic: how to choose β_t over time so that (i) pay is near-optimal given current estimates, while (ii) the induced {β_t} satisfy the diversity conditions needed for –. In the next section we show that an epoch-based greedy policy can reconcile these objectives and achieve polylogarithmic regret when audits and diversity are strong enough, and we characterize when explicit exploration becomes unavoidable.

7. Online learning: epoch-based greedy learner with audits; regret bounds under diversity; discussion of when explicit exploration is necessary (weak audits/weak diversity).

The identification results in Section suggest a natural operational message: once audits provide enough repeated measurements to estimate (M, θ^*), the platform can simply and behave as if the primitives were known. The subtlety, of course, is that the posted contracts are themselves the source of identification: if we are always ``too greedy’’ too early, we may fail to generate the variation required for the audited diversity condition . This section formalizes a simple approach that reconciles these two objectives.

We evaluate an adaptive policy by (expected) regret relative to the audit-contingent optimum. Let
U_P(β; p) := 𝔼[y − payment | β, p]
denote the principal’s one-period expected utility induced by posting β under audit probability p and the contingent payment rule from Section . Let β^*(p) be the corresponding optimal weight when (M, θ^*) are known. Given a (possibly history-dependent) policy producing {β_t}_t = 1^T, we define regret
$$ \mathrm{Reg}(T) := \sum_{t=1}^T \Big(U_P(\beta^*(p);p)-U_P(\beta_t;p)\Big). $$
Because the induced principal objective is a strictly concave quadratic in β (conditional on (M, θ^*)), regret can be related directly to how far β_t lies from β^*(p), and hence to parameter estimation error.

We consider a simple estimate-then-commit within an epoch'' design. Time is partitioned into epochs $e=1,2,\dots$, with lengths $L_e$ (for concreteness one can take $L_e=2^{e-1}$, so the horizon grows geometrically). At the start of epoch $e$, the platform computes estimates $(\widehat{M}_e,\widehat{\theta}_e)$ using \emph{all audited observations collected strictly before the epoch}, via \eqref{eq:mu-hat} and \eqref{eq:theta-hat}. It then posts a constant contract throughout the epoch: \begin{equation}\label{eq:epoch-greedy} \beta_t \equiv \widehat{\beta}_e := \Pi_{B}\!\left(\beta^*(p;\widehat{\theta}_e,\widehat{M}_e)\right), \qquad t \in \text{epoch }e, \end{equation} where $\Pi_B$ denotes Euclidean projection onto the feasible set $B$ and \[ \beta^*(p;\theta,M) = \tfrac12\bigl(K+(1-p)M\bigr)^{-1}K\theta \] is the model-implied optimal weight map. Intuitively, \eqref{eq:epoch-greedy} isgreedy’’ in that within each epoch we choose the contract that would be optimal if current estimates were correct, but ``cautious’’ in that we only update at epoch boundaries, so that estimation error does not feed back too aggressively through continuously changing policies.

Two frictions motivate the epoch structure. First, the IV/GMM estimator depends on the realized matrix X̃^⊤X^c; updating β too frequently can make the effective first stage highly nonstationary, complicating concentration arguments for σ_min(X̃^⊤X^c). Second, even if one is willing to deal with time-varying designs, the correction step x^c = x − (1 − p)M̂β couples behavior to estimation error. Freezing β within an epoch makes it straightforward to control this coupling and to express regret in terms of the estimation accuracy at epoch start.

A key technical input is that the mapping (θ, M) ↦ β^*(p; θ, M) is Lipschitz on bounded domains. In the diagonal case (our baseline), this is immediate coordinate-wise:
$$ \beta_i^*(p;\theta,M) = \tfrac12 \theta_i \cdot \frac{\kappa_i}{\kappa_i+(1-p)\mu_i}, $$
so perturbations in θ_i translate linearly into perturbations in β_i, while perturbations in μ_i translate with sensitivity controlled by (κ_i + (1 − p)μ_i)⁻². Thus, whenever κ_i is bounded away from zero and (μ_i, θ_i) lie in bounded sets, we can bound
∥β̂_e − β^*(p)∥₂ ≤ C_β(∥θ̂_e − θ^*∥₂+∥M̂_e − M∥_op),
for a constant C_β depending on (K, p) and the parameter bounds. Since the principal objective is a strongly concave quadratic in β, one also obtains a quadratic regret bound of the form
U_P(β^*(p); p) − U_P(β̂_e; p) ≤ C_U ∥β̂_e − β^*(p)∥₂²,
so controlling regret reduces to controlling the estimation errors of M and θ^* at epoch starts.

Suppose contracts are uniformly bounded (β_t ∈ B ⊆ [0, 1]^d), noises are subgaussian, and audits occur i.i.d. with probability p > 0. Assume further that the policy induces audited diversity in the sense that for each epoch e, the audited design accumulated up to epoch start satisfies a condition like with parameter λ₀ > 0 (equivalently, σ_min(X̃^⊤X^c) grows linearly in the number of audited rounds). Then combining the concentration bound for M̂ (as in ) with standard IV/GMM error bounds (as in ) yields
$$ \|\widehat{\theta}_e-\theta^*\|_2 = \widetilde{\mathcal{O}}\!\left(\sqrt{\frac{d}{n_{A,e}\lambda_0}}\right), \qquad \|\widehat{M}_e-M\|_{\mathrm{op}} = \widetilde{\mathcal{O}}\!\left(\frac{1}{1-p}\sqrt{\frac{d}{n_{A,e}\lambda_0}}\right), $$
where n_A, e is the number of audited rounds before epoch e and $\widetilde{\mathcal{O}}(\cdot)$ suppresses logarithmic factors in (d, 1/δ, T). Plugging these into the contract-error and utility-error relationships and summing across geometrically growing epochs yields polylogarithmic regret:

with an additional dependence on p that is economically intuitive: learning slows when audits are rare because n_A, e ≈ p ⋅ (elapsed time) and because manipulation estimation inherits the structural 1/(1 − p) amplification. The content of is that, under sufficient audited diversity, we do not need heavy-handed exploration; the platform can learn ``in the background’’ while largely posting near-optimal contracts.

The preceding guarantee makes explicit what is often implicit in learning-to-contract environments: some mechanism must ensure that the posted contracts span the task space often enough to identify both manipulability and task values. There are two conceptually distinct sources of such variation.

First, can come from institutional or product constraints. For example, if B restricts the platform to choose among a menu of contracts that already vary across tasks (e.g. compliance requiring weight on each dimension, or operational needs forcing rotation across objectives), then can hold without any deliberate randomization.

Second, can arise because the optimal contract itself is interior and sensitive to the estimates: as θ̂_e fluctuates around θ^* early on, the greedy contract naturally varies and can excite each coordinate. In settings where θ^* has broad support and K is well-conditioned, this endogenous mechanism can be sufficient.

The same logic also clarifies when a purely greedy approach can fail. If β^*(p) (or its early estimates) place zero or near-zero weight on some coordinates, then those coordinates may never be incentivized in audited rounds. In the diagonal case, this means ∑_{t ∈ 𝒜}β_t, i² can remain small, preventing identification of μ_i and weakening the IV matrix for θ_i^*. Economically, the platform is then caught in a self-confirming loop: it does not pay for a task because it believes it is low-value or too gameable, and it never acquires the data that could overturn that belief.

This is not merely a technical pathology; it is the dynamic counterpart of multitask distortion. A stylized example makes the point. With d = 2, suppose early noise realizations induce θ̂₁ to be high and θ̂₂ to be low. The greedy optimizer sets β̂_e, 2 ≈ 0, so audited differences provide essentially no information about μ₂, and the IV moment provides essentially no information about θ₂^*. If in truth θ₂^* is large and μ₂ is modest, the platform will persistently under-incentivize task 2, generating regret that is linear in T rather than logarithmic. In this sense, audits alone do not eliminate the need for exploration: audits create , but they do not force such variation.

A robust modification is to mix the greedy contract with a small amount of designed perturbation. One convenient implementation is: in each epoch e, post
$$ \beta_t = \begin{cases} \widehat{\beta}_e, & \text{with probability } 1-\epsilon_e,\\ \beta_t^{\mathrm{exp}}, & \text{with probability } \epsilon_e, \end{cases} \qquad t\in \text{epoch }e, $$
where β_t^exp is drawn from a distribution supported on B with 𝔼[β_t^exp(β_t^exp)^⊤] ≽ λ_expI_d. Choosing ϵ_e to decay slowly (e.g. proportional to 1/L_e) makes the cumulative exploration cost ∑_eϵ_eL_e logarithmic, while guaranteeing that the audited design matrix accumulates mass in all directions and hence that holds with high probability. The economic interpretation is straightforward: the platform pays a small short-run cost to ``probe’’ tasks whose value or manipulability is uncertain, thereby preventing lock-in to a distorted objective.

Finally, we emphasize the boundary cases. When p = 0 (no audits), the platform never observes x̃ and cannot form the difference signal that identifies M; manipulation and effort are observationally equivalent in x, and the best one can hope to learn from (x, y) is a reduced-form mapping from incentives to outcomes. Even when p > 0 but small, the effective audited sample size n_A ≈ pT can make learning slow; in practice this means that the benefits of a sophisticated plug-in policy may only materialize when audits are not merely possible but sufficiently frequent. More broadly, when the environment or the feasible contract set makes diversity hard to generate, explicit exploration is not a nuisance but a necessity: without it, the platform may rationally choose contracts that minimize gaming in the short run while permanently sacrificing the information needed to design high-powered incentives safely in the long run.

8. Extensions: correlated/low-rank manipulation matrices; partial manipulability (m affects only subset of tasks); nonnegativity constraints and caps; approximate quadratic costs; endogenous audit targeting.

Our baseline analysis imposes two pieces of structure that are convenient but not essential: (i) manipulation enters additively and independently across tasks (diagonal M), and (ii) the agent faces exactly quadratic costs and interior choices. We now discuss several extensions that broaden the model’s empirical and institutional scope while preserving the same economic tradeoff: the platform wants to price productive effort, but any component of the metric that can be inflated without creating value acts like an .

In many platform settings, gaming is not task-by-task. For instance, a seller can inflate multiple engagement metrics simultaneously by purchasing traffic, or a driver can engage in behaviors that jointly affect acceptance rate and cancellation rate. This motivates allowing manipulation to be correlated across dimensions. Formally, let $g(m)=\tfrac12 m^\top M^{-1}m$ with a full positive definite matrix M ≻ 0 (not necessarily diagonal). With linear pay on the manipulable metric in non-audited rounds, the agent’s manipulation best response remains linear:
m^*(β; p) = (1 − p)Mβ,
and effort remains a^*(β; p) = Kβ under the same quadratic c(a). The platform’s optimal contract therefore remains
$$ \beta^*(p)=\tfrac12\bigl(K+(1-p)M\bigr)^{-1}K\theta^*, $$
but the comparative statics become : increasing manipulability in one direction shrinks weights in other directions because M rotates incentives through its off-diagonal entries. Economically, this captures the idea that paying harder on a ``clean’’ metric can still induce gaming if gaming technologies spill over.

Identification also generalizes directly. In audited rounds we still observe
𝔼[x − x̃ ∣ β] = m^*(β; p) = (1 − p)Mβ,
so M is identified from a multivariate regression of x − x̃ on β provided the audited design has full rank, i.e. 𝔼[ββ^⊤] is nonsingular. In high dimensions, however, estimating a dense M can be data-hungry: it has d(d + 1)/2 free parameters. A practically relevant refinement is that manipulation may be , even if the metric is high-dimensional. If M is approximately low rank (or, more generally, has rapidly decaying eigenvalues), one can replace the plain regression with a regularized estimator (e.g. nuclear-norm penalization) and obtain meaningful recovery with far fewer audited samples. Operationally, this corresponds to a setting where ``there are only a few ways to cheat,’’ and audits can quickly learn those directions and shrink incentives accordingly.

A second relaxation recognizes that agents may be able to manipulate only some metrics, or that manipulation may affect metrics through a known technological mapping. One convenient formulation is
x = a + Hm + ε, x̃ = a + ε̃,
where m ∈ ℝ₊^k is a lower-dimensional manipulation action and H ∈ ℝ^d × k maps manipulation into observed metrics. Quadratic costs $g(m)=\tfrac12 m^\top M^{-1}m$ then imply (under interiority) the best response
m^*(β; p) = (1 − p)MH^⊤β, a^*(β; p) = Kβ,
so the manipulable component of the metric is (1 − p)HMH^⊤β. In other words, what matters for contract design is the HMH^⊤ at the metric level. This extension captures simple cases as special instances: if only a subset of tasks is manipulable, take H to select those coordinates; if some metrics are mechanically coupled (e.g. one action inflates several reported submetrics), H encodes that coupling.

Audits remain informative, since
x − x̃ = Hm + (ε − ε̃),
and the platform can identify HMH^⊤ from contract variation even if the underlying H and M are not separately identified. From an applied perspective this is often the relevant object: the platform needs to know which directions are vulnerable to inflation under incentives, not necessarily the primitive manipulation actions. In learning terms, this suggests that even coarse audits can be powerful if they are aligned with the principal directions of HMH^⊤.

Our baseline best responses use interior first-order conditions, which implicitly assumes that optimal a_i and m_i are positive and that no other constraints bind. In practice, platforms frequently impose caps (e.g. bonus ceilings) or agents face technological limits (e.g. a finite amount of time, or a maximal feasible manipulation intensity). These constraints create piecewise-linear regions in the best response and can lead to bunching.

The nonnegativity constraints a, m ≥ 0 are largely innocuous under the maintained restriction β ≥ 0: with strictly convex costs, any coordinate with β_i > 0 has strictly positive interior optimum, while coordinates with β_i = 0 optimally choose a_i = m_i = 0. Binding constraints become relevant when the feasible set for contracts includes negative weights (penalties) or when B imposes corners that induce some β_i = 0 even though θ_i^* > 0. In those cases, complementary slackness yields
a_i^*(β) = max {0, κ_iβ_i}, m_i^*(β; p) = max {0, (1 − p)μ_iβ_i},
in the diagonal baseline, and analogous projections in the correlated case. The platform objective remains concave in β, but the optimizer may lie on the boundary of B, which is precisely when the ``lack of diversity’’ problem discussed in Section becomes empirically salient.

Caps can be handled similarly. If, for example, manipulation is bounded by 0 ≤ m ≤ m̄ coordinate-wise, then m^*(β; p) = min {m̄, (1 − p)Mβ}, yielding a regime where sufficiently high-powered incentives no longer induce proportionally more gaming. This is economically plausible when cheating requires scarce resources (e.g. purchasing fake reviews). The main qualitative prediction survives: absent audits, the platform trades off inducing effort against inducing wasteful manipulation, but the tradeoff may become less severe once caps bind.

Quadratic costs deliver linear best responses and closed-form contracts, but the core logic only requires that costs are smooth, separable (or weakly coupled), and strongly convex. Suppose c and g are twice differentiable and α-strongly convex. Then the agent’s interior best response satisfies
∇c(a^*) = β, ∇g(m^*) = (1 − p)β
in the baseline audit-contingent scheme. Even when these equations do not yield linear solutions globally, they imply that the mapping from incentives to actions is around any operating point:
a^*(β + Δ) ≈ a^*(β) + (∇²c(a^*(β)))⁻¹Δ,
and similarly for m^*. Thus one can interpret K and M in our model as : they are the inverses of local Hessians of effort and manipulation costs. This provides a bridge to empirical implementation. Even if the true cost functions are not quadratic, the platform can treat observed behavior under small contract perturbations as revealing local elasticities, and then apply the same ``shrink incentives in manipulable directions’’ principle using estimated Jacobians.

Of course, the clean separation between effort and manipulation may fail if costs have complementarities (e.g. manipulation becomes cheaper when effort is low). In such cases, audits still help by breaking the observational equivalence between a and m, but optimal incentives may no longer be expressible as a simple shrinkage rule.

We have treated the audit probability p as exogenous. In many environments it is a choice variable: audits are costly, limited, or can be targeted to suspicious activity. Endogenizing audits clarifies a second economic margin: the platform can substitute for . A simple reduced-form way to capture this is to let the principal choose p ∈ [0, 1] at a per-round cost ϕ(p), with ϕ increasing and convex. Using the closed-form contract β^*(p), the platform’s problem becomes
max_{p ∈ [0, 1]} (U_P(β^*(p); p) − ϕ(p)),
which yields an interior condition balancing the marginal value of a higher-powered (less distorted) contract against the marginal audit cost. This delivers a testable comparative static: auditing should be most intense when manipulability is high (large M) and task value is high (large θ^*), since these are precisely the regimes where shrinking incentives is most costly.

Targeted audits introduce an additional strategic channel: if the audit probability depends on realized x, then manipulation affects not only payment when not audited, but also the likelihood of being audited. For example, with p(x) increasing in unusually high x, the expected marginal benefit of manipulation is attenuated by an additional ``detection’’ term. In stylized form, if payment uses x only when not audited, then the agent internalizes
$$ \frac{\partial}{\partial m}\mathbb{E}\bigl[\langle \beta,x\rangle \mathbf{1}\{\text{no audit}\}\bigr] \approx \beta\cdot(1-p(\cdot)) - \langle \beta, x\rangle \cdot p'(\cdot)\cdot \frac{\partial x}{\partial m}, $$
so sufficiently aggressive targeting can deter manipulation even at a fixed average audit rate. The cost is that the induced behavior and the data-generating process become more nonlinear, complicating both identification and learning. In practice, this is a familiar tension: aggressive anomaly-triggered audits can reduce gaming, but they also change the population of audited events, which can bias naive estimators of manipulation unless the targeting rule is accounted for.

Taken together, these extensions reinforce the main message. The precise algebra of K and M may change with richer technologies, constraints, and audit policies, but the underlying structure persists: incentives load on a mixture of productive and manipulable components, and audits (exogenous or targeted) reshape that mixture by reducing the returns to metric inflation and by enabling the platform to where the metric is vulnerable.

9. Empirical implications and measurement: what platforms can log; testable predictions (weight changes vs manipulability proxies); how to estimate in real telemetry.

A virtue of the linear–quadratic structure is that it translates directly into logging requirements and testable predictions in platform telemetry. The model says that what the platform pay for is not the raw metric x, but the component of x that is induced by productive effort. Empirically, this shifts attention from predicting outcomes y using whatever signals are available to , and then separating productive from manipulable responses.

At minimum, implementing the shrinkage logic and auditing-based identification requires that the platform record, at the unit of contracting (e.g. user–creator, driver–week, seller–month): (i) the posted contract parameters, i.e. the full vector of pay weights β_t (and, when relevant, β_t^(x), β_t^(x̃)), including any caps or piecewise rules; (ii) the realized manipulable metric x_t at the same task granularity; (iii) whether an audit occurred and the resulting audit metric x̃_t when observed; and (iv) the platform objective y_t (or a proxy, such as downstream retention, complaint-adjusted revenue, or long-run conversions) aligned to the same decision window. In practice, (iv) is often delayed and noisy, but the learning problem is robust to delay provided the platform can join outcomes back to the contract and signal history.

Two operational details are easy to overlook. First, the must be logged in its true deployed form. If the platform uses a scoring model s_t = f(features) and pays linearly in s_t, then the relevant β_t is the gradient of the score with respect to the underlying task components (or the linearization used in pay). Second, if audits are targeted, then one must log the audit policy (or at least the audit propensity) alongside the realized audit indicator, since selection into audit otherwise contaminates naive difference-based estimators.

The cleanest predictions concern how optimal weights move with manipulability and auditing intensity. In the diagonal case, the optimal weight satisfies
$$ \beta_i^*(p)=\tfrac12\,\theta_i^*\cdot \frac{\kappa_i}{\kappa_i+(1-p)\mu_i}, $$
so, holding fixed the platform value θ_i^* and productivity κ_i, incentives should be on metrics that are more gameable (higher μ_i), and when audits are more frequent (higher p). These predictions can be tested in two complementary ways.

First, in cross-sectional analyses across tasks or product surfaces, one can compare posted pay weights to external or internal proxies for manipulability. Examples include: the prevalence of known fraud vectors (e.g. bot traffic share), the historical divergence between x and alternative measurements, the elasticity of x to low-cost interventions that plausibly reflect manipulation (e.g. traffic purchases), or the concentration of extreme values that are hard to reconcile with physical constraints. The model predicts that, conditional on estimated marginal value (a proxy for θ_i^*) and responsiveness (a proxy for κ_i), higher manipulability proxies should be associated with lower incentive weights.

Second, in within-platform policy changes (natural experiments), the model predicts systematic of weights when monitoring changes. If the platform expands auditing (increases p) or deploys a new verification system that makes manipulation harder (effectively reduces μ_i), the platform should optimally increase incentive power on the affected dimensions, and the induced gap between x and x̃ should shrink. Conversely, if a measurement pipeline becomes easier to game (e.g. an API change or a new marketplace for fake engagement), one should see either lower weights on that metric or increased auditing to sustain weights.

A distinctive implication of correlated manipulation is that these shifts need not be one-for-one: improving the integrity of one metric can justify higher weights on metrics when manipulation technologies spill over. Empirically, this suggests looking for coupled movements in β across tasks after integrity interventions, rather than treating each metric in isolation.

When audited repeats are available, the model yields a direct estimand: in audited rounds,
x_t − x̃_t = m_t + (ε_t − ε̃_t), 𝔼[x_t − x̃_t ∣ β_t] = (1 − p)Mβ_t
under the audit-contingent scheme. Thus, a regression of (x_t − x̃_t) on β_t identifies (1 − p)M under standard full-rank conditions. In implementation, we recommend estimating M using only rounds where (a) the platform knows the posted β_t precisely and (b) the audit metric is produced by an independent measurement pipeline. Since gaming can sometimes leak into audits (e.g. if the same logs feed both x and x̃), validating independence is not a mere statistical nicety; it is central to the interpretation of x − x̃ as manipulation.

In high-dimensional settings, two issues arise. The first is sample size: a dense M contains O(d²) parameters, so one either needs substantial audit volume or structural restrictions (e.g. sparsity or low rank). The second is contract variation: if the platform rarely changes β, then 𝔼[ββ^⊤] will be ill-conditioned and M will not be stably estimated. This is a concrete empirical manifestation of the ``diversity’’ condition: identification requires that the platform occasionally varies incentives in different directions. In practice, this can be achieved via small randomized perturbations (A/B tests on weights), staggered rollouts, or rotating emphasis across submetrics, all of which are implementable without committing to fully exploratory (and potentially harmful) contracts.

When audits are targeted rather than random, the regression must be adjusted for selection. A simple correction is inverse propensity weighting: if the platform can estimate or log π_t = ℙ(audit ∣ ℋ_t), then one can estimate M from weighted moments that re-create the unconditional design. More robustly, one can model the audit rule explicitly and treat the audit indicator as an endogenous sampling mechanism. The practical lesson is that targeted audits are compatible with learning, but only if the platform treats the audit assignment rule as part of the data-generating process to be accounted for, not ignored.

Once manipulability is estimated, the platform can construct a corrected metric
x_t^c ≡ x_t − (1 − p)M̂β_t,
which, under the maintained structure, behaves like a noisy measurement of effort: x_t^c = a_t + ε_t. Together with x̃_t = a_t + ε̃_t on audited rounds, this yields a repeated-measurement setting in which x̃_t serves as an instrument for x_t^c in the outcome equation y_t = ⟨θ^*, a_t⟩ + η_t. Operationally, the platform can run an IV regression of y_t on x_t^c using x̃_t as instruments, or equivalently the GMM estimator based on the moment 𝔼[x̃_t(y_t − ⟨θ, x_t^c⟩)] = 0. This two-stage procedure has an appealing engineering interpretation: audits are used first to , and then to .

Two cautions matter for empirical credibility. First, y is often affected by factors beyond contemporaneous effort (seasonality, demand shocks, product changes). This calls for including fixed effects, time controls, and possibly differencing or orthogonalization of y with respect to known confounders. Second, the mapping from effort to outcomes may be nonlinear or delayed. In those cases, one can still treat θ^* as a local or short-horizon marginal value and interpret the resulting contract as locally optimal; this aligns with the ``local linearity’’ perspective in the extensions.

Some platforms cannot generate x̃ at meaningful scale. Our framework still suggests useful measurement strategies. One approach is to construct that proxy for the cost of manipulation—for example, the share of events verified by trusted hardware, the fraction of traffic with strong identity, or the historical stability of the metric under known anti-fraud interventions. These can be used to stratify tasks or cohorts and test whether optimal weights are lower where integrity is weaker. Another approach is to leverage that affects manipulability but not productivity (e.g. changes in detection, identity verification, or API rate limits) as instruments for manipulation intensity; the predicted response is that, when manipulation becomes harder, x should become a more reliable predictor of y and the platform should rationally increase incentive power on x.

Of course, proxies cannot fully substitute for audits: without some source of ground truth, separating true effort from inflation is intrinsically underidentified. We view this not as a limitation of the model but as an empirical fact about Goodhart settings: measurement integrity is itself a scarce input.

A final empirical implication is methodological: one can evaluate whether an incentive system is ``Goodhart-proof’’ by tracking (i) the estimated manipulability response M̂ over time, (ii) the divergence x − x̃ in audited samples, and (iii) the welfare-relevant outcome y. A system that chases short-run improvements in x without corresponding gains in y, while also showing rising x − x̃ gaps, is precisely a system moving into the region where our theory recommends shrinking weights or increasing monitoring. This provides a concrete monitoring dashboard for practice: it is not enough to ask whether a metric moved; one must ask whether it moved , and whether the incentive weights are commensurate with the metric’s integrity.

10. Conclusion: design principles for Goodhart-proof linear incentives; limitations and future work.

We have studied a simple but practically salient environment in which a platform rewards agents using a manipulable metric. The central lesson is that linear incentives are not inherently ``bad’’ under Goodhart pressure; rather, they must be . When agents can allocate marginal effort both to productive actions and to inflating the score, the platform should treat manipulability as an additional effective cost of incentives. In the linear–quadratic benchmark, this logic collapses into a transparent shrinkage rule: the optimal pay vector loads on high-value, high-productivity dimensions, but is attenuated on dimensions where a dollar of incentive buys mostly manipulation. Audits play a complementary role by scaling down the agent’s marginal return to manipulation, thereby allowing the platform to restore incentive power without paying for noise.

We distill this into a set of design principles that are directly actionable for metric-based contracting systems.

In Goodhart settings, the relevant object is not the predictive correlation between x and y, but the of the metric to incentives and the fraction of that response that is productive. Our model makes this decomposition explicit: incentives induce both a and m, and only a moves the principal’s objective. The practical implication is that ``stronger incentives’’ should not be interpreted as uniformly better; the platform should strengthen incentives only on dimensions where the induced response is likely to be real. In the benchmark, the integrity adjustment takes the form of replacing the naive benchmark β ∝ θ^* by an attenuated weight that accounts for manipulability (and, with audits, the manipulability). More generally, we should expect optimal incentive weights to be decreasing in any measure of gaming elasticity, even when the metric is highly correlated with downstream outcomes in historical (pre-incentive) data.

A recurring operational error is to treat the platform’s best measurement pipeline as the same pipeline that should be used for pay. Our framework highlights why separation can be valuable: a noisy but hard-to-game audit signal can be used to discipline a high-frequency but gameable operational metric. Even if audits are sparse, their informational content is leveraged through identification: they allow the platform to estimate how much of the operational metric is inflated under incentives, and to correct that inflation when learning task values. This suggests a systems design stance: engineer measurement so that at least one channel is difficult to manipulate (e.g. independent data sources, cryptographic attestations, human verification, or delayed-but-verified outcomes), even if it is too expensive to use at full scale.

Auditing is often framed as a compliance cost that substitutes for better incentives. The model instead points to complementarity: higher audit probability p reduces the marginal return to manipulation by a factor of (1 − p), which lets the platform safely increase β on valuable dimensions. In practice, this means one should not choose incentives and monitoring in separate organizational silos. A platform that wishes to increase incentive power—for example, to accelerate growth in a key metric—should simultaneously budget for higher verification, at least temporarily, to keep the manipulation wedge from expanding. Conversely, if monitoring capacity is cut, the theory recommends proactively shrinking weights on the most gameable components to avoid paying for increasingly unproductive responses.

Good incentive design is constrained not only by immediate welfare tradeoffs but also by what can be learned from data. In our setting, learning manipulability and task values requires variation in β that is rich enough to identify how agents respond. This is a formalization of an operational truth: if the platform never changes its scoring weights, it will be unable to distinguish genuine improvement from strategic behavior, and it will be slow to detect integrity decay. Importantly, the needed variation can be small and safe: randomized perturbations around a baseline, staggered rollouts, or rotating emphasis across dimensions can generate the eigenvalue growth needed for stable estimation while limiting user-facing volatility. We view this as a governance principle: platforms should institutionalize controlled experimentation not only for product iteration but also for incentive integrity.

A mature incentive system should track not only the headline metric x and the objective proxy y, but also diagnostics that isolate gaming. In environments with audits, the gap x − x̃ is a direct measure of inflation (up to noise), and its sensitivity to β is a measure of manipulability. Even without perfect audits, platforms can construct partial analogues: consistency checks across independent logging systems, anomaly rates, or the fraction of activity that passes verification. The point is conceptual: if the platform cannot measure how incentives load onto the metric and how the metric loads onto the objective, it is effectively flying blind. Treating the manipulation wedge as a dashboard variable creates an early-warning system for Goodhart drift.

In live systems, the platform rarely knows task values or manipulability ex ante; both are learned from noisy data and can change over time. A practical extension of the shrinkage logic is therefore : when M̂ is imprecise, the platform should discount weights more, not less. Put differently, estimation error should be treated asymmetrically because over-incentivizing a manipulable metric can generate large transfers with little value and can erode trust in the system. This suggests using confidence-aware policies (e.g. conservative updates, caps, or regularization toward a low-powered contract) until audits and experimentation provide sufficient identification strength.

Platforms often audit strategically (e.g. suspicious cases). While operationally sensible, this creates selection that can confound naive estimators of manipulation and task value. The design principle is not never target audits,'' but ratherlog and model the targeting.’’ If the audit propensity is itself a function of history, then any learning algorithm must incorporate that propensity to recover unbiased moments. More broadly, audit policies become part of the incentive environment: agents may respond to the probability of being audited, not merely to the payment rule. Treating audit assignment as an endogenous policy variable is therefore essential for both identification and deterrence.

Our conclusions are intentionally drawn from a parsimonious structure, and several assumptions deserve scrutiny before applying the prescriptions mechanically. First, linear contracts and quadratic costs deliver linear best responses and clean shrinkage; real effort technologies may exhibit complementarities, thresholds, and nonconvexities. Second, we assumed a clean separation between a manipulable operational metric x and a manipulation-free audit metric x̃. In practice, audits may themselves be partially gameable, correlated with x through shared data pipelines, or subject to adversarial adaptation once their role in pay becomes salient. Third, the diagonal structure of K and M abstracts from cross-task interactions: in many settings, manipulation in one dimension spills over into others, and productive actions can jointly move multiple metrics. While the conceptual shrinkage logic survives, the optimal policy may require rotating incentives across correlated dimensions rather than shrinking each independently.

Fourth, we have emphasized incentives and learning but abstracted from institutional constraints: risk aversion, participation constraints, fairness concerns, multi-agent strategic interactions (e.g. collusion, sabotage, or congestion), and dynamic reputational effects. These forces can change both the welfare criterion and the feasible contract class. Finally, our learning discussion presumes a degree of stationarity; in real systems, both task values and manipulability can shift as attackers innovate or as the product changes, implying that continual re-estimation and change-point detection may be as important as asymptotic identification.

Several extensions appear especially fruitful. On the theory side, jointly optimizing the audit policy (including targeted audits) with the contract, under explicit audit costs, would endogenize the monitoring–incentives tradeoff that platforms face. Allowing richer manipulation technologies—for example, manipulation that distorts the audit metric with smaller probability, or manipulation that changes the distribution of noise rather than its mean—would connect our setting to adversarial measurement and data-poisoning models. Nonlinear and history-dependent contracts are also natural: while linear rules are operationally attractive, many platforms use piecewise payments, thresholds, or ranking mechanisms that can amplify gaming around discontinuities.

On the empirical side, a key agenda is to map theoretical objects such as M (gaming elasticities) and K (productivity responsiveness) into measurable quantities in telemetry and experiments, and to develop standard diagnostics for when independence assumptions for audits are credible. Finally, on the algorithmic side, incentive-aware learning under nonstationarity—where both agents and attackers adapt—calls for policies that combine exploration, auditing, and robust estimation in a unified control loop.

Stepping back, the broader message is optimistic. Goodhart’s Law is often treated as a fatal critique of metric-based management. Our analysis suggests a more constructive interpretation: Goodhart pressure is a design constraint that can be priced, monitored, and mitigated. By explicitly modeling manipulation as an endogenous response to incentives, and by treating audits and contract variation as integral parts of the system, platforms can build linear incentive schemes that are not perfectly manipulation-proof, but are systematically —and therefore substantially more resilient in practice.