Two-sided assignment platforms in 2026 are asked to do something that classical matching theory was not designed to do out of the box: provide fairness guarantees to multiple stakeholder groups under operational constraints that make deterministic, once-and-for-all assignments both unrealistic and politically brittle. Online labor platforms must balance worker welfare (e.g., schedule quality, earnings potential, commute burden) against client or job-side performance metrics (e.g., service quality, reliability, credential fit). Public-facing assignment systems—such as teacher placements, clinical rotations, and certain school choice variants—face analogous pressures, often amplified by regulation and scrutiny: the same assignment rule is evaluated both by how it treats individuals on one side and by whether it preserves performance and quality on the other. In such environments, ``fairness’’ is not a slogan but a binding feasibility constraint: a platform that systematically protects one side while degrading the other may fail retention, provoke strategic behavior, or violate contractual and legal obligations.
A practical driver of this tension is that modern platforms rarely commit to a single, deterministic matching. They operate repeatedly, with assignments rolling out daily or weekly, and with a planner (or algorithmic marketplace designer) able to across different matchings. This repeated implementation creates room for fairness: an individual may not receive her best feasible job every period, but can be protected in expectation over time. At the same time, repeated operation makes stability considerations salient in a different way than in one-shot matching. If assignments are perceived as too manipulable or too far from accepted priority rules, the platform may be undermined by refusal, renegotiation, or costly appeals. This motivates our focus on lotteries whose realizations satisfy a notion of stability (we will formalize internal and ε-internal stability in the next section), while fairness is evaluated via expected utilities and expected quality.
A second driver is the prevalence of on one side of the market, which are no longer a corner case. In large-scale platforms, worker-side preference data are often coarse, learned, or noisy. Star ratings, salary bands, commute-time buckets, or predicted satisfaction scores are typically reported (or estimated) with limited resolution. As a consequence, many jobs are effectively : a worker may be genuinely indifferent among a set of assignments once we condition on observable attributes and measurement error. From the designer’s perspective, treating near-equal scores as strict preferences can be counterproductive, because it amplifies noise and induces arbitrary discontinuities in outcomes. We therefore model ties not as a failure of preference elicitation, but as an economically meaningful representation of indifference classes generated by prediction, aggregation, or privacy-preserving reporting.
This seemingly modest modeling choice has major distributional implications. With strict preferences on both sides, stable matchings form a well-structured set and extreme points (worker-optimal and job-optimal stable outcomes) provide clear benchmarks. Under one-sided ties, the stable set can expand dramatically, and small perturbations in tie-breaking can flip who benefits. In practice, this means that a platform can often make many workers ``whole’’ relative to their own stable benchmarks by randomizing across stable or near-stable outcomes, but doing so may impose persistent quality losses on some jobs. Conversely, prioritizing job-side quality via strict adherence to priorities can leave some workers consistently under-served even when, in principle, the stable set contains outcomes that treat them better. The platform thus faces an endogenous conflict between fairness (protecting the worst-off on each side) and balance (not letting one group systematically pay for the other’s protection).
Our aim is to illuminate this conflict in a way that is both
policy-relevant and technically tractable. We study a planner who can
choose a lottery over matchings, interpretably implementable as a
schedule over periods, and we evaluate outcomes through relative to
endogenous, market-specific benchmarks. On the worker side, we normalize
each worker’s expected utility by the best utility she can obtain in any
weakly stable matching; on the job side, we normalize each job’s
expected quality by the best quality it can obtain in any weakly stable
matching. This OSS-ratio'' normalization (OSS foroptimal
stable share’‘) has two virtues. First, it is scale-free and comparable
across heterogeneous agents: a worker with inherently low attainable
utility is not mechanically labeled ``worse off’’ than one with high
attainable utility. Second, it ties fairness directly to the market’s
stability-constrained opportunity set, aligning with how platforms
justify outcomes: not by promising everyone their first choice, but by
promising each party a reasonable fraction of what is attainable under
accepted stability or priority constraints.
Within this framework, we make three contributions.
First, we provide a convex (often linear) characterization of the feasible trade-offs between the worst-off worker share and the worst-off job share. The key idea is to move from lotteries over integral matchings to fractional allocations that capture expected assignment frequencies, and to encode stability-like restrictions through linear inequalities. This yields a tractable description of a Pareto frontier in the space of share guarantees. Conceptually, this reframes a complicated randomized mechanism design problem as a transparent resource allocation problem: each unit of assignment probability simultaneously produces worker utility and job quality, and stability constraints restrict which probabilities can be assigned without creating strong incentives to deviate. From a design standpoint, this frontier view is valuable because it separates (given the market instance and stability requirements) from (given policy priorities).
Second, we show that one-sided ties can generate a sharp, worst-case impossibility for two-sided fairness. There exist families of instances in which any lottery—even allowing non-stable matchings, and even before we impose the full force of stability—must obey a severe trade-off: achieving a logarithmic approximation for the worst-off worker share forces the worst-off job share to collapse to an order of 1/N (and, under symmetric constructions, vice versa). Economically, the mechanism behind this bound is a congestion of ``conflict chains’’ created by indifference: to protect many workers who are tied across a menu of jobs, the planner must spread probability mass widely; but because jobs have strict priorities and quality declines steeply with rank, spreading mass necessarily assigns some jobs to low-ranked workers with substantial frequency. The result is not merely that some jobs lose in a particular realization, but that at least one job is harmed persistently in expectation whenever we push worker-side guarantees above a modest threshold. This is the precise sense in which multi-stakeholder fairness becomes binding: the bottleneck is not computational, but information-theoretic and combinatorial, driven by the geometry of feasible stable (or near-stable) outcomes under ties.
Third, we propose a canonical compromise selection based on two-sided Nash bargaining over the worst-off shares. Maximizing the product (equivalently, the sum of logs) of the worker-side and job-side guarantees yields an outcome that is Pareto-efficient on the feasible frontier and interpretable as a balanced policy choice: it is insensitive to units, it penalizes letting either side’s worst-off share collapse, and it produces a single recommendation even when the frontier is flat or multi-valued. Importantly, our approach is compatible with platform implementation: the solution is a fractional allocation that can be decomposed into a lottery over matchings, and the stability notion we target is designed to be meaningful (so that each realized matching is defensible), while allowing controlled relaxation via an ε margin to accommodate measurement noise and operational constraints.
We emphasize limitations up front. Our analysis is primarily offline: we take utilities and priorities as given and study the induced feasible set. In practice, platforms estimate utilities from behavior, face uncertainty, and may wish to be robust to misspecification. Our ε-stability lens is meant as a first step toward that robustness, but it does not replace a full learning-and-assignment model. Moreover, our worst-case bounds should be read as cautionary: they identify configurations where no mechanism can simultaneously guarantee high shares to both sides, not as predictions that typical markets always operate at the worst case. That said, worst-case insight is operationally relevant in 2026 precisely because platforms are deployed at scale and must be prepared for adversarial or pathological instances, whether arising from strategic behavior, correlated preferences, or data artifacts.
The broader message is that randomized assignment, properly constrained by stability considerations, can be a powerful tool for fairness—but it cannot repeal the underlying two-sided scarcity created by priorities and ties. By formalizing shares relative to stable benchmarks and mapping the feasible frontier, we obtain a language for making policy trade-offs explicit: one can decide, transparently, how much job-side quality protection to require in exchange for worker-side fairness, or vice versa. The next section sets up the model and stability notions that make these trade-offs precise.
We fix a one-to-one two-sided assignment environment with a set of workers W and a set of jobs A, with |W| = |A| = N. Our goal in this section is to make precise (i) the primitives that generate one-sided ties, (ii) the stability notions that will discipline which matchings are ``defensible’’ when implemented period-by-period, and (iii) the endogenous benchmarks against which we will later normalize fairness guarantees.
Each worker w ∈ W
evaluates jobs using a utility function
U(w, a) ∈ [0, 1] for
each a ∈ A,
where ties are allowed: it may be that U(w, a) = U(w, a′)
even when a ≠ a′. We
interpret U as a
platform-relevant score (predicted satisfaction, net earnings, commute
disutility converted to a score, etc.) rather than a fully articulated
strict ranking. Modeling cardinality is not cosmetic: when utilities are
estimated or discretized, it is often more credible to treat close
alternatives as substitutable than to impose an arbitrary strict order
that amplifies noise. The induced weak preference relation is the usual
one: worker w (weakly) prefers
a to b if U(w, a) ≥ U(w, b),
and strictly prefers a to
b if U(w, a) > U(w, b).
The [0, 1] normalization is without loss for our purposes: any bounded utilities can be affinely rescaled, and our later share guarantees are scale-free. What matters is that comparisons of the form U(w, a) − U(w, b) are meaningful, which will allow us to formalize robustness margins.
Each job a ∈ A has a priority order ≻a over workers. This captures settings where the job side is governed by a clear rule: credential-based eligibility, seniority, an admissions rubric, or a platform policy ranking workers by predicted performance. The strictness assumption reflects that such rules often produce a total order even when they are coarse (e.g., lexicographic tie-breaking by timestamps). We deliberately place the indifference on the worker side; this asymmetry matches many platform applications in which job-side screening is sharper than worker-side preference data.
In addition to the ordinal priority ≻a, we associate with
each job a cardinal quality function
Q(a, w) ∈ [0, 1] for
each w ∈ W,
which is assumed to be in the priority rank induced by ≻a. Formally, if w≻aw′
then Q(a, w) > Q(a, w′).
We do not require Q to
coincide with any particular performance measure; it is a convenient way
to quantify job-side losses when a job is assigned a lower-priority
worker. This will be essential later when we evaluate job-side
guarantees under randomization. In practice, Q may reflect service quality,
predicted completion probability, or an SLA-relevant fit metric,
provided it is consistent with the priority rule.
A (possibly partial) matching is a function μ that assigns each worker to at most one job and each job to at most one worker. We use μ(w) ∈ A ∪ {⊥} for the job assigned to worker w (with ⊥ denoting unassigned), and μ(a) ∈ W ∪ {⊥} analogously for jobs. Feasibility requires mutual consistency: μ(w) = a if and only if μ(a) = w. In the baseline one-to-one case with |W| = |A|, it is natural to focus on perfect matchings; we nonetheless allow ⊥ in the notation because (i) many platforms operate with fluctuating participation and (ii) our stability relaxations are most transparent when outside options are explicit. When needed, we interpret U(w, ⊥) = 0 and Q(a, ⊥) = 0.
Because workers may be indifferent among many jobs, there are multiple stability notions in the literature (weak, strong, super). We adopt , which is the notion most consistent with how appeals and objections work in practice: a worker is plausibly willing to contest an assignment only if she can point to a better alternative that would also accept her under the job-side rule.
Given a matching μ, a pair
(w, a) with a ≠ μ(w) is a (for
weak stability) if
U(w, a) > U(w, μ(w)) and w≻aμ(a),
where we interpret w≻aμ(a)
as automatically true if μ(a) = ⊥ (i.e., an empty
job would accept any worker). The first condition requires the worker to
have a strict improvement; ties on the worker side therefore by
themselves create blocks. The second condition is purely ordinal: job
a would replace its current
assignee with w because w has higher priority.
We say that μ is if it
admits no blocking pair. Let S
denote the set of weakly stable matchings.
Weak stability is a disciplined compromise between incentives and realism. On one hand, it respects strict job-side priorities exactly. On the other hand, it avoids treating worker indifferences as knife-edge constraints that would force the planner to defend an outcome against objections that are essentially measurement noise. Weak stability also has an important existence property in our setting: with strict priorities on the job side, a deferred-acceptance procedure (with any tie-breaking on the worker side) produces a matching that is weakly stable with respect to the original weak preferences. We will exploit this stability baseline to define endogenous ``best attainable under stability’’ benchmarks.
Our eventual object of choice is a over matchings, implemented repeatedly across periods. In that context, it is often operationally acceptable that some unmatched agent might prefer to match (e.g., a job would like to fill a last-minute vacancy), as long as the realized assignment among agents does not generate compelling bilateral deviations that unravel the allocation. This motivates a stability notion that only polices blocking pairs among agents who are currently assigned.
Formally, a matching μ is
if there is no pair (w, a) such that
μ(w) ≠ ⊥, μ(a) ≠ ⊥, U(w, a) > U(w, μ(w)), w≻aμ(a).
Compared to weak stability, internal stability ignores blocking
opportunities that rely on an unmatched worker or an unmatched job. In
the baseline perfect-matching case this distinction is moot; in markets
with slack capacity, it captures the idea that stability concerns are
strongest when they involve already-allocated opportunities rather than
filling residual capacity.
In modern assignment platforms, utilities and quality scores are estimated and updated, and many ``improvements’’ are too small to justify costly renegotiation or an appeal. We therefore introduce a margin parameter ε ≥ 0 that encodes an instability tolerance. Intuitively, we want to rule out deviations that offer a improvement to both sides, while allowing reallocations that would be blocked only by negligible score differences.
We say that a pair (w, a) is an for μ (under internal stability) if both
agents are matched and both obtain a gain exceeding ε in their respective cardinal
metrics:
μ(w) ≠ ⊥, μ(a) ≠ ⊥, U(w, a) ≥ U(w, μ(w)) + ε, Q(a, w) ≥ Q(a, μ(a)) + ε.
A matching is if it admits no ε-blocking pair; let Iε denote this
set. When ε = 0 and Q is strictly consistent with ≻a, this definition
aligns closely with internal stability (and, in a perfect matching, with
weak stability). As ε
increases, the constraint weakens monotonically, enlarging Iε and thereby
expanding the set of matchings that are defensible .
Two remarks are useful. First, our use of Q in the job-side deviation condition is not meant to replace the priority order; it is a way to translate ``how much better’’ into the same robustness language used for workers. Second, the appropriate ε is application-specific: in some regulated domains it may be close to zero (priorities must be adhered to tightly), whereas in learning-based platforms a modest ε may be the right way to avoid brittle outcomes driven by estimation error.
To evaluate fairness across heterogeneous agents, we benchmark each agent against what she could obtain in some weakly stable outcome. Because one-sided ties can generate a large set S with no single ``side-optimal’’ matching that simultaneously serves as a clean reference for everyone, we define agent-by-agent best stable attainments.
For each worker w, define
her benchmark as
U*(w) = maxμ ∈ SU(w, μ(w)).
For each job a, define its
benchmark as
Q*(a) = maxμ ∈ SQ(a, μ(a)).
These are not claims about what can be achieved simultaneously. Rather,
they are individualized opportunity-set normalizations: U*(w) is the
best utility worker w can ever
receive in weakly stable matching, and Q*(a) is the
best quality job a can ever
receive in weakly stable matching. In the next section, we will compare
a randomized policy to these benchmarks via worst-off . This choice is
deliberate: it separates the question
what does the market allow under defensible stability constraints?'' from the questionhow
should the platform trade off protection across sides?’’
Finally, we note a limitation that will recur throughout the paper. Because U* and Q* are defined via maximization over the stable set, they inherit any sensitivity of S to modeling choices (e.g., how ties are represented, or how priorities are formed). Our view is that this is a feature rather than a bug: in practice, the benchmark against which stakeholders judge fairness is itself shaped by institutional definitions of priority and by the resolution at which preferences are elicited. Our framework makes that dependence explicit, and the ε-stability margin offers a controlled way to test robustness when those primitives are noisy.
Our object of choice is a randomized assignment policy, implemented across periods, that selects an integral matching each day and thereby across matchings. Formally, let D be a probability distribution over matchings μ, and interpret the realized assignment as μ ∼ D. We restrict attention to lotteries that are in the sense of Section~: every realized matching in the support of D lies in Iε, the set of ε-internally stable matchings. This requirement captures a practical discipline: even when the planner randomizes, each realized allocation should be robust to material pairwise deviations among currently matched agents.
Given a lottery D, we
evaluate each worker and job by their cardinal outcomes,
uw(D) = 𝔼μ ∼ D[U(w, μ(w))] and qa(D) = 𝔼μ ∼ D[Q(a, μ(a))].
Because U(⋅, ⋅) and Q(⋅, ⋅) are normalized to [0, 1], these expectations also lie in [0, 1], but their absolute magnitudes remain
difficult to compare across heterogeneous agents. We therefore evaluate
outcomes .
Recall that U*(w) = maxμ ∈ SU(w, μ(w)) is worker w’s best utility in weakly stable matching, and Q*(a) = maxμ ∈ SQ(a, μ(a)) is job a’s best quality in weakly stable matching. These benchmarks define opportunity sets that are defensible under the same stability logic that anchors the market, while explicitly accommodating the multiplicity of stable outcomes under one-sided ties.
Given D, we define the
planner’s and as
$$
s_W(D)\;=\;\min_{w\in W}\frac{u_w(D)}{U^*(w)}
\qquad\text{and}\qquad
s_A(D)\;=\;\min_{a\in A}\frac{q_a(D)}{Q^*(a)}.
$$
These are max–min style fairness metrics with three features that will
matter throughout. First, they are : if one rescales utilities or
quality scores by an affine transformation, the ratios (and hence our
guarantees) are unchanged after the same transformation is applied to
the benchmark. Second, they are : each agent is compared to what she
could plausibly obtain under stability, rather than to a common cardinal
threshold that may be unrealistic for some and trivial for others.
Third, they are deliberately : by taking a minimum across agents, we
focus attention on the most exposed participant on each side.
If one ignored the job side and optimized only sW(D), the natural policy would be to time-share across matchings that are ``good for workers,’’ even if doing so repeatedly assigns some jobs to low-priority workers. Under one-sided ties, such policies can indeed raise the worst-off worker’s utility by smoothing over conflicts created by indifference classes: when many workers view several jobs as near-substitutes, randomization can spread access to the most desired positions.
The difficulty is that job-side losses can be . Because each job has a strict priority order and a steeply decreasing quality function Q(a, ⋅) consistent with that order, reallocations that are nearly costless to workers (due to ties) may be very costly to jobs. In extreme instances, raising the worst-off worker share to order 1/log N forces some jobs to accept very low-priority workers with nontrivial frequency, driving sA(D) down to order 1/N in the worst case. Symmetrically, if the planner optimized only sA(D), it could concentrate probability mass on matchings that give each job a high-priority worker, while leaving some workers chronically assigned to low-utility jobs (or, in slack markets, unmatched). In short, one-sided objectives invite corner solutions that are hard to justify in two-sided environments.
This is not merely a normative concern; it is also operational. Platforms are commonly evaluated simultaneously on worker-facing metrics (satisfaction, earnings) and job-facing metrics (fill rate, service quality). A policy that is defensible on one dimension but predictably catastrophic on the other is rarely implementable, even if each daily assignment is stable in the narrow deviation sense.
To formalize the trade-off, let 𝒟ε denote the set of
lotteries whose support lies in Iε. We then
define the as
ℱε = {(tW, tA) ∈ [0, 1]2: ∃D ∈ 𝒟ε s.t. sW(D) ≥ tW, sA(D) ≥ tA}.
Equivalently, (tW, tA) ∈ ℱε
if and only if there exists a lottery D ∈ 𝒟ε such that
the inequalities hold:
uw(D) ≥ tW U*(w) ∀w ∈ W, qa(D) ≥ tA Q*(a) ∀a ∈ A.
This representation makes clear that (tW, tA)
is a pair of : every worker receives at least a tW fraction of
her stable benchmark in expectation, and every job receives at least a
tA
fraction of its stable benchmark in expectation.
Two monotonicity observations are immediate and useful. First, ℱε is downward closed: if (tW, tA) ∈ ℱε, then any (tW′, tA′) with 0 ≤ tW′ ≤ tW and 0 ≤ tA′ ≤ tA also lies in ℱε. Second, as ε increases, 𝒟ε expands, hence ℱε expands: allowing small-gain deviations to be ignored enlarges the design space and can strictly improve the attainable two-sided guarantees.
The of ℱε is the set of points (tW, tA) ∈ ℱε such that there is no other feasible point (tW′, tA′) ∈ ℱε with tW′ ≥ tW and tA′ ≥ tA, with at least one strict inequality. Frontier points summarize the essential economic tension in the market: they are precisely the policies for which improving the worst-off workers’ relative guarantee necessarily worsens the worst-off jobs’ relative guarantee, and vice versa.
We emphasize that the frontier is defined at the level of , not raw utilities. This choice is deliberate. In many applications, workers and jobs vary widely in their attainable stable outcomes, and absolute comparisons can be misleading. A policy that increases average utility may nonetheless be unacceptable if it leaves the worst-off worker at a tiny fraction of her defensible opportunity set, or if it routinely assigns a small set of jobs far below their feasible quality levels.
Rather than selecting a policy by ad hoc weights on uw and qa, we adopt a
approach: we characterize and then choose among uniform guarantee pairs
(tW, tA).
A standard method to trace Pareto boundaries is to fix a target on one
coordinate and optimize the other. Concretely, for any τ ∈ [0, 1], consider the
problem
max tW s.t. ∃D ∈ 𝒟ε with uw(D) ≥ tWU*(w) ∀w, qa(D) ≥ τQ*(a) ∀a.
Varying τ traces the best
worker guarantee compatible with a required job guarantee.
Symmetrically, one may fix a worker guarantee and optimize the job
guarantee. Because payoffs are linear in the lottery D and the constraints above are
uniform lower bounds, these scalarizations interact naturally with
convex relaxations: mixtures of lotteries preserve feasibility and
interpolate expected payoffs. In the next section we exploit this
structure by replacing lotteries over matchings with an equivalent
fractional representation, yielding linear (or convex) programs that
compute outer bounds on ℱε and, in important
cases, implementable solutions.
Scalarizations are useful for mapping the frontier, but a planner
often wants a single recommended operating point. We therefore also
study the (or equivalently, max-product) compromise:
max log tW + log tA s.t. (tW, tA) ∈ ℱε, tW > 0, tA > 0.
This objective has two appealing interpretations. First, it selects the
point that maximizes the product tWtA,
i.e., the best multiplicative simultaneous approximation to both sides’
stable benchmarks in the worst case. Second, it is scale-free and
symmetric: doubling the units of U or Q does not change the selected
shares, and the objective treats proportional improvements for workers
and jobs on equal footing. In applications where policy priorities are
asymmetric, we can incorporate bargaining weights and maximize βlog tW + γlog tA;
doing so moves the solution continuously along the Pareto frontier,
transparently encoding the desired tilt toward one side.
The key limitation to keep in mind is implementability. While the set of lotteries is convex and the share constraints are linear in expectations, the stability discipline is ex post and combinatorial. Our next step is therefore to develop tractable relaxations that preserve the economic meaning of the frontier while making computation and implementation explicit.
A lottery D over integral
matchings induces a natural object: for each worker–job pair (w, a), let
xwa = Prμ ∼ D[μ(w) = a].
The matrix x = (xwa)
records the long-run frequencies with which worker w is assigned to job a under time-sharing. Expected
payoffs become linear functionals of x:
uw(x) = ∑a ∈ Axwa U(w, a), qa(x) = ∑w ∈ Wxwa Q(a, w).
Thus, once we can describe a convex set of feasible fractional
allocations, the scalarizations from Section~ become linear (or more
generally convex) programs. This is the main computational advantage of
working with shares: uniform lower bounds on individualized benchmarks
translate directly into linear constraints on uw(x)
and qa(x).
Any lottery over matchings must respect capacity constraints in
expectation:
∑a ∈ Axwa ≤ 1 ∀w ∈ W, ∑w ∈ Wxwa ≤ 1 ∀a ∈ A, xwa ≥ 0.
In a balanced one-to-one market it is often convenient to enforce
equalities by adding dummy jobs/workers (interpreting ⊥ as a dummy counterpart), but the weak
inequalities above already capture the feasible set when unmatched mass
is allowed. Absent stability considerations, these constraints define
the classical assignment polytope; when all row and column sums are
exactly one, it is the doubly stochastic polytope. The Birkhoff–von
Neumann theorem then implies that any feasible x can be decomposed as a convex
combination of permutation matrices, i.e., integral matchings:
$$
x \;=\; \sum_{r=1}^R p_r\,\mathbf{1}_{\mu_r},
\qquad p_r\ge 0,\ \sum_r p_r=1,
$$
with R ≤ N2 by
standard Carath'eodory-type bounds. Operationally, this decomposition is
exactly a time-sharing schedule: in period r we implement matching μr with
frequency pr.
Our problem is that we do not want matchings in the support; we want them to be ex post ε-internally stable. This is where additional structure is required.
To obtain a tractable outer description of ex post defensibility, we
impose a set of linear inequalities that rule out (or, under ε > 0, dampen) the possibility
that a given worker–job pair can simultaneously ``have room’’ to
deviate. The canonical ordinal constraint takes the following form. For
each pair (w, a),
define the set of jobs weakly acceptable to w relative to a,
Bw(a) = {b ∈ A: U(w, b) ≥ U(w, a)},
and the set of workers that job a ranks at least as high as w,
Wa(w) = {w′ ∈ W: w′≻aw} ∪ {w}.
Then impose, for all (w, a),
The economic content is the same as in the usual (integral) stability
definition. If worker w
receives too little probability mass on jobs she likes at least as much
as a, then job a must, with high probability mass,
be filled by workers it ranks above w; otherwise (w, a) would have scope to
block. Inequality~ is linear and can be combined with the matching
constraints to define a polytope 𝒫 of
allocations.
The constraint is deliberately conservative. It is designed as an outer approximation to the convex hull of stable matchings: every stable matching satisfies it (interpreting x as {0, 1} variables), and any convex combination of stable matchings satisfies it by linearity. When one-sided ties exist, however, the reverse containment generally fails: there may be fractional points satisfying~ that cannot be written as a lottery over (internally) stable matchings. This gap is precisely why we view 𝒫 as a relaxation unless further conditions ensure integrality.
Our ex post discipline is ε-internal stability, which ignores blocking pairs that yield gains no larger than ε and only considers deviations among currently matched agents. A fully faithful ex post requirement is combinatorial, but the relaxation can be adapted in two ways that preserve linearity and retain the right comparative statics in ε.
First, we can the worker-side dominance sets to reflect material
improvements: replace Bw(a)
by
Bwε(a) = {b ∈ A: U(w, b) ≥ U(w, a) + ε} ∪ {a}.
The resulting constraints treat jobs within ε of a as essentially non-blocking
substitutes for w, matching
the interpretation that small gains do not generate actionable
deviations.
Second, to reflect internal (rather than global) blocking, we can allow unmatched mass to ``absorb’’ potential deviations by explicitly modeling ⊥ and treating pairs involving ⊥ as irrelevant for stability checks. In practice, this amounts to adding a dummy job (or worker) and restricting stability constraints to pairs where both sides are real agents. Both modifications enlarge the feasible polytope as ε increases, aligning with the expansion of ℱε emphasized earlier.
The relationship between fractional constraints and implementable lotteries depends on polyhedral integrality. With preferences on both sides (no ties), the stable matching polytope has an elegant description: the analogues of~ together with the matching constraints characterize the convex hull of stable matchings, and every extreme point is integral (see, e.g., the Rothblum formulation). In that case, solving any of our scalarized programs over 𝒫 directly returns an integral stable matching (or a convex combination of a small number of them when we additionally optimize over (tW, tA)), and implementability is immediate.
Under one-sided ties, two complications arise. First, integrality can fail: 𝒫 may contain fractional extreme points, so an LP optimum can be unattainable by any lottery over stable matchings. Second, even if we restrict attention to weak stability, the set of stable matchings can have complex structure (and related decision problems become computationally harder once ties are present). For our purposes, the key message is methodological: in the presence of ties, 𝒫 should be viewed as producing (i) a tractable on the feasible guarantee region and (ii) a fractional allocation that we attempt to implement approximately.
Suppose we solve a scalarization (or the Nash program) over a polytope 𝒫 and obtain a fractional solution x* along with shares (tW*, tA*). There are three conceptually distinct cases.
Then x* is itself a matching, and if 𝒫 was built to enforce (approximate) stability, we can implement x* directly each period.
Then there exists a decomposition x* = ∑rpr1μr where every μr ∈ Iε. This yields an explicit time-sharing schedule with the desired shares. In settings where the underlying polytope is integral (or admits an efficient separation oracle over stable matchings), such a decomposition can be obtained in polynomial time via standard techniques (Birkhoff–von Neumann for pure matching constraints; analogous decomposition methods for integral stable matching polytopes), with support size bounded by O(N2).
This is the generic difficulty with one-sided ties. We can always decompose x* into integral matchings using Birkhoff–von Neumann, but those matchings may violate ε-internal stability. Enforcing ex post defensibility then requires post-processing (e.g., mapping each drawn matching to a nearby internally stable matching by local improvement dynamics, or re-running a stable procedure with randomized tie-breaking), which may distort the target shares. From a design perspective, we treat this as an implementation gap: the LP describes what is achievable in an idealized convexification, while actual deployment may require an additional approximation step whose worst-case loss can depend on the instance.
This caveat is not purely technical. It reflects a substantive economic tension: when workers are indifferent among many jobs, the planner can ``smooth’’ their expected utilities through fine-grained randomization, but insisting that each realized matching be robust to deviations imposes discrete constraints that may prevent such smoothing from being carried out exactly.
We emphasize two uses of 𝒫. First, as a computational tool, it lets us map (an outer approximation to) the Pareto frontier by solving a sequence of LPs, and it yields a canonical Nash compromise point in the same representation. Second, as an analytical tool, it provides clean upper bounds for impossibility results: if even the relaxation cannot achieve a pair (tW, tA), then no implementable ex post defensible lottery can. In the next section we exploit this logic to construct worst-case instances under one-sided ties, where the frontier itself collapses sharply, implying tight trade-offs between the best attainable worker and job shares.
The relaxation in Section~ makes it tempting to think that we can ``smooth away’’ the discreteness of matching by time-sharing: if workers have ties, then small reallocations of probability mass can equalize expected utilities almost continuously. The difficulty is that, in a two-sided market, such smoothing has a real economic bite. Because jobs have strict priorities (and because we measure job performance using a priority-consistent quality score Q), reallocating probability mass to raise the worst-off worker typically forces some jobs to accept substantially lower-ranked workers in many realized matchings. This section makes that tension sharp by exhibiting worst-case instances in which no lottery can simultaneously guarantee more than a logarithmic fraction of workers’ stable benchmarks and more than a linear-in-N fraction of jobs’ stable benchmarks.
We construct instances in which workers’ cardinal utilities have
large indifference classes, while jobs’ priorities are strict and their
quality scores drop steeply with priority rank. The backbone is a
recursive structure. Intuitively, there is a nested sequence of worker
groups,
G1 ⊂ G2 ⊂ ⋯ ⊂ GK, K = Θ(log N),
together with a nested sequence of
scarce'' job sets \[ J_1 \subset J_2 \subset \cdots \subset J_K, \] such that workers in $G_k$ obtain utility essentially only from being assigned within $J_k$ (and are largely indifferent within $J_k$). The market is engineered so that for each individual worker $w$ there exists a weakly stable matching $\mu\in S$ with $U(w,\mu(w))=1$ (hence $U^*(w)=1$), yet for any single matching $\mu$ there is some level $k$ at which many workers in $G_k$ must be pushed outside $J_k$ and therefore obtain utility $0$. The economic interpretation is a familiar one: ties allow workers to accept many jobs, but scarcity at differenttiers’’
forces the planner to choose which tier is favored in any given realized
matching.
On the job side, we overlay a priority-and-quality design that
amplifies the cost of these choices. For each job a, we select a ``high-priority’’
subset of workers (often just the top-ranked worker) for whom Q(a, w) is near
1, and we set Q(a, w) to be very
small—on the order of 1/N—for
workers far down the priority list. Because Q is consistent with ≻a, this is simply a
statement about curvature: quality declines fast in rank. We also ensure
that there exists a job-optimal stable matching (with respect to
priorities) μA ∈ S
in which each job attains its benchmark quality, i.e.,
Q(a, μA(a)) = Q*(a) ∀a ∈ A.
Thus, when we measure job performance relative to Q*(a), we are
asking how much quality must be sacrificed when we depart from the
job-optimal stable outcome in order to equalize workers.
The punchline is that the worker-side bottleneck forces any worker-fair lottery to spread probability mass across many qualitatively different matchings, and that spread in turn forces at least one job to be assigned to low-ranked workers so often that its expected quality becomes a 1/N fraction of its stable benchmark.
There exists a constant c > 0 and a family of instances
with |W| = |A| = N and
one-sided ties on the worker side such that, for every lottery D over matchings (even if we do
restrict D to be supported on
stable matchings),
Equivalently, along this family, any mechanism that attains a
near-optimal worker guarantee must suffer a linear job-quality collapse:
there are constants c1, c2 > 0
such that
$$
s_W(D)\;\ge\;\frac{c_1}{\log N}
\qquad\Longrightarrow\qquad
s_A(D)\;\le\;\frac{c_2}{N}.
$$
The worker-side component follows the standard
nested scarcity'' logic. Because workers in $G_k$ obtain value primarily from $J_k$ and $|J_k|$ is small relative to $|G_k|$, any \emph{integral} matching can satisfy (i.e., give utility $1$ to) at most $|J_k|$ of the $|G_k|$ workers at level $k$. If we denote by $\mathbf{1}\{w\ \text{is satisfied}\}$ the event that a worker receives a utility-$1$ job, then for each level $k$ we obtain a capacity inequality of the form \[ \sum_{w\in G_k}\Pr_{\mu\sim D}\big[U(w,\mu(w))=1\big] \;\le\; |J_k|. \] Choosing the sizes so that the ratio $|J_k|/|G_k|$ decreases roughly like $1/k$ across levels yields the harmonic bound: to make all these inequalities simultaneously consistent, the worst-case satisfaction probability (hence the worst-off worker's expected utility, and hence $s_W(D)$ because $U^*(w)=1$) cannot exceed $\Theta(1/\log N)$. This is the sense in which one-sided ties generate \emph{logarithmic} fairness limits: they create many locallyflat’’
substitutions, but global feasibility still forces the planner to
time-share across Θ(log N) competing
scarcities.
The job-side collapse uses a different counting argument that
exploits steeply rank-decreasing quality. In the hard family we
designate a particular job a† (or a small constant
number of such jobs) that acts as a across the worker-side bottleneck
levels. In any matching that materially improves the utility of the
currently worst-off worker group, the buffer job must be assigned to a
worker whom a†
ranks extremely poorly (because higher-ranked workers are already
``used’’ to satisfy other constraints). We encode this in Q by setting
$$
Q(a^\dagger,w)\approx 1 \ \text{for the top-ranked stable worker(s)},
\qquad
Q(a^\dagger,w)\approx \frac{1}{N}\ \text{for workers used as buffer
assignments}.
$$
In the job-optimal stable matching μA, the buffer
job receives one of its top-ranked workers, so Q*(a†) ≈ 1.
However, any lottery D that
pushes sW(D)
up to the logarithmic ceiling must place most of its probability mass on
matchings that use the buffer assignment; otherwise the nested capacity
constraints for the worker bottleneck are violated. Formally, one can
isolate a subset of matchings ℳbuffer such that
sW(D) ≥ Θ(1/log N) ⇒ Prμ ∼ D[μ ∈ ℳbuffer] ≥ 1 − Θ(1/log N),
and in every μ ∈ ℳbuffer the job a† is filled by a worker
whose quality is at most O(1/N). Taking expectations
then gives
$$
q_{a^\dagger}(D)\;=\;\mathbb{E}_{\mu\sim
D}\big[Q(a^\dagger,\mu(a^\dagger))\big]
\;\le\; O\!\left(\frac{1}{N}\right),
$$
and since Q*(a†) = Θ(1)
we conclude sA(D) ≤ O(1/N).
Combining this with the worker-side ceiling yields the product
bound~.
The negative result is sharp up to constants on the same family. The key observation is that the worker bottleneck is fundamentally ``multi-level’’: to equalize workers, we must alternate across levels, and doing so inevitably uses the buffer job in exactly the damaging way identified above. This suggests a direct construction.
Let K = Θ(log N) be
the number of bottleneck levels in the instance. For each level k ∈ {1, …, K} we define an
integral matching μ(k) that (i) is
ex post ε-internally stable
(indeed, in the construction we can take ε = 0) and (ii) prioritizes
satisfying workers in level k,
using the buffer job a† as needed. We then
define the lottery D that
selects μ(k)
uniformly:
$$
D \;=\; \frac{1}{K}\sum_{k=1}^K \delta_{\mu^{(k)}}.
$$
By design, each worker belongs to O(1) relevant levels and is
satisfied in Θ(1) of the
matchings μ(k),
implying
$$
u_w(D)\;=\;\Theta\!\left(\frac{1}{K}\right)\cdot U^*(w)
\qquad\Rightarrow\qquad
s_W(D)\;=\;\Theta\!\left(\frac{1}{\log N}\right).
$$
Meanwhile, the buffer job is assigned to a low-ranked worker in all but
O(1) of the μ(k), so its
expected quality is Θ(1/N) relative to Q*(a†) = Θ(1),
yielding
$$
s_A(D)\;=\;\Theta\!\left(\frac{1}{N}\right).
$$
Thus the impossibility is not merely an artifact of analysis: it
reflects the true shape of the feasible two-sided guarantee region in
the worst case.
Economically, the construction highlights a policy dilemma that appears in practice (e.g., assignment with ties induced by coarse preference reports or by equity constraints): equalizing opportunities for workers via randomization may require routinely violating the priorities that drive job-side performance. At the same time, we emphasize that these are instances; the frontier can be far more favorable under milder tie structures, flatter quality decay, or positive instability tolerance ε. Precisely because the frontier can vary so dramatically across instances, it is natural to seek a canonical compromise point. This motivates the two-sided Nash bargaining mechanism we introduce next.
The impossibility result in Section~ leaves open a constructive question: if the feasible region of two-sided guarantees is intrinsically thin in the worst case, how should a planner among the remaining trade-offs in a principled way? We propose a canonical selection rule based on Nash (egalitarian) bargaining applied to the on each side. The mechanism is deliberately agnostic about cardinal units—it operates on the normalized shares tW and tA—and it treats workers and jobs symmetrically at the level of guarantees, even though their primitives (utilities versus priorities) are asymmetric.
Recall that for any fractional allocation x we can write expected
payoffs
uw(x) = ∑a ∈ AxwaU(w, a), qa(x) = ∑w ∈ WxwaQ(a, w),
and compare them to the stable benchmarks U*(w) and Q*(a). We
interpret
$$
t_W \le \min_{w\in W}\frac{u_w(x)}{U^*(w)},\qquad
t_A \le \min_{a\in A}\frac{q_a(x)}{Q^*(a)}
$$
as simultaneously enforceable (one for the worst-off worker, one for the
worst-off job). The feasible set of guarantee pairs,
ℱ = {(tW, tA) : ∃x ∈ 𝒫 s.t. uw(x) ≥ tWU*(w) ∀w, qa(x) ≥ tAQ*(a) ∀a},
is convex because 𝒫 is convex and the
guarantee constraints are linear in (x, tW, tA).
Hence it is natural to seek a rule that is invariant to scaling and does
not require us to hand-pick a slope parameter λ ex ante.
We define the two-sided Nash compromise as the solution to
Equivalently, because log is strictly
increasing, maximizes the of worst-off shares:
max log tW + log tA ⇔ max tWtA.
This product criterion is attractive in our context because it is
scale-free (shares are dimensionless), penalizes extreme imbalance
(driving one side close to zero collapses the product), and has a clear
bargaining interpretation: we are maximizing the Nash product between
two ``representatives,’’ namely the worst-off worker (in share terms)
and the worst-off job.
The program selects a Pareto-efficient point of ℱ and, more strongly, it is optimal for the product ordering.
First, : if (x*, tW*, tA*) solves , then there is no other feasible (x, tW, tA) with tW ≥ tW* and tA ≥ tA* and at least one inequality strict. Indeed, if such a point existed, then log tW + log tA would be strictly larger at the new point, contradicting optimality.
Second, : for every feasible guarantee pair (tW, tA) ∈ ℱ
we have
tW*tA* ≥ tWtA,
since tW*tA*
is the maximum product over ℱ. Thus,
even when the feasible frontier is complicated (and even if 𝒫 is an outer relaxation of implementable
lotteries), identifies the point that is most ``balanced’’ in the Nash
sense.
Because 𝒫 is described by linear constraints (matching constraints plus stability-style constraints) and the only nonlinearity is the concave objective log tW + log tA, is a tractable convex optimization problem. In practice, there are two convenient views.
One can solve with standard interior-point methods for convex programs: the feasible set is polyhedral in (x, tW, tA), and the objective is smooth and strictly concave over tW > 0, tA > 0. When we also include bargaining weights (see below), the objective remains concave.
Since ℱ is convex, we can trace its upper boundary by the LP scalarizations described earlier. In particular, fixing a target τ for the job guarantee and maximizing tW is a linear program. We can then search over τ ∈ [0, 1] to maximize the Nash product tW(τ) ⋅ τ. This approach is useful when we already have efficient LP routines for the stability-style constraints defining 𝒫, and it makes the economics transparent: τ is the ``promised’’ job share, and the LP tells us the largest worker share compatible with that promise.
A practical concern is whether the compromise is well defined. While the allocation x* need not be unique (polyhedral feasible sets typically support multiple optimal allocations), the (tW*, tA*) is generically unique under mild conditions. The reason is that the set ℱ of achievable guarantee pairs is convex and compact (bounded by [0, 1]2 once benchmarks are normalized), and the objective log tW + log tA is strictly concave in (tW, tA). Strict concavity implies that if ℱ contains a feasible point with tW > 0 and tA > 0, then the optimizer in guarantee space is unique. Economically, even if there are many lotteries D (or fractional allocations x) that deliver the same minimal shares, the granted to each side is pinned down.
Two caveats are worth making explicit. First, if some benchmarks are zero (e.g., U*(w) = 0 for a worker who cannot obtain any positive utility in any stable matching), the share normalization is ill-posed; in that case we either exclude such agents from share guarantees or replace ratios by additive guarantees. Second, if ℱ only touches the axes (e.g., the stability-style constraints make it impossible to guarantee both sides strictly positively), then the log objective forces us to impose minimal floors or to relax feasibility (for instance via a positive ε), because log 0 is undefined. In the markets we study, interior feasibility is the typical case, and ε-internal stability is precisely a tool for restoring slack when strict stability is too constraining.
A planner may wish to privilege one side, for example due to
distributional or performance mandates. The Nash objective extends
naturally to
which is equivalent to maximizing tWβtAγ.
Increasing γ/β shifts
the chosen point along the frontier toward higher job guarantees and
lower worker guarantees, in a continuous and interpretable way.
Importantly, remains convex and retains the same invariance properties:
what matters are shares, not utility units.
We emphasize a limitation that will matter for implementation. Program is formulated over the relaxation 𝒫, which is designed to capture feasibility and (approximate) stability constraints in expectation. Solving it yields a fractional allocation x* (and thus target shares tW*, tA*), but it does not by itself provide an explicit lottery over integral matchings whose realizations are ex-post ε-internally stable. Establishing such an implementation requires a decomposition or rounding argument, and the extent to which it is possible depends on integrality properties of the underlying polytope and on how we encoded stability. We address this gap next by showing when exact decomposition is guaranteed and what heuristics are available otherwise.
Our Nash program (or any of the frontier LP scalarizations) outputs a fractional allocation x* ∈ 𝒫 together with target guarantees (tW*, tA*). Interpreting xwa* as a is economically natural in repeated assignment settings (e.g., day-by-day scheduling): worker w should spend an xwa* fraction of periods at job a. The remaining step is operational rather than conceptual: we must convert x* into an explicit lottery D over matchings whose realizations are ex-post ε-internally stable.
Let 1μ ∈ {0, 1}W × A
denote the incidence matrix of an integral matching μ (with 1μ, wa = 1
iff μ(w) = a). A
lottery D supported on a set
ℳ of integral matchings implements the
fractional allocation x
iff
x = 𝔼μ ∼ D[1μ] ∈ conv(ℳ).
In our context, ℳ is not the set of all
matchings, but rather the subset Iε of matchings
that are ex-post ε-internally
stable. Thus, exact implementation of x* is equivalent to a
statement:
x* ∈ conv(Iε).
This is the precise sense in which the choice of feasible polytope 𝒫 matters. If 𝒫 is chosen so that 𝒫 = conv(Iε) (or
at least 𝒫 ⊆ conv(Iε)),
then every fractional optimum can be implemented exactly as a lottery
over admissible matchings. If instead 𝒫
is an designed for tractability, then x* may fail to be
implementable, and we must accept either approximation or additional
computation.
There are two conceptually distinct ingredients behind an exact lottery: (i) (extreme points correspond to matchings), and (ii) (writing a fractional point as a convex combination of extreme points).
First, if the feasible set is simply the bipartite matching polytope
(i.e., no stability-style constraints), then integrality holds and the
Birkhoff–von Neumann theorem yields a decomposition of any doubly
stochastic matrix into permutation matrices. With ≤ matching constraints (allowing unmatched
agents), we can add dummy workers/jobs to make the matrix square and
apply the same argument. Algorithmically, one obtains a
decomposition
$$
x^* \;=\; \sum_{r=1}^{R} p_r \mathbf{1}_{\mu_r},
\qquad p_r\ge 0,\ \sum_r p_r=1,
$$
with R ≤ N2 via
standard greedy cycle-canceling procedures.
Second, with stability restrictions, exact decomposition continues to be possible in special cases where the relevant ``stable’’ polytope is integral. Under strict preferences on sides, the (weakly) stable matching polytope admits an integral linear description (classically associated with Rothblum-type formulations), implying that any feasible fractional point can be expressed as a convex combination of stable matchings. In such settings, if we encode stability correctly, the output x* can be decomposed into ex-post stable matchings, hence also ex-post ε-internally stable matchings for any ε ≥ 0.
Third, there are economically important priority structures under which integrality is effectively ``built in.’’ For example, with a common priority order across jobs (serial dictatorship), the stable matching is essentially unique once ties are resolved, and the planner’s scope for randomization comes primarily from mixing across tie-breakings or across cohorts. In these cases the convexified feasible region is often the convex hull of a small, explicitly describable set of integral outcomes, and decomposition is straightforward.
A useful general bound is provided by Carath'eodory’s theorem: if x* ∈ conv(Iε) ⊂ ℝN2, then x* can be represented using at most N2 + 1 matchings in the support (and typically many fewer in practice). This matters for implementation because a lottery with a polynomial-size support can be operationalized as a rotating schedule rather than an opaque ``random draw’’ over exponentially many outcomes.
With ties on the worker side, stable sets can be large and the corresponding polytopes can lose integrality; indeed, fractional feasible points may exist that satisfy natural linear stability-style constraints in expectation but are not convex combinations of ex-post stable (or internally stable) matchings. This is not merely a technical nuisance: it reflects a real economic tension between (i) using randomization to smooth worker welfare across indifference classes and (ii) respecting job priorities ex post. Our approach is therefore to separate the (selecting a compelling point on the frontier) from the (finding a nearby lottery over admissible matchings), and to be explicit about when the second step requires approximation.
A robust implementation strategy in large systems is a two-step pipeline.
We first compute a decomposition of x* into integral matchings using a Birkhoff–von Neumann style algorithm (after adding dummies if needed). This produces matchings μr that exactly match the desired long-run frequencies but may violate ex-post ε-internal stability.
For each realized matching μr, we apply a
local post-processing procedure that eliminates blocking pairs among
matched agents with gain exceeding ε. One natural repair rule is to
restrict attention to the submarket induced by currently matched agents
and run a priority-respecting deferred acceptance procedure (with
utilities only affecting workers’ choices) until no ε-blocking pair remains. Because
jobs have strict priorities, such a repair can be implemented
deterministically and transparently. The economic cost is that repair
perturbs the realized matching away from μr, and hence
perturbs the induced frequencies away from x*. We therefore view
this approach as producing an implemented lottery D̂ satisfying ex-post ε-internal stability , while
approximating the target shares:
uw(D̂) ≈ uw(x*), qa(D̂) ≈ qa(x*).
In applications, the planner can monitor these deviations and increase
the support size (more periods/more matchings) to reduce empirical
error.
When post-processing distortions are material, we can instead build stability into the randomized construction. A simple approach is to use x* only as a : in each period, sample a priority-respecting matching by running deferred acceptance on a perturbed instance where worker utilities are slightly randomized around U(w, a) and where the perturbation is chosen to match the desired marginal probabilities suggested by x*. The perturbation can be tuned so that the resulting matching is ex-post stable for the perturbed instance, and hence typically ε-internally stable for the original instance for a chosen tolerance ε (interpreting ε as a robustness buffer). This produces a sequence of ex-post admissible matchings whose empirical frequencies converge toward a target region, even if exact equality to x* is unattainable.
In the most general case, determining whether x* ∈ conv(Iε)
and, if so, finding an explicit convex decomposition can require
numerical methods beyond a closed-form decomposition routine. A clean
formulation is to optimize directly over lotteries with a finite
support:
maxpr, μr ∈ Iε log tW + log tA s.t. ∑rprU(w, μr(w)) ≥ tWU*(w), ∑rprQ(a, μr(a)) ≥ tAQ*(a),
with ∑rpr = 1,
pr ≥ 0.
The computational difficulty is that Iε can be
enormous. Practically, we can address this with : maintain a working set
of admissible matchings, solve the restricted master LP over their
probabilities, and then search for a new matching μ ∈ Iε
that most improves the dual objective. The ``pricing’’ subproblem is a
combinatorial optimization over internally stable matchings and may
itself be nontrivial; in some priority structures it reduces to running
a variant of deferred acceptance, while in others it resembles a
mixed-integer program. We flag this as the point where purely analytical
guarantees give way to instance-specific computation.
From a policy perspective, the virtue of an explicit decomposition is that it yields a : matchings μ1, …, μR with weights (pr) can be implemented as a deterministic rotation over R days (or weeks), with each pattern used approximately pr of the time. This makes the randomization interpretable to participants and auditable to regulators. When we must rely on repair or sampling methods, we recommend publishing (i) the ex-post admissibility criterion (here, ε-internal stability), (ii) a clear fairness/quality dashboard tracking realized shares, and (iii) the chosen ε as an explicit ``instability budget.’’ In our view, the key limitation to acknowledge is conceptual: in markets with one-sided ties, there may be no mechanism that simultaneously delivers the frontier-level guarantees and insists on strong ex-post stability in every realization. Our implementation layer is therefore designed to make the trade-off explicit, controllable, and empirically checkable, rather than hidden in tie-breaking or ad hoc overrides.
The frontier and Nash constructions above are intentionally general: they treat the planner’s choice as a convex optimization over a (possibly relaxed) stability-aware polytope. In many operational settings, however, the institutional environment imposes additional structure—often precisely the structure that makes implementation transparent. In this subsection we record a set of special cases where (i) the admissible matchings have a simple combinatorial description, and (ii) the induced (tW, tA) frontier can be characterized in closed form (or via a one-dimensional parametric rule), yielding a particularly interpretable policy object.
Suppose all jobs share a common strict priority order over workers,
written
w1 ≻ w2 ≻ ⋯ ≻ wN,
so that for every job a we
have ≻a ≡ ≻. This
is the canonical
common ranking'' environment in school choice (e.g., test-score priorities) and in staffing systems with a single seniority list. In this case the set of weakly stable matchings has a simple description: every stable matching can be generated by a \emph{serial dictatorship} in which workers are processed in priority order, and worker $w_i$ is assigned her utility-maximizing job among those not yet taken (with arbitrary tie-breaking inside indifference classes). Formally, if $\mathcal{T}_i(J)$ denotes worker $w_i$'s set of maximizers in a remaining job set $J\subseteq A$, then any stable matching $\mu$ can be obtained by iterating \[ \mu(w_i)\in \mathcal{T}_i\!\Bigl(A\setminus \{\mu(w_1),\ldots,\mu(w_{i-1})\}\Bigr). \] Conversely, every outcome of this procedure is weakly stable because no job canreach
up’’ to a higher-priority worker who is willing to move: every
higher-priority worker already holds a (weakly) best available job at
the moment she is assigned.
Two practical implications follow immediately. First, ex-post stability is easy to certify and to communicate: one can literally publish the priority list and the tie-breaking rule. Second, the planner’s randomization problem collapses to a problem of selecting a (or, equivalently, a distribution over serial-dictatorship choice paths). This is substantially lower-dimensional than optimizing over arbitrary matchings.
Under common priorities, the scalarized frontier programs admit an
especially transparent interpretation. Fix any weight λ ∈ [0, 1] and consider the
per-period objective
$$
\lambda \cdot \min_{w}\frac{U(w,\mu(w))}{U^*(w)} \;+\; (1-\lambda)\cdot
\min_{a}\frac{Q(a,\mu(a))}{Q^*(a)},
$$
or, in the fractional relaxation, the corresponding max-min constraints
with a weighted objective. Because feasible integral outcomes correspond
to serial dictatorship realizations, any optimal lottery can be viewed
as mixing among . As λ varies,
the identity of the optimal tie-breaking rule changes only when two such
rules yield the same weighted objective. Consequently, the induced
frontier in (tW, tA)
is : it is the upper envelope of finitely many linear segments obtained
by mixing between a small set of ``extreme’’ tie-breaking regimes
(worker-favoring versus job-favoring) that differ only at a set of
pivotal indifference classes.
This has two policy-facing benefits. Computationally, one can trace the frontier by a simple parametric sweep over λ (or over a target τ for tA) and record the finitely many breakpoints. Substantively, each breakpoint corresponds to a concrete rule change: for a specific priority rank i and a specific indifference class, we switch from assigning wi to her most worker-beneficial option to assigning her to an alternative that protects job quality for a subset of jobs. In other words, the frontier can be explained as a menu of ``which tie(s) we break in which direction.’’
Closed-form expressions become even sharper when worker utilities are
binary. Let U(w, a) ∈ {0, 1}
and define the acceptable set L(w) = {a : U(w, a) = 1}.
Suppose additionally that each job’s benchmark quality is attained by
receiving any worker above a rank cutoff, i.e.,
Q(a, w) = 1{w ∈ T(a)},
where T(a) is the set
of workers above job a’s
quality cutoff (consistent with ≻a), so that Q*(a) = 1
whenever T(a) ≠ ∅. In
this case, the share constraints become literal probability
guarantees:
∑a ∈ L(w)xwa ≥ tW ∀w, ∑w ∈ T(a)xwa ≥ tA ∀a,
together with the matching constraints ∑axwa ≤ 1
and ∑wxwa ≤ 1.
In this binary/threshold regime, the frontier can be characterized by
a family of Hall-type bottlenecks: to guarantee a worker share tW, every subset
of workers must collectively have enough acceptable capacity enough
high-priority mass to satisfy the job constraints at level tA. Concretely,
a necessary condition (and, under standard flow formulations,
essentially sufficient in the fractional model) is that for every R ⊆ W,
|R|⋅tW ≤ ∑a ∈ NL(R)(1 − reserveda(tA)),
where NL(R) = ∪w ∈ RL(w)
and reserveda(tA)
is the amount of job a’s unit
capacity that must be allocated to workers in T(a) to meet the job-side
guarantee. One may view this as an explicit statement of : worker
fairness consumes capacity of a small set of popular jobs, while job
quality consumes capacity of jobs with strict cutoffs, and feasibility
is governed by a finite collection of subset constraints (equivalently,
min-cuts in an associated flow network). As a result, (tW, tA)
can be computed by a parametric max-flow routine with finitely many
breakpoints, which in practice behaves like a closed-form rule: as we
increase tA, a small
number of cuts become tight and determine the maximal feasible tW.
If mixing preserves the identity of the worst-off agents (so that the
binding constraints remain the same along the segment), then for the
mixture Dα = αDW + (1 − α)DA
we obtain the linear shares
$$
t_W(\alpha)=\alpha\cdot 1+(1-\alpha)\underline{\sigma},
\qquad
t_A(\alpha)=\alpha\underline{\rho}+(1-\alpha)\cdot 1.
$$
The two-sided Nash solution (maximizing log tW + log tA)
is then obtained by a one-line calculation: choose α to satisfy the first-order
condition that the frontier’s slope equals −tA/tW,
which here reduces to solving a scalar equation in α. Operationally, this says that the
Nash compromise selects a between two interpretable regimes (``how often
we run the worker-favoring tie-break’’), rather than an opaque
high-dimensional randomization.
The value of these closed-form regimes is not only computational. They turn the abstract worst-off shares into objects that stakeholders recognize. In the binary-utility case, tW is literally a floor on each worker’s probability of receiving an acceptable job (relative to what she could obtain in some stable outcome), while tA is a floor on each job’s probability of receiving a worker meeting its service-level cutoff. Under common priorities, moving along the frontier corresponds to changing a small set of tie-break decisions for high-priority agents—exactly the kind of design lever that agencies already use (e.g., controlled lotteries within priority classes). The Nash point then has a natural ``fair compromise’’ narrative: it is the unique point on the feasible frontier that maximizes the product of the two worst-off guarantees, and therefore resists unit changes (cardinal utilities versus quality scores) and rhetorical reframings (worker-centric versus job-centric reporting).
At the same time, these special cases clarify a limitation that is easy to miss in purely numerical treatments. When the frontier is driven by a handful of tight bottlenecks (popular jobs, strict cutoffs, or pivotal ties), small modeling changes—for example, redefining Q from a cutoff metric to a steep rank-based metric—can move the breakpoints and hence the policy prescription discontinuously. The closed-form view therefore complements our general framework: it tells the planner the trade-offs come from, which constraints are truly binding, and which ``tie-break rules’’ are doing the distributive work.
Our baseline formulation treats the job side via a single cardinal quality score Q(a, w) ∈ [0, 1] that is strictly decreasing in the priority ranking ≻a. This is deliberately flexible: it allows us to compare mechanisms even when jobs care about priorities for institutional reasons (e.g., seniority, credentials) but the planner wants to report outcomes in a policy-relevant metric (e.g., expected service quality). In practice, however, different applications call for different job-facing metrics and constraints. We briefly record several extensions that fit naturally into the same convex-analytic template, and we emphasize where the mapping is exact (i.e., a linear program) versus where it introduces mild approximations.
In some settings, jobs do not have meaningful
cardinal'' evaluations of workers but do have a strong preference for higher-ranked workers, with diminishing sensitivity beyond the top ranks. A parsimonious way to encode this is \[ Q(a,w) \;=\; g_a\bigl(r_a(w)\bigr), \] where $r_a(w)\in\{1,\ldots,N\}$ is the rank of worker $w$ under $\succ_a$ (with $r_a(w)=1$ best) and $g_a:\{1,\ldots,N\}\to[0,1]$ is strictly decreasing. If $g_a$ is approximately linear in rank, then our existing linear objective $q_a(x)=\sum_w x_{wa}Q(a,w)$ already captures the job side. If $g_a$ is convex or concave in rank and the analyst prefers to report a nonlinear summary (e.g., expected rank rather than expected score), then there are two common approaches. First, one can define $Q$ directly as a normalized rank utility, e.g., \[ Q(a,w)=1-\frac{r_a(w)-1}{N-1}, \] which preserves linearity of $q_a(x)$ and gives a transparent interpretation: $q_a(x)$ is one minus expected normalized rank. Second, for more complex $g_a$ one can approximate $g_a$ by a piecewise-linear function in rank, which amounts to replacing a nonlinear quality curve with a small number of linear segments; the resulting programs remain linear after introducing standard auxiliary variables. The upshot is that mostrank-sensitivity’’
desiderata on the job side can be incorporated while keeping the
frontier computation tractable and the shares interpretable.
A different, operationally salient job metric is a service-level
guarantee: a job wants a high probability of being assigned a worker
from its top-k priority set.
This can be expressed as a threshold quality function,
Qk(a, w) = 1{ra(w) ≤ k},
or equivalently via a direct constraint on the fractional
allocation:
∑w: ra(w) ≤ kxwa ≥ tA ∀a.
This form is attractive for policy because it reads as ``each job
receives a top-k worker with
probability at least tA (relative to
its best stable benchmark).’’ Importantly, it also clarifies what the
planner is trading off: raising tA concentrates
probability mass on each job’s small set of high-priority workers, which
mechanically reduces the slack available to equalize worker-side
guarantees. Multiple service levels can be layered by imposing several
constraints (e.g., top-k1 with probability ≥ tA(1)
and top-k2 with
probability ≥ tA(2)),
producing a multi-dimensional frontier. When a single scalar summary is
needed, one can either (i) take the minimum across the service levels as
the job-side objective, or (ii) embed them in a weighted Nash objective;
both preserve convexity because the constraints are linear and the
objective is concave in the share variables.
Many applications are not one-to-one: jobs may have capacities
(multiple seats), and sometimes workers can accept multiple assignments
over time. Let ca ∈ ℤ+
be job a’s capacity. In the
fractional model, we simply replace ∑wxwa ≤ 1
by
∑w ∈ Wxwa ≤ ca ∀a,
and, if each worker can hold at most one unit per period, keep ∑axwa ≤ 1.
The principal modeling change is on the stability side. For
hospital–resident style stability, a job with capacity ca can be
blocked by a worker w if (i)
w prefers a to her current assignment and (ii)
a either has unused capacity
or prefers w to at least one
currently assigned worker. In the fractional setting, a standard
linearization uses cumulative priority mass: define Wa ≽ w = {w′ : w′≻aw
or w′ = w}. A capacity-aware
analogue of our stability-style constraint is
$$
\sum_{b:\, U(w,b)\ge U(w,a)} x_{wb} \;+\; \frac{1}{c_a}\sum_{w'\in
W_a^{\succeq w}} x_{w'a} \;\ge\; 1
\qquad \forall (w,a),
$$
which can be read as: either worker w already receives enough
probability on jobs she weakly prefers to a, or job a allocates sufficiently much
probability to workers it weakly prefers to w so that w cannot ``profitably displace’’
capacity in expectation. This constraint is conservative (it is tailored
to preserve linearity), but it is typically tight in the environments
where we expect fractional allocations to be implementable by
time-sharing across internally stable integral outcomes. More generally,
one can work with the known polyhedral descriptions of the stable
matching polytope in many-to-one markets under strict preferences, and
then add the share constraints exactly as before.
Our use of ex-post ε-internal stability is motivated by
institutional reality: small perturbations in utilities, tie-breaking,
or feasibility (e.g., last-minute withdrawals) should not create large
incentives to deviate among the agents who are actually matched.
Operationally, allowing ε > 0 expands the set of
implementable lotteries and can smooth discontinuities caused by
knife-edge ties. There are two complementary ways to incorporate ε into the convex programs. The
first is ex-post: restrict the support of the lottery D to matchings in Iε, and then
optimize over distributions. The second is ex-ante, via relaxed blocking
inequalities: if a worker-job pair would be a blocking pair only when it
yields a utility gain exceeding ε, we can weaken the stability-style
constraint by excluding such pairs from the constraint index set.
Concretely, define the ``strictly profitable’’ set
Bε(w, a) = {b ∈ A: U(w, b) ≥ U(w, a) + ε},
and replace the worker term ∑b: U(w, b) ≥ U(w, a)xwb
by ∑b ∈ Bε(w, a)xwb
in the stability constraint. This maintains linearity and captures the
intended behavioral restriction: only deviations with gain larger than
ε are treated as
destabilizing. The planner then obtains a clear policy knob: increasing
ε corresponds to tolerating
small envy gaps in exchange for larger simultaneous guarantees (tW, tA).
In many deployments, U(w, a) is not
known exactly at design time: it is estimated from historical choices,
surveys, or noisy preference elicitation. A natural extension is
therefore robust optimization. Suppose we observe a baseline estimate
Û and posit an uncertainty set
𝒰, for example an ℓ∞ ball
𝒰 = {U: |U(w, a) − Û(w, a)| ≤ ηwa ∀(w, a)},
or a scenario set {U(1), …, U(M)}.
A conservative planner may then choose a single lottery (or fractional
allocation) to maximize worst-case guarantees:
$$
\max_{x\in\mathcal{P}}\ \min_{U\in\mathcal{U}}\ \min_{w}\
\frac{u_w(x;U)}{U^*(w;U)},
$$
possibly subject to a job-side constraint that also holds uniformly over
U ∈ 𝒰. Two comments are
important. First, the ratio form is intrinsically nonlinear because the
benchmark U*(w; U)
depends on U through the
stable set. A tractable substitute is to benchmark against a fixed
reference stable set (computed under Û) or to use lower bounds on U*(w; U)
that are valid on 𝒰. Second, if the
analyst is willing to treat benchmarks as fixed (e.g., because job
priorities are the primary institutionally relevant side and worker
utilities only enter through acceptability), then the robust counterpart
reduces to a standard robust LP: it suffices to enforce $u_w(x;U)\ge t_W \underline{U}^*(w)$ for all
U ∈ 𝒰, which becomes linear
under ℓ∞-type sets.
In either case, the conceptual message is stable: robustness shifts the
frontier inward, and the Nash compromise selects a point that is stable
against both sides’ worst-case realizations.
Finally, while our core results are offline, the lottery interpretation is inherently dynamic: the planner implements D repeatedly over periods. This creates a natural interface with online learning. If utilities (or participation) are learned over time, the planner can maintain confidence sets 𝒰t around Ût and recompute a robust Nash point (xt, tW, t, tA, t) each period. Two design principles are worth highlighting. First, : if service-level constraints for jobs are hard requirements, then one should use pessimistic (lower) confidence bounds for job-quality constraints and only exploit optimism on the worker side where the planner is trading off welfare rather than feasibility. Second, : frequent changes in tie-breaking or mixing weights can be difficult to communicate and may create perceived unfairness. A simple remedy is to update slowly, e.g., via a mirror-descent or primal–dual averaging scheme on the share variables, which yields a gradually drifting mixture over a small set of interpretable regimes. These learning-oriented variants raise additional questions—regret relative to an omniscient frontier, sample complexity under strategic behavior, and the interaction between exploration and blocking incentives—that we view as promising directions, but they do not alter the central economic lesson of the model: once we commit to explicit worst-off guarantees on both sides, the planner’s problem remains a disciplined choice of a transparent compromise point on a convex (often polyhedral) feasibility frontier.
Our goal in this paper has been pragmatic as much as it is
theoretical: when one side of a matching market is governed by hard
priorities while the other side expresses (possibly coarse) cardinal
utilities, the natural policy question is not
which stable matching should we pick?'', but ratherwhat
guarantees can we credibly offer to sides, and how should we choose
among the feasible compromises?’’ The language of makes this question
operational. By benchmarking each worker against her best stable utility
U*(w) and
each job against its best stable quality Q*(a), we
normalize away scale and heterogeneity, and we can interpret a pair
(tW, tA)
as a clear promise: every worker receives at least a tW fraction of
what she could hope for within stability, and every job receives at
least a tA
fraction of its corresponding benchmark.
The first design lesson is that feasibility is best understood
through a . Once we allow the planner to time-share across matchings,
the set of attainable expected outcomes becomes convex, and the
constraints that encode feasibility and
no strong blocking incentives'' can often be written linearly in the fractional allocation variables $x_{wa}$. This yields a tractable description of an instance-level feasible set \[ \mathcal{F}=\Bigl\{(t_W,t_A): \exists x\in\mathcal{P}\ \text{s.t.}\ u_w(x)\ge t_W U^*(w)\ \forall w,\ \ q_a(x)\ge t_A Q^*(a)\ \forall a\Bigr\}, \] where $\mathcal{P}$ incorporates matching constraints and the stability-style inequalities appropriate for the implementation notion (e.g., $\varepsilon$-internal stability). The practical implication is straightforward: rather than arguing abstractly aboutfairness
versus priorities,’’ a designer can compute (or approximate) the set of
feasible guarantees, trace the Pareto boundary by scalarizations (fix
tA = τ
and maximize tW, or vice
versa), and present the trade-off to stakeholders in a common unit.
The second lesson is equally important: , even with randomization and
even if we are generous about stability. In markets with one-sided ties,
we show a worst-case two-sided obstruction: there exist instances in
which any lottery over matchings must satisfy a product-type upper bound
of the form
$$
s_W(D)\cdot s_A(D)\ \le\ \frac{c}{N\log N},
$$
and, correspondingly, any attempt to ensure a worker guarantee on the
order of 1/log N forces the
job guarantee down to the order of 1/N (up to constants). This is not a
statement about a particular algorithm failing; it is a statement about
the geometry of the problem. Ties on one side create large stable sets
and make it possible to protect many workers only by cycling probability
mass through assignments that repeatedly place some jobs with
low-priority workers. When job quality drops steeply with rank, this
cycling shows up as an unavoidable deterioration in job-side guarantees.
The value of a worst-case result here is diagnostic: it tells the
planner when an aspirational policy target (e.g., ``everyone gets at
least half of their stable benchmark’’) is not a matter of better
engineering but is incompatible with the primitives of the market.
At the same time, the third lesson is constructive: the impossibility trade-off is . On the same hard families, there exist explicit lotteries supported on ex-post ε-internally stable matchings achieving sW(D) = Θ(1/log N) and sA(D) = Θ(1/N). This matters for design because it separates what is truly impossible from what is merely nontrivial. If a planner is willing to commit to a transparent, repeated time-sharing schedule—the natural interpretation of a lottery—then one can meet the best-possible asymptotic guarantees simultaneously. In other words, randomization is not a panacea, but it is an essential instrument for reaching the frontier: it converts a brittle selection problem among discrete matchings into a controlled policy choice among convex combinations.
The remaining question, and the one that most directly speaks to
practice, is how to select a point on the frontier in a way that is
intelligible and defensible. Our preferred recommendation is the
two-sided Nash (or weighted Nash) compromise, which solves
maxx ∈ 𝒫, tW, tA > 0 log tW + log tA,
or, when policy dictates, max βlog tW + γlog tA.
We emphasize two reasons. First, this choice is Pareto-efficient in the
space of guarantees: there is no alternative feasible policy that raises
both sides’ minimal shares. Second, it is ``scale-free’’ and symmetric
in the relevant sense: maximizing log tW + log tA
is equivalent to maximizing the product tWtA,
so the mechanism does not implicitly privilege one side simply because
its benchmarks are numerically larger or more dispersed. For
communication, this is valuable: the planner can describe the selected
policy as the one that maximizes the joint strength of the two
guarantees, rather than as an arbitrary weighted average of incomparable
welfare measures.
A further design guidance point concerns . Because 𝒫 can be written down explicitly, the planner
can explain which constraints are binding at the chosen compromise. When
the job-side guarantee is low in a given instance, the explanation is
not the algorithm chose to hurt jobs,'' but rathergiven
priorities, capacity, and the requirement that no strongly profitable
blocking deviations exist among matched agents, there is not enough
probability mass to keep every job frequently matched to top-priority
workers while also protecting the worst-off workers relative to their
stable opportunities.’’ Conversely, if the worker-side guarantee is low,
it becomes visible whether this is due to exceptionally stringent
job-side service requirements, to steep quality curvature, or to a
deliberate choice of bargaining weights.
Implementation is the final step where theory must acknowledge limitations. A fractional solution x is only as useful as our ability to realize it as a lottery over matchings that satisfy the intended ex-post constraints. In settings where the relevant polytope is integral, one can appeal to standard decomposition arguments to obtain a small-support lottery that exactly implements x. In more general environments (notably with one-sided ties), the relaxation can be imperfect, and decomposition may require either numerical methods or a controlled rounding step that introduces additional, instance-dependent distortion in the achieved shares. Our guidance is therefore to treat implementation as part of the policy choice: if exact ex-post internal stability is a hard requirement, one should design 𝒫 to align as closely as possible with the convex hull of implementable matchings; if small violations are institutionally acceptable, an ε-stability budget can be used explicitly to enlarge feasibility and to make rounding errors negligible relative to the guarantees being targeted.
Stepping back, the central message is that two-sided fairness in priority-based markets is not well captured by selecting a single stable matching. It is better captured by (i) stating explicit, benchmarked worst-off guarantees for both sides, (ii) computing the feasible frontier of such guarantees under the stability and feasibility constraints the institution actually cares about, and (iii) selecting a compromise point by an objective—such as the (weighted) Nash product—that is both economically grounded and easy to explain. The model does not remove value judgments: bargaining weights, stability tolerances, and metric choices remain policy parameters. What it does provide is discipline. It tells us which ambitions are infeasible, which compromises are efficient, and how to implement and communicate a chosen policy as a transparent point on a well-defined trade-off frontier.