Introduction
In my last article I presented the fundamentals of the sandwich attack, whereby an exploiter can siphon value away from a naïve token swap by constructing a series of transactions during block creation that flank that of their victim. The process occurs in three steps 1) front run the user’s trade, creating a modified liquidity pool state with an arbitrarily reduced-price quote with respect to the token the user is attempting to sell, 2) allow the user’s trade through at the reduced rate and 3) back run the prior two steps, returning the liquidity pool to a comparatively normal state. It was demonstrated that the overall process is financially indistinguishable from an effective theft by the attacker, followed by a significant swap fee hike on the liquidity pool prior to the swap of what is left of the user’s tokens. Should one wish it, the formulae presented there, including an OptimumSandwich
python class, are fertile ground to continue with independent study. I concluded the article by challenging the reader to derive a mathematical description for an “un-sandwichable” trade. That is, to show algebraically that for any set of state variables and user inputs wherein a sandwich attack can be performed, there exists a similar set where the modification of at most one of those elements will render a sandwich attack impossible. The purpose of this article is to describe the “un-sandwichable” set.
The conventions used here are the same as those established previously; to differentiate the inputs and outputs belonging to the attacker and the user, I’ll continue to use the subscripts a and u, respectively. Additionally, x will consistently represent the token that is being sent to the liquidity pool by either the attacker or the user, while y will invariably represent the token that is being transferred from the liquidity pool to either the user or the attacker. The lowercase Greek letter δ represents the liquidity pool swap fee, and the uppercase form, Δ, denotes trade quantities. Assume x, y, δ, Δx, Δy, are always positive real numbers, with Δx and Δy being the quantities of tokens that are taken from- and added to the wallets of the user or attacker, respectively (implying they’re added to- and removed from the liquidity pool balances in a corresponding manner). The difference between the number of tokens received by the attacker from the back running trade, and the same token sent to the liquidity pool during the front running trade is denoted with the letter Q.
Breaking the Sandwich Vending Machine
The critical piece of information in the prior article is that the attacker’s profits, Q, is optimal at a precise front running trade quantity, Δxₐ, given the token reserve of the liquidity pool, x, its fee level, δ, and the quantity of tokens the user is attempting to swap, Δxᵤ. The optimal front running trade quantity, Δxₐ, can be expressed as one of the roots of a quartic polynomial. Whereas the prior focus was to determine the value of Δxₐ while treating the other variables as constants, the aim here is to determine the value for the other variables, x, δ, and Δxᵤ, when the attacker’s optimum front running trade is zero. In other words, what combination of the pool state and user inputs cause the attacker to decide to do nothing, and leave the user’s transaction alone? Compared to this article’s predecessor, these solutions are markedly easier to find. Take the previously defined quartic and set the indeterminate, Δxₐ, to zero. This causes all but the constant term (i.e. the D coefficient) to be eliminated (eqn1).
While the x term is still quartic, two of its roots are trivial (refer to the factorization below), and the other two are the solutions to a relatively benign quadratic. The Δxᵤ and δ terms are both cubic, but again, one root is trivial and the other can be surmised from application of the quadratic formula. As before, I am only presenting the roots that are contextually relevant. That is, for the infimum δ: 0 < δ < 1, the supremum Δxᵤ: Δxᵤ > 0, and the infimum x: x > 0 where the attacker’s optimal front running trade is Δxₐ ≤ 0 (eqns 2–4). For the benefit of majority of the readership, the meaning of these expressions can be understood as follows:
Given a constant product liquidity pool with a token reserve of x ETH, where the user has nominated to swap Δxᵤ ETH for the counterpart token, the minimum swap fee that nullifies all value of a sandwich attack is inf δ (eqn 2). Therefore, any δ ≥ inf δ will also make a sandwich attack impossible (figure 1).
Given a constant product liquidity pool with a token reserve of x ETH, and a swap fee of δ, the maximum amount of ETH the user can swap before exposing a sandwich attack opportunity is sup Δxᵤ (eqn 3). Therefore, any Δxᵤ ≤ sup Δxᵤ will also make a sandwich attack impossible (figure 2).
Given a constant product liquidity pool with a swap fee of δ, where the user has nominated to swap Δxᵤ ETH for the counterpart token, the minimum token reserve of ETH in the constant product liquidity pool required to nullify the sandwich attack is inf x (eqn 4). Therefore, any x ≥ inf x will also make a sandwich attack impossible (figure 3).
Figure 1: Analysis of inf δ with respect to x and Δxᵤ. In the context of a constant product liquidity pool holding a token reserve of x ETH, when the user chooses to exchange Δxᵤ ETH for its counterpart token, the lowest swap fee that renders a sandwich attack valueless is represented by inf δ (eqn 2). Swap fees greater than or equal to inf δ safeguard against the possibility of a sandwich attack. The visualizations are a) a three-dimensional (3D) surface plot on the left, and b) a corresponding heatmap on the right.
Figure 2: Analysis of sup Δxᵤ in relation to x and δ. Given a constant product liquidity pool with a token reserve of x ETH and a swap fee of δ, the maximum ETH amount the user can exchange without risking a sandwich attack is represented by sup Δxᵤ (eqn 3). Any Δxᵤ less than or equal to sup Δxᵤ ensures immunity from the sandwich attack. The visualizations are a) a three-dimensional (3D) surface plot on the left, and b) a corresponding heatmap on the right.
Figure 3: Analysis of inf x in relation to δ and Δxᵤ. Within a constant product liquidity pool operating with a swap fee of δ, when the user opts to exchange Δxᵤ ETH for the associated counterpart token, the least token reserve of ETH required in the liquidity pool to neutralize the potential for a sandwich attack is depicted by inf x (eqn 4). Reserves of x greater than or equal to inf x ensure a sandwich attack is unfeasible. The inf x axis is presented using a log10 scale. The visualizations are a) a three-dimensional (3D) surface plot on the left, and b) a corresponding heatmap on the right.
Cursory examination of eqns 3 and 4 reveals an opportunity to reduce the x and Δxᵤ dimensions into a single variable, r = x/ Δxᵤ. This is an intuitive simplification; it is not the absolute size of the user’s trade, but its relative size compared to the token reserve of the liquidity of the pool that matters (eqn 5).
The new minimum values that describe a sandwich attack-resistant trade, inf δand inf r can then be defined (eqns 6 and 7).
The interrogation of the plot of inf δ versus r challenges my intuition (figure 4). It is obvious that for x, Δxᵤ ∈ ℝ+, the limit of r = ∞ as x → ∞, the limit of r = 0 as Δxᵤ → ∞, and the limit of r = 1 as Δxᵤ → x. From a geometric perspective r = 1 is the “middle” of the range, as 0 and ∞ are in some sense equidistant from the limit at Δxᵤ → x. Nothing surprising so far. The limits of inf δ are also trivial; the limit of inf δ = 0 as r → ∞, the limit of inf δ = 1 as r → 0, and the natural “middle” of the range is δ = ½. I expected these midpoints to coincide with each other, but they do not. The r value that corresponds to δ = ½ is r = 1/√3, and the δ value that corresponds to r = 1 is δ = (9 — √33)/8. There is nothing apparently useful in this fact; I raise it only for curiosity’s sake. However, the analysis continues to bear fruit. The function that defines inf δ also exhibits asymptotically limiting behavior. The inf δ function is asymptotically equivalent to 1 — √r as r becomes arbitrarily large. It is also asymptotically equivalent to 2/(2r + 3) as r becomes arbitrarily close to but greater than 0 (eqns 8 and 9). An interactive plot is provided for the reader’s convenience via desmos. The former (eqn 8) has more practical significance, as we seldom expect a user to be attempting a swap with a token quantity exceeding that of the entire reserve of the pool.
Figure 4: Analysis of inf δ with respect to r, where r = x/Δxᵤ. In the context of a constant product liquidity pool holding a token reserve of x ETH, when the user chooses to exchange Δxᵤ ETH for its counterpart token, and where the quotient of x and Δxᵤ is denoted as r, the lowest swap fee that renders a sandwich attack valueless is represented by inf δ (eqn 6). Swap fees greater than or equal to inf δ safeguard against the possibility of a sandwich attack. The visualizations are log-scaled plots highlighting the relationship between r and inf δ and showcasing a) key intersection points corresponding to the heuristic midpoints of the function domain, and b) asymptotic approximation of inf δ.
Synthesizing an Inedible Sandwich
For the sake of consistency, this demonstration will re-use the scenarios introduced in the “Expected Behavior” and “A Delicious Sandwich for One” sections from the preceding article, “The Optimum Sandwich: How to Exploit Blockchain Enthusiasts with Arbitrary Precision”. Assume a liquidity pool exists with 500 ETH (x) and 1,000,000 USDC (y), representing a combined total value of approximately $2M USD, from which a market price of ETH near $2,000 can be inferred. Additionally, assume the pool fee level, δ, is fixed at 0.003 (i.e. 0.3%, or 30 basis points) in the standard case.
The intent of this section is to navigate through the revelations arising from the prior discourse. I must underscore, though, that our journey here is largely scholastic. My primary objective is to enhance the reader’s grasp of the principles elaborated above, as well as the prior article. Existing methods, such as the minReturn
, which present reasonably adept solutions against sandwich attacks should not be overlooked. I promise a closer examination of the minReturn
criterion in an upcoming piece. For now, let the scope of this section be limited to the concepts covered thus far. Humor me.
First, consider the case where the user elects to swap 20 ETH for USDC:
The user observes a pool with 500.000000 ETH tokens, x, and 1000000.000000 USDC tokens, y, and a swap fee of 0.300000%, δ.
The user elects to swap 20.000000 ETH tokens, Δxᵤ, and expects to receive 38346.153846 USDC tokens, Δyᵤ.
First, the attacker front runs the user’s trade by swapping 681.367696 ETH tokens, Δxₐ, for 575031.461640 USDC tokens, Δyₐ.
Then, the user’s trade is allowed through; the user swaps 20.000000 ETH tokens, Δxᵤ, for 7053.521318 USDC tokens, Δyᵤ.
Finally, the attacker back runs both of the previous trades by swapping 575031.461640 USDC tokens, Δxₐ, for 693.644385 ETH tokens, Δyₐ.
Therefore, the attacker has extracted a total of 12.276689 ETH tokens, Q, from the user’s transaction.
The overall process is equivalent to the user giving away 12.276689 ETH tokens, Q, to the attacker, then swapping the remaining 7.723311 ETH tokens, Δxᵤ, with the pool.
In addition to the sacrificed ETH token quantity, Q, the pool fee also appears to be increased from 0.300000% to 53.630805%, δ* (i.e. 17776.934872% increase).
At the end of the process, the liquidity pool contains 507.723311 ETH tokens, x, and 992946.478682 USDC tokens, y.
The user’s losses are -81.605662% with respect to the expected outcome.
The maximum unattackable trade at a 0.300000% fee level is to swap 1.506781 ETH tokens, sup Δxᵤ, to receive 2995.493230 USDC tokens, Δyᵤ.
Alternatively, if the fee level was changed to 3.773612%, inf δ, the user could have swapped all 20.000000 ETH tokens, Δxᵤ, for 37010.149326 USDC tokens, Δyᵤ, with no risk of attack.
The adjusted fee level translates to a mere -3.484064% difference compared to the naive swap, and a +424.704579% difference compared to the attacked transaction.
Even as the author of this analysis, I find myself continually astonished by the apparent falsidical paradox revealed by performing it. The illustration above shows that for a pool fee setting of 0.3%, the cost of executing a sandwich attack is sufficiently low to allow the exploiter to reduce the initial transaction value by an abhorrent 81% margin. However, increasing the pool fee from 0.300% to 3.774% (inf δ evaluated for x = 500, Δxᵤ = 20) makes the transaction worthless to the would-be exploiter, and the transaction value is improved for the user by 424.705% compared to its exploited alternative. The difference is colossal, beyond anything I would suspect one might be able to arrive at by guessing alone. This serves as an [unwelcome?] reminder that although heuristics and intuition play a significant part in developing a solid theoretical framework, there comes a point where the need to carry out a concrete calculation becomes unavoidable. The data above are tabulated in figure 5.
Figure 5: Tabulated results of the sandwich attack illustration. a) Details of the sandwich attack and b) the adjusted swap quantity or pool fee setting that would have prevented it from occurring.
The upper limit of a non-attackable token swap, sup Δxᵤ, might initially seem mundane. Both equation 3 and figure 2 provide a comprehensive understanding of how variables x and δ affect its value, and the subsequent financial implications appear straightforward. However, there’s more under the surface that warrants a closer look.
The variable fee technique, which uses a dynamic inf δ calculation, presents a puzzle. The above example, in light of an almost infinite range of user inputs, is incomplete. Referring to the fee curve in figure 4, notice that when r values are minuscule — meaning Δxᵤ greatly outnumbers the token’s liquidity pool reserve, x — the pool fee nears 100%. As r approaches zero, both the inf δ function and its asymptotic approximation tend to unity, as shown in equations 6 and 9.
Acknowledging these traits, one might argue that this mechanism can’t maintain user value for exceptionally large swap sizes, possibly even for just moderately large ones. When inf δ values verge on 100% for substantial swaps, it’s logical to deduce that tokens transferred from the liquidity pool to the user would dwindle to almost nothing. There seems to be a paradox: as Δxᵤ tends toward infinity, Δyᵤ gravitates towards zero. Yet, the same is true when Δxᵤ is virtually non-existent. Put simply, swapping an endless amount of ETH into the pool yields the same result as swapping almost nothing — virtually no return. However, as already demonstrated, there are specific Δxᵤ values that produce very reasonable outputs for Δyᵤ, suggesting there exists a certain amount of ETH that maximizes the USDC return under inf δ’s effect. This can be proven algebraically (eqns. 10–13).
At first glance, the overarching swap function (eqn 10) appears more complex than what we’re traditionally accustomed to. Thankfully, the intricacies of implementation are irrelevant, given the exploratory nature of this exercise. The somewhat daunting partial derivative (eqn 11) simplifies when evaluated at Δxᵤ = 0 to the familiar [feeless] constant product AMM marginal price formula (eqn 12), as expected. The derivative also has an easily identifiable root at Δxᵤ = 2x. This marks the precise quantity of ETH one can swap to maximize the USDC received from the pool (i.e. the point at which an additional, infinitesimal amount of ETH fails to yield any additional USDC to the swapper).
The prior deduction primarily highlights the imprudence of swapping more than double the pool’s ETH reserve in a single move. However, its dynamics in relation to the sandwich attack has yet to be addressed. It’s vital to note that this method ensures an unprofitable venture for any would-be exploiter attempting a sandwich attack, but it doesn’t promise a superior amount of USDC for the user compared to if they had been sandwiched instead. The question then, is if there exists a point where preventing a sandwich attack is more costly than simply allowing one to happen.
To delve deeper and truly gauge the breadth of this method, I subjected it to rvalues as low as 0.001, translating to trade volumes up to 500 times the aggregate value of both token reserves in the pool. Imagine, for the sake of our prior example, executing a staggering $1 billion ETH trade into a pool whose contents are merely $2 million, divided evenly between ETH and USDC. This is the most extreme case presently under examination (figure 6).
Figure 6: Analysis of Δyᵤ with respect to Δxᵤ employing the minimum sandwich-attack resistant pool fee inf δ. When a user opts to exchange Δxᵤ ETH for its counterpart token, the potential value obtained by the user, Δyᵤ, is represented under normal conditions when no sandwich attack is performed (white trace), when the optimal sandwich attack is performed (red trace) at the 0.3% fee level, and when the minimum sandwich attack-resistant pool fee, inf δ, is employed. All three conditions assume a liquidity pool reserve balance of 500 ETH. The visualizations are a) log-log-scaled, up to and including Δxᵤ inputs 500× the liquidity pool reserve balance of ETH (i.e. 500,000 ETH), and b) linear in both dimensions, up to and including Δxᵤ inputs equal to 1× the liquidity pool reserve balance of ETH (i.e. 500 ETH). The local maxima of the inf δ method (blue trace) and the sandwich-attacked trade (red trace), and the intersection point of these two curves are depicted with broken lines and labelled according to their x- and y-coordinates (Δxᵤ and Δyᵤ, respectively).
It is crucial to first acknowledge that employing inf δ in place of δ — aimed at averting a sandwich attack — enhances the user’s rate of value retention at an accelerating rate. This happens with an unexpected persistence as Δxᵤ values ascend, but only up to a point. The peak at Δxᵤ = 2x, deduced above (eqn 13), stands out clearly. A similar peak can be discerned for the sandwiched transaction curve, a feature not discussed previously. But the crux of our observation lies in the convergence of the blue and red traces. At this juncture, falling prey to a sandwich at a 0.03% fee matches the outcome of an anticipatory fee surge. For any swap exceeding this threshold, the irony is palpable: succumbing to a sandwich attack becomes more economical than its prevention.
Unfortunately, if it is possible to describe the local maxima around the red trace, or its intersection point with the blue trace algebraically, it is beyond my abilities at present. I suspect it may be impossible, but I can’t be sure without committing more time to this question than it is worth. I’ll offer a 100 USDC bounty to the first person that can provide a purely symbolic solution to this problem, or proof that one can’t exist. The red trace maxima (Δxᵤ = 49,518.49922993397 ETH, Δyᵤ = 57,986.60716050453 USDC) and the red-blue trace intersection (Δxᵤ = 181,608.08402209895 ETH, Δyᵤ = 50,907.34540591974 USDC) were determined numerically.
This analysis considers only the situation where both the user and the attacker observe the same pool fee level, either the de facto 0.3% in the case of emulating a sandwiched trade, or whatever the calculated inf δ value is, appropriate to nullify the attack. While the variable fee technique used here is referred to as “dynamic”, it should be stressed that this refers to the model itself, and not a hypothetical CFMM design. While the dynamics of an “on-the-fly” fee calculation, and especially its impact on sandwich MEV is interesting enough to warrant further investigation, that deep dive remains outside the purview of the present discussion.
Conclusion
The theory presented here is motivated by a desire for a more robust analytical foundation for describing sandwich attacks in relation to arbitrary fee levels, liquidity depth, and user transaction sizes. Nothing presented here should be conflated with the fundamentals of CFMM design, necessarily. As noted above, the minReturn
is a perfectly serviceable answer to the challenge of sandwich attack mitigation and will be the focus of a future analysis. Instead, let the distribution of value between the pool’s liquidity providers, its swapper, and his adversary be the subject of your attention. If nothing else, these models cast new light on an old problem, and illuminates the previously uncharacterized, bounded nature of the industry’s most popular exploit.
Postscript
This piece emerges alongside Stefan Loesch’s recent exploration into how Carbon, Bancor’s trading protocol, stands up to sandwich attacks. Loesch delves into the practicality of these attacks, and comments specifically on their practicability under different fee structures. His insights corroborate the findings presented in the present analysis.
Updated Code Block: OptimumSandwich Class (python)
The OptimumSandwich
class (below) is updated from that in the prior publication, and outputs the bulleted text from the previous section, and the tabulated data in figure 5 to a text file.