Introduction
I often have opportunity to discuss various aspects of our industry with a studious and enthusiastic audience. Anecdotally, sandwich attacks are among the more frequently requested discussion topics. I suspect the interest in this specific attack vector is owed to its ominous and threatening nature. Indeed, sandwich attacks are one of the more credible fears shared by the DeFi community; however, it is remarkable how few of its members can articulate exactly what a sandwich attack is, let alone derive the equations that describe how one is performed. I am forced to concede that I must shoulder at least part of the blame for the disappointing levels of erudition exhibited by the typical DeFi pedestrian. Experience demonstrates that I have failed to guide an otherwise capable and willing audience to the understanding they have politely asked for, and rightfully deserve. My resolve is to expose here the full itinerary for performing a sandwich attack, and the mathematical methods that allow for the industry’s top exploiters to wring dry every available wei of value (quite literally) from vulnerable trades. Under any other circumstance there may be a reasonable question of ethics with respect to publishing information that is so easily weaponizable. Unfortunately for us, those who would seek to weaponize it, have already, and so I see no adverse consequences to emptying the black boxes of our adversaries and exploring their contents together.
Expected Behavior
At the risk of sounding naïve, I want to begin by examining the prototypical case where a user trades against a generic constant product liquidity pool and expects to be delivered the tokens they surmised from the algorithms employed in the smart contracts with which they are interacting. That is, the result of the familiar swap function (eqn 1). To distinguish between inputs and outputs associated with the attacker and the user, I will use the subscripts a and u, respectively. Further, let x always denote the token that is being sent to the liquidity pool from either the attacker or the user, and let y always denote the token that is being sent from the liquidity pool to either the attacker or the user. I will also observe the old convention of denoting the liquidity pool swap fee with the lowercase Greek letter δ, and trade amounts with the uppercase form, Δ. Assume x, y, δ, Δx, Δy, are always positive real numbers, where Δx and Δy are quantities of tokens subtracted from- and added to the user or attacker wallets, respectively (and by inference, added to- and subtracted from the liquidity pool balances, respectively).
Assume a liquidity pool exists with 500 ETH and 1,000,000 USDC, representing a combined total value of approximately $2M USD, from which a market price of ETH near $2,000 can be inferred. For convenience, assume the pool fee level, δ, is fixed at 0.003 (i.e. 0.3%, or 30 basis points). Then, assume that a user is motivated to swap 20 ETH (i.e. ~$40K nominal value) with the pool. The expected result after the effect of price slippage and after accounting for the trading fee is summarized in figure 1.
Figure 1: The (naïve) expected result. 20 ETH is exchanged for 38,346.1538 USDC from a constant product liquidity pool comprising 500 ETH and 1,000,000 USDC, and near $2M USD total liquidity.
What is depicted above is the desired outcome, if we allow ourselves the dubious assumption that the user is familiar with the basics of constant-function market making (CFMM).
A Delicious Sandwich for One
Armed with only a modest working knowledge of the CFMM algorithm and the requisite skills for its algebraic manipulation, and perhaps a reckless disregard for the wellbeing of the user, it is possible to intervene in the expected process and siphon significant value away from the transaction and into your own wallet. Before I provide the instructions for performing such an attack under arbitrary conditions, I first want to demonstrate the attack using the expected behavior depicted in figure 1 as an example specimen.
Step 1: Front Run the Trade
After observing the user’s transaction in the mempool, the attacker can very quickly calculate a precise number of tokens to trade ahead of the user’s expected swap to maximize the effectiveness of the exploit. For now, take me at my word that the most efficient exploit is to front run the user’s 20 ETH trade with a swap in the same direction for 575,031.4616 USDC (figure 2). To the uninitiated, this may seem like a rather impulsive act. It has often been my experience that newcomers to DeFi, and especially those without prior knowledge in CeFi, are under the misapprehension that front running is akin to being beaten to an attractive price by another participant who is equally, if not more interested in the purchase of the same token as they are. Let me put that fantasy to rest. Save for a vanishing minority of cases, the person front running you has absolutely no interest in the token you are attempting to purchase.
Figure 2: Step 1 — Front run the trade. The attacker performs a large trade in the same direction as the user and makes certain that their swap is ahead of the user’s in the transaction sequence; 681.3677 ETH is swapped by the attacker for 575,031.4616 USDC.
Step 2: Allow the User’s Swap to Proceed on the Rigged Liquidity Pool
With the prices inside the liquidity pool now appropriately adjusted to set the stage for the exploit, it is now critical that the user’s swap transaction is executed immediately following the front run (figure 3). This would be a risky endeavor, were it not for the fact that these exploits are typically performed during block creation. That is, the person who is determining the order of transactions within the block is in a privileged position to decide the fate of the attempted exploit; if the exploit is their own, they can guarantee its success. If there is a margin for chance, it is sufficiently small as to be negligible.
Figure 3: Step 2 — Allow the user’s trade through. The attacker forces the user’s trade immediately after the front running transaction. 20 ETH is exchanged for 7,053.5213 USDC by the user; previously, in the unexploited example, the user received 38,346.1538 USDC for the same trade on the same liquidity pool (therefore suffering an 81.6% decrease).
Step 3: Back Run the Trade
In the third and final step, the attacker reverses their initial front run, by swapping all the USDC they had purchased only moments ago, back for ETH.
Figure 4: Step 3 — Back run the trade. The attacker forces their own back running transaction immediately after the user’s trade has been executed. The 575,031.4616 USDC acquired in the front running step is swapped back in its entirety for 693.6444 ETH, netting the attacker 12.2767 ETH in the round trip.
The important thing to note here is that the attacker has risked essentially nothing, and provided nothing of value to the system, save maybe for their involvement in “securing the blockchain” as part of the block building process. If this latter argument is to be entertained, I doubt many would hesitate to characterize the relationship between those participating in consensus and those inhabiting the chain for its own sake as abusive.
The Overall Result
The steps described above detail the mechanism of the attack (figures 2–4), and for many of you there will be very little offered in the way of new information thus far. However, I am hopeful that the following perspective regarding the overall process is sufficiently novel to earn your consideration, regardless of your prior familiarity with the subject.
Imagine for a moment that we become so exhausted and complacent as a community that we simply accept that these kinds of attack vectors are insurmountable — or worse, serve the purpose of a built-in tax vehicle in addition to gas costs demanded by those governing the infrastructure. Further, imagine that it is realized that there are a lot of unnecessary steps involved in these sandwich attacks, and that it would be preferable to arrive at the same result without the superfluous ritual of performing it. I am being deliberately hyperbolic, but I promise it doesn’t matter. The point is to consider the reductionist perspective.
A user decides to trade ETH for USDC on a liquidity pool with a fee rate of 0.3% and nearly $2M in liquidity.
They are expecting to send 20 ETH to the liquidity pool and receive 38,346.1538 USDC, and then commit to the transaction.
A stranger arrives and takes 12.2767 ETH without warning and for no reason.
The remaining 7.7233 ETH are sent to the liquidity pool to perform the swap.
The liquidity pool then immediately changes its fee rate to 53.63% (i.e. a 17,776.9349% increase).
The swap is performed, and the user receives 7,053.5213 USDC (in contrast to the 38,346.1538 USDC they were expecting).
Finally, the liquidity pool then immediately reverts its fee level to that prior to the user’s exchange (i.e. back to 0.3%).
The steps above are completely redundant with the steps of performing a sandwich attack. The two have different mechanisms but arrive at precisely the same result; they are financially and, to some extent, computationally identical. While I would prefer everyone with an interest in the topic have a firm basis of understanding in the true mechanism, I’ll settle for a heuristic one so long as the facts regarding the precise transfer of value, and to whom, are consistent.
Figure 5: The overall result. The user sends 20 ETH, which is intercepted by the attacker, who removes 12.2767 ETH from the transaction, leaving only 7.7233 ETH for the trade. Then, the liquidity pool changes its fee level from 0.3% to 53.63%, before processing the exchange. The user then receives 7,053.5213 USDC according to the familiar swap algorithm (eqn 1) at the new fee level.
Precise Sandwich Elucidation
The sandwich attack detailed above is optimal for the state as it is presented, save for the lack of decimal places. That is, the 681.3677 ETH (in fact, 681.367696126344533020 ETH) used in the front running transaction is not an approximation, and the 12.2767 ETH (in fact, 12.276688934466993032 ETH) extracted by the attacker is the absolute maximum achievable.
Let Q be the total amount of ETH extracted by the attacker before gas is considered. It is a relatively easy task to nest the three swap equations together to calculate Q in terms of the swap fee, δ, the two token reserves of the liquidity pool, x and y, the number of ETH tokens sent to the liquidity pool by the user, Δxᵤ, and the amount of ETH tokens sent to the liquidity pool by the attacker during the front run, Δxₐ (eqn 2). However, the result is a little ugly.
Reducing eqn 2 and cancelling the y terms yields eqn 3.
The plot of Q versus Δxₐ reveals the attack profits profile with respect to the number of tokens committed during the front running step. At first, visual inspection of the graph is seemingly unrevealing; however, depicting the Δxₐ dimension on a logarithmic scale helps to exaggerate the feature of interest (figure 6). An interactive plot is provided for the reader’s convenience via desmos.
Figure 6: Q versus Δxₐ. Both plots are obtained from eqn 3; a) depicted on linear scales and b) the Δxₐ dimension on a logarithmic scale. The peak depicted in b) denotes the optimal Δxₐ token quantity to trade during the front running component of the attack.
The local maxima around the peak featured in figure 5 b) can be calculated from the partial derivative of eqn 3 with respect to Δxₐ, evaluated at 0 (eqn 4). The coefficients A, B, C, D, are substituting for the other terms (x, δ, and Δxᵤ), which are treated as constant for the purpose of computing the front running transaction (eqns 5–8).
Note that each of the coefficients A-D have the same denominator, 1-(1-δ)², which is an artefact of dividing through by the leading coefficient to force reduction to the monic equation type (i.e. leading coefficient = 1). Fortunately, eqn 4 is quartic, so solving for Δxₐ is not impossible, albeit inconvenient. For all realistic values (e.g. swap fee, token balances, swap amounts are greater than zero) eqn 4 has two real, and two complex roots. Only the root at Δxₐ > 0 (i.e. the part needed to perform the exploit), if one exists, is presented here. Note that depending on the fee level, δ, liquidity depth, x, and the user’s trade amount, Δxᵤ, some sandwich attacks are simply not profitable for any front running trade quantity, Δxₐ.
As is characteristic of quartics, the solution is best broken into smaller pieces for readability and clarity of presentation. The solution is thus broken up into ten constituent parts, represented by the Greek letters gamma (γ), nu (ν), omega (ω), phi (φ), tau (τ), mu (µ), eta (η), rho (ρ), beta (β) and alpha (α) (eqns 9–18). These labels have no standard meaning — they are a vestige of the method I used to solve and simplify the quartic.
The optimum front running transaction, which determines the total returns of the sandwich attack, Q, can be calculated by composing eqn 19 with the quartic intermediates γ, ν, ω, φ, τ, µ, η, ρ, β and α (eqns 9–18), and the coefficients A, B, C, and D (eqns 5–8):
The method described here is generic. For example, implementing the formulae above for Δxᵤ = 12.345678910, x = 123.45678910, and δ = 0.012345678910 yields an optimal front running trade of Δxₐ = 121.983219009865770262 and net profits of Q= 6.075819297992183389 tokens. In general, the larger the user’s intended trade compared to the depth of the liquidity pool, and the lower the swap fee, the more profitable the exploit. At δ = 0 the exploit becomes asymptotic at Q = Δxᵤ as Δxₐ → ∞, as expected. Correcting for gas consumption is trivial, as the modelled gas cost can simply be subtracted from Q to complete the analysis from the attacker’s perspective.
Calculating the user’s tokens obtained from the pool following the attack, Δyᵤ, is relatively straight forward (eqn 20).
The “effective” fee level, δ*, which is the apparent fee charged on the swap amount after accounting for the attacker’s rake, Δxᵤ-Q, is similarly trivial to calculate (eqn 21).
I’ll save you the hassle of transcribing these equations if you promise not to use them for evil. The OptimumSandwich
class (below) computes the optimum front running trade, Δxₐ, and prints an analysis sufficient to recreate figures 1–5. I consider the lack of documentation, tests, and error handling in the code block below to be appropriately compensated by the context and commentary provided by this article. Come at me (jk).
Conclusion
The purpose of this article is to provide an authoritative overview of sandwich attacks, and to service a somewhat selfish need to have a convenient flag in the literature to which I can refer in future discussions. However, this is only the most fundamental case; there are more complex methods that are outside of the scope of this article. For example, sophisticated exploits where the attacker controls the majority of the lp token supply belonging to the liquidity pool wherein the attack is performed, deliberate “baiting” by creating an attractive price compared to competing liquidity pools, and simultaneous sandwich & market arbitrage tactics are commonplace, but not explicitly discussed here. Regardless, the material provided is complete with respect to the scope of the discussion, and ought to satisfy the needs of almost any informed conversation on the topic.
I want to end this article with a challenge to the reader — both in the spirit of finding a silver lining, while also appealing to my own pedagogical ambitions. Given the mathematical descriptions presented here, it is possible to derive an expression for an “un-sandwichable” trade. That is, given the state of the pool (x, δ), there is a maximum quantity of tokens being swapped by the user, Δxᵤ, for which any attempted sandwich attack will net zero in profit for the exploiter (i.e. Q = 0 and by inference, Δxₐ = 0). It stands to reason that such a trade amount can and should be exposed to users via the front-end for DeFi applications running a typical constant product CFMM at the smart contract level. If you have made it this far — the challenge I issue to you is this:
For a generic liquidity pool with a reserve balance of x ETH and a swap fee of δ, what quantity of ETH, Δxᵤ, can the user swap without exposing a sandwich attack opportunity?
For a generic liquidity pool with a reserve balance of x ETH, where the user is determined to trade a certain quantity of ETH, Δxᵤ, what swap fee setting, δ, would nullify the sandwich attack opportunity?
Postscript
This article is adjacent a recent contribution from Stefan Loesch regarding the resistance of Carbon, a trading protocol from Bancor, to sandwich attacks. Loesch’s article discusses the viability of sandwich attacks in the context of implied fee levels in the concentrated liquidity paradigm and speaks directly to the concepts embedded in the challenge questions above.