Swimming Safely In The Public Mempool: MEV Smart Contract Obfuscation Techniques
10 min read
In blockchain, generalized frontrunners in MEV abuse the well-known standards of smart contract, therefore obfuscation techniques are essential for MEV operators. Over the past year I've created some zero-day techniques for avoiding these sharks in the mempool of Ethereum and all other existing blockchains in web3. Whether you're a semi-vet like DeGatchi or are just starting in the the MEV world of ETH, you are able to deploy these obfuscation techniques on day 1 to potentially gain an edge on your competition. Godspeed, anon!
Disclaimer
The information provided in this article is intended solely for educational purposes and should not be used for illegal activities. The focus is on enhancing awareness and understanding of smart contract security and best practices in blockchain technology. This article does not endorse or encourage using these techniques for exploiting vulnerabilities in real-world smart contract applications. The author and publisher of this article disclaim any liability for the misuse of the information contained herein and any damages that may arise from such misuse.
Preface
I originally made an article on smart contract obfuscation techniques and how generalized frontrunners (GFs) work. This is the second edition, combining the knowledge from to smart contract obfuscation and GFs (im a single virgin dev, btw) to create a (not anymore) zero-day technique for avoiding those pesky buggers in public mempools so we can actually execute our transactions.
This article will provide actual code examples, in opcodes, so you can both learn these techniques in their original form and also learn how someone like me likes to spend their day writing smart contracts — improving your understanding of low-level programming alongside attempting to upskill you into the next level of smart contract development :)
So, how do you avoid GFs (in my case, how do I get one…)?
There are 2 options: you use private transactions — if you’re on Ethereum, or you need to somehow come up with some tactics to make your transactions unfrontrunable. We will be focusing on the latter.
Unfrontrunable Obfuscation
During 2023 I was writing a exploit-suite for smart contracts for the entire year and was doing some bytecode smart contract development for around 5 months straight. I progressively gained intuition in regards to what makes a contract hyper efficient. Simultaneously, with my reverse engineering background, I thought of ways to avoid frontrunning in a public mempool, which is consists analyzing a transaction’s calldata and contract to mimic and yoink it for yourself. This requires having a sophiscated system that is fast enough to process simulations for all the possible pathways a transaction can have during it’s pending state. Evidently, you are able to create an exploit frontrunner and do good instead of wreckin’ loser bot operators. But, in all fairness, that is good fun so I understand.
Address Type Check
Enough lore, lets get into gameplay. For some backstory about how normie frontrunners build their shitty bots is like this: they simulate the tx by replacing the addresses with their own and checking if they made profit. If that didn’t work they would check if any of the addresses are contracts instead of EOAs — because maybe there was a smart contract requirement check, e.g:
CODESIZE
would check if the CALLER
is a smart contract — so a contract has to call this contract — or you could use <address> EXTCODESIZE
to make sure the <address>
is a contract. Alternatively, you could check that the CALLER
is an EOA by making sure the CODESIZE
(or EXTCODESIZE
) is 0
, representing an EOA address.
It gets more advanced! This is just the tip of the iceburg. I realized that all bots were checking for either PUSH32 <address>
or PUSH20 <address>
to easily replace the address(es) with their own. Let me show you in code:
This is a standard way to execute a function. However, we can perform some bytecode obfuscation, that I purposely didn’t add to my smart contract obfuscation article because I thought I would use it later on (spoiler, I didn’t). It ended up being a constant backthought because I was going to add it into the exploit-suite to basically create the most advanced zero-day attacks seen to-date. After some conversations with frens, I decided I didn’t want to become a prison tenant and deal with the annoying roomies, so here we are…leaking alfa for funsises…
Address Shift
So this is our beautiful wallet, oh how glorious it is!
A piece of shit left bell-curve frontrunner would replace our precious wallet with the stinky dirty 20-bytes they call their address. But fear not, anon. We can do some bit-magic to our address to fuck with their targeting algorithm so we can keep our hard-earned (or not) profits. The first technique I’ll mention is changing the address position by adding a byte to the far right, causing a direct PUSH32
replacement to be futile. Lets see what this looks like:
Although this is quite simple and I wouldn’t suspect this to work. It’s merely a brain warm up for you. The real alfa comes in the form of splitting up the address to make the operators think to themselves “wtf is going on” without staring at etherscan for some hours or using a tool that can show you the stack and memory. So, let’s make some reverse engineers cry.
Address Split
We can go one step further. This looks a bit too obvious what we’ve done. Although to a simple tool this wouldn’t be easily recognizable without memory identification and taint analysis.
Split Noramlization
To continue to mindfuck some more, for shits and gigs, let’s elaborate on this concept:
This is the exact same thing as before but we seperate the bytes to be PUSH4
which commonly are represented as function selectors in contracts. We’re hiding our address in plain sight! We do this twice, however this might spark some interest in any snoopy devs. Nonetheless, this is quite advanced and has never been done before or recorded of (to my knowledge).
XORy Frontrunners
To extend even further, lets utilize the wonders of bit-magic, adding some XOR
s to make the operators really work for the frontrun.
How? Why would this make them work harder?
A simple copy and paste wont be able to replace our address since it is being XOR
ed. Effectively, we’ve “faked” 2 function selectors that are combined via XOR
to create our actual address. From face-value this looks super weird. My suggestion would be to add some bloater SWAP
s and JUMPI
s to make it harder to physically analyze w/o tools. Enough of the explaining though, lets implement our new friend, the XOR
operator.
For some initial understanding, 0xdeafbeef
can be read as the following bits:
Now in order to know how to utilize XOR
(or eXclusive or
) we need to know how it works:
- different inputs ==
1
ouput, e.g.[0, 1] XOR == 1
- same inputs ==
0
output, e.g.[0, 0] or [1, 1] XOR == 0
Perfect, now lets implement an example using our 0xDE
with a XOR
scramble:
You can see how batshit annoying this is to read. Additionally, you may think, “why don’t we just XOR
the entire thing?”. That’ a good question but, it’s probably better if we only XOR
slices of it to make the operators add more functionality to their algos, e.g. more downtime and complexity to account for — which we can constantly change and fuck around with. The frontrunner operator(s) will eventually come down to writing a generalized security tool…which is good?. So, we build some random algorithm that adjusts the encoding of the XOR
— I’ll leave it to you if you want to do the work (also its good experience to learn bitmagic to fully understand it!).
So how do we implement it into our last modification?
Lets replace the 0xdeadbeef
(1101 1110 1010 1101 1011 1110 1110 1111)
in our obfuscated bytecode:
Now that we have our scrambled code, we can place it into the mix:
I think you might be getting the gist of obfuscation. It’s all really not too hard to figure out, it just makes it a lengthy process. You can put SWAP
s everywhere to make the code actually unreadable without a tool to compute the stack and memory. Mix that in with some JUMPI
s to random places, creating chaos in the code, will cause any CFG to explode in pathways, but also making it damn hard to actually understand. Especially if the JUMPDEST
s are all depedent on calldata. These could even be bluffs or do some bitwise operations (like XOR
) to really make reverse engineering a living hell. The only real downside is the extra gas you pay, for example the above example to replace our PUSH20
with [PUSH4, PUSH4, XOR]
only added an extra 6 gas. This is really nothing for the benefits you get, especially on non-EVM blockchains where there is no private mempool to protect chud operators. Ultimately, these tactics can be bypassed by a memory analyzer that synthesizes a contract by understanding how our one works — maybe I write an article on this as I’ve written this as well (for more info, check out here)
I truly hope flashbots ceases to exist so we can go back to the good ol’ days of open PVP without bs private orderflow creating an unfair game.
Vanity Addresses
The final end game for address obfuscation lies in the vanity address realm where you can mine for an address like 0x000000363aac8ec1027502a4904bd9247675b42f
or 0x000000363aac8ec1027502a4904bd92476000000
. These are the hardest to automatically account for since they require you to generate an address with the exact same amount of zeros or you need to create a custom contract synthesizer to bypass all this bullshit, people like me, will add. Although this is a challenging task, it’s a much more fruitful one that adding each tiny heuristic you’ll come across in you life as a generalized dickhead.
Auto Adjustable Pending Tx
There are actually some cool tricks done in current-day contracts in relation to COINBASE
. You can update your execution mid-transaction by implementing switch cases! I didn’t come up with this genius tech, but I thought I’d explain it since on the topic of flashbots.
In each of these parts you could adjust your strategy on the fly depending on what builder picks it up in the end. Maybe there is some nuance nuggie you can pick up during a period a specific builder wins in — their algorithms would be different for each kind of environment (for more info check out this podcast w/ Titan Builder).
Final
If this article sparked some ideas, feel free to tag me on the bird app (twitter)! I would love to see the smart contract arena evolve, despite it’s limited potential. Hopefully you can find some use out of these techniques, for the greater good — looking at you whitehats.
I appreciate you for taking the time to read this article. I hope you found value in this!
Godspeed, anon.
Share this Article
Recent Articles
-
Generating Custom Assembly Smart Contracts
2 years ago I wrote a component that allowed me to generate any custom assembly smart contract on the fly to automatically create exploits without needing to do any manual work. I had a montiroing system that would provide the inputs and the bytecode generator would chug along and spit out an executable program that I could deploy and call on with a bundle of transactions. In this article I'll share the core of that codebase to get you up to speed! Buckle up, anon. There isn't any other article like this revealing these trade secrets!
-
Baiting MEV Bots: UniV2 Token Trapper
So many MEV bots take money from people but why don't people take money from them? I always thought about this when I was in my web3 cybersec assembly arc. I got quite fascinated with reverse engineering them and the contracts they interected with and realised there are some interesting things you can do with the uniswap code, since it has a few assumptions with the tokens you provide to create the pairs. Although not very practical it's definitely an intereting thought experiment that can provoke some further creativity!
-
Starting Malware Development
After spending years in MEV and web3 infosec DeGatchi explains his reasoning behind switching to malware development. Although seemingly less money and entering an alreaedy mature field, it's clearly the most powerful long term decision to be made -- especially when combining malware development with custom evolutionary AI algorithms.