Swimming Safely In The Public Mempool: MEV Smart Contract Obfuscation Techniques

Disclaimer

The information provided in this article is intended solely for educational purposes and should not be used for illegal activities. The focus is on enhancing awareness and understanding of smart contract security and best practices in blockchain technology. This article does not endorse or encourage using these techniques for exploiting vulnerabilities in real-world smart contract applications. The author and publisher of this article disclaim any liability for the misuse of the information contained herein and any damages that may arise from such misuse.

Preface

I originally made an article on smart contract obfuscation techniques and how generalized frontrunners (GFs) work. This is the second edition, combining the knowledge from to smart contract obfuscation and GFs (im a single virgin dev, btw) to create a (not anymore) zero-day technique for avoiding those pesky buggers in public mempools so we can actually execute our transactions.

This article will provide actual code examples, in opcodes, so you can both learn these techniques in their original form and also learn how someone like me likes to spend their day writing smart contracts — improving your understanding of low-level programming alongside attempting to upskill you into the next level of smart contract development :)

So, how do you avoid GFs (in my case, how do I get one…)?

There are 2 options: you use private transactions — if you’re on Ethereum, or you need to somehow come up with some tactics to make your transactions unfrontrunable. We will be focusing on the latter.

Unfrontrunable Obfuscation

During 2023 I was writing a exploit-suite for smart contracts for the entire year and was doing some bytecode smart contract development for around 5 months straight. I progressively gained intuition in regards to what makes a contract hyper efficient. Simultaneously, with my reverse engineering background, I thought of ways to avoid frontrunning in a public mempool, which is consists analyzing a transaction’s calldata and contract to mimic and yoink it for yourself. This requires having a sophiscated system that is fast enough to process simulations for all the possible pathways a transaction can have during it’s pending state. Evidently, you are able to create an exploit frontrunner and do good instead of wreckin’ loser bot operators. But, in all fairness, that is good fun so I understand.

Address Type Check

Enough lore, lets get into gameplay. For some backstory about how normie frontrunners build their shitty bots is like this: they simulate the tx by replacing the addresses with their own and checking if they made profit. If that didn’t work they would check if any of the addresses are contracts instead of EOAs — because maybe there was a smart contract requirement check, e.g:

CODESIZE or <address> EXTCODESIZE
ISZERO
JUMPI
...
JUMPDEST
STOP

CODESIZE would check if the CALLER is a smart contract — so a contract has to call this contract — or you could use <address> EXTCODESIZE to make sure the <address> is a contract. Alternatively, you could check that the CALLER is an EOA by making sure the CODESIZE (or EXTCODESIZE) is 0, representing an EOA address.

It gets more advanced! This is just the tip of the iceburg. I realized that all bots were checking for either PUSH32 <address> or PUSH20 <address> to easily replace the address(es) with their own. Let me show you in code:

// fn selector
PUSH4 0xabcd1234
PUSH0
MSTORE

// calldata (our wallet)
PUSH20 0x699694BaaB565C1557AcE8CAde5a7c71deadbeef
PUSH1 0x04
MSTORE

// fn call
PUSH0 PUSH0      // return spec
PUSH0 PUSH1 0x20 // calldata spec
PUSH0            // callvalue
GAS              // fart
PUSH20 0x3fC91A3afd70395Cd496C647d5a6CC9D4B2b7FAD // uni router

CALL // execute

This is a standard way to execute a function. However, we can perform some bytecode obfuscation, that I purposely didn’t add to my smart contract obfuscation article because I thought I would use it later on (spoiler, I didn’t). It ended up being a constant backthought because I was going to add it into the exploit-suite to basically create the most advanced zero-day attacks seen to-date. After some conversations with frens, I decided I didn’t want to become a prison tenant and deal with the annoying roomies, so here we are…leaking alfa for funsises…

Address Shift

So this is our beautiful wallet, oh how glorious it is!

// Our wallet
PUSH20 0x699694BaaB565C1557AcE8CAde5a7c71deadbeef

A piece of shit left bell-curve frontrunner would replace our precious wallet with the stinky dirty 20-bytes they call their address. But fear not, anon. We can do some bit-magic to our address to fuck with their targeting algorithm so we can keep our hard-earned (or not) profits. The first technique I’ll mention is changing the address position by adding a byte to the far right, causing a direct PUSH32 replacement to be futile. Lets see what this looks like:

// Removes the final 0s           -->            00
PUSH32 0x699694baab565c1557ace8cade5a7c71deadbeef00
PUSH1 0x08
SHR

Although this is quite simple and I wouldn’t suspect this to work. It’s merely a brain warm up for you. The real alfa comes in the form of splitting up the address to make the operators think to themselves “wtf is going on” without staring at etherscan for some hours or using a tool that can show you the stack and memory. So, let’s make some reverse engineers cry.

Address Split

// We split the 20-bytes "0x699694baab565c1557ace8cade5a7c71deadbeef" in half (2 10-bytes)
PUSH10 0x699694BaaB565C1557Ac
PUSH1 0x40
SHL
PUSH10 0xE8CAde5a7c71deadbeef
// Then combine them
OR
PUSH0
MSTORE // Creates, 0x00: 0x00..699694baab565c1557ace8cade5a7c71deadbeef

We can go one step further. This looks a bit too obvious what we’ve done. Although to a simple tool this wouldn’t be easily recognizable without memory identification and taint analysis.

Split Noramlization

To continue to mindfuck some more, for shits and gigs, let’s elaborate on this concept:

PUSH4 0x699694Ba
PUSH0
MSTORE

PUSH12 0xaB565C1557AcE8CAde5a7c71
PUSH1 0x04
MSTORE

PUSH4 0xdeadbeef
PUSH1 0x16
MSTORE

// Creates, 0x00: 0x699694baab565c1557ace8cade5a7c71deadbeef..00

This is the exact same thing as before but we seperate the bytes to be PUSH4 which commonly are represented as function selectors in contracts. We’re hiding our address in plain sight! We do this twice, however this might spark some interest in any snoopy devs. Nonetheless, this is quite advanced and has never been done before or recorded of (to my knowledge).

XORy Frontrunners

To extend even further, lets utilize the wonders of bit-magic, adding some XORs to make the operators really work for the frontrun.

How? Why would this make them work harder?

A simple copy and paste wont be able to replace our address since it is being XORed. Effectively, we’ve “faked” 2 function selectors that are combined via XOR to create our actual address. From face-value this looks super weird. My suggestion would be to add some bloater SWAPs and JUMPIs to make it harder to physically analyze w/o tools. Enough of the explaining though, lets implement our new friend, the XOR operator.

For some initial understanding, 0xdeafbeef can be read as the following bits:

// D    E    A    D    B    E    E    F
1101 1110 1010 1101 1011 1110 1110 1111

Now in order to know how to utilize XOR (or eXclusive or) we need to know how it works:

different inputs == 1 ouput, e.g. [0, 1] XOR == 1
same inputs == 0 output, e.g. [0, 0] or [1, 1] XOR == 0

Perfect, now lets implement an example using our 0xDE with a XOR scramble:

// Combining these two give us the original 0xDE (1101 1110)
0111 0101 // 0x 7 5
1010 1011 // 0x A B
XOR

You can see how batshit annoying this is to read. Additionally, you may think, “why don’t we just XOR the entire thing?”. That’ a good question but, it’s probably better if we only XOR slices of it to make the operators add more functionality to their algos, e.g. more downtime and complexity to account for — which we can constantly change and fuck around with. The frontrunner operator(s) will eventually come down to writing a generalized security tool…which is good?. So, we build some random algorithm that adjusts the encoding of the XOR — I’ll leave it to you if you want to do the work (also its good experience to learn bitmagic to fully understand it!).

So how do we implement it into our last modification?

Lets replace the 0xdeadbeef (1101 1110 1010 1101 1011 1110 1110 1111) in our obfuscated bytecode:

...
// PUSH4 0xdeadbeef: 1101 1110 1010 1101 1011 1110 1110 1111
PUSH4 0xB32B76F2  // 1011 0011 0010 1011 0111 0110 1111 0010
PUSH4 0x6D86C81D  // 0110 1101 1000 0110 1100 1000 0001 1101
XOR
...

Now that we have our scrambled code, we can place it into the mix:

PUSH4 0x699694Ba
PUSH0
MSTORE

PUSH12 0xaB565C1557AcE8CAde5a7c71
PUSH1 0x04
MSTORE

// Creates: 0xdeadbeef
PUSH4 0xB32B76F2
PUSH4 0x6D86C81D
XOR

PUSH1 0x16
MSTORE

// Creates, 0x00: 0x699694baab565c1557ace8cade5a7c71deadbeef..00

I think you might be getting the gist of obfuscation. It’s all really not too hard to figure out, it just makes it a lengthy process. You can put SWAPs everywhere to make the code actually unreadable without a tool to compute the stack and memory. Mix that in with some JUMPIs to random places, creating chaos in the code, will cause any CFG to explode in pathways, but also making it damn hard to actually understand. Especially if the JUMPDESTs are all depedent on calldata. These could even be bluffs or do some bitwise operations (like XOR) to really make reverse engineering a living hell. The only real downside is the extra gas you pay, for example the above example to replace our PUSH20 with [PUSH4, PUSH4, XOR] only added an extra 6 gas. This is really nothing for the benefits you get, especially on non-EVM blockchains where there is no private mempool to protect chud operators. Ultimately, these tactics can be bypassed by a memory analyzer that synthesizes a contract by understanding how our one works — maybe I write an article on this as I’ve written this as well (for more info, check out here)

I truly hope flashbots ceases to exist so we can go back to the good ol’ days of open PVP without bs private orderflow creating an unfair game.

Vanity Addresses

The final end game for address obfuscation lies in the vanity address realm where you can mine for an address like 0x000000363aac8ec1027502a4904bd9247675b42f or 0x000000363aac8ec1027502a4904bd92476000000. These are the hardest to automatically account for since they require you to generate an address with the exact same amount of zeros or you need to create a custom contract synthesizer to bypass all this bullshit, people like me, will add. Although this is a challenging task, it’s a much more fruitful one that adding each tiny heuristic you’ll come across in you life as a generalized dickhead.

Auto Adjustable Pending Tx

There are actually some cool tricks done in current-day contracts in relation to COINBASE. You can update your execution mid-transaction by implementing switch cases! I didn’t come up with this genius tech, but I thought I’d explain it since on the topic of flashbots.

// Titan Builder
COINBASE
PUSH20 0x4838B106FCe9647Bdf1E7877BF73cE8B0BAD5f97
EQ JUMPI

// Beaver Builder
COINBASE
PUSH20 0x95222290DD7278Aa3Ddd389Cc1E1d165CC4BAfe5
EQ JUMPI

// Rsync Builder
COINBASE
PUSH20 0x1f9090aaE28b8a3dCeaDf281B0F12828e676c326
EQ JUMPI

...

In each of these parts you could adjust your strategy on the fly depending on what builder picks it up in the end. Maybe there is some nuance nuggie you can pick up during a period a specific builder wins in — their algorithms would be different for each kind of environment (for more info check out this podcast w/ Titan Builder).

Final

If this article sparked some ideas, feel free to tag me on the bird app (twitter)! I would love to see the smart contract arena evolve, despite it’s limited potential. Hopefully you can find some use out of these techniques, for the greater good — looking at you whitehats.

I appreciate you for taking the time to read this article. I hope you found value in this!

Godspeed, anon.