skip to content
← Go back

Reversing The EVM: Raw Calldata

Reversing The EVM: Raw Calldata

You may have have wondered how to decipher and read evm calldata, then attempted to read the transaction calldata of an Ethereum smart contract, only to become confused at a certain point. The EVM (and other L1 forks) encode and decode calldata in a specific way for static and dynamic types, which can be initially confusing. In this article, we will delve into the encoding sequence of calldata so that you can comprehend any verified or unverified smart contract transactions and understand the bytes. By doing so, I hope to empower you to create your own raw calldata.

What Is Calldata?

Calldata is the encoded parameters that we send to functions, in this case to smart contracts on the Ethereum Virtual Machine (EVM). Each piece of calldata is 32 bytes long, or 64 characters. There are two types of calldata: static and dynamic.

Static variables are fairly straightforward to understand. Dynamic variables, on the other hand, are much more complex, and this is likely the reason why you may have difficulty reading raw calldata intuitively. However, once we go through how dynamic variables work, you will be able to read raw calldata with ease.

To begin, let’s understand how calldata is encoded and decoded to establish a foundation of how it all works.

Encoding Calldata

To encode types, you can pass them into the `abi.encode(parameters)“ method to generate raw calldata.

If you want to encode calldata for a specific interfaced function, you can use abi.encodeWithSelector(selector, parameters). This will be the same as passing in the function and it’s parameters directly.

For example:

interface A {
  function transfer(uint256[] memory ids, address to) virtual external;

contract B {
  function a(uint256[] memory ids, address to) external pure returns(bytes memory) {
    return abi.encodeWithSelector(A.transfer.selector, ids, to);

The method .selector generates the 4-bytes that represents that method on the interface. We use it to tell the EVM that we’re sending our calldata to that function. This is how UniswapV2 enables flashswaps!

There is also abi.encodePacked(...) which is efficiently put all dynamic variables with eachother, removing the 0 padding. The problem with it is that it doens’t prevent collisions and should only be used when you know for certain the types and lengths of the parameters.

Decoding calldata

So you have your calldata, how do you decode it?

If the calldata was created with abi.encode(...) then we can decode the parameters with abi.decode(...) by passing in the parameters we want to decode the calldata into.

For example:

(uint256 a, uint256 b) = abi.decode(data, (uint256, uint256))

Where data represents the calldata being passed in.

Now that we understand how to encode and decode parameters, we can move onto the different variable types and how they are reflected in the calldata output.

Static Variables

Static variables are simply the encoded representation of the following types: uints, ints, address, bool, bytes1 to bytes32 (including function selector), and tuples (however they can have dynamic variables in them).

For example, lets say we’re interacting with the following contract:

pragma solidity 0.8.17;
contract Example {
    function transfer(uint256 amount, address to) external;

With the input parameters:

amount: 1300655506
address: 0x68b3465833fb72A70ecDF485E0e4C7bD8665Fc45

We would generate the calldata: 0x000000000000000000000000000000000000000000000000000000004d866d9200000000000000000000000068b3465833fb72a70ecdf485e0e4c7bd8665fc45

But… how tf do we read this?

Well, lets chop it up into readable pieces, by first removing the prefix 0x and then breaking each line into 64 character (or 32 byte) parts

// uint256
// address

Cool, now we know the first 32-bytes is the uint256 amount variable and the 2nd 32-bytes is the address to one.


But what if we wanted to call the transfer function directly?

We would need to know what parameters types takes in order and use a hashing mechanism called keccak256 which turns the inputted data into a 32-byte hash.

In this case, to hash:

function transfer(uint256 amount, address to) external;

We would do:


Which would return the following 32-byte hash:


To get the function signature we only need the first 4-bytes (or 8 characters, excluding the 0x prefix), b7760c8f.

This 4-byte signature, b7760c8f, is how we tell the EVM that we’re interacting with that function and the following calldata is being passed in as parameters.

For example, if we were to call transfer with the same parameters as before in # Static Variables we would grab the existing calldata:


And add b7760c8f to the start of the first 4 bytes of the first 32-bytes:




You may be wondering, how is the calldata parameter actually inputted into the function with the embedded signature?

The answer is, the contract’s bytecode reads it by targetting the function b7760c8f, then replacing it with 00000000 padding then passing in the parameter.

Enjoying the article? Stay updated!

Dynamic Variables

Dynamic variables are non-fixed-size types, including bytes, string, and dynamic arrays <T>[], as well as fixed arrays <T>[N].

The structure of dynamic types always starts with an offset, which is the hexadecimal representation of where the dynamic type begins. For example, a hex of 20 represents 32-bytes. Once we reach the offset, there is a smaller number that represents the length of the type.

Tldr; 1st 32-bytes = offset, 2nd 32-bytes = length, the rest are elements.

For arrays, this length represents the number of elements contained in the array. For bytes and string types, it represents the length of the type. For example, the string "Hello World!" is 12-bytes long, with each character being 1-byte. Keep in mind that these types start on the left-hand side of the calldata, rather than the right-hand side like everything else.

For example, here’s the string "Hello World!" encoded:


Observe how the first 32-bytes represents the hexadecimal offset of 20, which is 32 in decimal. So we skip 32-bytes from the start of 0000000000000000000000000000000000000000000000000000000000000020, bringing us to the next line with a hex of 0c, a decimal of 12, representing length of our string in bytes. Now when we convert 48656c6c6f20576f726c6421 to a string type it returns our original value.

Congrats! Now you know how to read dynamic types.

Decoding Static And Dynamic Parameters

Lets say we’re interacting with the following contract:

pragma solidity 0.8.17;
contract Example {
    function transfer(uint256[] memory ids, address to) external;

With the following parameters for transfer:

ids: ["1234", "4567", "8910"]
to: 0xf8e81D47203A594245E36C48e151709F0C19fBe8

We would generate the calldata: 0x8229ffb60000000000000000000000000000000000000000000000000000000000000040000000000000000000000000f8e81d47203a594245e36c48e151709f0c19fbe8000000000000000000000000000000000000000000000000000000000000000300000000000000000000000000000000000000000000000000000000000004d200000000000000000000000000000000000000000000000000000000000011d700000000000000000000000000000000000000000000000000000000000022ce

We can chop this up into a more readable form:

// prefix we discard
// fn selector we're calling (`transfer(uint[], address)`)
// `uint256[] ids` param array offset (64-bytes below from start of this line)
// `address to` param
// length of `ids` array; 3 inputs
// 1st `ids` param
// 2nd `ids` param
// 3rd `ids` param

Notice how the array parameter is represented by an offset to where the array begins. Then we move onto the second param, the address type, then finishing off the array type.

Now that we know how to read both static + dynamic parameters, let’s dissect a more complex example!

Decoding A Multicall’s Calldata

We’re going to be a UniswapV3 multicall’s input calldata from this transaction. Here the user calls 3 different functions from the multicall function.

Etherscan is nice enough to give us a simple decoded version:

MethodID: 0xac9650d8

We will modify this a bit and expand upon this line-by-line to make it even more readable. Keep in mind, each value is in hexadecimal format and 20 hex == 32-bytes for quick reference.

MethodID: 0xac9650d8
// offset of array_1 (starting next line)
// length of array_1 (how many elements in array)
// offset of 1st element in array_1, array_1A (96-bytes / 32 = 3)
// offset of 2nd element in array_1, array_1B (288-bytes / 32 = 9)
// offset of 3rd element in array_1, array_1C (704-bytes / 32 = 22)

// length 1st element of array_1, array_1A (132-bytes (inc. selector))

// here we'll read the next 132-bytes
// fn selector; 4 of 132
// 1st param; 36 of 132
// 2nd param; 68 of 132
// 3rd param; 100 of 132
// 4th param; 132 of 132
// this marks the end of array_1A

// 32-bytes of `0` indicating next elemet
// length 2nd element of array_1, array_1B (356-bytes (inc. selector))
// we have 4-bytes missing due to the embedded fn selector, 13ead562
// the next fn selector, 88316456, will be inserted here

// here we'll read the next 356-bytes
// fn selector; 4 of 356
// 1st param; 36 of 356
// 2nd param; 68 of 356
// 3rd param; 100 of 356
// 4th param; 132 of 356
// notice how all the `0`s are `f`s. this indicates a `int` type!
// 5th param; 164 of 356
// we have 32-bytes of `0`, but since we're still reading the bytes
// we know this is a paramter, representing 0 of a type
// 6th param; 196 of 356
// 7th param; 228 of 356
// 8th param; 260 of 356
// 9th param; 292 of 356
// 10th param; 324 of 356
// 11th param; 356 of 356
// this marks the end of array_1B

// 32-bytes of `0` indicating next elemet
// this is the same thing as before, the length!
// we can see there's only 32-bytes left so we can conclude
// that it's going to be a fn with no inputs

// a call to the fn selector 12210e8a; 4 of 4

Now you’re able to read raw embedded dynamic types!


I hope this information has helped you understand how calldata is encoded, decoded, and read. It took me some time to research and experiment with it all in order to learn, but it was worth it. The next step from here is to learn how to read bytecode in order to understand the EVM at its lowest level (Then everything becomes open-source >:D).

I appreciate you for taking the time to read this article. I hope you found value in this, anon!

Share this Article

Recent Articles