skip to content
← Go back

A Low-Level Guide To Solidity's Storage Management

A Low-Level Guide To Solidity's Storage Management

Learn how the EVMs storage system works by interacting with it through smart contracts using solidity's inline assembly/yul, taking you a step closer to bridging the gap between high and low level programming! We'll walkthrough each encounter you will face by learning deal with them using bitwise operations alongside SLOAD and SSTORE to control the EVMs storage at will.

What Is Storage?

Storage is a persistent mapping with 2^256 - 1 available 32 byte words for each contract. When you set a state variable’s value it stores it at the assigned slot where it will remain in the EVM unless overriden with another value of the same type.

When To Use Storage Vs Memory

You may have seen that it’s better to use memory over storage to optimise your smart contract. If you don’t know exactly why let me explain it to you, anon. Having a deep understanding of this is critical to writing optimised contracts at any language level, whether it’s solidity or assembly.

When we first load a storage slot it’s cold, meaning it’s more expensive at 2100 gas and whenever we call that newly used storage slot again it’s a warm storage slot, meaning it’ll be 100 gas but not as cheap as memory which is at least 3 gas; but can go higher if memory expansion occurs!

Let’s see correct use of storage and memory assignments with a contract that is poorly written and one that optimises it.

Unoptimised contract:

contract C {
    struct S {
        uint256 a;
        uint256 b;
        address c;
    }

    S public s;

    function foo(uint256 input) external {
        // `s.b` is loaded from storage once: warming up the storage!
        if (input < s.b) revert;
        // Second `s.b` SLOAD with warm storage.
        if (s.b > 50) revert;
    }
}

Notice how s.b is loaded twice from storage? We could optimise this by creating a memory variable that assigns s.b to it and then later use that variable.

But why?

Because we’re only using storage load (SLOAD) once to store it in memory instead of twice to perform all checks which reduces the gas costs since memory loads (MLOADs) are significantly cheaper in gas than SLOADs.

Knowing this, the optimised contract would look like:

function foo(uint256 input) external {
    // Initial storage load to store in memory.
    uint256 b = s.b;
    // Using MLOAD in comparison operations!
    if (input < b) revert;
    if (b > 50) revert;
}

The trick to remember is that initially loading a storage variable into memory calls SLOAD behind the scenes to copy it into the memory slot. So if you use a variable more than once then it’s better to SLOAD it.

With structs and arrays you load each child in the type. With S memory memory_s, so you have to make sure you’re using each variable more than once, otherwise you should specifically choose which child to store to memory, like what we did with uint256 b = s.b.

Manually Assigning Storage

The most glorified devs in crypto all have low-level granular expertise when it comes to smart contracts. They are able to manipulate the state at the lowest level, with either inline assembly/yul or huff. You need to know this in order to become an actual master of smart contract development, contradictory to what most courses say when they teach you the basics without diving into the roots of the language.

So, lets take a look at a new smart contract using structs as the main variable to analyse. I want to teach with structs because they will be the most complex type you need to deal with majority of the time and with this knowledge you can understand the basics.

contract C {
    struct S {
        uint16 a;  // 2 bytes,  2 bytes total
        uint24 b;  // 3 bytes,  5 bytes total
        address c; // 20 bytes, 25 bytes total + end of slot 0x01
        address d; // 20 bytes, slot 0x02
    }

    // I've noted the storage slots each state is located at.
    // A single slot is 32 bytes :)
    uint256 boring;              // 0x00
    S s_struct;                  // 0x01, 0x02
    S[] s_array;                 // 0x03
    mapping(uint256 => S) s_map; // 0x04

    constructor() {
        s_struct = S({
            a: 10,
            b: 20,
            c: 0x047b37ef4d76c2366f795fb557e3c15e0607b7d8,
            d: 0x047b37ef4d76c2366f795fb557e3c15e0607b7d8
        });
    }
}

Basic Types

If we wanted to access uint256 boring it’s quite simple. We only need the slot it’s assigned, which in this case is 0x00 since it’s the initial global state variable.

assembly {
    let x := sload(0x00)
}

Bitpack Loading

Lets step it up a notch with a bitpacked struct! Bitpacked means storing multiple variables in a single slot (32 bytes) by ordering the byte size of the variables in a way that results the slot being equal or less to 32 bytes. In this case we pack a total of 25 bytes into a single slot at 0x01 using:

  • uint16 a (2 bytes).
  • uint24 b (3 bytes).
  • address c(20 bytes).

s_struct’s slots would be:

// 0x01 00000000000000047b37ef4d76c2366f795fb557e3c15e0607b7d8000014000a
// 0x02 000000000000000000000000047b37ef4d76c2366f795fb557e3c15e0607b7d8

s_struct.d isn’t contained in the same slot because it would overflow it: 25 + 20 = 45 but the maximum size is 32, that’s why it’s given it’s own seperate slot.

We can break up the slot values intos a readable format:

//       unused bytes                     c                        b    a
// 0x01 00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014 000a

//           unused bytes                             d
// 0x02 000000000000000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8

Notice how a, b, c are in the same slot. The way we grab any of these values by shifting the bits and using masking to grab a specific string of bits in a slot.

First, it’s good to know how masking is done. In the next example we use 0xFFFFFF which represents:

// 0000000000000000000000000000000000000000000000000000000000FFFFFF
// Where 0x0F hex == 15 decimal == 1111 bits.

When we state 0xFFFFFF we mean 1111111111111111 bits which we use bitwise operations on. A single byte is made up of 8 bits represented with 1s and/or 0s. 0s are empty bits and 1s for bits that form byte values.

Lets go through a practical example to solidify our new knowledge by grabbing s_struct.b:

function view_b() external view returns (uint24) {
    assembly {
        // before: 00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014 000a
        //                                                                         ^
        // after:  0000 00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014
        //          ^
        let v := shr(0x10, sload(0x01))

        // If both characters aren't 0, keep the bit (1). Otherwise, set to 0.
        // mask:   0000000000000000000000000000000000000000000000000000000000 FFFFFF
        // v:      000000000000000000047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014
        // result: 0000000000000000000000000000000000000000000000000000000000 000014
        v := and(0xffffff, v)

        // Store in memory bc return uses memory.
        mstore(0x40, v)

        // Return reads left to right.
        // Since our value is far right we can just return 32 bytes from the 64th byte in memory.
        return(0x40, 0x20)
    }
}

Bitpack Setting

But what if we wanted to change the value of s_struct.b?

We have a few extra steps to add to viewing it but we can do it like this (keep in mind I haven’t optimised it at all):

//          unused bytes                     c                        b    a
// before: 00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014 000a
//          unused bytes                     c                        b    a
// after:  00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 0001F4 000a
function set_b(uint24 b) external {
    assembly {
        // Removing the `uint16` from the right.
        // before: 00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014 000a
        //                                                                         ^
        // after:  0000 00000000000000 047b37ef4d76c2366f795fb557e3c15e0607b7d8 000014
        //          ^
        let new_v := shr(0x10, sload(0x01))

        // Create our mask.
        new_v := and(0xffffff, new_v)

        // Input our value into the mask.
        new_v := xor(b, new_v)

        // Add back the removed `a` value bits.
        new_v := shl(0x10, new_v)

        // Replace original 32 bytes' `000014` with `0001F4`.
        new_v := xor(new_v, sload(0x01))

        // Store our new value.
        sstore(0x00, new_v)
    }
}

Now when we run set_b() it will change s_struct.b to 500!

Special thanks to vectorized.eth for helping me out with this!

Arrays

A fixed array’s length is known so it takes up a predetermined amount of slots. However, dynamic array lengths are unknown and new elements assign slots after deployment so the EVM handles this with keccak256 hashing!

For our example contract, we can access any element in s_array with the formula:

  • keccak256(array_slot) + var_slot

Lets say we want to access s_array[0].d:

// keccak256(array_slot) + var_slot
// keccak256(0x03) + 1
// Remember how `s_struct` takes up 2 slots?
// The `+ 1` indicates the second slot allocation in S
// For the bitpacked slot in S we use don't need the add
// The next element's slot would be `+ 2`
function get_element() external view returns(bytes32) {
    assembly {
        // Store array slot in memory.
        mstore(0x40, 0x03)
        // Keccak does the MLOAD internally so we give the memory location.
        let hash := add(keccak256(0x40, 0x20), 1)
        // Store the return value.
        mstore(0x40, sload(hash))
        // Return `d`.
        return(0x40, 0x20)
    }
}

Mappings

Since mappings are a dynamic type and we will never know the length of the mapping the EVM uses hashes as a key to identify the corresponding value in order to avoid state collisions, similar to arrays.

There are 2 formulas we need to know for mappings:

  • keccak256(mapping_key . mapping_slot)
  • keccak256(mapping_key . mapping_slot) + i, for when the struct uses multiples slots we just add the slot we need i.

The . means concatenatinating the left and right value together into a string.

Lets say we want to access s_map[2].b from our example contract:

// keccak256(mapping_key . mapping_slot)
// keccak256(0x02 . 0x04)
function v() external view returns(bytes32) {
    assembly{
        // Store map key (the element we want).
        mstore(0, 0x02)
        // Store map slot location.
        mstore(0x20, 0x04)
        // We want `b` in the first slot 0x00.
        let slot := keccak256(0, 0x40)
        // Store our value for return.
        mstore(0, sload(slot))
        return(0, 0x20)
    }
}

What about accessing s_map[4].d?

// keccak256(mapping_key . mapping_slot) + i`
// keccak256(0x02 . 0x04) + 1
function v() external view returns(bytes32) {
    assembly{
        // Store map key (the element we want).
        mstore(0, 0x02)
        // Store map slot location.
        mstore(0x20, 0x04)
        // We want `d` which is 1 slot more.
        let slot := add(keccak256(0, 0x40), 0x01)
        // Store our value for return.
        mstore(0, sload(slot))
        return(0, 0x20)
    }
}

Speical thanks to 0xKitetsu for breaking this down for me!

Strings & Bytes

string and bytes have identical encoding types that are very annoying to deal with:

  1. When the length of the string is 31 bytes or less it’s stored in a single slot starting from the left side and the length * 2 is stored in the final byte on the right.
// For example, `string below31 = "reeeee";`:
//    string                       unused bytes                    length of string * 2
// 726565656565 00000000000000000000000000000000000000000000000000 0c
  1. For anything larger than 31 bytes the storage process is similar to an array. Where the slot of the string stores length * 2 + 1 and the data is stored via keccak256(slot) + i

This way you can see what type of string it is from checking if lowest bit is set (the far right byte).

You honestly wont encounter the long string type frequently, probably once in a blue moon, but if you do you can check if it’s the short version if ànd(0x1, <value>) equals zero.

Final

This knowledge bridges the gap from being a normie solidity dev to becoming a low-level chad. Now that you understand how to interact with storage using assembly and how it functions you’ll be able to write in any low-level smart contract language! I’m proud of you for taking another step closer to writing raw bytecode >:D

I would like to thank everyone that’s answered my questions and have helped me with learning this mountain of a challenge. Special thank you to the Huff Discord, especially 0xKitetsu, Franfran, Godel, vectorized, jtriley, devtooligan, merkleplant and Sabnock for dealing with my infinite questions :,) (sorry if I forgot you, dm me!).

This article took a while to research and write about in a very digestable way. For any future article suggestions, please dm me.

I appreciate you for taking the time to read this article and hope you found value in this, anon!

Recent Articles