Indexer

The MUD Indexer

The MUD Indexer is an off-chain indexer for on-chain applications built with MUD.

Why an off-chain indexer?

Reads with on-chain apps can be tricky. What does it mean to be able to query the Ethereum network? Technically, given a node with a fully synced state, we can explore just about everything using the EVM, but the “exploring” would be looking at raw storage slots for accounts corresponding to smart contracts. A way around this exists by providing view functions on smart contracts: these effectively are just wrappers around raw storage and expose a more friendly API. Instead of having to figure out where the balances for an account are stored in the storage tree, we now can call a function that does the lookup via Solidity via an RPC endpoint.

The issue with view functions is that complexity in having to generate and call these to get the “full picture” of the state from the chain explodes pretty quickly. For a web app client wanting to present the user with an up-to-date view of the on-chain state, the task of calling the different view functions becomes unnecessarily hard very fast. This can also quickly accelerate the need to run a set of dedicated nodes just to be able to service the needed number of RPC requests.

Installation

These are the steps to install and start the MUD indexer. They are written under the assumption you are using anvil for your test chain, which is what the getting started package's pnpm dev does.

  1. Build the repository as explained here.

  2. Ensure you have jq (opens in a new tab) installed.

  3. Run a test world. The easiest way to do this is to follow these directions in a separate command line window.

  4. Start the indexer.

    cd packages/store-indexer
    pnpm start:sqlite:local

    Note: If you have already ran the MUD indexer, and then restarted the blockchain, make sure to delete the old database:

    rm anvil.db
  5. Run this command (in a separate command line window) as a sanity check to verify the indexer is working correctly:

    curl 'http://localhost:3001/trpc/findAll?batch=1&input=%7B%220%22%3A%7B%22json%22%3A%7B%22chainId%22%3A31337%2C%22address%22%3A%220x5FbDB2315678afecb367f032d93F642f64180aa3%22%7D%7D%7D' | jq

    The result should be nicely formatted (and long) JSON output.

Data access and interpretation

Using the indexer with TypeScript

You can see an example of an indexer client in the MUD repo (opens in a new tab). To execute it:

  1. Download and build the example.

    git clone https://@github.com/latticexyz/mud.git
    cd mud/examples/indexer-client
    pnpm install
  2. Run the example.

    pnpm read-indexer

If your indexer is indexing the minimal template, the expected output should be similar to:

Block number: 2042
Tables: schema,Hooks,StoreMetadata,NamespaceOwner,InstalledModules,ResourceAccess,Systems,FunctionSelector,SystemHooks,SystemRegistry,ResourceType,Counter
Information about Counter
{
  id: '0x5FbDB2315678afecb367f032d93F642f64180aa3____Counter',
  address: '0x5FbDB2315678afecb367f032d93F642f64180aa3',
  tableId: '0x00000000000000000000000000000000436f756e746572000000000000000000',
  namespace: '',
  name: 'Counter',
  keySchema: {},
  valueSchema: { value: 'uint32' },
  lastUpdatedBlockNumber: 2042n,
  records: [ { key: {}, value: [Object] } ]
}
The actual value: 2

Detailed explanation

import { createIndexerClient } from "@latticexyz/store-sync/trpc-indexer";
const indexer = createIndexerClient({
  url: "http://127.0.0.1:3001/trpc",
});

Create an indexer client. The URL, http://127.0.0.1/3001/trpc, is the one for the indexer you run on the local computer. If the indexer is elsewhere, modify the URL as appropriate.

const result = await indexer.findAll.query({
  chainId: 31337,
  address: "0x5FbDB2315678afecb367f032d93F642f64180aa3",
});

Connect to the indexer. The chainId value is for a sanity check, to ensure we are connecting to the correct indexer. The address parameter is the address of the World contract whose information we are reading. The same indexer can read from multiple World objects, as long as they are on the same blockchain.

console.log(`Block number: ${result.blockNumber}`);

The result has two fields:

  • blockNumber, the data is correct as of that block (it could change later).
  • tables, the tables of the World we are reading.
console.log(`Tables: ${result.tables.map((t) => t.name)}`);

Use map (opens in a new tab) to get the names of the tables.

const counterTable = result.tables.filter((t) => t.name == "Counter")[0];

Use filter (opens in a new tab) to get the Counter table, assuming there is at least one.

console.log("Information about Counter");
console.log(counterTable);

Log the information we have about the table. It contains these fields:

FieldMeaning
idIinternal ID for uniqueness in the context of SQL
addressWorld address
tableIdbytes32 hex encoded table ID (a concat of bytes16(namespace) and bytes16(name))
namespaceTable's namespace
nameTable's name
keySchemaSchema of the table key
valueSchemaSchema of the table value
lastUpdatedBlockNumberBlock number for which the data is correct
recordsActual table data
console.log(`The actual value: ${counterTable.records[0].value.value}`);

The actual value in Counter, which is in the first (and only) record. Every record has two fields:

  • key
  • value

Both key and value can have multiple fields inside them, one for each field in the appropriate schema. In the case of Counter there is nothing in the key, but value has a single field, also called value. So the counter value is value.value of the record.

Using HTTP

URLs and parameters may vary

The parameters, especially the URL, may change as we adjust our tRPC configuration. If you can use the TypeScript methods, those are more stable.

The indexer is built on top of tRPC (opens in a new tab), which is primarily designed to be used between TypeScript (opens in a new tab) programs. As a result, the data format is more complicated than REST APIs.

For queries, the indexer expects to get two parameters in a query string (opens in a new tab):

  • batch, the batch identifier. This parameter can contain any string, as long as it is present.

  • input, the actual query. This query is a URLEncoded (opens in a new tab) JSON (opens in a new tab) object. To play with possible object values, you can use this online calculator (opens in a new tab).

    For example, the input object used in the sanity check above is:

    {
      "0": {
        "json": {
          "chainId": 31337,
          "address": "0x5FbDB2315678afecb367f032d93F642f64180aa3"
        }
      }
    }

    An HTTP request can include multiple queries, so the query object includes a numbered key (or numbered keys) with the actual queries. Also, in the future other formats than JSON might be supported, so each query specifies what format it uses.

The one type of query currently supported is findAll. It gives you all the information stored on the specific world in the indexer. The query is available at path /findAll and have two parameters:

  • chainId: The chain ID of the blockchain to query.
  • address: The address of the World. Based on the paramters in the development environment, pnpm dev uses the address in the sanity check.

To understand the result, it is easiest to look at it inside Node.

  1. Read the result and enter node:

    curl 'http://localhost:3001/findAll?batch=wd&input=%7B%220%22%3A%7B%22json%22%3A%7B%22chainId%22%3A31337%2C%22address%22%3A%220x5FbDB2315678afecb367f032d93F642f64180aa3%22%7D%7D%7D' > res.json
    node
  2. Read the result into a JavaScript object.

    res = JSON.parse(fs.readFileSync("res.json"));
  3. res is actually a list of results, which only has one item because we only had one key in the query. This item has one key, record, whose value also has only one key, data. Run these commands to verify the statements above and then cut res to the actual result.

    res.length;
    Object.keys(res[0]);
    Object.keys(res[0].result);
    res = res[0].result.data;
  4. Inside the result there are two fields:

    • meta, the metadata (opens in a new tab). In the case of JSON the field names are part of the data, so this primarily contains data types.
    • json, the data read from the World. This field itself is divided into two values:
      • blockNumber, the block number in which these query results apply.
      • tables, a list of the tables in the World and their data.

    You can use the filter function (opens in a new tab) to get a specific table. In this case, we are getting the Counter table.

    counterTable = res.json.tables.filter((t) => t.name === "Counter")[0];
  5. Finally, to get the actual record with the counter value, use:

    counterTable.records[0].value.value;

Debugging

See here.

Indexing other blockchains

To index a different blockchain, you specify these environment variables:

VariableValue
RPC_HTTP_URLThe URL to access the blockchain using HTTP
RPC_WS_URLThe URL to access the blockchain using WebSocket
START_BLOCK1The block to start indexing from. The block in which the World contract was deployed is a good choice.
SQLITE_FILENAME<blockchain name>.db

(1) This value is optional, but highly recommended in chains where the world was deployed after a large number of blocks had already passed.

After you do that, start the indexer with this command:

pnpm start:sqlite

Note that in your queries you need to specify the correct chainId for the chain you are indexing.

Using the indexer with PostgreSQL

To run the indexer with PostgreSQL (opens in a new tab), follow these steps:

  1. Download and install PostgreSQL (opens in a new tab).

  2. To use the default setting (local PostgreSQL, using the postgres database, indexing the local anvil chain), run:

    pnpm start:postgres:local

    Otherwise, set these environment variables:

    VariableValue
    CHAIN_IDChainID for the chain you index
    RPC_HTTP_URL1The URL to access the blockchain using HTTP
    RPC_WS_URL1The URL to access the blockchain using WebSocket
    START_BLOCK2The block to start indexing from. The block in which the World contract was deployed is a good choice.
    DATABASE_URLURL for the database, of the form postgres://<host>/<database>

    (1) Not necessary in the case of a Lattice testnet. (2) This value is optional, but highly recommended in chains where the world was deployed after a large number of blocks had already passed for performance reasons.

    And then run:

    pnpm start:postgres

Reading information from PostgreSQL

You can now use psql (opens in a new tab) to view the indexer information.

  1. Get the list of schemas to verify __mud_internal is there.

    \dn
  2. Get the list of relations in the schema.

    \dt __mud_internal.*

    You should get two tables, chain and tables.

  3. Get the list of chains (there should be just one).

    select * from __mud_internal.chain;
  4. The tables table is a lot bigger, so first get the list of columns within the table.

    \d+ __mud_internal.tables
  5. Get the list of MUD table names.

    select name from __mud_internal.tables;
  6. Get the key and value schemas for Counter.

    select key_schema, value_schema from __mud_internal.tables where name='Counter';
  7. Get the list of tables for a specific world (if your world address is different, use that instead of 0x5FbDB2315678afecb367f032d93F642f64180aa3).

    \dt "0x5FbDB2315678afecb367f032d93F642f64180aa3__".*
  8. Specify that the default schema is the world being indexed.

    set search_path to "0x5FbDB2315678afecb367f032d93F642f64180aa3__";
  9. Get the value of the counter.

    select * from "Counter";