Mempool Dumpster is a free mempool transaction archive, brought to you by Flashbots in collaboration with various transaction providers (Infura, bloXroute, Chainbound, Eden, as well as local geth nodes):
- Data: mempool-dumpster.flashbots.net
- The data is published under the CC-0 public domain license (allowing any use whatsoever)
- Source code: GitHub - flashbots/mempool-dumpster: Dump all the mempool transactions 🗑️ ♻️ (in Parquet + CSV)
The core output is a daily Parquet and CSV file, which contains:
- Transaction metadata
- Information about which sources it was received from
- Raw transaction (RLP-encoded, only in the Parquet file)
You can find the files on the website for a particular month, like this one: 2023-09.
- CSV example: Mempool Dumpster - Daily CSV Example · GitHub
- Daily summary file, like 2023-09-14_summary.txt
- For fast query execution, we recommend working with the Parquet output files (more details below).
ClickHouse
We strongly recommend ClickHouse to work with the datasets - it’s versatile, performant, and optimized for large datasets.
- You can use ClickHouse locally or also ClickHouse Cloud, where you can upload the CSV files and have a nice user interface.
- ClickHouse is open-source, and you can install it based on the installation instructions, or using Homebrew:
brew install clickhouse
See also:
- ClickHouse SQL Statements
clickhouse-local
- DuckDB is pretty good too
Download data
Grab a dataset for a particular day from here:
wget https://mempool-dumpster.flashbots.net/ethereum/mainnet/2023-09/2023-09-14.parquet
Explore the dataset
# Show the schema
$ clickhouse local -q "DESCRIBE TABLE '2023-09-14.parquet';"
# Get exclusive transactions from bloxroute
$ clickhouse local -q "SELECT COUNT(*) FROM '2023-09-14.parquet' WHERE length(sources) == 1 AND sources[1] == 'bloxroute';"
# Count exclusive transactions from all sources (query takes <1 sec)
$ clickhouse local -q "SELECT sources[1], COUNT(*) FROM '2023-09-14.parquet' WHERE length(sources) == 1 GROUP BY sources[1];"
eden 22460
local 309
infura 133
apool 86
bloxroute 29871
chainbound 20008
# Get details for a particular hash
$ clickhouse local -q "SELECT timestamp,hash,from,to,hex(rawTx) FROM '2023-09-14.parquet' WHERE hash='0x09ec56414462c3980772999e0d27fa0a79dcb667a156e1a1532ed0f5eaa672f3';"
These are the available fields:
$ clickhouse local -q "DESCRIBE TABLE '2023-09-14.parquet';"
timestamp Nullable(DateTime64(3))
hash Nullable(String)
chainId Nullable(String)
from Nullable(String)
to Nullable(String)
value Nullable(String)
nonce Nullable(String)
gas Nullable(String)
gasPrice Nullable(String)
gasTipCap Nullable(String)
gasFeeCap Nullable(String)
dataSize Nullable(Int64)
data4Bytes Nullable(String)
sources Array(Nullable(String))
rawTx Nullable(String)
Have fun, and please share your results, insights and feature requests!