Use it when
You need a complete bucket inventory, a migration baseline, a before/after diff, or evidence that a provider is listing objects slowly or unevenly.
Rust CLI / Object Storage
s3-turbo-list is for the moment when a bucket is too large, too slow, or too important to inspect with a single sequential listing. It turns object metadata into auditable Parquet inventory and keeps enough trace data to explain the run later.
$ s3-turbo-list --dry-run --agent \
--output-dir out --delimiter '' \
list --region us-east-2 --bucket my-bucket
plan.segments 64
plan.output out/*.parquet
trace.enabled true
checkpoint.resume ready
$ s3-turbo-list manifest-summary run.json --check
artifacts ok
parquet.rows verified
exit_code 0Positioning
You need a complete bucket inventory, a migration baseline, a before/after diff, or evidence that a provider is listing objects slowly or unevenly.
A backup system or storage browser. It focuses on listing, tracing, export, and comparison so the inventory can be analyzed elsewhere.
A manifest for the run, checkpoint state, trace JSONL for S3 calls, and analysis-ready output such as Parquet for DuckDB or pandas.
Capabilities
Probe real CommonPrefixes at startup, cache boundaries, and begin parallel recursive listing without hand tuning.
Record every S3 API call as structured JSONL so provider behavior is observable after the run.
Resume interrupted scans from saved segment progress with identity checks to avoid mismatched runs.
Write streaming Parquet by default, or use NDJSON, TSV, summary-only, and dry-run modes for shell and CI workflows.
Why it exists
Standard listing tools are often bound to a single ListObjectsV2 pagination chain. That is fine for small buckets, but painful when object counts move into millions or hundreds of millions.
s3-turbo-list slices the key space, runs listing work in parallel, and keeps the run auditable through manifests, traces, checkpoints, and Parquet artifacts.
Workflow
Run doctor or dry-run to confirm output paths, options, warnings, and whether the planned job makes sense before touching the bucket.
Probe prefix structure and key-space boundaries so the scan can start with parallel work instead of one long pagination chain.
List recursive buckets with bounded memory, runtime split for skewed segments, checkpoint state, manifest, and trace output.
Query Parquet with DuckDB or pandas, compare inventories, or inspect trace JSONL when provider behavior needs explanation.
Compatibility
Endpoint presets, compat-probe, trace output, and validation reports make it easier to understand how each provider behaves before scaling up a scan.