Rust CLI / Object Storage

List huge S3-compatible buckets without waiting on one slow chain.

s3-turbo-list is for the moment when a bucket is too large, too slow, or too important to inspect with a single sequential listing. It turns object metadata into auditable Parquet inventory and keeps enough trace data to explain the run later.

Rust · MIT License · S3-compatible · Parquet · Trace JSONL

$ s3-turbo-list --dry-run --agent \
  --output-dir out --delimiter '' \
  list --region us-east-2 --bucket my-bucket

plan.segments        64
plan.output          out/*.parquet
trace.enabled        true
checkpoint.resume    ready

$ s3-turbo-list manifest-summary run.json --check
artifacts            ok
parquet.rows         verified
exit_code            0
Automatic key-space discoveryRuntime segment splittingParquet / NDJSON / TSV / summary modesDry-run and agent-friendly plans

Positioning

Use it when

You need a complete bucket inventory, a migration baseline, a before/after diff, or evidence that a provider is listing objects slowly or unevenly.

It is not

A backup system or storage browser. It focuses on listing, tracing, export, and comparison so the inventory can be analyzed elsewhere.

What you keep

A manifest for the run, checkpoint state, trace JSONL for S3 calls, and analysis-ready output such as Parquet for DuckDB or pandas.

Capabilities

Auto-discovered segments

Probe real CommonPrefixes at startup, cache boundaries, and begin parallel recursive listing without hand tuning.

Trace JSONL

Record every S3 API call as structured JSONL so provider behavior is observable after the run.

Checkpoint / resume

Resume interrupted scans from saved segment progress with identity checks to avoid mismatched runs.

Analysis-ready output

Write streaming Parquet by default, or use NDJSON, TSV, summary-only, and dry-run modes for shell and CI workflows.

Why it exists

Sequential listing works until the bucket is large enough to become an operation.

Standard listing tools are often bound to a single ListObjectsV2 pagination chain. That is fine for small buckets, but painful when object counts move into millions or hundreds of millions.

s3-turbo-list slices the key space, runs listing work in parallel, and keeps the run auditable through manifests, traces, checkpoints, and Parquet artifacts.

Workflow

01

Preflight

Run doctor or dry-run to confirm output paths, options, warnings, and whether the planned job makes sense before touching the bucket.

02

Discover

Probe prefix structure and key-space boundaries so the scan can start with parallel work instead of one long pagination chain.

03

Scan

List recursive buckets with bounded memory, runtime split for skewed segments, checkpoint state, manifest, and trace output.

04

Analyze

Query Parquet with DuckDB or pandas, compare inventories, or inspect trace JSONL when provider behavior needs explanation.

Compatibility

Built for real S3-compatible endpoints, not only the happy path.

AWS S3MinIOCloudflare R2BOSOSSB2

Endpoint presets, compat-probe, trace output, and validation reports make it easier to understand how each provider behaves before scaling up a scan.

Start with a dry-run, then keep the trace.

Open repository