s3-turbo-list · Rust CLI

List / diff at scale, no scanner to build

A CLI that discovers parallel work for large S3-compatible list / diff runs.

View on GitHub Quick start

Rust · Apache-2.0 · v0.23.0 · S3-compatible · Parquet · Agent-safe JSON

v0.21.0 field benchmark

1M objects in 18.8s

2 vCPU host · default list QPS · OSS Beijing. Observed under constraint, not a throughput ceiling.

1,000,413 objectsAlibaba Cloud OSS · Beijingv0.21.0

Elapsed time by concurrency

c=8 is the Parquet reference run. c=24 is a scheduling outlier, not a general limit.

2 vCPU · 3.4 GiB · same-region ECS → OSS · default list QPS · 2026-06-18

Automatic list segmentationDiffFlag ParquetResume for list

What it does

Answers for large buckets

One CLI for the two questions that matter before a migration or investigation: what is here, and what changed?

LIST

Know what is in a bucket

Turn a large S3-compatible bucket into an analysis-ready object list without hand-tuning a scanner first.

DIFF

Know what changed between two buckets

Compare source and target object metadata, then work from one ordered difference dataset.

Not a sync engine s3-turbo-list does not copy objects or replace a storage browser. It produces a reliable basis for the next decision.

Why it exists

The useful work starts after the scan

A large-bucket scan is rarely the end of the task. You need an object list for analysis, or a difference set to decide whether a migration is complete.

s3-turbo-list makes those answers its default output. It keeps the command surface small and exposes planning, recovery, and evidence only when the run needs them.

How list adapts

The scan plan follows the bucket

The first list reads the bucket's structure, opens parallel work where that structure exists, then only splits the segments that remain long.

01

Read the shape

Probe real prefix boundaries before recursive work begins.
02

Run segments

Use those boundaries to open parallel list work without a prepared hints file.
03

Split the long tail

Only fan out segments that remain long while useful capacity is available.
04

Write the list

Finish with an analysis-ready object list instead of terminal-only output.

What remains

List and diff leave different answers

Both paths write results for downstream work. They are deliberately not the same operation.

LIST

A usable object list

One bucket Parquet object list

Checkpoint and resume keep a long list recoverable; optional trace and manifest files preserve how it ran.

DIFF

An ordered difference set

Source bucket + target bucket Parquet with DiffFlag

Both sides are listed in parallel and merged in key order, so downstream tools can filter what is equal, missing, or changed.

Compatibility

Check the endpoint before scaling up

Verified: AWS S3Verified: MinIOVerified: Baidu BOSPreset: Cloudflare R2Preset: OSSPreset: B2

AWS S3, MinIO, and Baidu BOS are covered by the project’s verified compatibility path. R2, OSS, and B2 have presets; run `compat-probe` before trusting a long job on any endpoint.

Questions & how-to

Why not just use aws s3 ls?

For a quick, small inspection, `aws s3 ls` is often enough. s3-turbo-list is for large inventories and repeatable comparisons where output and recovery matter.

Do I need to prepare hints first?

No. List mode probes real `CommonPrefixes` boundaries on the first run. Hints files remain an optional control for repeated inventories.

How does startup discovery work?

Startup discovery pre-partitions flat namespaces at list startup with parallel single-key probes, so the first run can open parallel work earlier instead of relying only on runtime splits.

Start with a dry-run plan

Open repository