The whole course as one decision flow, one cheat sheet, and two tools that cover everything.
You’ve got the model now: the index makes selective extraction possible, the streaming/seekable split says when it’s cheap and when it isn’t, and code adds the in-memory buffer. This capstone compresses all of it into a field guide you’ll actually keep — and ends with a task to lock it in.
One archive, one question at a time:
The three verbs across every format you’ll meet. Bookmark this one.
| Format | List | Stream one → stdout | Bulk extract |
|---|---|---|---|
.zip | unzip -l a.zip | unzip -p a.zip f | unzip -d out/ a.zip |
.rar | unrar l a.rar | unrar p a.rar f | unrar x a.rar out/ |
.tar | tar -tf a.tar | tar -xO -f a.tar f | tar -xf a.tar -C out/ |
.tar.gz | tar -tzf a.tgz | tar -xzO -f a.tgz f | tar -xzf a.tgz -C out/ |
.gz (one file) | — | zcat f.gz | gunzip f.gz |
.7z | 7z l a.7z | 7z e -so a.7z f | 7z x a.7z |
| any (universal) | bsdtar -tf a.* | bsdtar -xOf a.* f | bsdtar -xf a.* |
Don’t want to remember per-format syntax? Two readers swallow most formats behind one interface:
7z (p7zip) — one binary for .7z, .zip, .tar, .gz, and more. l lists, x extracts keeping paths, and -so writes a member to stdout.1bsdtar (libarchive) — “compression and format are always detected automatically, and the same API is used for all formats.”2 It reads tar, zip, 7-zip, cpio, iso and more with the same -tf / -xf flags. The closest thing to one tool for all of them.The catch worth remembering from Lesson 2: a universal tool still obeys the format’s physics. bsdtar -tf on a .tar.gz is still a sequential walk; on a .zip it’s still a seek. One interface — not one cost.
This is the part that builds the skill. Do it now on a real archive (one of your BSE zips is perfect). Click each step as you finish it:
unzip -l yourfile.zip and find one member you want.unzip -p yourfile.zip THAT.csv | head — confirm nothing was written to disk.unzip -p yourfile.zip THAT.csv | wc -l to count its rows.zipfile, z.read("THAT.csv"), and print the first line — no file on disk..tar.gz; time tar -tzf on it vs unzip -l on a zip. Notice the difference.You started by asking “how did you extract those without unzipping them?” You can now answer it cold — the index, the seek, the streaming split, the in-memory buffer — and reach into any .zip, .rar, .tar(.gz), .7z three ways, choosing the cheap one on sight. That’s the fluency the mission was after.
This one mixes all four lessons on purpose — interleaving is what proves the knowledge is yours, not just fresh. Retrieve each from memory.
libarchive / bsdtar — the universal reader; skim the front page on automatic format detection. And the 7-Zip command-line manual for l / x / e -so. Together they’re the two-tool kit behind the cheat sheet above.
zip64 for >4 GB, or wiring all of this into your backtester’s loader. You pick the direction.
-so (write data to stdout) and the command reference (l list, e/x extract).