session-tools/docs/disk-reclaim.md

104 lines
3.7 KiB
Markdown
Raw Normal View History

# disk-reclaim — find reclaimable disk space
`disk-reclaim` scans a directory tree for generated/cache directories that
regenerate from source — `node_modules`, `target`, `__pycache__`, build
outputs, IDE state — and reports them sorted by size. Read-only by design;
it names paths, it never deletes.
The intended workflow is: run it, scan the list for entries you don't
need actively built, `rm -rf` those yourself. The script stays out of the
deletion decision because the right answer is project-by-project (a
`target/` you'll need to recompile in 10 minutes is different from one on a
project you haven't touched in a year).
## Usage
```sh
disk-reclaim # scan $HOME, default --min 100M
disk-reclaim ~/Code # scope to a subtree
disk-reclaim --min 1G # only the worst offenders
disk-reclaim --all # no minimum filter
disk-reclaim --no-summary # skip the per-category totals
```
Sample output (first real run on this machine, `~/Code --min 500M`):
```
scanning /Users/natalie/Code (min size: 500M)...
SIZE PATH
---- ----
15.7G /Users/natalie/Code/@projects/@magic-civilization/src/simulator/target
5.6G /Users/natalie/Code/@projects/@magic-civilization/.local/build
1.2G /Users/natalie/Code/@projects/@lilith/lilith-platform.live/node_modules
top-level cache roots:
1.5G /Users/natalie/.npm
868M /Users/natalie/Library/Caches
579M /Users/natalie/.cargo/registry
totals by category:
15.7G target
5.6G build
1.2G node_modules
```
## What it scans
**Project-scoped patterns (via `find ... -prune`):**
| Ecosystem | Patterns |
|---|---|
| JS/TS | `node_modules`, `.next`, `.nuxt`, `.turbo`, `.vite`, `.parcel-cache`, `.svelte-kit`, `.astro`, `.cache`, `dist`, `build`, `out` |
| Python | `__pycache__`, `.pytest_cache`, `.mypy_cache`, `.ruff_cache`, `.tox`, `.venv` |
| Rust | `target` |
| Other | `_build`, `Pods`, `DerivedData`, `.gradle`, `.android` |
`-prune` matters: once a `node_modules` is matched, `find` doesn't descend
into it looking for nested matches. Otherwise scans are slow and
double-count nested build dirs.
**Top-level cache roots (checked once each, not via find):**
- `~/Library/Caches`, `~/Library/Developer/Xcode/DerivedData` (macOS)
- `~/.cache` (XDG)
- `~/.npm`, `~/.pnpm-store`, `~/.yarn/cache`
- `~/.cargo/registry`, `~/.cargo/git`
These are *the* cache root for their tool — including them in the `find`
sweep would be wrong (the script would find every nested `.cache` inside
them).
## What it deliberately does NOT scan
| Pattern | Why excluded |
|---|---|
| `vendor/` | Usually committed (Go) or required at runtime (PHP). Not generated. |
| `.git` | User data. Never delete. |
| Top-level `tmp`, `Downloads`, `Movies` | User-created content, not regenerable from source. |
| Docker images/volumes | Use `docker system prune` instead — separate workflow with its own safety story. |
## Caveats before `rm -rf`
| Pattern | Cost to rebuild |
|---|---|
| `node_modules` | `pnpm install` / `npm install` — seconds to minutes depending on cache |
| `target` (Rust) | Full `cargo build` — minutes |
| `.venv` | `uv sync` / `pip install -r requirements.txt` — depends on wheel availability |
| `.next`, `.nuxt`, `dist`, `build` | Single build command — usually fast |
| `DerivedData`, `.gradle` | First build after deletion is slow; subsequent builds fine |
The script prints this warning at the bottom of every run. Cache roots
(`~/.npm`, `~/.cargo/registry`, etc.) are safe to nuke — they're pure
caches that get repopulated lazily on next install.
## Files
| Path | Role |
|---|---|
| `bin/disk-reclaim` | The script |
## Related
- [[lan-power-ctrl]] — when *apricot's* disk fills and it wedges, `power-cycle apricot` is the recovery path