Pack Build Optimization

29% faster builds  ·  84% smaller cache  ·  fleet-wide

neeto-deploy  ·  May 2026

Press to advance  ·  F for fullscreen  ·  ESC for overview

TL;DR

Build time (warm)
668s → 473s
−195s  ·  −29%
Export phase
371s → 235s
−136s  ·  −37%
:cache ECR image
3,236 MB → 531 MB
−2,705 MB  ·  −84%
build-gems layer
2,652 MB → 125 MB
−2,527 MB  ·  −95%

Measured on neeto-planner-web-staging (representative Rails monorepo). Source: ClickHouse logs.app_logs + ECR manifest inspection.

The Problem

1,854 pack builds in 7 days hit the EKS arm-builds nodes:

PercentileDurationWhat you felt
p50624 s (10.4 min)average wait per deploy
p90905 s (15.1 min)slow deploys
p95961 s (16.0 min)painful
p991,403 s (23.4 min)brutal
max1,895 s (31.6 min)hit the timeout zone

EXPORT phase was the dominant cost — averaging ~370 s per build with no obvious cause from the surface.

Pack Build — 5 lifecycle phases

1. ANALYZE — pull previous-image manifest, decide which layers to reuse
2. DETECT — each buildpack votes "I apply"
3. RESTORE — pull :cache, untar layers into build container
4. BUILD — each buildpack's bin/build runs (bundle install, assets:precompile, …)
5. EXPORT ← dominant cost — walk dirs, tar, gzip, SHA256, push to ECR

For every layer: tar(dir) → gzip → sha256 → upload-or-reuse. Layer size drives the wall-clock cost.

Where time was actually going

Per-phase duration from the before baseline (deployment 7fdc4f7c):

setup_env
4.7s
clone_repo
1.1s
analyzing
42.7s
detecting
1.8s
restoring
59.5s
building
134.5s
exporting
371.0s

EXPORT alone = 56% of total build time. So we needed to know what was happening inside it.

Bundle-install creates two layers per build

build-gems
flagsbuild:true cache:true
lives in<app>:cache ECR image
used duringbuild phase only
contentsall gems incl. dev/test
purposespeed up future bundle install
launch-gems
flagslaunch:true
lives in<app>:<deploy-id> app image
used duringruntime (in the running pod)
contentsonly :default + :production
purposewhat the app actually loads

Critical detail: launch-gems is created by copying build-gems, then running bundle install --without development:test --clean true to strip the dev/test gems out.

How the cache flows across builds

Build 1
(cold)
RESTORE: no :cache exists → no-op
BUILD: bundle install (no --without) → installs every gem into build-gems
EXPORT: push fresh :cache with build-gems blob. Push :latest with launch-gems.
Build 2+
(warm)
RESTORE: pull :cache → untar build-gems into the build container
BUILD: buildpack reads cache_sha metadata. If Gemfile.lock unchanged → "Reusing cached layer"install is skipped entirely
EXPORT: re-tar + gzip + hash the same build-gems content → push same :cache
Consequence: build-gems content is frozen from Build 1. Whatever leaked in on day one (dev/test gems, old wkhtmltopdf-binary versions) gets dragged forward forever — and re-hashed every export.

Quick win #1 — lifecycle cache duplicate-layer bug

Problem. When a developer deploys an app without changing any dependencies, the build pipeline should reuse the cached gem and asset layers from the previous build — not re-upload them.

But every build — even ones where the only change was app code — was pushing all 9 cache layers to ECR again.

Inspecting the actual :cache image revealed why: its manifest listed 18 layer references, but only 9 unique data blobs existed — every blob was being uploaded twice.

Cost: ~80 s of duplicate network transfer per build, and the :cache image kept growing unnecessarily.

Next slide: the root cause + the one-line fix →

Quick win #1 — cause + fix

Cause. A missing return in neeto-deploy-lifecycle/phase/cache.go. When a layer's SHA matched the previous build's (reuse path), ReuseLayer() ran correctly — but execution then fell through and also ran AddLayerFile(), re-uploading the same blob:

if layer.Digest == previousSHA {
    if err = cache.VerifyLayer(previousSHA); err == nil {
        if err = cache.ReuseLayer(previousSHA); err != nil { /* handle */ }
        // ← MISSING `return` here. Falls through.
    }
}
return layer.Digest, cache.AddLayerFile(layer.TarPath, layer.Digest)
//                   ↑ called even when ReuseLayer already succeeded

Fix. Add return layer.Digest, nil after the successful ReuseLayer call. One line of code.

Result: cache_add dropped from 9 → 0 on warm rebuilds; EXPORT phase 453 s → 371 s (−80 s/build) — recovered before the bundle-install work even started.

PR (merged): neeto-deploy-lifecycle#2 · Released as lifecycle:0.2

Quick win #2 — --previous-image launch reuse

Problem. Every build's EXPORT phase tars + gzips + uploads each layer of the app image to ECR. Some of those layers — launcher, config, process-types — come from the CNB lifecycle itself and only change when we bump the lifecycle version (months apart).

But pack had no way to look at the previous build's app image. So it treated every layer as new, re-uploading bytes that were byte-for-byte identical to last week's build. Every build's logs were spammed with Adding layer 'buildpacksio/lifecycle:launcher' when they could have been Reusing layer ….

Fix. In neeto-deploy-slug-compiler-web/.docker/pack-build/build.sh: look up :latest via aws ecr describe-images, then pass it to pack build as --previous-image:

if aws ecr describe-images --repository-name "$APP_IMAGE_REPOSITORY" \
     --image-ids imageTag=latest --region us-east-1 > /dev/null 2>&1; then
  previous_image_args=('--previous-image' "$LATEST_IMAGE_TAG")
fi
pack build "$APP_IMAGE_TAG" "${previous_image_args[@]}" --tag "$LATEST_IMAGE_TAG" …

Effect. The exporter now compares each new layer's SHA against the previous image's manifest. Same SHA → reference the existing blob by digest, skip the upload entirely. Logs flip from Adding layer … to Reusing layer ….

Investigation — the smoking gun

Pulled the :cache ECR image, aggregated uncompressed bytes by category:

CategorySize (MB)% of cache
wkhtmltopdf-binary (in :development, :test group)1,13435%
Other production gems3009%
Dev/test gems (brakeman, faker, rbs, …)1525%
Duplicate gem versions1204%
Native ext sources + bundler cache1,33841%
Misc2166%
Total3,236100%
Finding: 1.13 GB of wkhtmltopdf-binary in the cache — that gem is in :development, :test group of Gemfile.common.rb. It should NEVER ship to production. Yet there it was, in every neeto product's build-gems cache.

Root cause

Upstream Paketo bundle-install in build.go — asymmetric configs:

// BUILD layer install
installProcess.Execute(..., map[string]string{
    "path":  layer.Path,
    "clean": "true",
    // ← NO "without". Installs every group.
})

// LAUNCH layer install
installProcess.Execute(..., map[string]string{
    "path":    layer.Path,
    "without": "development:test",  // ← hardcoded, launch only
    "clean":   "true",
})

First build (no cache) → build-gems gets all gems. Subsequent builds "Reuse cached layer" without re-installing → dev/test gems live in the cache forever.

Launch image was fine. But the cache layer still had to be tarred + gzipped + hashed every export — ~85 s of wasted work per build.

And the bloat reached production app images

Even though launch install runs bundle install --without development:test --clean true:

  1. Launch-gems starts as a copy of build-gems (which has dev/test).
  2. Pack's --previous-image optimization tells the exporter: "if this layer's SHA matches the previous build's, reference that blob — don't re-upload".
  3. When Gemfile.lock doesn't change, the buildpack logs "Reusing cached layer …/launch-gems" → exporter re-uses the same SHA from the previous build's manifest.
  4. Result: the polluted launch-gems blob created on the very first build is referenced by every subsequent app image. The 2.5+ GB launch-gems persists in production indefinitely.
Fleet audit found: 3 production app images at 2.5–2.8 GB each. After :latest + :cache delete + 1 cold rebuild → 700–900 MB each (−69%).

The fix — bundle-install:0.9.0

Added BP_BUNDLE_WITHOUT env var (+ RAILS_ENV/RACK_ENV-derived default) honored by both layer installs:

// environment.go
switch railsEnv {
case "production", "staging": return "development:test"
case "development":           return "production:test:staging:heroku"
case "test":                  return "production:development:staging:heroku"
default:                      return ""           // legacy
}

// build.go — same logic now applied to BOTH layers
if environment.BundleWithout != "" {
    buildConfig["without"] = environment.BundleWithout
}

Defaults preserve back-compat. Apps with RAILS_ENV=production (i.e., every neeto product) automatically get the savings.

PR (merged): neeto-deploy-paketo-bundle-install-buildpack#12 · Tracking issue: neeto-deploy-web#7146 · 11 new tests, 0 regressions

Result #1 — cache image size

Before (3,236 MB)
After (531 MB)

Layer 5 (build-gems) dropped from 2,652 MB125 MB. All other layers unchanged.

Result #2 — per-phase timing

BEFOREAFTER (warm)
setup_env
4.6s
4.6s
analyzing
42.7s
42.7s
restoring
59.5s
20.5s  (−65%)
building
134.5s
116.2s  (−14%)
exporting
371.0s
235.2s  (−37%)

Restoring dropped because the cache image is 84% smaller → less to pull + untar. Export dropped because there's less data to tar+gzip+hash.

Result #3 — fleet-wide impact

Top 5 apps per env, builds before vs after 2026-05-11 14:00 UTC:

Staging

AppBeforeAfterΔ
neeto-cal-web937s487s−48%
neeto-desk-web865s454s−48%
neeto-chat-web899s536s−40%
neeto-git-web752s463s−38%
neeto-planner-web728s468s−36%

Production

AppBeforeAfterΔ
neeto-git-web710s442s−38%
neeto-engage-web752s514s−32%
neeto-pay-web741s521s−30%
neeto-tower-web691s492s−29%
neeto-deploy-web841s614s−27%
Staging median
−41%
Production median
−27%

Result #4 — production image sizes

3 production apps audited (had bloated 2.5+ GB launch-gems from dev/test gems):

After deleting :latest + :cache and forcing one cold rebuild: combined size 8,315 MB2,541 MB (−5.8 GB / −69%).

Result #5 — fleet-wide image savings

Across the apps that have rebuilt since the fleet-wide tag clear, 17 of 72 apps got measurably smaller (the other 55 simply haven't redeployed yet). Top 12 by absolute MB saved:

AppBeforeAfterΔ MBΔ %
neeto-record-web-prod2,801 MB752 MB−2,049−73%
neeto-cal-web-prod2,829 MB870 MB−1,959−69%
neeto-form-web-prod2,685 MB920 MB−1,765−66%
neeto-form-web-stag1,143 MB647 MB−496−43%
neeto-chat-web-stag1,126 MB630 MB−495−44%
neeto-chat-web-prod950 MB630 MB−320−34%
neeto-editor-prod740 MB462 MB−278−38%
bigbinary-website-stag1,161 MB899 MB−262−23%
neeto-git-web-prod / stag851 MB660 MB−191−22%
neeto-tower-web-prod / stag670 MB521 MB−148−22%
Aggregate so far: ~8.3 GB freed across just 17 apps. The remaining 55 prod/staging apps will follow the same pattern on their next natural deploy — expected fleet-wide reclaim ~30–60 GB.

Why smaller images matter operationally

Image size isn't just a storage line item. Every byte gets pulled by the kubelet onto every node, on every cold start.

Faster pod cold start
~30 s → ~9 s
Cold image pull (2.8 GB → 870 MB)
Faster horizontal scale-out
~21 s/pod saved
HPA scales 1 → N replicas faster
Faster rolling deploys
~21 s × replicas
Each pod rotation is quicker
Lower ECR storage
~$0.10 / GB-month
~30–60 GB freed fleet-wide

Compounding effect on traffic spikes: when an app gets a sudden burst, the autoscaler responds faster because new pods come ready in seconds instead of half a minute. Especially impactful for the 3 worst-offender apps (cal-web, record-web, form-web) which each had ~21 s of unnecessary pull latency on every pod startup.

What we actually shipped

SHIPPED Per-phase instrumentation[Build][phase=…][duration_ms=…] log lines, foundation for everything else
SHIPPED --previous-image launch reuse — exporter references existing app-image blobs by digest
SHIPPED lifecycle:0.2PR #2: fixed missing return in addOrReuseCacheLayer (duplicate manifest entries bug)
SHIPPED lifecycle:0.3PR #4: 6 SBOM placeholder files, eliminates exporter warnings
SHIPPED slug-compilerPR #316: removed no-op --trust-builder, bumped lifecycle ref to 0.3
FLAGSHIP bundle-install:0.9.0PR #12: env-aware --without on both build and launch layers. THE big lever.
SHIPPED ruby:0.47.21 composite — references bundle-install:0.9.0, propagates fix to every Ruby app
OPS Fleet-wide tag clear — 250 :cache + 118 prod/staging :latest deleted, forcing fresh clean rebuilds

Tracking issue: neeto-deploy-web#7146

What we evaluated and skipped (honestly)

OptionOriginal estimateHonest estimateVerdict
zstd compression30–50 s8–15 sSKIP Lifecycle doesn't support it, code change required
Parallel layer export"cut in half"15–25 sSKIP Diminishing returns post-fix, race-condition risk
Bigger build pods (3→6 CPU)30–60 s<5 sSKIP gzip is single-threaded — no win
Reduce launch-layer count"modest"~3 s/layerSKIP UX risk for tiny gain
PVC local layer cache200–300 s~40 sSKIP 5-7 days of ops work for 40 s
arm-builds image pre-pull30–60 s savedn/aWRONG Pack uses podman in-pod, not containerd — node-level pre-pull invisible

The bundle-install fix already extracted the easy multi-minute wins. Every remaining option had diminishing returns relative to its implementation cost. Knowing when to stop is part of the work.

Artifacts shipped to ECR

348674388966.dkr.ecr.us-east-1.amazonaws.com/
├── neeto-deploy/paketo/lifecycle:0.2          (cache duplicate-layer fix)
│   ├── 0.2-amd64    sha256:8d6cd7…
│   └── 0.2-arm64    sha256:d79c82…
│
├── neeto-deploy/paketo/lifecycle:0.3          (SBOM placeholder files)
│   └── multi-arch   sha256:17637a…
│
├── neeto-deploy/paketo/buildpack/bundle-install:0.9.0    ← FLAGSHIP
│   ├── 0.9.0-amd64
│   ├── 0.9.0-arm64
│   └── multi-arch   sha256:736355…
│
└── neeto-deploy/paketo/buildpack/ruby:0.47.21
    └── multi-arch   sha256:fffc12…   (composite — pulls bundle-install:0.9.0)
PRs (all merged):
· neeto-deploy-paketo-bundle-install-buildpack #12 — env-aware --without (flagship)
· neeto-deploy-slug-compiler-web #316 — remove --trust-builder, bump lifecycle
· neeto-deploy-lifecycle #4 — SBOM placeholder files (0.3)
· neeto-deploy-lifecycle #2 — cache duplicate-layer fix (0.2)
Tracking issue: neeto-deploy-web #7146

Bottom line

29%
faster builds
84%
smaller cache
95%
smaller build-gems layer
8 apps
7 PRs merged, 1 open
368 tags
cleared fleet-wide
~6 GB
freed from just 3 apps

Thanks 🙌

Questions?

Detailed report: pack-build-optimization-results.md
Deck source: github.com/vishal24367/pack-build-optimization-deck
Gist: gist.github.com/vishal24367/e06ad…