300 Commits

Author SHA1 Message Date
Claude
cc46a915dd refactor: complete removal of buildkit and sticky disk management
This commit completes the refactoring of build-push-action to focus solely on
Docker build reporting and metrics, with all infrastructure management moved
to the separate setup-docker-builder action.

Changes:
- Remove all setupOnly references from context.ts, main.ts, and state-helper.ts
- Rename startBlacksmithBuilder to reportBuildMetrics to better reflect its purpose
- Remove exposeId from all function signatures and state management
- Remove sticky disk commit logic from reporter.ts
- Update tests to match new function names and signatures
- Clean up unused imports and fix linting issues

The action now assumes that a Docker builder has already been configured
(either via setup-docker-builder or existing setup) and focuses only on:
- Running Docker builds with the configured builder
- Reporting build metrics and status to Blacksmith API
- Managing build outputs and metadata

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-01 14:25:57 -04:00
Claude
877a04de98 refactor: Remove sticky disk management while keeping build reporting
- Remove sticky disk mounting and unmounting logic
- Remove sticky disk commit logic from both main and post actions
- Replace setupStickyDisk with reportBuildStart to only report build start
- Update build completion reporting to not depend on exposeId
- Keep build tracking and reporting functionality intact

The sticky disk lifecycle is now fully managed by setup-docker-builder
2025-08-01 14:10:11 -04:00
Claude
7894682343 refactor: Remove buildkit management from build-push-action
- Remove buildkitd startup and configuration logic
- Remove buildkitd shutdown and cleanup from both main and post actions
- Remove buildkitd-related imports and helper functions
- Update startBlacksmithBuilder to check for existing builder from setup-docker-builder
- Keep sticky disk setup and build reporting functionality intact

BREAKING CHANGE: This action now requires setup-docker-builder to be run first to manage the Docker builder lifecycle
2025-08-01 14:06:43 -04:00
Claude
f9f71c9f11 src: only prune if buildkitd was spun up 2025-06-17 14:36:14 -04:00
Claude
a037e6f634 src: use BLACKSMITH prefixed VM ID env var 2025-06-16 16:12:28 -04:00
Claude
616bee01ad test: fix platform test to work on both ARM and AMD runners
The test was hardcoded to expect arm64 platform, causing failures
on AMD runners. Now checks actual host architecture dynamically.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-06-11 13:54:31 -04:00
Claude
28c244705c *: allow users to pass in a buildx version 2025-05-30 12:31:51 -04:00
Claude
9dbab7fbd2 src: add a retry with backoff to combat 429s when downloading buildkit 2025-05-18 16:22:27 -04:00
Claude
1868624b97 src: add ping before get stickydisk 2025-05-16 13:41:46 -04:00
Claude
e84bc1a88e src: more debug logs 2025-05-14 14:04:56 -04:00
Claude
41a36ac067 src: print the port bpa is trying to hit 2025-05-14 13:43:57 -04:00
Claude
296109dd1e src: only commit stickydisk in post step if in setup-only
Firstly this was a bug where we were trying to commit in the post
step even if we had already committed at the end of the main step in
a non-setup-only invocation.

Secondly, if the action is canceled before the exposeID is set in the main
process, we don't want to send a commit request with an empty exposeID.
2025-04-29 17:01:42 -04:00
Claude
c80185915d src: move buildkit prune to cleanup stage and invoke it inline
Previously, we were firing off an async buildkit prune to clean
up layers unused in 14 days. This changes that to cleanup layers
unused in 7 days and fires it off inline on cleanup. It just seems
easier to reason about that way.
2025-04-22 16:31:23 -04:00
Claude
11ec21ffed src: use port from env 2025-04-15 18:23:28 -07:00
Claude
ab514e31b5 *: introduce a setup-only mode to the build-push-action
This setup-only mode will setup a docker builder with the stickydisk
mounted but will not run a Docker build. The use case here is to allow
customers to then run their custom Tilt files or Docker commands against
our builder. The other subtle change is that we only cleanup in the post
step of this builder action. It is still to be seen if you can start several
of these builders at the same time in a workflow but we can do that as a follow
on.
2025-04-14 16:36:36 -07:00
Aayush Shah
f8d1c2e2ae
*: normalize file paths in all cases (#104) 2025-03-06 17:24:56 -05:00
Aditya Maru
6fd13769ac src: disable native multi-arch builds 2025-03-04 15:53:15 -05:00
Aditya Maru
feb3751245 src: only log fatal errors in tailscale teardown 2025-03-03 22:55:54 -05:00
Aditya Maru
4a3e86e9c9 src: add scaffolding for support multi-platform builds 2025-02-17 05:25:52 +05:30
Aayush
1390f95565 *: bind to localhost over TCP instead of using a unix socket 2025-02-10 23:06:21 -05:00
Aditya Maru
2331ad873b src: add sync before umount 2025-01-21 19:34:23 -05:00
Aditya Maru
f440133b20 wip 2025-01-10 15:52:55 -05:00
Aayush
8554acbf59
src: prevent path duplication when dockerfile is within context 2025-01-09 10:03:58 -05:00
Aditya Maru
5ac445ae84 src: fix error message 2025-01-08 07:14:25 -05:00
Aayush
0e4788906e
src: bump buildkit startup timeout to 30sec 2025-01-07 21:18:32 -05:00
Aayush Shah
d8a061af73
src: update timeout on setupStickyDisk (#91) 2025-01-01 15:09:21 -05:00
Aditya Maru
34ea2f79e5 src: change warning to debug 2025-01-01 13:16:46 +04:00
Aayush Shah
4ed3ba5c73
src: ignore unset sentinel value for tailscale token (#89) 2025-01-01 02:05:30 -05:00
Aditya Maru
42b59d67c9 src: bump timeout from 30s to 45s 2025-01-01 09:25:31 +04:00
Aayush Shah
c03b613806
use local dockerfile path over git context (#86) 2024-12-31 13:08:49 -05:00
Aditya Maru
aa6b213b0b src: join and leave tailnet on start and cleanup of builder 2024-12-31 15:52:49 +04:00
Aditya Maru
9fdeb57c53 src: disable automatic buildkit GC
We have reason to believe that automatic GC is affecting
daemon startup times. In this patch we disable automatic GC
and instead rely on manual pruning of the buildkit cache.
Once the daemon is ready we spawn an async task to run prune
on any objects older than 14 days. We are already manaing the
ceph volume approaching its size limit ourselves in the VM
Agent.

Patch also adds some alerting when inode usage is high on a mountpoint.
2024-12-23 09:15:34 -05:00
Aditya Maru
61713d1849 src: print api url in debug info 2024-12-21 23:42:52 -05:00
Aditya Maru
6fe2467492 src: silence metric warnings for now 2024-12-21 23:12:08 -05:00
Aditya Maru
4759d93c12 src: use the plumbed BLACKSMITH_BACKEND_URL if present 2024-12-21 12:08:11 -05:00
Aditya Maru
def1585067 *: report metrics to the VM agent 2024-12-20 17:43:40 -05:00
Aditya Maru
4723a2a346 src: stop spurious warnings on buildkit shutdown 2024-12-19 19:04:07 -05:00
Aditya Maru
1672d6fbad src: fix shutdown retry behavior 2024-12-19 13:04:09 -05:00
Aditya Maru
9302d2aea9 src: stop running process as nohup to avoid missing logs 2024-12-19 12:44:35 -05:00
Aditya Maru
ac42783fa9 src: cleanup flakiness in different parts of the action 2024-12-18 09:58:15 -05:00
Aditya Maru
54bc4e0788 src: refactor cleanup logic to expose buildkitd.log
Previosuly, we only killed the buildkitd process and unmounted
if builderInfo was non null. This was wrong cause we could have setup
builkdkitd, but failed after that step. This would then rely on the last
ditch effort by the post action to cleanup. We now change the proc kill
and unmount to happen on any build error.
2024-12-16 19:25:47 -05:00
Aditya Maru
d43ee61bb7 *: move to grpc backed communication for the agent 2024-12-16 15:29:30 -05:00
Aditya Maru
53000f0f59 ignore error when nothing is mounted 2024-12-15 17:16:24 -05:00
Aditya Maru
1df1b3c361 src: ignore error when theres nothing mounted 2024-12-13 12:32:05 -05:00
Aditya Maru
de0451e517 src: make post unmount even if buildkitd is no longer present
Also increase retries when trying to unmount the buildkit directory.
Retry up to 3 seconds now, previously we were only retrying 3 times
with a 100ms backoff.
2024-12-10 21:26:18 -05:00
Aditya Maru
0f99a0b1c7 src: start sending get request with query params
We are incorrectly using formData in a get request. To move
away from this we send both query params and formData until
the server is fully upgraded. After which we can stop sending
formData.
2024-12-09 13:01:35 -05:00
Aditya Maru
0186286e06 *: use axios-retry instead of handrolled retry methods 2024-12-09 13:01:20 -05:00
Aayush Shah
7b8642822f
src: make getDockerfilePath return the full path to the dockerfile (#64)
Previously we were just returning the path to the dir containing the dockerfile
in most cases.
2024-12-09 12:20:46 -05:00
Aditya Maru
f06a558c36 src: alert if an exception is thrown on cleanup 2024-12-08 19:21:46 -05:00
Aditya Maru
b76cd7bf3b src: fix bug in conditional that zero'd out expose ID 2024-12-08 18:44:36 -05:00