Exit gateway configuration failures with EX_CONFIG and teach generated systemd units not to restart on that exit status.\n\nCo-authored-by: neo1027144-creator <neo1027144-creator@users.noreply.github.com>
The test previously asserted that a valid snapshot without gateway.mode
blocks startup. After defaulting gateway.mode to 'local' when unset,
the gateway should start successfully in this scenario — update the
test to verify the new expected behavior.
Co-authored-by: Brad Groux <bradgroux@users.noreply.github.com>
After v2026.3.24 introduced a gateway.mode guard, startup fails on
Windows (and other platforms) when the config file exists but doesn't
contain an explicit gateway.mode value. This happens after 'openclaw
onboard' writes a minimal config without gateway settings.
Default to 'local' when the mode is unset, restoring pre-3.24 behavior
where the gateway started without requiring an explicit mode.
Fixes#54801
Co-authored-by: Brad Groux <bradgroux@users.noreply.github.com>
Fix stale lock files from crashed gateway processes blocking new invocations on Windows/macOS. Detect PID recycling to avoid false positive lock conflicts, and add startup progress indicator.
Thanks @TonyDerek-dot
- Release gateway lock when in-process restart fails, so daemon
restart/stop can still manage the process (Codex P2)
- P1 (env mismatch) already addressed: best-effort by design, documented
in JSDoc
- Remove dead 'return false' in runServiceStart (Greptile)
- Include stack trace in run-loop crash guard error log (Greptile)
- Only catch startup errors on subsequent restarts, not initial start (Codex P1)
- Add JSDoc note about env var false positive edge case (Codex P1)
When an in-process restart (SIGUSR1) triggers a config-triggered restart
and the new config is invalid, params.start() throws and the while loop
exits, killing the process. On macOS this loses TCC permissions.
Wrap params.start() in try/catch: on failure, set server=null, log the
error, and wait for the next SIGUSR1 instead of crashing.
When a config-change restart hits the force-exit timeout, exit with
code 1 instead of 0 so launchd/systemd treats it as a failure and
triggers a clean process restart. Stop-timeout stays at exit(0)
since graceful stops should not cause supervisor recovery.
Closes#36822