Before this fix, devcontainer.status was a read-only DB query that
returned whatever state the row currently held. The state only flips
provisioning→running via touchActivity() inside execInDevContainer.
That created a deadlock: the AI polls devcontainer.status waiting
for 'running'; status will never flip until something else execs.
Caught live in smoke test 2026-05-01 (manifest project) — the AI
fired devcontainer.status three times in a row, hit the loop guard,
and surfaced the dead-end to the user.
Two fixes:
1. getDevContainerStatus() now does a cheap 'true' exec probe when
the row says 'provisioning'. If the probe lands, it flips the
row to 'running' via touchActivity and reports selfHealed=true.
If the probe fails AND the row is older than 120s, it reports
likelyFailed=true so callers can stop polling and escalate.
Also returns ageSeconds for the AI to reason about wait windows.
Coolify's own service status is not used because dev containers
have no fqdn/healthcheck and Coolify reports running:unknown for
any such service forever.
2. New error-recovery rule 'devcontainer-still-provisioning' that
fires whenever a status response contains state:'provisioning'.
Tells the AI to send one status message, wait 15s, and prefer
shell.exec (which lazy-provisions and proves reachability) over
another devcontainer.status call. Explicit antipattern: do not
poll status in a tight loop.
Co-authored-by: Cursor <cursoragent@cursor.com>