Job monitoring

Job monitoring tracks the execution of automated jobs. Each job reports its lifecycle (start, success, failure) through pings to a monitor endpoint. The platform records each execution’s status, start time, duration, and output, and flags runs that do not check in on schedule as missed.

A job can be anything that runs on a schedule or on demand: a cron task, a systemd timer, a CI step, or a one-off command. The monitoring itself is scheduler-agnostic, so the same mechanism works regardless of what triggers the job.

How it works

Each monitored job has a public key and a ping endpoint. A run ping marks the execution as started, and a complete or fail ping marks it as finished. The server computes the duration from the two pings and compares check-ins against the job’s expected schedule to detect missed runs.

Pings can be sent two ways: the simob run command wraps a command and reports each state automatically, or you can call the ping endpoint directly over HTTP from any script.

With the runner

Info

The simob run command requires simob agent version 0.8.0 or newer. The HTTP API works independently of the agent version.

simob run wraps a command and reports its execution lifecycle to the ping endpoint. It sends a run ping before the command starts, and a complete or fail ping when it exits, based on the exit code. The command’s stdout and stderr pass through unchanged, so existing pipes and redirects keep working.

simob run <job-key> -- /usr/local/bin/backup.sh

Inside a crontab, this becomes:

0 0 * * * simob run Vp4s8S0SsnMo -- /usr/local/bin/backup.sh

The same pattern applies to any scheduler.

Capturing output

Add --capture-output to send stdout and stderr to the platform alongside the execution.

simob run Vp4s8S0SsnMo --capture-output -- /usr/local/bin/backup.sh

Output is captured line by line. The terminal still receives the output as normal, because the runner writes to both the capture buffer and the inherited stdout and stderr.

Exit codes and signals

simob run exits with the same code as the wrapped command. If your script exits with code 2, the runner exits with code 2, so schedulers and wrappers that rely on exit codes behave as expected.

With the HTTP API

The ping endpoint accepts a single GET request per state transition. It requires no authentication header: the job key in the URL identifies the monitor.

GET {api-url}/jobs/p/{key}?state={state}
ParameterRequiredDescription
keyyesThe public key of the job, part of the URL path.
stateyesThe execution state. One of run, complete, or fail (case-insensitive).
vnoSet to 1 to receive the execution ID in the response body.

A run ping starts a new execution. If the previous execution is still open (still running), it is marked as timed out before the new one starts. A complete ping closes the open execution as successful, and a fail ping closes it as failed. If a complete or fail ping arrives with no open execution, a synthetic execution is created to record the outcome.

MONITOR_URL="https://api.simpleobservability.com/jobs/p/<job-key>"
curl "$MONITOR_URL?state=run"
# ... your script logic ...
curl "$MONITOR_URL?state=complete"

To report a failure on error, trap it and send a fail ping:

trap 'curl "$MONITOR_URL?state=fail"' ERR

Adding v=1 returns the execution ID in the response, which lets you correlate captured logs or external events with a specific run:

curl "$MONITOR_URL?state=run&v=1"
# {"execution": "d3b1..."}

The endpoint returns 400 if the state parameter is missing or invalid, listing the accepted states in the error message. It returns 404 if no job matches the given key.

Tip

You can mix both approaches. Use simob run on servers where the agent is installed, and fall back to curl for jobs running elsewhere.

Troubleshooting

Command not found

simob run requires a -- between the job key and the command. Everything after the -- is treated as the command to run. If the runner reports that no command was provided, check that the -- is present.

Ping errors

If the ping endpoint returns 404, the job key in the URL does not match an existing monitor. Verify the key from the dashboard. A 400 response means the state parameter is missing or not one of run, complete, or fail.