codingstairs
NotesEDULifeContact
⌕Search⌘K
koen

Navigation

  • Intro
  • Blog
  • Life

Get in touch

Send without signing in. Add your email if you'd like a reply.

  • Leave a message anonymously →
  • ✉ warragon112@gmail.com
  • KakaoTalk Open Chat ↗

© 2026 codingstairs

  • Notes
  • EDU
  • Search
  • Life
  • Contact
  • Legal
  • RSS
  • GitHub
Notes›backend

Scheduled jobs and APScheduler

Published 2026-04-28· Updated 2026-05-18·0 views

Scheduled jobs and APScheduler

Periodic tasks show up in any backend. Nightly aggregates, external data collection, expired-token cleanup. At small scale, cron or an in-process scheduler is enough; at larger scale, distributed queues and workers appear.

1. About APScheduler

APScheduler is a Python library started by Alex Grönholm and reportedly first published around 2008 — a long-standing project. It is an in-process scheduler that bundles cron expressions, intervals, and date triggers behind a single API.

Trigger Meaning
cron Cron expressions like "every day at 03:00."
interval Fixed intervals like "every 30 seconds."
date A one-shot at a specific time.

There are several scheduler types — BlockingScheduler (occupies the main thread), BackgroundScheduler (its own thread), AsyncIOScheduler (event loop), and others. Job stores can be memory, SQLAlchemy, MongoDB, or Redis, allowing job restoration after restarts.

2. Triggers and jobs

from apscheduler.schedulers.background import BackgroundScheduler

sched = BackgroundScheduler(timezone='Asia/Seoul')

@sched.scheduled_job('cron', hour=3, minute=0, id='daily-aggregate')
def daily_aggregate():
    ...

@sched.scheduled_job('interval', seconds=30, id='heartbeat')
def heartbeat():
    ...

sched.start()

Specifying an id prevents duplicate registration of the same job (replace_existing=True). Even when code changes while state is preserved in the job store, the same id updates it in place.

3. Single-instance assumption and idempotency

When several processes share the same job store, APScheduler does not automatically guarantee job distribution (a separate distributed lock is needed). Running on a single worker is the simplest assumption. When multiple workers are needed, choose one of:

  • Job store + DB row locks so only one worker runs at a time.
  • Redis-based distributed lock (algorithms like Redlock).
  • An operational convention of "the scheduler is on only on one instance."

By design, assume the same job may run twice and write the body idempotently (the same input gives the same result, or a safe no-op).

4. misfire and coalesce

Policies for how to handle scheduled times that pass while the worker is down.

  • misfire_grace_time — running this late at most is still OK.
  • coalesce — whether to collapse accumulated executions into one.

The defaults are conservative, and explicit configuration is recommended so that accumulated executions do not strain the system.

5. Other tools

Tool First appeared Model
cron (Unix) 1975 OS-level scheduler. The simplest.
systemd timers 2010s systemd's cron replacement. Single host.
APScheduler 2008 Python in-process.
Celery 2009, Ask Solem Python distributed task queue (broker: RabbitMQ/Redis). celery beat schedules.
RQ 2012 Python + Redis. Simpler than Celery.
Sidekiq 2012, Mike Perham Ruby + Redis. A standard for large-scale operation.
BullMQ 2018 (formerly Bull) Node + Redis.
Quartz 2001 Long-standing JVM scheduler. Standard Spring integration.
Temporal 2019 (Cadence fork) Workflow engine. State, retries, and timers are first-class.
AWS EventBridge / GCP Cloud Scheduler late 2010s Managed cron.

One axis of choice is "must execution be distributed across hosts?" If distribution is needed, the queue model; if a single host is enough, in-process schedulers work.

6. Combining with FastAPI

from contextlib import asynccontextmanager
from fastapi import FastAPI
from apscheduler.schedulers.asyncio import AsyncIOScheduler

sched = AsyncIOScheduler()

@asynccontextmanager
async def lifespan(app: FastAPI):
    sched.start()
    yield
    sched.shutdown(wait=False)

app = FastAPI(lifespan=lifespan)

The lifespan event aligns the scheduler's lifetime with the app's.

7. Guards in job bodies

def daily_aggregate():
    if already_done(date.today()):
        return
    do_work()
    mark_done(date.today())

This shape is the starting point of idempotency. A simple DB flag guarantees that "waking up twice for the same date does the work only once."

8. Distributed locks (Redis)

When several instances run the scheduler together and we want at most one job running at a time, the Redis SET key NX EX <ttl> pattern is common. The Redlock debate kicked off by Martin Kleppmann's article and Redis author antirez's response (2016) is well known. When strong consistency is required, consensus-based tools like ZooKeeper, etcd, or DB row locks are also seen as more appropriate.

9. Common pitfalls

Auto-reload during development — modes like uvicorn --reload can effectively spawn two workers, causing jobs to register twice. Either disable the scheduler in development, or specify replace_existing=True on the job store.

Missing timezone configuration — schedule times get interpreted as UTC and drift from intent. Specify timezone when constructing the scheduler.

Long-running job overlapping the next trigger — the next run triggers before the previous one finishes. Review options like max_instances=1 and coalesce=True.

Distributed lock TTL inverted with work duration — when work runs longer than the lock TTL, another worker can grab the lock and run twice. Measure the work duration distribution and set TTL conservatively.

Closing thoughts

The scheduler is the most efficient stop before bigger systems show up. Starting with one line of APScheduler and just keeping idempotent bodies + the single-instance assumption keeps operational burden very small. When distribution becomes necessary, moving to tools like Celery or Temporal is a natural flow.

Next

  • typeorm-readonly
  • crawler-ethics

See APScheduler · APScheduler GitHub · Celery · Sidekiq · Quartz · Temporal · Redlock debate (Martin Kleppmann).

More in backend

All in this category →
  • Wrap public OpenAPIs with your own BFF
  • Email Delivery and OTP — SMTP
  • Audit Log — logAdminAction pattern
  • WebSocket and SSE — real-time communication
  • REST API introduction
  • OpenAPI Specification