Digging Deeper
Rate Limiting
This page shows you how to throttle incoming requests with the built-in wheels.middleware.RateLimiter. You’ll pick a strategy (fixed window, sliding window, or token bucket), decide between in-memory and database-backed storage, key requests by something other than client IP when needed, understand the X-RateLimit-* response headers the middleware writes, and scope strict limits to sensitive routes like /login.
You’ll learn:
- The three throttling strategies and when each one fits
- How to enable
wheels.middleware.RateLimiterglobally - Memory vs database storage — and why multi-node deployments need the latter
- How to key by API token instead of client IP
- What response headers the middleware writes (and what clients see on a 429)
- How to apply tighter limits to individual routes
Three strategies
Section titled “Three strategies”The strategy argument picks the throttling algorithm. All three enforce a budget of maxRequests per windowSeconds, but they behave differently at boundaries and under bursts.
- Fixed window (default,
"fixedWindow") — the simplest. Counts requests per discretewindowSeconds-long bucket and resets at every boundary. Cheap on memory and trivial to reason about, but a client can burst at the very end of one window and the very start of the next, effectively getting2 × maxRequestsfor a moment. - Sliding window (
"slidingWindow") — smoother. Maintains a timestamp log per client and prunes entries older thanwindowSeconds. Eliminates the boundary-burst problem at the cost of more memory per client (one timestamp per recent request). - Token bucket (
"tokenBucket") — burst-friendly. A bucket holds up tomaxRequeststokens and refills atmaxRequests / windowSecondstokens per second. Each request costs one token. Idle clients accumulate a full bucket and can then burst; sustained high rates starve on the refill rate. Best for APIs where occasional spikes are fine but long-running high rates should throttle.
Basic config — fixed window
Section titled “Basic config — fixed window”Register the middleware globally in config/settings.cfm:
<cfscript>set(middleware = [ new wheels.middleware.RateLimiter( maxRequests=60, windowSeconds=60 )]);</cfscript>With no other arguments you get: strategy="fixedWindow", storage="memory", and keying by client IP. Every client gets 60 requests per 60 seconds.
Sliding window
Section titled “Sliding window”Swap the strategy for smoother enforcement:
<cfscript>set(middleware = [ new wheels.middleware.RateLimiter( maxRequests=100, windowSeconds=120, strategy="slidingWindow" )]);</cfscript>100 requests per 120-second sliding window. A client is throttled whenever 100 requests have been made in the past 120 seconds, regardless of clock boundaries.
Token bucket
Section titled “Token bucket”Token bucket allows bursts up to capacity, then throttles to the refill rate:
<cfscript>set(middleware = [ new wheels.middleware.RateLimiter( maxRequests=50, windowSeconds=60, strategy="tokenBucket" )]);</cfscript>Bucket capacity: 50. Refill rate: 50 / 60 ≈ 0.83 tokens per second. A fresh client can burst 50 requests in a second, then must wait ~1.2 seconds per subsequent request until the bucket refills.
Storage backends
Section titled “Storage backends”storage defaults to "memory" and holds counters in a per-JVM ConcurrentHashMap. Fast and requires zero setup — but each Wheels node has its own counters. Behind a load balancer, a client hitting two nodes gets double the limit, because each node throttles against its own map.
For multi-node deployments, use storage="database":
<cfscript>set(middleware = [ new wheels.middleware.RateLimiter( maxRequests=100, windowSeconds=60, storage="database" )]);</cfscript>The middleware auto-creates a wheels_rate_limits table on first use (it probes for the table and creates it with engine-appropriate DDL when missing). All nodes read and write to the same table, so the budget is shared. Database storage is slower per request than memory — use it only when you need shared state.
Engine support
Section titled “Engine support”Counter updates and cross-node locking use each engine’s native primitives, detected automatically from the datasource:
| Engine | Counter increment | Cross-node locking |
|---|---|---|
| MySQL / MariaDB | Native atomic upsert (ON DUPLICATE KEY UPDATE) | SELECT ... FOR UPDATE |
| PostgreSQL | Native atomic upsert (ON CONFLICT ... DO UPDATE) | SELECT ... FOR UPDATE |
| SQLite | Native atomic upsert (ON CONFLICT ... DO UPDATE) | In-process lock only |
| SQL Server | Unique-constraint-backed insert-retry | WITH (UPDLOCK, ROWLOCK) |
| Oracle | Unique-constraint-backed insert-retry | SELECT ... FOR UPDATE |
| H2 | Unique-constraint-backed insert-retry | SELECT ... FOR UPDATE |
| Unrecognized engines | Unique-constraint-backed insert-retry | In-process lock only |
Every row in wheels_rate_limits carries a globally unique store_key (enforced by a unique index), so the fixed-window counter is created-or-incremented race-free, and the sliding-window and token-bucket strategies run their read-modify-write sequences inside a transaction holding a row lock. On SQLite and unrecognized engines the serialization comes from an in-process lock instead of a SQL row lock — correct on a single node, but not a multi-node guarantee. For multi-node deployments use one of the engines with a real row lock.
Custom key function
Section titled “Custom key function”By default each client is keyed by IP address. For token-authenticated APIs you usually want to key by the API token so a client behind a shared NAT doesn’t share a budget with unrelated traffic:
<cfscript>set(middleware = [ new wheels.middleware.RateLimiter( maxRequests=1000, windowSeconds=60, keyFunction=function(req) { var apiKey = req.cgi.http_x_api_key ?: ""; return Len(apiKey) ? apiKey : "anonymous"; } )]);</cfscript>The closure receives the middleware request context — a struct of params, route, pathInfo, method, and cgi — and returns a unique-per-client string. The cgi member carries the standard CGI keys plus every inbound HTTP header under its CGI-style http_* name, so arbitrary headers like X-Api-Key resolve per client. Keep the Len() guard: a client can send the header with an empty value, which reads as an empty string, not undefined. Fall back to a constant like "anonymous" so unauthenticated traffic still hits some limit — otherwise every header-less request shares whatever empty-string key your code returns: one merged budget you never intended.
Response headers
Section titled “Response headers”On every request (allowed or rejected), the middleware writes three headers:
X-RateLimit-Limit: 60X-RateLimit-Remaining: 42X-RateLimit-Reset: 1713589200X-RateLimit-Limit— themaxRequestsyou configured.X-RateLimit-Remaining— how many requests this client has left in the current window.X-RateLimit-Reset— Unix timestamp when the client’s budget refreshes.
The header prefix is configurable via the headerPrefix constructor argument (defaults to "X-RateLimit").
When a client exceeds the limit, the middleware returns 429 Too Many Requests with Retry-After added:
HTTP/1.1 429 Too Many RequestsRetry-After: 47X-RateLimit-Limit: 60X-RateLimit-Remaining: 0X-RateLimit-Reset: 1713589247Retry-After is the whole-second wait until the budget refreshes, per RFC 7231. Well-behaved clients honor it automatically.
Per-route rate limiting
Section titled “Per-route rate limiting”Global limits are fine for most traffic, but sensitive endpoints (login, password reset, signup) deserve much stricter budgets. Apply a second RateLimiter to a route scope:
<cfscript>mapper() .scope( path="/login", middleware=[ new wheels.middleware.RateLimiter( maxRequests=5, windowSeconds=60 ) ] ) .post(name="authenticate", pattern="/", to="sessions##create") .end() .resources("users") .wildcard().end();</cfscript>Declare the scoped routes between .scope() and its matching .end(), then close the scope before any routes that shouldn’t share the limit — without the .end() every subsequent route (including .wildcard()) nests under /login, breaking the rest of the app’s routing. scope() also accepts a callback= argument that declares the scoped routes and closes the scope automatically when it returns (#3072). On releases before that fix (4.0.3 and earlier), callback= was silently ignored, so use the explicit .end() form there.
Global middleware from config/settings.cfm still runs — it composes with the scope-level one. The pattern is: a permissive global limit (say 60/min) plus a tight limit on sensitive routes (5/min on login) so a brute-force attempt trips the narrow limit long before the broad one.
Debugging unexpected rejections
Section titled “Debugging unexpected rejections”If clients complain about being throttled when they shouldn’t be, walk this checklist:
- Check the server-seen IP vs the real client IP. Behind a load balancer or reverse proxy, every request looks like it came from the proxy’s IP — so all traffic shares one budget. Enable
trustProxy=trueto readX-Forwarded-For, but only when your proxy sanitizes that header (otherwise clients can spoof it). - Use a keyFunction that reads the real identity. For APIs, key by API token. For logged-in users, key by session or user ID. IP-only keying fails badly behind NATs and shared networks.
- Inspect
X-RateLimit-Remainingin responses to see exactly where the client is in its budget. If it hits zero unexpectedly, either the client is making more requests than you thought or the key is collapsing traffic from multiple clients into one bucket. - Temporarily raise the budget for debugging:
maxRequests=10000, windowSeconds=1effectively disables throttling while you isolate the real issue. Remember to put it back.