Best practices for designing public APIs for external integrations
A public API is a programmatic interface exposed to external developers, partners, and third-party applications—not just your own frontend clients. Public APIs power Stripe's payment processing (used by millions of businesses), Twilio's messaging (billions of texts monthly), and GitHub's developer platform (100M+ developers). Designing a public API is fundamentally different from designing an internal service-to-service API: once external clients integrate, every endpoint becomes a contract you cannot easily change. Breaking changes cascade across thousands of integrations simultaneously. In system design interviews, API design is increasingly tested as its own dedicated round—Meta's "Pirate X" loop and Amazon's API-focused rounds evaluate whether you understand versioning, backward compatibility, rate limiting, idempotency, and pagination. These are not afterthoughts; they are the design decisions that determine whether your API is adopted by thousands of developers or abandoned after the first integration attempt.
Key Takeaways
- A public API is a contract. Once published, changing it breaks external integrations. Design for stability from day one: strict versioning, backward compatibility, and deprecation policies.
- REST is the default for public APIs in 2026—simple, cacheable, well-documented. Use GraphQL when clients need flexible queries over complex data graphs. Use gRPC for internal service-to-service communication where latency matters most.
- Rate limiting protects your system and your customers. Implement per-client limits with clear error responses (429 status) and rate limit headers so developers can self-regulate.
- Idempotency keys prevent duplicate operations when clients retry failed requests. This is non-negotiable for APIs that modify state—especially payment, order, and booking endpoints.
- Cursor-based pagination is required for any dataset that changes frequently. Offset pagination causes duplicate or skipped records when new items are inserted during paging.
API Style Selection: REST vs GraphQL vs gRPC
| Style | Best For | Protocol | Caching | Client Complexity | Public API Suitability |
|---|---|---|---|---|---|
| REST | CRUD operations, public APIs | HTTP/1.1 | Native (HTTP caching) | Low | Excellent (industry standard) |
| GraphQL | Complex data graphs, mobile apps | HTTP/1.1 | Requires custom caching | Medium | Good (Shopify, GitHub) |
| gRPC | Internal microservices, low latency | HTTP/2 + Protobuf | Not built-in | High (requires code generation) | Poor (not browser-friendly) |
| WebSocket | Real-time bidirectional communication | TCP | Not applicable | Medium | Niche (chat, live updates) |
REST is the default for public APIs because it maps cleanly to HTTP semantics, supports native caching, and requires no specialized tooling from clients. Developers can test REST APIs with a simple curl command. Documentation is standardized through OpenAPI/Swagger. Every major API (Stripe, Twilio, GitHub, Slack) is primarily REST.
GraphQL for public APIs works when clients need flexible queries—Shopify's Storefront API and GitHub's v4 API use GraphQL because integrators query vastly different data shapes. The trade-off: GraphQL adds client-side complexity, makes caching harder, and requires rate limiting by query complexity rather than simple request count.
Interview application: "For the public developer API, I would use REST. It is the standard external developers expect, supports HTTP caching natively, and can be tested with curl. For the internal feed aggregation between microservices, I would use gRPC for lower latency and type-safe contracts."
Versioning: Protecting External Integrations
Versioning is the most critical public API design decision. Without versioning, every change risks breaking thousands of integrations simultaneously.
Versioning Strategies
URL path versioning (recommended for public APIs): /v1/users, /v2/users. The version is visible, explicit, and trivially routable. Stripe, Twilio, and most major public APIs use this approach.
Header versioning: Pass the version via a custom header (X-API-Version: 2). Keeps URLs clean but hides the version from developers inspecting request URLs. GitHub uses calendar-based header versioning (X-GitHub-Api-Version: 2022-11-28).
Query parameter versioning: /users?version=2. Simple but clutters query strings and can interfere with caching.
Backward Compatibility Rules
Never remove a field from an API response. Clients may depend on every field you return. Instead, add new fields alongside existing ones.
Never change a field's type. If created_at is a string, do not change it to an integer in a new version.
Never change the meaning of an existing endpoint. If POST /orders creates an order, do not repurpose it to create a quote.
Add new optional parameters—never make existing optional parameters required. Clients built against the current contract should continue working without modification.
Interview application: "I would use URL path versioning: /v1/orders. When we need breaking changes, we publish /v2/orders alongside /v1 and give clients a 12-month migration window. Non-breaking changes—adding new fields, adding optional parameters—are applied to the current version without a version bump."
Rate Limiting: Protecting Your System and Your Clients
Rate limiting prevents abuse, protects backend services from overload, and ensures fair resource distribution across API consumers.
Implementation
Token bucket algorithm: Each client has a bucket that fills at a fixed rate (e.g., 100 tokens per minute). Each request consumes one token. When the bucket is empty, requests are rejected with 429 Too Many Requests. The bucket refills continuously.
Rate limit headers: Return current limits in every response so clients can self-regulate:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 73
X-RateLimit-Reset: 1714060800
Per-client limits: Assign different rate limits based on the client's plan tier. Free tier: 100 requests/minute. Pro tier: 1,000 requests/minute. Enterprise: 10,000 requests/minute.
Interview application: "I would implement rate limiting at the API gateway using a token bucket algorithm. Each API key receives 100 requests per minute for the free tier and 1,000 for the paid tier. Rate limit headers are returned with every response. When the limit is exceeded, the client receives a 429 response with a Retry-After header indicating when they can resume."
Idempotency: Safe Retries for State-Changing Operations
When a client sends a POST /payments request and the network drops before the response arrives, the client does not know whether the payment was processed. Without idempotency, retrying creates a duplicate charge.
Solution: Require clients to include an idempotency key (a unique identifier) with every state-changing request. The server checks if a request with that key has already been processed. If yes, return the original response without re-executing the operation.
Stripe pioneered this pattern for public APIs. Every POST request to Stripe accepts an Idempotency-Key header. The server stores the result for 24 hours, returning the cached response for duplicate keys.
Interview application: "For the payment endpoint, I would require an Idempotency-Key header on every POST request. The server stores the key and response in Redis with a 24-hour TTL. If a duplicate key arrives, the server returns the stored response without re-processing. This prevents double-charging when clients retry after network failures."
Pagination: Handling Large Datasets
Cursor-Based Pagination (Required for Public APIs)
Returns a cursor (opaque token) pointing to the next page of results. The cursor encodes the position in the dataset, making it immune to insertions and deletions between page requests.
{
"data": [...],
"meta": {
"next_cursor": "eyJpZCI6MTIzNH0=",
"has_more": true
}
}
Why cursor pagination is mandatory: Offset pagination (?page=2&limit=20) breaks when new records are inserted. If 5 new items are added between page 1 and page 2, the client sees duplicates on page 2. For a social feed, product catalog, or any dataset that changes in real time, this produces a broken user experience.
Interview application: "I would implement cursor-based pagination for the feed endpoint. Each response includes a next_cursor token. The client passes this token to fetch the next page. This avoids the duplicate-record problem of offset pagination and maintains constant database query performance regardless of page depth."
Offset Pagination (Acceptable for Static Data)
?limit=20&offset=40 is simpler and acceptable for admin dashboards, static reports, or datasets that rarely change. Do not use it for user-facing, real-time data.
Authentication and Authorization
OAuth 2.0 is the standard for public APIs. It enables third-party applications to access user data without handling user credentials directly. Scopes limit what each token can do (read:users, write:orders).
API keys for server-to-server integrations where user context is not needed. API keys identify the client application, not the end user.
JWT (JSON Web Tokens) for stateless authentication. The server validates the token signature without a database lookup. Include expiration timestamps and refresh token rotation.
Interview application: "For the public API, I would use OAuth 2.0 with scoped access tokens. Third-party apps request specific scopes (read:products, write:orders) during authorization. Tokens expire after 1 hour with rotatable refresh tokens. For server-to-server integrations (webhooks, batch processing), I would issue API keys with per-key rate limits."
Error Handling: Predictable, Actionable Responses
Public API errors must be consistent, machine-readable, and actionable. External developers cannot call your support team for every error—the error response itself must explain what went wrong and how to fix it.
Standard error format:
{
"error": {
"code": "INVALID_PARAMETER",
"message": "The 'email' field must be a valid email address.",
"field": "email",
"documentation_url": "https://api.example.com/docs/errors\#invalid-parameter"
}
}
HTTP status codes to know for interviews: 200 (success), 201 (created), 204 (no content), 400 (bad request), 401 (unauthorized—authentication failed), 403 (forbidden—authenticated but not authorized), 404 (not found), 409 (conflict), 422 (unprocessable entity), 429 (rate limited), 500 (internal server error), 503 (service unavailable).
The 401 vs 403 distinction: 401 means "I do not know who you are" (missing or invalid token). 403 means "I know who you are, but you are not allowed to do this" (valid token, insufficient permissions). Interviewers test this distinction frequently.
Webhooks: Push Notifications for External Integrations
Instead of forcing clients to poll your API for changes, webhooks push event notifications to client-specified URLs when relevant events occur.
Design considerations: Require HTTPS endpoints. Include a signature header (HMAC-SHA256) so clients can verify the webhook originated from your system. Implement retry with exponential backoff for failed deliveries (3 retries over 24 hours). Include an event type and timestamp in every payload. Provide a webhook testing endpoint in your developer dashboard.
Interview application: "For order status updates, I would implement webhooks. Clients register a callback URL and select event types (order.created, order.shipped, order.delivered). When an event occurs, we POST the payload to their URL with an HMAC-SHA256 signature for verification. Failed deliveries retry 3 times with exponential backoff."
For structured practice applying API design principles across complete system design problems, Grokking the System Design Interview covers API design as a core component of every solution.
For advanced API patterns including API gateway design, rate limiting at scale, and webhook infrastructure, Grokking the Advanced System Design Interview builds the depth required for L6+ interviews. The system design interview guide provides the broader framework for integrating API design into every interview phase.
Frequently Asked Questions
Why is API design tested separately in system design interviews?
API design is the contract between your system and external consumers. Companies like Meta ("Pirate X" round), Stripe, and Amazon test it as a dedicated round because public API decisions—versioning, backward compatibility, rate limiting—are irreversible once external developers integrate. Poor API design creates permanent technical debt.
Should I use REST or GraphQL for a public API?
REST for most public APIs—it is the industry standard, supports native HTTP caching, and requires no specialized client tooling. GraphQL when clients need flexible queries over complex data graphs (Shopify, GitHub). gRPC for internal service-to-service communication where latency is critical. Never use gRPC for browser-facing public APIs.
What is the best API versioning strategy?
URL path versioning (/v1/users) for public APIs—it is explicit, visible, and trivially routable. This is what Stripe, Twilio, and most major APIs use. Header versioning keeps URLs clean but hides version information. Query parameter versioning clutters URLs and interferes with caching.
How do I implement rate limiting for a public API?
Use a token bucket algorithm at the API gateway. Assign per-client limits based on plan tier (free: 100/min, pro: 1,000/min). Return rate limit headers with every response (X-RateLimit-Remaining, X-RateLimit-Reset). Respond with 429 Too Many Requests when limits are exceeded, including a Retry-After header.
What is an idempotency key and why does it matter?
A client-provided unique identifier included with state-changing requests. The server checks if a request with that key was already processed and returns the cached response without re-executing. This prevents duplicate charges, orders, or bookings when clients retry after network failures. Stripe pioneered this pattern with a 24-hour TTL.
Should I use cursor or offset pagination?
Cursor-based pagination for any dataset that changes in real time (feeds, catalogs, search results)—it prevents duplicate records during paging. Offset pagination only for static, rarely-changing datasets (admin dashboards, reports). For public APIs, cursor pagination is the safe default.
How do I handle API deprecation?
Announce deprecation 12+ months in advance. Use a Sunset header with the deprecation date. Provide migration guides documenting exactly what changes. Run both old and new versions simultaneously during the transition. Monitor which clients are still on the old version and reach out directly.
What authentication should a public API use?
OAuth 2.0 for user-context APIs (third-party apps accessing user data). API keys for server-to-server integrations without user context. JWTs for stateless token validation. All three should be used over HTTPS only, with token expiration and rotation.
How should I design webhooks for external integrations?
Require HTTPS callback URLs. Include HMAC-SHA256 signatures for payload verification. Retry failed deliveries 3 times with exponential backoff. Include event type and timestamp in every payload. Provide a webhook testing endpoint in the developer dashboard.
What is the most common API design mistake in interviews?
Designing an internal-facing API when the question asks for a public API. Public APIs require strict versioning, backward compatibility guarantees, rate limiting, idempotency, comprehensive error messages, and documentation. Internal APIs can evolve freely. Always clarify "Who consumes this API?" during requirements.
TL;DR
Public API design is a contract that cannot be easily changed once external developers integrate. Use REST as the default (cacheable, simple, industry standard), with URL path versioning (/v1/users) and a 12-month deprecation policy for breaking changes. Implement rate limiting at the API gateway using token bucket with per-client limits and rate limit headers. Require idempotency keys on all state-changing endpoints to prevent duplicate operations during retries—Stripe's 24-hour TTL pattern is the standard. Use cursor-based pagination for all dynamic datasets to avoid duplicate records. Authenticate with OAuth 2.0 for user-context APIs and API keys for server-to-server integrations. Return consistent, actionable error responses with specific codes, messages, and documentation links. Implement webhooks with HMAC signatures and retry backoff for push notifications. In interviews, always clarify whether the API is public or internal—this distinction changes every design decision.
GET YOUR FREE
Coding Questions Catalog

$197

$72

$78