Best practices for designing public APIs for external integrations

Question

Design Gurus · Accepted Answer

A public API is a programmatic interface exposed to external developers, partners, and third-party applications—not just your own frontend clients. Public APIs power Stripe's payment processing (used by millions of businesses), Twilio's messaging (billions of texts monthly), and GitHub's developer platform (100M+ developers). Designing a public API is fundamentally different from designing an internal service-to-service API: once external clients integrate, every endpoint becomes a contract you cannot easily change. Breaking changes cascade across thousands of integrations simultaneously. In system design interviews, API design is increasingly tested as its own dedicated round—Meta's "Pirate X" loop and Amazon's API-focused rounds evaluate whether you understand versioning, backward compatibility, rate limiting, idempotency, and pagination. These are not afterthoughts; they are the design decisions that determine whether your API is adopted by thousands of developers or abandoned after the first integration attempt.

Key Takeaways

A public API is a contract. Once published, changing it breaks external integrations. Design for stability from day one: strict versioning, backward compatibility, and deprecation policies.  
REST is the default for public APIs in 2026—simple, cacheable, well-documented. Use GraphQL when clients need flexible queries over complex data graphs. Use gRPC for internal service-to-service communication where latency matters most.  
Rate limiting protects your system and your customers. Implement per-client limits with clear error responses (429 status) and rate limit headers so developers can self-regulate.  
Idempotency keys prevent duplicate operations when clients retry failed requests. This is non-negotiable for APIs that modify state—especially payment, order, and booking endpoints.  
Cursor-based pagination is required for any dataset that changes frequently. Offset pagination causes duplicate or skipped records when new items are inserted during paging.

API Style Selection: REST vs GraphQL vs gRPC

Style Best For Protocol Caching Client Complexity Public API Suitability
REST CRUD operations, public APIs HTTP/1.1 Native (HTTP caching) Low Excellent (industry standard)
GraphQL Complex data graphs, mobile apps HTTP/1.1 Requires custom caching Medium Good (Shopify, GitHub)
gRPC Internal microservices, low latency HTTP/2 + Protobuf Not built-in High (requires code generation) Poor (not browser-friendly)
WebSocket Real-time bidirectional communication TCP Not applicable Medium Niche (chat, live updates)

REST is the default for public APIs because it maps cleanly to HTTP semantics, supports native caching, and requires no specialized tooling from clients. Developers can test REST APIs with a simple curl command. Documentation is standardized through OpenAPI/Swagger. Every major API (Stripe, Twilio, GitHub, Slack) is primarily REST.

GraphQL for public APIs works when clients need flexible queries—Shopify's Storefront API and GitHub's v4 API use GraphQL because integrators query vastly different data shapes. The trade-off: GraphQL adds client-side complexity, makes caching harder, and requires rate limiting by query complexity rather than simple request count.

Interview application: "For the public developer API, I would use REST. It is the standard external developers expect, supports HTTP caching natively, and can be tested with curl. For the internal feed aggregation between microservices, I would use gRPC for lower latency and type-safe contracts."

Versioning: Protecting External Integrations

Versioning is the most critical public API design decision. Without versioning, every change risks breaking thousands of integrations simultaneously.

Versioning Strategies

URL path versioning (recommended for public APIs): /v1/users, /v2/users. The version is visible, explicit, and trivially routable. Stripe, Twilio, and most major public APIs use this approach.

Header versioning: Pass the version via a custom header (X-API-Version: 2). Keeps URLs clean but hides the version from developers inspecting request URLs. GitHub uses calendar-based header versioning (X-GitHub-Api-Version: 2022-11-28).

Query parameter versioning: /users?version=2. Simple but clutters query strings and can interfere with caching.

Backward Compatibility Rules

Never remove a field from an API response. Clients may depend on every field you return. Instead, add new fields alongside existing ones.

Never change a field's type. If created_at is a string, do not change it to an integer in a new version.

Never change the meaning of an existing endpoint. If POST /orders creates an order, do not repurpose it to create a quote.

Add new optional parameters—never make existing optional parameters required. Clients built against the current contract should continue working without modification.

Interview application: "I would use URL path versioning: /v1/orders. When we need breaking changes, we publish /v2/orders alongside /v1 and give clients a 12-month migration window. Non-breaking changes—adding new fields, adding optional parameters—are applied to the current version without a version bump."

Rate Limiting: Protecting Your System and Your Clients

Rate limiting prevents abuse, protects backend services from overload, and ensures fair resource distribution across API consumers.

Implementation

Token bucket algorithm: Each client has a bucket that fills at a fixed rate (e.g., 100 tokens per minute). Each request consumes one token. When the bucket is empty, requests are rejected with 429 Too Many Requests. The bucket refills continuously.

Rate limit headers: Return current limits in every response so clients can self-regulate:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 73
X-RateLimit-Reset: 1714060800

Per-client limits: Assign different rate limits based on the client's plan tier. Free tier: 100 requests/minute. Pro tier: 1,000 requests/minute. Enterprise: 10,000 requests/minute.

Interview application: "I would implement rate limiting at the API gateway using a token bucket algorithm. Each API key receives 100 requests per minute for the free tier and 1,000 for the paid tier. Rate limit headers are returned with every response. When the limit is exceeded, the client receives a 429 response with a Retry-After header indicating when they can resume."

Idempotency: Safe Retries for State-Changing Operations

When a client sends a POST /payments request and the network drops before the response arrives, the client does not know whether the payment was processed. Without idempotency, retrying creates a duplicate charge.

Solution: Require clients to include an idempotency key (a unique identifier) with every state-changing request. The server checks if a request with that key has already been processed. If yes, return the original response without re-executing the operation.

Stripe pioneered this pattern for public APIs. Every POST request to Stripe accepts an Idempotency-Key header. The server stores the result for 24 hours, returning the cached response for duplicate keys.

Interview application: "For the payment endpoint, I would require an Idempotency-Key header on every POST request. The server stores the key and response in Redis with a 24-hour TTL. If a duplicate key arrives, the server returns the stored response without re-processing. This prevents double-charging when clients retry after network failures."

Pagination: Handling Large Datasets

Best practices for designing public APIs for external integrations

Key Takeaways

API Style Selection: REST vs GraphQL vs gRPC

Versioning: Protecting External Integrations

Versioning Strategies

Backward Compatibility Rules

Rate Limiting: Protecting Your System and Your Clients

Implementation

Idempotency: Safe Retries for State-Changing Operations

Authentication and Authorization

Error Handling: Predictable, Actionable Responses

Webhooks: Push Notifications for External Integrations

Frequently Asked Questions

Why is API design tested separately in system design interviews?

Should I use REST or GraphQL for a public API?

What is the best API versioning strategy?

How do I implement rate limiting for a public API?

What is an idempotency key and why does it matter?

How do I handle API deprecation?

What authentication should a public API use?

How should I design webhooks for external integrations?

What is the most common API design mistake in interviews?

TL;DR

Style	Best For	Protocol	Caching	Client Complexity	Public API Suitability
REST	CRUD operations, public APIs	HTTP/1.1	Native (HTTP caching)	Low	Excellent (industry standard)
GraphQL	Complex data graphs, mobile apps	HTTP/1.1	Requires custom caching	Medium	Good (Shopify, GitHub)
gRPC	Internal microservices, low latency	HTTP/2 + Protobuf	Not built-in	High (requires code generation)	Poor (not browser-friendly)
WebSocket	Real-time bidirectional communication	TCP	Not applicable	Medium	Niche (chat, live updates)

Best practices for designing public APIs for external integrations

Key Takeaways

API Style Selection: REST vs GraphQL vs gRPC

Versioning: Protecting External Integrations

Versioning Strategies

Backward Compatibility Rules

Rate Limiting: Protecting Your System and Your Clients

Implementation

Idempotency: Safe Retries for State-Changing Operations

Pagination: Handling Large Datasets

Cursor-Based Pagination (Required for Public APIs)

Offset Pagination (Acceptable for Static Data)

Authentication and Authorization

Error Handling: Predictable, Actionable Responses

Webhooks: Push Notifications for External Integrations

Frequently Asked Questions

Why is API design tested separately in system design interviews?

Should I use REST or GraphQL for a public API?

What is the best API versioning strategy?

How do I implement rate limiting for a public API?

What is an idempotency key and why does it matter?

Should I use cursor or offset pagination?

How do I handle API deprecation?

What authentication should a public API use?

How should I design webhooks for external integrations?

What is the most common API design mistake in interviews?

TL;DR