Rate Limiting
1. Algos
1.3. Fixed Window Counter
1.4. Sliding window log
1.5. Sliding window counter
2. Rate Limiting at different OSI Layers
2.1. Layer 1 - Physical Layer:
- Rate limiting is not applicable as this layer deals with the raw bit transmission over physical media.
2.2. Layer 2 - Data Link Layer:
- Rate limiting can be implemented through traffic shaping protocols (e.g., IEEE 802.1Q for VLAN tagging).
- Ethernet frames can be managed for flow control using techniques like pause frames.
2.3. Layer 3 - Network Layer:
- Routers can implement rate limiting via Access Control Lists (ACLs) and Quality of Service (QoS) configurations, thereby managing the rate of IP packets sent or received.
- Protocols like ICMP can be rate limited to mitigate network attacks.
2.4. Layer 4 - Transport Layer:
- Rate limiting can be enforced using TCP Window Scaling and congestion control mechanisms to regulate flow.
- Rate limiting tools such as firewall rules can be applied on protocols like TCP and UDP to limit connection rates.
2.5. Layer 5 - Session Layer:
- Rate limiting at this layer can be implemented for managing session initiation requests, particularly in VoIP applications.
- Techniques such as SIP rate limiting can help to prevent abuse and manage resource allocation.
2.6. Layer 6 - Presentation Layer:
- Not typically associated with direct rate limiting; occasional measures might be taken on data serialization/deserialization performance.
2.7. Layer 7 - Application Layer:
- Rate limiting is most commonly applied here, using middleware, API gateways, or web servers.
- Common methods include token bucket algorithm, leaky bucket algorithm, and request throttling mechanisms.
3. How to avoid hitting rate limits?
- Understand Rate Limits: Familiarize yourself with the specific rate limits imposed by the API or service you are using, including the number of requests allowed in a given time frame.
- Implement Backoff Strategies: Use exponential backoff strategies to gradually increase the wait time between requests when receiving errors related to rate limits.
- Optimize Request Frequency: Reduce the frequency of your requests by batching operations or consolidating data where possible : Client caches and throttles could help
- Prioritize Requests: Identify and prioritize the most important requests, postponing less critical ones to a later time.
- Use Caching: Implement caching mechanisms to store frequently requested data, thereby reducing the number of requests sent to the server.
- Monitor Usage: Keep track of your request counts and usage patterns to avoid unexpected spikes that could lead to rate limiting.
- Leverage Webhooks: If available, use webhooks to receive data in real-time instead of polling for updates, thus minimizing request volume.
- Error Handling: Implement robust error handling to manage rate limit responses gracefully and to retry requests after cooldown periods.
4. What to do in case of hitting rate limits?
- Optimize API Calls:
- Reduce the frequency of requests by merging multiple operations into a single call if possible.
- Use batch processing to minimize the number of calls.
- Cache Responses:
- Store responses temporarily to reduce the need for repeated requests.
- Consider using a caching strategy based on certain time intervals or specific query parameters.
- Monitor Usage:
- Track and log the number of API calls made to stay within the limits.
- Use tools or libraries that assist in monitoring API traffic.
- Request Higher Limits:
- Some services allow users to request an increase in their rate limits.
- Be prepared to justify your need for more limits.
- Distribute Requests:
- If permitted, distribute requests across multiple user accounts or environments.
- This method can help spread the load and reduce the likelihood of hitting limits.
- Use Alternative Endpoints:
- Check if the service provides alternative endpoints which may have different limits or better performance characteristics.