The IBM Application Gateway (IAG) allows incoming requests to be limited based on a user-defined set of criteria and policy.
Rate limiting achieves the following protections:
- Brute force attacks on sensitive information such as passwords or PINs;
- Denial of Service attacks on a server or the Web Reverse Proxy.
Rate limiting is performed by taking the incoming request and identifying the parts of the request that makes it unique to a client. Information such as the IP address that made the connection, a session cookie, other header information or the URL and HTTP method that were used can all be included to identify a client. When duplicate requests come in they fill up a counter; which when full causes the IAG to return an error to the client or drop the incoming connection.
Rate limiting can be deployed with two types of end users in consideration.
The malicious end user who tries to cause damage to the service or steal a credential with brute force.
There are two approaches to handle a malicious user:
- The first approach is to handle them in the most efficient manner by closing the connection without performing any further request processing;
- The second approach is to mislead the malicious user by returning false information, making it unclear to the user, if the operation was successful or even processed.
A valid user who needs to be stopped from making too many requests in order to stop service degradation or overloading a back end server:
- A page that provides more information is more suitable for a end user who is intended to use the service.
A critical aspect of the rate limiting capability is being able to uniquely identify the client which originated the request. The IAG container that performs the rate limiting might not be the perimeter device in a connection from a client, and thus the incoming IP might not correspond to the IP address from the end user. In this instance, the device that is at the network perimeter should include the client address information in a header such as X-Forwarded-For, which can then be used by the IAG to identify the source of the request.
Rate limiting is configured by authoring a policy. This policy contains the following items:
- Requests that the policy applies to. The request is identified by the HTTP method and path;
A rate limiting rule, which defines how the rate limiting policy should be applied. The rate limiting rule consists of:
- The information in the request that identifies the client. This includes; cookies, headers, query string parameters, the client IP address and credential attributes, if available;
- The threshold for how many identical requests a client can make before they need to be rate limited and how long a client remains rate limited;
- How to respond when a client is rate limited.
A rate limiting policy is defined in the IAG configuration YAML file.
Rate limiting can occur in two places during the IAG processing flow:
If the rate limiting policy does not include credential attributes as a value to use in the request the rate limiting processing occurs very early when processing a request:
- This is before the authentication and authorization processing;
If the rate limiting policy includes credential attributes the rate limiting processing will take place after the user's session has been verified:
- This is after HTTP transformations and authentication, but prior to authorization checks;
Please note that if rate limiting is applied early in the processing and the URL is re-written, the rate limiting check is not performed again.
All matching rate limiting polices are applied to each request, such that if a request satisfies multiple policies they are all applied until one results in a reaction. This allows rate limiting policies to be layered. An example of a layered rate limiting policy, which can be used to rate limit a user account on a device, requires two policies, which are described below:
- The first policy limits a user's ability to request resources from a protected application based on client IP and HTTP method;
- Then you can use a second policy to rate limit the use of a session cookie, with a reaction which will invoke the logout page.
This way, when a user abuses the session, they are logged out and must log in again. With the first policy in place, authentication is also a limited operation.
A rate limiting policy is specified in the IAG configuration YAML file under the 'policies' section. These policies can be applied to gateway resources.
For example, to limit 'GET' requests to 3 requests per second, from each client IP address, for all resources under the path "/my_app" :
resource_servers: - path: "/my_app" connection_type: "tcp" servers: - host: "10.10.10.200" port: 1337 transparent_path: false policies: rate_limiting: - name: "limited_by_ip" methods: - "GET" paths: - "/my_app* rule: | ip: true capacity: 3 interval: 1 reaction: TEMPLATE
See Rate Limiting Example for an example IAG configuration YAML file with rate limiting policy.
The two parts of the matching criteria are HTTP methods and the request path. If a request matches the criteria the rate limiting policy is applied.
- The request path is specified with the 'path' key value. The path pattern supports wild cards. Only one request path can be specified;
- The HTTP method is a list of methods to match on, or the value of '*' to indicate that all HTTP methods will match. Multiple methods can be specified.
- All matching on path and method is case insensitive.
Each matching criteria will correspond to a single rate limiting bucket. In other words, the access rate for each matching criteria is treated separately.
An example which will limit attempts to POST to an application path is:
policies: rate_limiting: - name: "limited_by_ip" methods: - "POST" - "*" paths: - "/my_app*"
If you want to match on every request the following can be specified:
policies: rate_limiting: - name: "limited_by_ip" methods: - "*" paths: - "*"
To match all POST and GET requests regardless of URL:
policies: rate_limiting: - name: "limited_by_ip" methods: - "GET" - "POST" paths: - "*"
An example, which will apply to all requests to a given application path:
policies: rate_limiting: - name: "limited_by_ip" methods: - "*" paths: - "/my_app"
Trace records for rate limiting can be sent to a file within the IAG container. The trace component is 'pdweb.http.ratelimit'.
The trace level governs how much detail is logged. A level of '9' provides the most detailed output.
logging: tracing: - file_name: /var/tmp/http-ratelimit.log component: pdweb.http.ratelimit level: 9
See Enable Tracing for more information on configuring tracing.
The rate limiting rules consists of three key portions.
The following attributes can be used to match an incoming request:
- query string parameters;
- client IP;
- credential attributes.
Wild card characters can be used to match an attribute. However, when a match occurs the actual value rather than the configured pattern is used to identify the request. All matching is case insensitive.
If the request does not contain the specified header, cookie, query string parameter, or credential attribute, the request is not rate limited.
An example of limiting incoming requests based on user credentials would be:
rule: | header: 'Authorization: "Bearer *"' ip: true capacity: 3 interval: 60 reaction: TEMPLATE
An example, which limits the number of session created, involves rate limiting on the PD-S-SESSION-ID cookie:
rule: | cookie: 'PD-S-SESSION-ID: "*"' ip: true capacity: 3 interval: 60 reaction: TEMPLATE
An example of rate limiting based on a query string (for example: /my_app?resource=123) would be:
rule: | query: 'resource: 123' ip: true capacity: 3 interval: 60 reaction: TEMPLATE
An example of limiting access to any resource would be:
rule: | query: 'resource: "*"' ip: true capacity: 3 interval: 60 reaction: TEMPLATE
An example which uses a credential attribute to limit access of users based on username would be:
rule: | credential: 'AZN_CRED_PRINCIPAL_NAME: "*"' ip: true capacity: 3 interval: 60 reaction: TEMPLATE
A boolean flag is also available to control whether the client IP address is also used to match the request. For example:
rule: | credential: 'AZN_CRED_PRINCIPAL_NAME: "*"' ip: false capacity: 3 interval: 60 reaction: TEMPLATE
Once a request has been matched the identifying attribute (i.e. URL, Method, IP, cookies, headers, query parameters, etc) are then built into a consistent lookup key.
A counter is kept for this key. This counter has two properties, the maximum value and how often the count is reset back to zero. These values are controlled by two configuration properties:
- A capacity which indicates the number of requests that are allowed until rate limiting occurs;
- An interval which is the number of seconds that must pass until the capacity is reset.
For example, to allow 100 requests per minute, set the following:
rule: | ip: true capacity: 100 interval: 60 reaction: TEMPLATE
The final part of the configuration is the reaction method. There are three ways to react:
- Close the connection without sending any information back to the client.
- Send the rate limiting template, with the status code 429 'Too Many Requests', back to the client. The default page returned to the client is the management page entitled: 'ratelimit.html'. See Defining Custom Responses for more information on customizing IAG pages if a customized page is required.
A URL can be specified. This URL rewrites the request URL to the defined value:
- This can be used to route a rate limited request to a dummy resource which appears to produce the same functionality as the actual resource. The intent of this is to mislead a malicious client into thinking they are performing numerous operations while actually not having any negative impact on the system.
- This can also be used to log a user out. For example, you can direct them to /pkmslogout to terminate their session.
Please note that providing a reaction is optional. If no reaction is provided the 'TEMPLATE' reaction will be used.
For example, to close a rate limited connection:
rule: | ip: true capacity: 100 interval: 60 reaction: CLOSE
To return the template response for a rate limited connection:
rule: | ip: true capacity: 100 interval: 60 reaction: TEMPLATE
To re-write the URL for a rate limited connection:
rule: | ip: true capacity: 100 interval: 60 reaction: "/dummy-login"
When performing rate limiting some information about the number of requests made for a client and policy must be stored. This information is stored in a cache which has a size limit. When this limit is exceeded the oldest entry is ejected. This effectively resets the rate limiting counters for this client.
It is important to ensure that you set the configuration entry, 'max-ratelimiting-buckets', to a suitably high number so that a malicious client cannot saturate the cache. This number needs to be higher than the number of requests being rate limited across a refresh interval. If this value is not set it defaults to '16384'. Refer to the following YAML reference page for details on setting this configuration entry: max-ratelimiting-buckets.
When there is a load balancer or other network device, one which terminates the connection before forwarding the request onto the IAG, the IP address flag is not useful. The IP address is effectively static as it will always be set to the address of the network terminating device. In order to rate limit on an IP address in this situation, have the network terminating device include the client IP address as a header, and include this in the rate limiting configuration. For example:
rule: | header: 'X-Forwarded-For: "*"' ip: true capacity: 3 interval: 60 reaction: TEMPLATE
When rate limiting occurs the username which is found in the corresponding request log entry will always be set to 'unauthenticated'. This is because the rate limiting occurs before the authenticated user is identified. With no username value being present in the request log entry it is often useful to include the Client IP address or the X-Forwarded-For header in the request log to help correlate and identify rate limited entries.
When a connection is closed due to rate limiting, the status code which is found in the corresponding request log entry will be set to '-1'. This signifies that the request was dropped without a status.