6. HTTP header manipulation
- In HTTP mode, it is possible to rewrite, add or delete some of the request and
- response headers based on regular expressions. It is also possible to block a
- request or a response if a particular header matches a regular expression,
- which is enough to stop most elementary protocol attacks, and to protect
- against information leak from the internal network.
-
- If HAProxy encounters an "Informational Response" (status code 1xx), it is able
- to process all rsp* rules which can allow, deny, rewrite or delete a header,
- but it will refuse to add a header to any such messages as this is not
- HTTP-compliant. The reason for still processing headers in such responses is to
- stop and/or fix any possible information leak which may happen, for instance
- because another downstream equipment would unconditionally add a header, or if
- a server name appears there. When such messages are seen, normal processing
- still occurs on the next non-informational messages.
-
- This section covers common usage of the following keywords, described in detail
- in section 4.2 :
-
- - reqadd <string>
- - reqallow <search>
- - reqiallow <search>
- - reqdel <search>
- - reqidel <search>
- - reqdeny <search>
- - reqideny <search>
- - reqpass <search>
- - reqipass <search>
- - reqrep <search> <replace>
- - reqirep <search> <replace>
- - reqtarpit <search>
- - reqitarpit <search>
- - rspadd <string>
- - rspdel <search>
- - rspidel <search>
- - rspdeny <search>
- - rspideny <search>
- - rsprep <search> <replace>
- - rspirep <search> <replace>
-
- With all these keywords, the same conventions are used. The <search> parameter
- is a POSIX extended regular expression (regex) which supports grouping through
- parenthesis (without the backslash). Spaces and other delimiters must be
- prefixed with a backslash ('\') to avoid confusion with a field delimiter.
- Other characters may be prefixed with a backslash to change their meaning :
-
- \t for a tab
- \r for a carriage return (CR)
- \n for a new line (LF)
- \ to mark a space and differentiate it from a delimiter
- \# to mark a sharp and differentiate it from a comment
- \\ to use a backslash in a regex
- \\\\ to use a backslash in the text (*2 for regex, *2 for haproxy)
- \xXX to write the ASCII hex code XX as in the C language
-
- The <replace> parameter contains the string to be used to replace the largest
- portion of text matching the regex. It can make use of the special characters
- above, and can reference a substring which is delimited by parenthesis in the
- regex, by writing a backslash ('\') immediately followed by one digit from 0 to
- 9 indicating the group position (0 designating the entire line). This practice
- is very common to users of the "sed" program.
-
- The <string> parameter represents the string which will systematically be added
- after the last header line. It can also use special character sequences above.
Notes related to these keywords :
- - these keywords are not always convenient to allow/deny based on header
- contents. It is strongly recommended to use ACLs with the "block" keyword
- instead, resulting in far more flexible and manageable rules.
-
- - lines are always considered as a whole. It is not possible to reference
- a header name only or a value only. This is important because of the way
- headers are written (notably the number of spaces after the colon).
-
- - the first line is always considered as a header, which makes it possible to
- rewrite or filter HTTP requests URIs or response codes, but in turn makes
- it harder to distinguish between headers and request line. The regex prefix
- ^[^\ \t]*[\ \t] matches any HTTP method followed by a space, and the prefix
- ^[^ \t:]*: matches any header name followed by a colon.
-
- - for performances reasons, the number of characters added to a request or to
- a response is limited at build time to values between 1 and 4 kB. This
- should normally be far more than enough for most usages. If it is too short
- on occasional usages, it is possible to gain some space by removing some
- useless headers before adding new ones.
-
- - keywords beginning with "reqi" and "rspi" are the same as their counterpart
- without the 'i' letter except that they ignore case when matching patterns.
-
- - when a request passes through a frontend then a backend, all req* rules
- from the frontend will be evaluated, then all req* rules from the backend
- will be evaluated. The reverse path is applied to responses.
-
- - req* statements are applied after "block" statements, so that "block" is
- always the first one, but before "use_backend" in order to permit rewriting
- before switching.