6. HTTP header manipulation

  1. In HTTP mode, it is possible to rewrite, add or delete some of the request and
  2. response headers based on regular expressions. It is also possible to block a
  3. request or a response if a particular header matches a regular expression,
  4. which is enough to stop most elementary protocol attacks, and to protect
  5. against information leak from the internal network.
  6.  
  7. If HAProxy encounters an "Informational Response" (status code 1xx), it is able
  8. to process all rsp* rules which can allow, deny, rewrite or delete a header,
  9. but it will refuse to add a header to any such messages as this is not
  10. HTTP-compliant. The reason for still processing headers in such responses is to
  11. stop and/or fix any possible information leak which may happen, for instance
  12. because another downstream equipment would unconditionally add a header, or if
  13. a server name appears there. When such messages are seen, normal processing
  14. still occurs on the next non-informational messages.
  15.  
  16. This section covers common usage of the following keywords, described in detail
  17. in section 4.2 :
  18.  
  19. - reqadd <string>
  20. - reqallow <search>
  21. - reqiallow <search>
  22. - reqdel <search>
  23. - reqidel <search>
  24. - reqdeny <search>
  25. - reqideny <search>
  26. - reqpass <search>
  27. - reqipass <search>
  28. - reqrep <search> <replace>
  29. - reqirep <search> <replace>
  30. - reqtarpit <search>
  31. - reqitarpit <search>
  32. - rspadd <string>
  33. - rspdel <search>
  34. - rspidel <search>
  35. - rspdeny <search>
  36. - rspideny <search>
  37. - rsprep <search> <replace>
  38. - rspirep <search> <replace>
  39.  
  40. With all these keywords, the same conventions are used. The <search> parameter
  41. is a POSIX extended regular expression (regex) which supports grouping through
  42. parenthesis (without the backslash). Spaces and other delimiters must be
  43. prefixed with a backslash ('\') to avoid confusion with a field delimiter.
  44. Other characters may be prefixed with a backslash to change their meaning :
  45.  
  46. \t for a tab
  47. \r for a carriage return (CR)
  48. \n for a new line (LF)
  49. \ to mark a space and differentiate it from a delimiter
  50. \# to mark a sharp and differentiate it from a comment
  51. \\ to use a backslash in a regex
  52. \\\\ to use a backslash in the text (*2 for regex, *2 for haproxy)
  53. \xXX to write the ASCII hex code XX as in the C language
  54.  
  55. The <replace> parameter contains the string to be used to replace the largest
  56. portion of text matching the regex. It can make use of the special characters
  57. above, and can reference a substring which is delimited by parenthesis in the
  58. regex, by writing a backslash ('\') immediately followed by one digit from 0 to
  59. 9 indicating the group position (0 designating the entire line). This practice
  60. is very common to users of the "sed" program.
  61.  
  62. The <string> parameter represents the string which will systematically be added
  63. after the last header line. It can also use special character sequences above.

Notes related to these keywords :

  1. - these keywords are not always convenient to allow/deny based on header
  2. contents. It is strongly recommended to use ACLs with the "block" keyword
  3. instead, resulting in far more flexible and manageable rules.
  4.  
  5. - lines are always considered as a whole. It is not possible to reference
  6. a header name only or a value only. This is important because of the way
  7. headers are written (notably the number of spaces after the colon).
  8.  
  9. - the first line is always considered as a header, which makes it possible to
  10. rewrite or filter HTTP requests URIs or response codes, but in turn makes
  11. it harder to distinguish between headers and request line. The regex prefix
  12. ^[^\ \t]*[\ \t] matches any HTTP method followed by a space, and the prefix
  13. ^[^ \t:]*: matches any header name followed by a colon.
  14.  
  15. - for performances reasons, the number of characters added to a request or to
  16. a response is limited at build time to values between 1 and 4 kB. This
  17. should normally be far more than enough for most usages. If it is too short
  18. on occasional usages, it is possible to gain some space by removing some
  19. useless headers before adding new ones.
  20.  
  21. - keywords beginning with "reqi" and "rspi" are the same as their counterpart
  22. without the 'i' letter except that they ignore case when matching patterns.
  23.  
  24. - when a request passes through a frontend then a backend, all req* rules
  25. from the frontend will be evaluated, then all req* rules from the backend
  26. will be evaluated. The reverse path is applied to responses.
  27.  
  28. - req* statements are applied after "block" statements, so that "block" is
  29. always the first one, but before "use_backend" in order to permit rewriting
  30. before switching.