Configuring HTTP requests

Go also supplies a lower-level interface for user agents to communicate with HTTP servers. As you might expect, not only does it give you more control over the client requests, but requires you to spend more effort in building the requests. However, there is only a small increase.

The data type used to build requests is the type Request. This is a complex type, and is given in the Go documentation as

  1. type Request struct {
  2. Method string // GET, POST, PUT, etc.
  3. RawURL string // The raw URL given in the request.
  4. URL *URL // Parsed URL.
  5. Proto string // "HTTP/1.0"
  6. ProtoMajor int // 1
  7. ProtoMinor int // 0
  8. // A header maps request lines to their values.
  9. // If the header says
  10. //
  11. // accept-encoding: gzip, deflate
  12. // Accept-Language: en-us
  13. // Connection: keep-alive
  14. //
  15. // then
  16. //
  17. // Header = map[string]string{
  18. // "Accept-Encoding": "gzip, deflate",
  19. // "Accept-Language": "en-us",
  20. // "Connection": "keep-alive",
  21. // }
  22. //
  23. // HTTP defines that header names are case-insensitive.
  24. // The request parser implements this by canonicalizing the
  25. // name, making the first character and any characters
  26. // following a hyphen uppercase and the rest lowercase.
  27. Header map[string]string
  28. // The message body.
  29. Body io.ReadCloser
  30. // ContentLength records the length of the associated content.
  31. // The value -1 indicates that the length is unknown.
  32. // Values >= 0 indicate that the given number of bytes may be read from Body.
  33. ContentLength int64
  34. // TransferEncoding lists the transfer encodings from outermost to innermost.
  35. // An empty list denotes the "identity" encoding.
  36. TransferEncoding []string
  37. // Whether to close the connection after replying to this request.
  38. Close bool
  39. // The host on which the URL is sought.
  40. // Per RFC 2616, this is either the value of the Host: header
  41. // or the host name given in the URL itself.
  42. Host string
  43. // The referring URL, if sent in the request.
  44. //
  45. // Referer is misspelled as in the request itself,
  46. // a mistake from the earliest days of HTTP.
  47. // This value can also be fetched from the Header map
  48. // as Header["Referer"]; the benefit of making it
  49. // available as a structure field is that the compiler
  50. // can diagnose programs that use the alternate
  51. // (correct English) spelling req.Referrer but cannot
  52. // diagnose programs that use Header["Referrer"].
  53. Referer string
  54. // The User-Agent: header string, if sent in the request.
  55. UserAgent string
  56. // The parsed form. Only available after ParseForm is called.
  57. Form map[string][]string
  58. // Trailer maps trailer keys to values. Like for Header, if the
  59. // response has multiple trailer lines with the same key, they will be
  60. // concatenated, delimited by commas.
  61. Trailer map[string]string
  62. }

There is a lot of information that can be stored in a request. You do not need to fill in all fields, only those of interest. The simplest way to create a request with default values is by for example

  1. request, err := http.NewRequest("GET", url.String(), nil)

Once a request has been created, you can modify fields. For example, to specify that you only wish to receive UTF-8, add an “Accept-Charset” field to a request by

  1. request.Header.Add("Accept-Charset", "UTF-8;q=1, ISO-8859-1;q=0")

(Note that the default set ISO-8859-1 always gets a value of one unless mentioned explicitly in the list.).

A client setting a charset request is simple by the above. But there is some confusion about what happens with the server’s return value of a charset. The returned resource should have a Content-Type which will specify the media type of the content such as text/html. If appropriate the media type should state the charset, such as text/html; charset=UTF-8. If there is no charset specification, then according to the HTTP specification it should be treated as the default ISO8859-1 charset. But the HTML 4 specification states that since many servers don’t conform to this, then you can’t make any assumptions.

If there is a charset specified in the server’s Content-Type, then assume it is correct. if there is none specified, since 50% of pages are in UTF-8 and 20% are in ASCII then it is safe to assume UTF-8. Only 30% of pages may be wrong :-(.