Go web客户端进阶

由于上一节的 web 客户端相当简单并没有任何灵活性,在这节您将学习如何更优雅的读取一个 URL,不使用 http.Get() 函数并且没有更多选项。这个演示程序命名为 advancedWelClient.go,并分为五个部分展示。

advancedWebClient.go 的第一部份包含如下代码:

  1. package main
  2. import (
  3. "fmt"
  4. "net/http"
  5. "net/http/httputil"
  6. "net/url"
  7. "os"
  8. "path/filepath"
  9. "strings"
  10. "time"
  11. )

advancedWebClient.go 的第二部分如下:

  1. func main() {
  2. if len(os.Args) != 2 {
  3. fmt.Printf("Usage: %s URL\n", filepath.Base(os.Args[0]))
  4. return
  5. }
  6. URL, err := url.Parse(os.Args[1])
  7. if err != nil {
  8. fmt.Println("Error in parsing:", err)
  9. return
  10. }

advancedWebClient.go 的第三部分代码如下:

  1. c := &http.Client{
  2. Timeout: 15 * time.Second,
  3. }
  4. request, err := http.NewRequest("GET", URL.String(), nil)
  5. if err != nil {
  6. fmt.Println("Get:", err)
  7. return
  8. }
  9. httpData, err := c.Do(request)
  10. if err != nil {
  11. fmt.Println("Error in Do():", err)
  12. return
  13. }

http.NewRequest() 函数返回一个 http.Request 对象,它被赋予一个请求方法,一个 URL 和一个可选的消息体。http.Do() 函数使用 http.Client对象发送一个 HTTP 请求(htt.Request),并获得一个 HTTP 响应(http.Response)。http.Do() 以一种更易理解的方式做了 http.Get() 的工作。

http.NewRequest() 使用的 GET 字符串可以用 http.MethodGet 替换。

advancedWebClient.go 的第四部分包含代码如下:

  1. fmt.Println("Status code:", httpData.Status)
  2. header, _ := httputil.DumpResponse(httpData, false)
  3. fmt.Println(string(header))
  4. contentType := httpData.Header.Get("Content-Type")
  5. characterSet := strings.SplitAfter(contentType, "charset=")
  6. if len(characterSet) > 1 {
  7. fmt.Println("Character Set:", characterSet[1])
  8. }
  9. if httpData.ContentLength == -1 {
  10. fmt.Println("ContentLength is unknown!")
  11. } else {
  12. fmt.Println("ContentLength:", httpData.ContentLength)
  13. }

上面这段代码,您能看到如何开始搜索服务器响应来找到我们想要的。

advancedWebClient.go 的最后一部分如下:

  1. length := 0
  2. var buffer [1024]byte
  3. r := httpData.Body
  4. for {
  5. n, err := r.Read(buffer[0:])
  6. if err != nil {
  7. fmt.Println(err)
  8. break
  9. }
  10. length = length + n
  11. }
  12. fmt.Println("Calculated response data length:", length)
  13. }

上面这段代码,您能看到一个计算服务器 HTTP 响应大小的技巧。如果您想显示这个 HTML 输出在您的屏幕上,您可以打印这个 r buffer 变量内容。

使用 advancedWebClient.go 访问一个 web 页面将产生如下比之前更丰富的输出:

  1. $ go run advancedWebClient.go http://www.mtsoukalos.eu
  2. Status code: 200 OK
  3. HTTP/1.1 200 OK
  4. Accept-Ranges: bytes
  5. Age: 0
  6. Cache-Control: no-cache, must-revalidate
  7. Connection: keep-alive
  8. Content-Language: en
  9. Content-Type: text/html; charset=utf-8
  10. Date: Sat, 24 Mar 2018 18:52:17 GMT
  11. Expires: Sun, 19 Nov 1978 05:00:00 GMT
  12. Server: Apache/2.4.25 (Debian) PHP/5.6.33-0+deb8u1 mod_wsgi/4.5.11 Python/2.7
  13. Vary: Accept-Encoding
  14. Via: 1.1 varnish (Varnish/5.0)
  15. X-Content-Type-Options: nosniff
  16. X-Frame-Options: SAMEORIGIN
  17. X-Generator: Drupal 7 (http://drupal.org)
  18. X-Powered-By: PHP/5.6.33-0+deb8u1
  19. X-Varnish: 886025
  20. Character Set: utf-8
  21. ContentLength is unknown!
  22. EOF
  23. Calculated response data length: 50176

执行 advancedWebClient.go 访问一个不同的 URL 将返回一个稍有不同的输出:

  1. $ go run advancedWebClient.go http://www.google.com
  2. Status code: 200 OK
  3. HTTP/1.1 200 OK
  4. Cache-Control: private, max-age=0
  5. Content-Type: text/html; charset=ISO-8859-7
  6. Date: Sat, 24 Mar 2018 18:52:38 GMT
  7. Expires: -1
  8. P3p: CP="This is not a P3P policy! See g.co/p3phelp for more info."
  9. Server: gws
  10. Set-Cookie: 1P_JAR=2018-03-24-18; expires=Mon, 23-Apr-2018 18:52:38 GMT; path=/;domain=.google.gr
  11. Set-Cookie:
  12. NID=126=csX1_koD30SJcC_1jAfcM2V8kTfRkppmAdmLjINLfclracMxuk6JGe4glc0Pjs8uD00bqGaxkSW-J-ZNDJexG2ZX9pNB9E_dRc2y1KZ05V7pk0boczE2FtS1zb50Uof1; expires=Sun, 23-Sep-2018 18:52:38 GMT; path=/; domain=.google.gr; HttpOnly
  13. X-Frame-OPtions: SAMEORIGIN
  14. X-Xss-Protection: 1; mode=block
  15. Character Set: ISO-8859-7
  16. ContentLength in unknown!
  17. EOF
  18. Calculated response data length: 10240

如果您使用 advancedWebClient.go 试图获取一个错误的 URL,将获得以下输出:

  1. $ go run advancedWebClient.go http://www.google
  2. Error in Do(): Get http://www.google: dial tcp: lookup www.google: no such host
  3. $ go run advancedWebClient.go www.google.com
  4. Error in Do(): Get www.google.com: unsupported protocol scheme ""

随意修改 advancedWebClient.go 以达到您想要的输出!