Go web客户端进阶
由于上一节的 web 客户端相当简单并没有任何灵活性,在这节您将学习如何更优雅的读取一个 URL,不使用 http.Get() 函数并且没有更多选项。这个演示程序命名为 advancedWelClient.go
,并分为五个部分展示。
advancedWebClient.go
的第一部份包含如下代码:
package main
import (
"fmt"
"net/http"
"net/http/httputil"
"net/url"
"os"
"path/filepath"
"strings"
"time"
)
advancedWebClient.go
的第二部分如下:
func main() {
if len(os.Args) != 2 {
fmt.Printf("Usage: %s URL\n", filepath.Base(os.Args[0]))
return
}
URL, err := url.Parse(os.Args[1])
if err != nil {
fmt.Println("Error in parsing:", err)
return
}
advancedWebClient.go
的第三部分代码如下:
c := &http.Client{
Timeout: 15 * time.Second,
}
request, err := http.NewRequest("GET", URL.String(), nil)
if err != nil {
fmt.Println("Get:", err)
return
}
httpData, err := c.Do(request)
if err != nil {
fmt.Println("Error in Do():", err)
return
}
http.NewRequest()
函数返回一个 http.Request
对象,它被赋予一个请求方法,一个 URL 和一个可选的消息体。http.Do()
函数使用 http.Client
对象发送一个 HTTP 请求(htt.Request),并获得一个 HTTP 响应(http.Response)。http.Do()
以一种更易理解的方式做了 http.Get()
的工作。
在 http.NewRequest()
使用的 GET
字符串可以用 http.MethodGet
替换。
advancedWebClient.go
的第四部分包含代码如下:
fmt.Println("Status code:", httpData.Status)
header, _ := httputil.DumpResponse(httpData, false)
fmt.Println(string(header))
contentType := httpData.Header.Get("Content-Type")
characterSet := strings.SplitAfter(contentType, "charset=")
if len(characterSet) > 1 {
fmt.Println("Character Set:", characterSet[1])
}
if httpData.ContentLength == -1 {
fmt.Println("ContentLength is unknown!")
} else {
fmt.Println("ContentLength:", httpData.ContentLength)
}
上面这段代码,您能看到如何开始搜索服务器响应来找到我们想要的。
advancedWebClient.go
的最后一部分如下:
length := 0
var buffer [1024]byte
r := httpData.Body
for {
n, err := r.Read(buffer[0:])
if err != nil {
fmt.Println(err)
break
}
length = length + n
}
fmt.Println("Calculated response data length:", length)
}
上面这段代码,您能看到一个计算服务器 HTTP 响应大小的技巧。如果您想显示这个 HTML 输出在您的屏幕上,您可以打印这个 r
buffer 变量内容。
使用 advancedWebClient.go
访问一个 web 页面将产生如下比之前更丰富的输出:
$ go run advancedWebClient.go http://www.mtsoukalos.eu
Status code: 200 OK
HTTP/1.1 200 OK
Accept-Ranges: bytes
Age: 0
Cache-Control: no-cache, must-revalidate
Connection: keep-alive
Content-Language: en
Content-Type: text/html; charset=utf-8
Date: Sat, 24 Mar 2018 18:52:17 GMT
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Server: Apache/2.4.25 (Debian) PHP/5.6.33-0+deb8u1 mod_wsgi/4.5.11 Python/2.7
Vary: Accept-Encoding
Via: 1.1 varnish (Varnish/5.0)
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Generator: Drupal 7 (http://drupal.org)
X-Powered-By: PHP/5.6.33-0+deb8u1
X-Varnish: 886025
Character Set: utf-8
ContentLength is unknown!
EOF
Calculated response data length: 50176
执行 advancedWebClient.go
访问一个不同的 URL 将返回一个稍有不同的输出:
$ go run advancedWebClient.go http://www.google.com
Status code: 200 OK
HTTP/1.1 200 OK
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-7
Date: Sat, 24 Mar 2018 18:52:38 GMT
Expires: -1
P3p: CP="This is not a P3P policy! See g.co/p3phelp for more info."
Server: gws
Set-Cookie: 1P_JAR=2018-03-24-18; expires=Mon, 23-Apr-2018 18:52:38 GMT; path=/;domain=.google.gr
Set-Cookie:
NID=126=csX1_koD30SJcC_1jAfcM2V8kTfRkppmAdmLjINLfclracMxuk6JGe4glc0Pjs8uD00bqGaxkSW-J-ZNDJexG2ZX9pNB9E_dRc2y1KZ05V7pk0boczE2FtS1zb50Uof1; expires=Sun, 23-Sep-2018 18:52:38 GMT; path=/; domain=.google.gr; HttpOnly
X-Frame-OPtions: SAMEORIGIN
X-Xss-Protection: 1; mode=block
Character Set: ISO-8859-7
ContentLength in unknown!
EOF
Calculated response data length: 10240
如果您使用 advancedWebClient.go
试图获取一个错误的 URL,将获得以下输出:
$ go run advancedWebClient.go http://www.google
Error in Do(): Get http://www.google: dial tcp: lookup www.google: no such host
$ go run advancedWebClient.go www.google.com
Error in Do(): Get www.google.com: unsupported protocol scheme ""
随意修改 advancedWebClient.go
以达到您想要的输出!