http库源码解析

"go Study"

Posted by ming on April 28, 2022

“The hard thing to do and the right thing to do are usually the same thing.”

1. golang搭建Http服务的原理

首先,我们给出一个简单的HTTP服务器的例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
package main

import (
    "fmt"
    "net/http"
    "log"
)

func sayHelloWorld(w http.ResponseWriter, r *http.Request)  {
	r.ParseForm() // 解析参数
	fmt.Println(r.Form) // 在服务端打印请求参数
	fmt.Println("URL:", r.URL.Path)  // 请求 URL
	fmt.Println("Scheme:", r.URL.Scheme)
	for k, v := range r.Form {
		fmt.Println(k, ":", strings.Join(v, ""))
	}
	fmt.Fprintf(w, "你好!")  // 发送响应到客户端
}

func main()  {
	http.HandleFunc("/", sayHelloWorld)
	err := http.ListenAndServe(":9091", nil)
	if err != nil {
		log.Fatal("ListenAndServe: ", err)
	}
}

对于一个HTTP服务器,其实本质上就是其如何接收处理客户端传递过来的request,并向客户端返回相应的response的过程。而在接收request的过程中,最重要的莫过于路由(router),即如何实现一个Multiplexer。对于GO而言,我们既可以使用内置的mutilplexer--DefaultServeMux,也可以自定义。Multiplexer路由的目的就是为了找到处理器函数(hander),后者将对request进行处理,同时构建response。

简单地总结下请求的整个流程如下:

Client -> Request -> [Multiplexer(router)] -> handler -> Response -> Client

因为,了解golang服务http的整体细节,最重要的就是要理解Multiplexerhandler。其实,Golang中的Multiplexer基于ServerMux结构,同时也实现了Handler接口。

1
2
3
4
5
6
7
8
9
10
type ServeMux struct {
	mu    sync.RWMutex
	m     map[string]muxEntry
	es    []muxEntry // slice of entries sorted from longest to shortest.
	hosts bool       // whether any patterns contain hostnames
}

type Handler interface {
	ServeHTTP(ResponseWriter, *Request)
}

1.1 概念介绍

在介绍具体的源码实现细节之前,我们先来讲述几个概念:

  • 路由:实际开发中一个HTTP接口会有许多的URL对应的Handler,如何根据URL找到对应的Handler就是路由所起的作用。
  • handler函数:具有func(w http.ResponseWriter, r *http.Requests)签名的函数,例如我们上面例子定义的sayHelloWorld函数就是。
  • handler处理器(函数):经过HanderFunc结构包装的handler函数,它实现了ServeHTTP接口方法的函数。调用handler处理器的ServeHTTP方法时,即调用handler函数本身.

    其实这句话挺绕的,让人比较不容易理解。其实func(w http.ResponseWriter, r *http.Requests)签名也可以被看成一种类型(http库里面就是对这种类型重定义了下,定义成了HandlerFunc这种类型),因为也可以为其实现相应的方法。直接上代码就比较好理解了。

1
2
3
4
5
6
7
8
9
10
// The HandlerFunc type is an adapter to allow the use of
// ordinary functions as HTTP handlers. If f is a function
// with the appropriate signature, HandlerFunc(f) is a
// Handler that calls f.
type HandlerFunc func(ResponseWriter, *Request)

// ServeHTTP calls f(w, r).
func (f HandlerFunc) ServeHTTP(w ResponseWriter, r *Request) {
	f(w, r)
}
  • handler对象:实现了Handler接口ServerHttp方法的结构。

Golang的http处理流程可以用下面的一张图来表示:

http流程

1.1.1 路由

net/http库中ServeMux(mux是multiplexor缩写)帮我们实现了这个路由功能。我们可以自定义相关的路由进行处理:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
package main

import (
	"io"
	"net/http"
)

func header(w http.ResponseWriter, r *http.Request) {
	encoding := r.Header["Accept-Encoding"]
	var encodingStr []byte
	for _, value := range encoding {
		encodingStr = append(encodingStr, []byte(value)...)
	}
	w.Write(encodingStr)
}

func hello(w http.ResponseWriter, r *http.Request) {
	io.WriteString(w, "hello world\n")
}

func main() {
	mux := http.NewServeMux()
	mux.HandleFunc("/header", header)
	mux.HandleFunc("/hello", hello)
	http.ListenAndServe(":8080", mux)
}

分析下代码,先通过http.NewServeMux()返回一个ServeMux结构体,然后调mux.HandleFunc()绑定Handler到对应的URL上。我们之前的例子调用的是http.HandleFunc(),其实它内部调用的是DefaultServeMux.HandleFunc(pattern, handler),这个net库提供的一个默认的ServerMux

1
2
3
4
// DefaultServeMux is the default ServeMux used by Serve.
var DefaultServeMux = &defaultServeMux

var defaultServeMux ServeMux

1.1.2 ServeMux

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
type ServeMux struct {
	mu    sync.RWMutex
	m     map[string]muxEntry
	es    []muxEntry // slice of entries sorted from longest to shortest.
	hosts bool       // whether any patterns contain hostnames
}

type muxEntry struct {
	h       Handler
	pattern string
}

// ServeHTTP dispatches the request to the handler whose
// pattern most closely matches the request URL.
func (mux *ServeMux) ServeHTTP(w ResponseWriter, r *Request) {
	if r.RequestURI == "*" {
		if r.ProtoAtLeast(1, 1) {
			w.Header().Set("Connection", "close")
		}
		w.WriteHeader(StatusBadRequest)
		return
	}
	h, _ := mux.Handler(r)
	h.ServeHTTP(w, r)
}

ServeMux结构中最重要的字段就是map结构m,其key是一些url模式,value是一个muxEntry结构,后者里面存储了具体的url模式和相关的handler。

从上面的源码我们也可以看到,ServeMux也实现了ServeHTTP接口,也算是一个handler,不过ServeMuxServeHTTP方法并不是用来处理request和response的,而是用来找到路由注册的handler(本质上是用来透传相应的参数)。

1.1.3 Server

Server也是很重要的一个结构,从之前http.ListenAndServe的源码可以看出,它创建了一个server对象,并调用server对象的ListenAndServe方法:

1
2
3
4
func ListenAndServe(addr string, handler Handler) error {
	server := &Server{Addr: addr, Handler: handler}
	return server.ListenAndServe()
}

给出Server的结构如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
type Server struct {
	Addr string

	Handler Handler // handler to invoke, http.DefaultServeMux if nil
	TLSConfig *tls.Config
	ReadTimeout time.Duration
	ReadHeaderTimeout time.Duration
	WriteTimeout time.Duration
	IdleTimeout time.Duration
	MaxHeaderBytes int
	TLSNextProto map[string]func(*Server, *tls.Conn, Handler)
	ConnState func(net.Conn, ConnState)
	ErrorLog *log.Logger
	BaseContext func(net.Listener) context.Context
	ConnContext func(ctx context.Context, c net.Conn) context.Context

	disableKeepAlives int32     // accessed atomically.
	inShutdown        int32     // accessed atomically (non-zero means we're in Shutdown)
	nextProtoOnce     sync.Once // guards setupHTTP2_* init
	nextProtoErr      error     // result of http2.ConfigureServer if used

	mu         sync.Mutex
	listeners  map[*net.Listener]struct{}
	activeConn map[*conn]struct{}
	doneChan   chan struct{}
	onShutdown []func()
}

server结构存储了服务器处理请求常见的字段。其中Handler字段也保留Hander接口。如果Server接口没有提供Handler结构对象,那么会使用DefaultServeMux做Multiplexer。

2.源码解析

2.1 路由注册部分

我们先从http.HandleFunc()方法看起:

1
2
3
4
5
6
// HandleFunc registers the handler function for the given pattern
// in the DefaultServeMux.
// The documentation for ServeMux explains how patterns are matched.
func HandleFunc(pattern string, handler func(ResponseWriter, *Request)) {
	DefaultServeMux.HandleFunc(pattern, handler)
}

这里面只有一行,调用DefaultServeMux.HandleFunc(pattern, handler):

1
2
3
4
5
6
7
// HandleFunc registers the handler function for the given pattern.
func (mux *ServeMux) HandleFunc(pattern string, handler func(ResponseWriter, *Request)) {
	if handler == nil {
		panic("http: nil handler")
	}
	mux.Handle(pattern, HandlerFunc(handler))
}

上述代码中,HandlerFunc是一个函数类型。同时实现了Handler接口的ServeHTTP方法。使用HandlerFunc类型包装一下路由定义的indexHandler函数,其目的就是为了让这个函数也实现ServeHTTP方法,即转变成一个handler处理器(函数)。

其实,本质上就是让我们的indexHandler函数也有了ServeHTTP方法。

此外,ServeMux的Handle方法,将会对pattern和handler函数做一个map映射:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
// Handle registers the handler for the given pattern.
// If a handler already exists for pattern, Handle panics.
func (mux *ServeMux) Handle(pattern string, handler Handler) {
	mux.mu.Lock()
	defer mux.mu.Unlock()

	if pattern == "" {
		panic("http: invalid pattern")
	}
	if handler == nil {
		panic("http: nil handler")
	}
	if _, exist := mux.m[pattern]; exist {
		panic("http: multiple registrations for " + pattern)
	}

	if mux.m == nil {
		mux.m = make(map[string]muxEntry)
	}
	e := muxEntry{h: handler, pattern: pattern}
	mux.m[pattern] = e
	if pattern[len(pattern)-1] == '/' {
		mux.es = appendSorted(mux.es, e)
	}

	if pattern[0] != '/' {
		mux.hosts = true
	}
}

通过上面的代码,我们可以看到Handle函数的主要目的就是在于把handler和pattern模式绑定到map[string]muxEntry的map上,URL其实就是作为map的key。这里面还有一个细节点,appendSorted()函数:

1
2
3
if pattern[len(pattern)-1] == '/' {
	mux.es = appendSorted(mux.es, e)
}

判断pattern是否是以 '/'结尾,如果是的话表示后面可以跟其他子路径,如果不是表示的是一个叶子节点。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
func appendSorted(es []muxEntry, e muxEntry) []muxEntry {
	n := len(es)
	i := sort.Search(n, func(i int) bool {
		return len(es[i].pattern) < len(e.pattern)
	})
	if i == n {
		return append(es, e)
	}
	// we now know that i points at where we want to insert
	es = append(es, muxEntry{}) // try to grow the slice in place, any entry works.
	copy(es[i+1:], es[i:])      // Move shorter entries down
	es[i] = e
	return es
}

分析下,sort.Search()方法是找es中是否有比e.pattern中长度更短的,如果找到返回第一次出现的下标,没找到返回n;如果没有找到就添加到muxEntry中返回,如果有就将e.pattern插入找到的i的位置。

es是ServeMux用来进行路由匹配的,采用的是最长匹配原则。当存在有多个匹配项时,是会优先匹配最长的pattern。

2.2 路由监听部分

http.ListenAndServe()方法看起:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
func ListenAndServe(addr string, handler Handler) error {
	server := &Server{Addr: addr, Handler: handler}
	return server.ListenAndServe()
}


func (srv *Server) ListenAndServe() error {
	if srv.shuttingDown() {
		return ErrServerClosed
	}
	addr := srv.Addr
	if addr == "" {
		addr = ":http"
	}
	ln, err := net.Listen("tcp", addr)
	if err != nil {
		return err
	}
	return srv.Serve(ln)
}

代码很简单,就是监听了TCP连接,然后调用server对象的Serve()方法来处理这次请求。来细看下处理请求的逻辑:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
// Serve accepts incoming connections on the Listener l, creating a
// new service goroutine for each. The service goroutines read requests and
// then call srv.Handler to reply to them.
//
// HTTP/2 support is only enabled if the Listener returns *tls.Conn
// connections and they were configured with "h2" in the TLS
// Config.NextProtos.
//
// Serve always returns a non-nil error and closes l.
// After Shutdown or Close, the returned error is ErrServerClosed.
func (srv *Server) Serve(l net.Listener) error {
	if fn := testHookServerServe; fn != nil {
		fn(srv, l) // call hook with unwrapped listener
	}

	origListener := l
	l = &onceCloseListener{Listener: l}
	defer l.Close()

	...

	var tempDelay time.Duration // how long to sleep on accept failure

    // 处理请求核心部分
	ctx := context.WithValue(baseCtx, ServerContextKey, srv)
	for {
		rw, err := l.Accept()
		if err != nil {
			select {
			case <-srv.getDoneChan():
				return ErrServerClosed
			default:
			}
			if ne, ok := err.(net.Error); ok && ne.Temporary() {
				if tempDelay == 0 {
					tempDelay = 5 * time.Millisecond
				} else {
					tempDelay *= 2
				}
				if max := 1 * time.Second; tempDelay > max {
					tempDelay = max
				}
				srv.logf("http: Accept error: %v; retrying in %v", err, tempDelay)
				time.Sleep(tempDelay)
				continue
			}
			return err
		}
		connCtx := ctx
		if cc := srv.ConnContext; cc != nil {
			connCtx = cc(connCtx, rw)
			if connCtx == nil {
				panic("ConnContext returned nil")
			}
		}
		tempDelay = 0
		c := srv.newConn(rw)
		c.setState(c.rwc, StateNew) // before Serve can return
		go c.serve(connCtx)
	}
}

略去一些不必要的细节,Serve()方法的功能主要就是创建了一个上下文对象,然后调用Listener的Accept方法用来获取连接数据并使用newConn方法创建连接对象,然后通过go c.serve(connCtx)处理连接。可以看到每一个连接都开启了一个协程,请求的上下文都不同。来看下serve()方法的细节:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// Serve a new connection.
func (c *conn) serve(ctx context.Context) {
	c.remoteAddr = c.rwc.RemoteAddr().String()
	ctx = context.WithValue(ctx, LocalAddrContextKey, c.rwc.LocalAddr())
	defer func() {
		if err := recover(); err != nil && err != ErrAbortHandler {
			const size = 64 << 10
			buf := make([]byte, size)
			buf = buf[:runtime.Stack(buf, false)]
			c.server.logf("http: panic serving %v: %v\n%s", c.remoteAddr, err, buf)
		}
		if !c.hijacked() {
			c.close()
			c.setState(c.rwc, StateClosed)
		}
	}()

	...

	for {
		w, err := c.readRequest(ctx)
		if c.r.remain != c.server.initialReadLimitSize() {
			// If we read any bytes off the wire, we're active.
			c.setState(c.rwc, StateActive)
		}
		
        ...

		// HTTP cannot have multiple simultaneous active requests.[*]
		// Until the server replies to this request, it can't read another,
		// so we might as well run the handler in this goroutine.
		// [*] Not strictly true: HTTP pipelining. We could let them all process
		// in parallel even if their responses need to be serialized.
		// But we're not going to implement HTTP pipelining because it
		// was never deployed in the wild and the answer is HTTP/2.
		serverHandler{c.server}.ServeHTTP(w, w.req)

		...
	}
}

serve()方法的逻辑比较长,但总结下其实也还好:

  • 使用defer定义了函数退出时,连接关闭相关的处理。
  • 然后就是读取连接的网络数据,并处理读取完毕时候的状态。
  • 接下来就是调用serverHandler{c.server}.ServeHTTP(w,w.req)方法处理请求了。
  • 最后就是请求处理完毕的逻辑。

serverHandler是一个比较重要的结构体,它仅有一个字段,即Server结构,同时它也实现了Hander接口方法ServeHTTP

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// serverHandler delegates to either the server's Handler or
// DefaultServeMux and also handles "OPTIONS *" requests.
type serverHandler struct {
	srv *Server
}

func (sh serverHandler) ServeHTTP(rw ResponseWriter, req *Request) {
	handler := sh.srv.Handler
	if handler == nil {
		handler = DefaultServeMux
	}
	if req.RequestURI == "*" && req.Method == "OPTIONS" {
		handler = globalOptionsHandler{}
	}
	handler.ServeHTTP(rw, req)
}

在该接口方法中做了一个重要的事情,初始化multiplexer路由多路复用器。如果server对象没有制定Handler,则使用默认的DefaultServeMux作为路由Multiplexer。并调用初始化Handler的ServeHTTP方法,我们以DefaultServeMux为例,进行详细的讲解:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// ServeHTTP dispatches the request to the handler whose
// pattern most closely matches the request URL.
func (mux *ServeMux) ServeHTTP(w ResponseWriter, r *Request) {
	if r.RequestURI == "*" {
		if r.ProtoAtLeast(1, 1) {
			w.Header().Set("Connection", "close")
		}
		w.WriteHeader(StatusBadRequest)
		return
	}
	h, _ := mux.Handler(r)
	h.ServeHTTP(w, r)
}

// net/http/server.go:L1963-1965
func (f HandlerFunc) ServeHTTP(w ResponseWriter, r *Request) {
    f(w, r)
}

mux的ServeHTTP方法通过调用其Handler方法寻找注册到路由上的handler函数,并调用该函数的ServeHTTP方法,依次看下这两个方法具体干了些什么,先看看Handler()函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
func (mux *ServeMux) Handler(r *Request) (h Handler, pattern string) {

	// CONNECT requests are not canonicalized.
	if r.Method == "CONNECT" {
		// If r.URL.Path is /tree and its handler is not registered,
		// the /tree -> /tree/ redirect applies to CONNECT requests
		// but the path canonicalization does not.
		if u, ok := mux.redirectToPathSlash(r.URL.Host, r.URL.Path, r.URL); ok {
			return RedirectHandler(u.String(), StatusMovedPermanently), u.Path
		}

		return mux.handler(r.Host, r.URL.Path)
	}

	// All other requests have any port stripped and path cleaned
	// before passing to mux.handler.
	host := stripHostPort(r.Host)
	path := cleanPath(r.URL.Path)

	// If the given path is /tree and its handler is not registered,
	// redirect for /tree/.
	if u, ok := mux.redirectToPathSlash(host, path, r.URL); ok {
		return RedirectHandler(u.String(), StatusMovedPermanently), u.Path
	}

	if path != r.URL.Path {
		_, pattern = mux.handler(host, path)
		url := *r.URL
		url.Path = path
		return RedirectHandler(url.String(), StatusMovedPermanently), pattern
	}

	return mux.handler(host, r.URL.Path)
}

// handler is the main implementation of Handler.
// The path is known to be in canonical form, except for CONNECT methods.
func (mux *ServeMux) handler(host, path string) (h Handler, pattern string) {
	mux.mu.RLock()
	defer mux.mu.RUnlock()

	// Host-specific pattern takes precedence over generic ones
	if mux.hosts {
		h, pattern = mux.match(host + path)
	}
	if h == nil {
		h, pattern = mux.match(path)
	}
	if h == nil {
		h, pattern = NotFoundHandler(), ""
	}
	return
}

// Find a handler on a handler map given a path string.
// Most-specific (longest) pattern wins.
func (mux *ServeMux) match(path string) (h Handler, pattern string) {
	// Check for exact match first.
	v, ok := mux.m[path]
	if ok {
		return v.h, v.pattern
	}

	// Check for longest valid match.  mux.es contains all patterns
	// that end in / sorted from longest to shortest.
	for _, e := range mux.es {
		if strings.HasPrefix(path, e.pattern) {
			return e.h, e.pattern
		}
	}
	return nil, ""
}

mux.Handler(r)这个是完成对于路径的匹配,返回相关的Handler。该函数首先对URL进行简单的处理(RedirectHandler()相关的逻辑注释很明显了,不再解释了);然后调用handler方法,后者会创建一个锁,同时调用match方法返回一个handler和pattern。

match()方法中,mux的m字段是map[string]muxEntry,后者存储了pattern和handler处理器函数,因此通过迭代m寻找出注册路由的pattern模式与实际url匹配的handler函数并返回。

返回的结构一直传递到mux的ServeHTTP方法,接下来调用handler函数的ServeHTTP方法,即IndexHandler函数,然后把response写到http.RequestWirter对象返回给客户端。

上述函数运行结束即serverHandler{c.server}.ServeHTTP(w, w.req)运行结束。接下来就是对请求处理完毕之后上和连接断开的相关逻辑了。

参考文章: golang 构建HTTP服务

Go的http库详解

2. gin框架源码阅读

我们从gin框架官网demo开始讲起:

1
2
3
4
5
6
7
8
9
10
11
12
13
package main

import "github.com/gin-gonic/gin"

func main() {
    r := gin.Default()
    r.GET("/ping", func(c *gin.Context) {
        c.JSON(200, gin.H{
            "message": "go语言中文文档www.topgoer.com",
        })
    })
    r.Run() // listen and serve on 0.0.0.0:8080
}

看下r.Run()的源码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// Run attaches the router to a http.Server and starts listening and serving HTTP requests.
// It is a shortcut for http.ListenAndServe(addr, router)
// Note: this method will block the calling goroutine indefinitely unless an error happens.
func (engine *Engine) Run(addr ...string) (err error) {
	defer func() { debugPrintError(err) }()

	if engine.isUnsafeTrustedProxies() {
		debugPrint("[WARNING] You trusted all proxies, this is NOT safe. We recommend you to set a value.\n" +
			"Please check https://pkg.go.dev/github.com/gin-gonic/gin#readme-don-t-trust-all-proxies for details.")
	}

	address := resolveAddress(addr)
	debugPrint("Listening and serving HTTP on %s\n", address)
	err = http.ListenAndServe(address, engine)
	return
}

可以看到其本质上也是调用了net/http库的ListenAndServe(),所以核心请求数据流转的流程还是依赖了基础的net/http库。在这里,其实gin源码的阅读主要是要搞懂以下几个问题:

  • request数据是如何流转的
  • gin框架到底扮演了什么角色
  • 请求从gin流入net/http, 最后又是如何回到gin中
  • gin的context为何能承担起来复杂的需求
  • gin的路由算法
  • gin的中间件是什么

2.1 request数据是如何流转的与gin框架的角色

这个问题上面已经详细分析了下net/http库的源码,现在来思考一个问题:为什么目前会出现很多的go框架

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
// Find a handler on a handler map given a path string.
// Most-specific (longest) pattern wins.
func (mux *ServeMux) match(path string) (h Handler, pattern string) {
	// Check for exact match first.
	v, ok := mux.m[path]
	if ok {
		return v.h, v.pattern
	}

	// Check for longest valid match.  mux.es contains all patterns
	// that end in / sorted from longest to shortest.
	for _, e := range mux.es {
		if strings.HasPrefix(path, e.pattern) {
			return e.h, e.pattern
		}
	}
	return nil, ""
}

从上面匹配路由的函数其实可以得到这个答案,因为匹配的规则过于简单,其无法满足复杂的需求开发。因此,基本所有的go框架干的最主要的一件事情就是重写net/httproute逻辑。

2.2 gin框架的数据流转流程