Golang httprouter 源码阅读

田曜瑞

2023-12-01

一、httprouter特点介绍

轻量、高性能的route框架
显示匹配
不关注请求路径尾部 \
路径自动矫正
支持url参数
零字节垃圾
性能优越
不会服务器崩溃
适合构建API

二、httprouter如何工作

此路由器依赖大量使用公共前缀的树形结构，它是基于压缩前缀树或者叫基数树。有共同父元素的节点有相同前缀。以下是GET请求方法树结构的样子：

Priority   Path             Handle
9          \                *<1>
3          ├s               nil
2          |├earch\         *<2>
1          |└upport\        *<3>
2          ├blog\           *<4>
1          |    └:post      nil
1          |         └\     *<5>
2          ├about-us\       *<6>
1          |        └team\  *<7>
1          └contact\        *<8>

// 这个图相当于注册了下面这几个路由
GET("/search/", func1)
GET("/support/", func2)
GET("/blog/:post/", func3)
GET("/about-us/", func4)
GET("/about-us/team/", func5)
GET("/contact/", func6)

每个* 代表一个handler function的内存地址。如果你跟踪这颗树的根节点到叶子节点，你会得到完整的路由路径，例如：\blog:post\，在这里:post只是一个占位符，它实际上是一个文章的名称。不像hash-map，这种树形结构允许我们使用动态参数例如:post，因为httprouter是直接与请求路径匹配而不是匹配路径的哈希值。在benchmarks的测试中展现出httprouter非常高效。

因为url路径是分层结构，并且使用的字符集是有限的，所以它们很可能有很多公共的前缀。这样就可以使我们减少路由到更小的问题中。而且此路由器还对每种请求方法都有一颗树。首先它比每个节点都保存handle要更节省空间，并且在前缀树开始查询会较少很多问题。

为了更好的伸缩性，在每个树层级上，子节点都通过优先级排序，这里的优先级是注册在此节点的handle的数量（子，孙等等往下的节点）：

有最多路由路径的节点先被搜索。这样可以帮助尽可能快的搜索到handle。
这比较像成本补偿，最长可达路径总是优先被搜索。以下是可视化的树形结构，节点的搜索顺序为从上到下，从左到右。
```
├------------
├---------
├-----
├----
├--
├--
└-
```

三、httprouter如何初始化（构造树结构）

route框架最重要就是两点，一是如何初始化，二是用户请求来后如何分发（其他比如中间件也是一些重要的点）。
先从官方的server demo看如何初始化：

package main

import (
    "fmt"
    "github.com/julienschmidt/httprouter"
    "net/http"
    "log"
)

func Index(w http.ResponseWriter, r *http.Request, _ httprouter.Params) {
    fmt.Fprint(w, "Welcome!\n")
}

func Hello(w http.ResponseWriter, r *http.Request, ps httprouter.Params) {
    fmt.Fprintf(w, "hello, %s!\n", ps.ByName("name"))
}

func main() {
    router := httprouter.New()
    router.GET("/", Index)
    router.GET("/hello/:name", Hello)

    log.Fatal(http.ListenAndServe(":8080", router))
}

这里从httprouter.New()跟踪进去可以发现两个重要的struct，一个是Router，一个是node。
先看一下Router

// Router 是一个 http.Handler 可以通过定义的路由将请求分发给不同的函数
type Router struct {
    trees map[string]*node

    // 这个参数是否自动处理当访问路径最后带的 /，一般为 true 就行。
    // 例如： 当访问 /foo/ 时， 此时没有定义 /foo/ 这个路由，但是定义了 
    // /foo 这个路由，就对自动将 /foo/ 重定向到 /foo (GET 请求
    // 是 http 301 重定向，其他方式的请求是 http 307 重定向）。
    RedirectTrailingSlash bool

    // 是否自动修正路径， 如果路由没有找到时，Router 会自动尝试修复。
    // 首先删除多余的路径，像 ../ 或者 // 会被删除。
    // 然后将清理过的路径再不区分大小写查找，如果能够找到对应的路由， 将请求重定向到
    // 这个路由上 ( GET 是 301， 其他是 307 ) 。
    RedirectFixedPath bool

    // 用来配合下面的 MethodNotAllowed 参数。 
    HandleMethodNotAllowed bool

    // 如果为 true ，会自动回复 OPTIONS 方式的请求。
    // 如果自定义了 OPTIONS 路由，会使用自定义的路由，优先级高于这个自动回复。
    HandleOPTIONS bool

    // 路由没有匹配上时调用这个 handler 。
    // 如果没有定义这个 handler ，就会返回标准库中的 http.NotFound 。
    NotFound http.Handler

    // 当一个请求是不被允许的，并且上面的 HandleMethodNotAllowed 设置为 ture 的时候，
    // 如果这个参数没有设置，将使用状态为 with http.StatusMethodNotAllowed 的 http.Error
    // 在 handler 被调用以前，为允许请求的方法设置 "Allow" header 。
    MethodNotAllowed http.Handler

    // 当出现 panic 的时候，通过这个函数来恢复。会返回一个错误码为 500 的 http error 
    // (Internal Server Error) ，这个函数是用来保证出现 painc 服务器不会崩溃。
    PanicHandler func(http.ResponseWriter, *http.Request, interface{})
}

再看node

type node struct {
    // 当前节点的 URL 路径
    // 如上面图中的例子的首先这里是一个 /
    // 然后 children 中会有 path 为 [s, blog ...] 等的节点 
    // 然后 s 还有 children node [earch,upport] 等，就不再说明了
    path      string

    // 判断当前节点路径是不是含有参数的节点, 上图中的 :post 的上级 blog 就是wildChild节点
    wildChild bool

    // 节点类型: static, root, param, catchAll
    // static: 静态节点, 如上图中的父节点 s （不包含 handler 的)
    // root: 如果插入的节点是第一个, 那么是root节点
    // catchAll: 有*匹配的节点
    // param: 参数节点，比如上图中的 :post 节点
    nType     nodeType

    // path 中的参数最大数量，最大只能保存 255 个（超过这个的情况貌似太难见到了）
    // 这里是一个非负的 8 进制数字，最大也只能是 255 了
    maxParams uint8

    // 和下面的 children 对应，保留的子节点的第一个字符
    // 如上图中的 s 节点，这里保存的就是 eu （earch 和 upport）的首字母 
    indices   string

    // 当前节点的所有直接子节点
    children  []*node

    // 当前节点对应的 handler
    handle    Handle

    // 优先级，查找的时候会用到,表示当前节点加上所有子节点的数目
    priority  uint32
}

从httprouter.New()方法进入可以看到

func (r *Router) GET(path string, handle Handle) {
	r.Handle("GET", path, handle)
}

func (r *Router) Handle(method, path string, handle Handle) {
	if path[0] != '/' {
		panic("path must begin with '/' in path '" + path + "'")
	}

	if r.trees == nil {
		r.trees = make(map[string]*node)
	}

	// 从这里可以看出每种http方法（也可以是自定义的方法）都在同一颗树，作为不同根节点。
	root := r.trees[method]
	if root == nil {
		// 初始化各个方法的根节点
		root = new(node)
		r.trees[method] = root
	}
	// addRoute方法是把handle添加到路径对应的节点上
	root.addRoute(path, handle)
}

接着看addRoute

func (n *node) addRoute(path string, handle Handle) {
	fullPath := path
	n.priority++
	numParams := countParams(path)

	// 此树非空
	if len(n.path) > 0 || len(n.children) > 0 {
	walk:
		for {
			// 更新当前节点最大的参数数量值
			if numParams > n.maxParams {
				n.maxParams = numParams
			}

			// 找到最长公共前缀
			// 最长的公共前缀不包含 ':' 和 '*'
			i := 0
			
			// 获取当前节点和当前传入路径最小的字符数量
			max := min(len(path), len(n.path))
			// 寻找分裂点位置
			for i < max && path[i] == n.path[i] {
				i++
			}

			// 如果分裂字符的位置小于当前节点的n.path，那么需要从当前节点分裂出来一个子节点
			// 比如: 当前节点是/search 输入的path是/support，则需要把/search分裂出一个子节点 earch
			if i < len(n.path) {
				child := node{
					// 分裂出来的部分，比如/search 遇到 /support，先分裂出 earch
					path:      n.path[i:],
					// 和原来的节点一致
					wildChild: n.wildChild,
					nType:     static,
					// 和原来的节点一致
					indices:   n.indices,
					// 原来的节点的子节点变为分裂后节点的子节点
					children:  n.children,
					handle:    n.handle,
					priority:  n.priority - 1,
				}

				// 更新当前树枝下节点的最大参数数量值
				for i := range child.children {
					if child.children[i].maxParams > child.maxParams {
						child.maxParams = child.children[i].maxParams
					}
				}
				
				// 将分裂后的节点作为当前节点的子节点
				n.children = []*node{&child}
				// 存放子节点的第一个字符
				n.indices = string([]byte{n.path[i]})
				n.path = path[:i]
				// 分裂后就没有handle了
				n.handle = nil
				n.wildChild = false
			}

			// 为当前节点添加新的子节点
			if i < len(path) {
				// 去除公共部分，剩余的字符
				path = path[i:]
				
				if n.wildChild {
					n = n.children[0]
					n.priority++

					// 更新当前节点最大参数个数
					if numParams > n.maxParams {
						n.maxParams = numParams
					}
					numParams--

                    // 检查通配符是否匹配
					if len(path) >= len(n.path) && n.path == path[:len(n.path)] &&
                        // 检查更长的通配符参数，例如：:name和:names
						(len(n.path) >= len(path) || path[len(n.path)] == '/') {
						continue walk
					} else {
						// 通配符冲突
						var pathSeg string
						if n.nType == catchAll {
							pathSeg = path
						} else {
							pathSeg = strings.SplitN(path, "/", 2)[0]
						}
						prefix := fullPath[:strings.Index(fullPath, pathSeg)] + n.path
						panic("'" + pathSeg +
							"' in new path '" + fullPath +
							"' conflicts with existing wildcard '" + n.path +
							"' in existing prefix '" + prefix +
							"'")
					}
				}

				c := path[0]

				// 参数后面的 /
				if n.nType == param && c == '/' && len(n.children) == 1 {
					n = n.children[0]
					n.priority++
					continue walk
				}

				// 检查是否有子路径
				for i := 0; i < len(n.indices); i++ {
					if c == n.indices[i] {
						i = n.incrementChildPrio(i)
						n = n.children[i]
						continue walk
					}
				}

				// 如果没有就插入
				if c != ':' && c != '*' {
					// unicode 转 string
					n.indices += string([]byte{c})
					child := &node{
						maxParams: numParams,
					}
					n.children = append(n.children, child)
					n.incrementChildPrio(len(n.indices) - 1)
					n = child
				}
				n.insertChild(numParams, path, fullPath, handle)
				return

			} else if i == len(path) { 生成一个叶子节点
				if n.handle != nil {
					panic("a handle is already registered for path '" + fullPath + "'")
				}
				n.handle = handle
			}
			return
		}
	} else { // 空树
		n.insertChild(numParams, path, fullPath, handle)
		n.nType = root
	}
}

结下来再看看insertChild


func (n *node) insertChild(numParams uint8, path, fullPath string, handle Handle) {
	var offset int // 已经处理的字符位置

	// 找到':' 或者 '*' 出现的位置
	for i, max := 0, len(path); numParams > 0; i++ {
		c := path[i]
		if c != ':' && c != '*' {
			continue
		}
		fmt.Println("c=", string(c))

		// 在通配符往后以'/'或者路径结束的位置
		end := i + 1
		for end < max && path[end] != '/' {
			switch path[end] {
			// 通配符':' 或者'*' 后面不能再出现 ':', '*'
			case ':', '*':
				panic("only one wildcard per path segment is allowed, has: '" +
					path[i:] + "' in path '" + fullPath + "'")
			default:
				end++
			}
		}

		// 检查此节点是否有子节点
		if len(n.children) > 0 {
			panic("wildcard route '" + path[i:end] +
				"' conflicts with existing children in path '" + fullPath + "'")
		}

		// 参数节点必须有名称
		if end-i < 2 {
			panic("wildcards must be named with a non-empty name in path '" + fullPath + "'")
		}

		if c == ':' { // 参数 :
			// 分裂通配符路径
			if i > 0 {
				n.path = path[offset:i]
				offset = i
			}

			child := &node{
				nType:     param,
				maxParams: numParams,
			}
			n.children = []*node{child}
			n.wildChild = true
			n = child
			n.priority++
			numParams--

			// 如果通配符后面还有路径
			if end < max {
				n.path = path[offset:end]
				offset = end

				child := &node{
					maxParams: numParams,
					priority:  1,
				}
				n.children = []*node{child}
				n = child
			}

		} else { // catchAll
			if end != max || numParams > 1 {
				panic("catch-all routes are only allowed at the end of the path in path '" + fullPath + "'")
			}

			if len(n.path) > 0 && n.path[len(n.path)-1] == '/' {
				panic("catch-all conflicts with existing handle for the path segment root in path '" + fullPath + "'")
			}

			i--
			if path[i] != '/' {
				panic("no / before catch-all in path '" + fullPath + "'")
			}

			n.path = path[offset:i]

			child := &node{
				wildChild: true,
				nType:     catchAll,
				maxParams: 1,
			}
			n.children = []*node{child}
			n.indices = string(path[i])
			n = child
			n.priority++
			
			child = &node{
				path:      path[i:],
				nType:     catchAll,
				maxParams: 1,
				handle:    handle,
				priority:  1,
			}
			n.children = []*node{child}

			return
		}
	}

	n.path = path[offset:]
	n.handle = handle
}

再来看看请求来后如何做分发找到handle，代码在getValue

// 通过请求来的路径返回handle和参数（tsr是否重定向）
func (n *node) getValue(path string) (handle Handle, p Params, tsr bool) {
walk: // 遍历树
	for {
		if len(path) > len(n.path) {
			fmt.Println("n.path:", n.path)
			if path[:len(n.path)] == n.path {
				path = path[len(n.path):]
				// If this node does not have a wildcard (param or catchAll)
				// child,  we can just look up the next child node and continue
				// to walk down the tree
				// 如果当前节点不是参数节点，就一直找子节点直到遍历完整颗树
				if !n.wildChild {
					c := path[0]
					for i := 0; i < len(n.indices); i++ {
						if c == n.indices[i] {
							n = n.children[i]
							continue walk
						}
					}

					// 如果没有找到handle，会重定向到尾部没有 / 的path
					tsr = (path == "/" && n.handle != nil)
					return

				}

				// 处理参数节点
				n = n.children[0]
				switch n.nType {
				case param:
					// 找到参数或者 / 或者 路径结束
					end := 0
					for end < len(path) && path[end] != '/' {
						end++
					}

					// 保存参数
					if p == nil {
						// 懒惰分配
						p = make(Params, 0, n.maxParams)
					}
					i := len(p)
					p = p[:i+1] // 在预分配容量内展开切片
					p[i].Key = n.path[1:]
					p[i].Value = path[:end]

					// 更进一步查找
					if end < len(path) {
						if len(n.children) > 0 {
							path = path[end:]
							n = n.children[0]
							continue walk
						}

						// 还是未找到
						tsr = (len(path) == end+1)
						return
					}

					if handle = n.handle; handle != nil {
						return
					} else if len(n.children) == 1 {
						// 重定向
						n = n.children[0]
						tsr = (n.path == "/" && n.handle != nil)
					}

					return

				case catchAll:
					// 保存参数值
					if p == nil {
						// 懒惰分配
						p = make(Params, 0, n.maxParams)
					}
					i := len(p)
					p = p[:i+1] // 在预分配容量内展开切片
					p[i].Key = n.path[2:]
					p[i].Value = path

					handle = n.handle
					return

				default:
					panic("invalid node type")
				}
			}
		} else if path == n.path {
			// 应该找到handle了
			if handle = n.handle; handle != nil {
				return
			}

			if path == "/" && n.wildChild && n.nType != root {
				tsr = true
				return
			}
		
			// 重定向
			for i := 0; i < len(n.indices); i++ {
				if n.indices[i] == '/' {
					n = n.children[i]
					tsr = (len(n.path) == 1 && n.handle != nil) ||
						(n.nType == catchAll && n.children[0].handle != nil)
					return
				}
			}

			return
		}
		
		// 重定向
		tsr = (path == "/") ||
			(len(n.path) == len(path)+1 && n.path[len(path)] == '/' &&
				path == n.path[:len(n.path)-1] && n.handle != nil)
		return
	}
}

Golang httprouter 源码阅读

一、httprouter特点介绍

二、httprouter如何工作

三、httprouter如何初始化（构造树结构）

相关阅读

相关文章

相关问答

相关文档