goproxy 源码分析

422次阅读  |  发布于3年以前

go get 取包原理

1 第一步,正则匹配出依赖包的查询路径

go get可以指定具体包的import路径或者通过其自行分析代码中的import得出需要获取包的路径。但是import路径,并不直接就是该包的查询路径。在go get的源码实现中,包的查询路径是通过一组正则匹配出来的。也就是说,import路径是必须匹配这组正则表达式的,如果不匹配的话,代码是肯定无法编译的。

再结合go-get参数,向远端VCS系统发起https://github.com/goproxyio/goproxy?go-get=1请求。

2 第二步,查询得出包的远端仓库地址

包的远端仓库地址,可以通过go get请求的响应中的go-import的meta标签中的content中获取的。

3 第三步,根据仓库地址clone到本地

虽然版本控制系统VCS本身就存在各类区别,但是一些基础操作大多类似。在go get中具体clone的过程会根据具体的VCS采用对应的操作。

go get 代理取包流程

了解了go get取包的基础流程后,说说Go Module功能开启后的完整流程。

可以用go get -x 查看拉取的详细过程

 go get -x github.com/goproxyio/goproxy   
# get https://goproxy.cn/github.com/goproxyio/@v/list
# get https://goproxy.cn/github.com/@v/list
# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/list
# get https://goproxy.cn/github.com/@v/list: 404 Not Found (0.686s)
# get https://goproxy.cn/github.com/goproxyio/@v/list: 404 Not Found (0.754s)
# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/list: 200 OK (0.855s)
# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.info
# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.info: 200 OK (0.117s)
go: downloading github.com/goproxyio/goproxy v1.0.0
# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.zip
# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.zip: 200 OK (0.228s)
# get https://goproxy.cn/sumdb/sum.golang.org/supported
# get https://goproxy.cn/sumdb/sum.golang.org/supported: 200 OK (0.032s)
# get https://goproxy.cn/sumdb/sum.golang.org/lookup/github.com/goproxyio/goproxy@v1.0.0
# get https://goproxy.cn/sumdb/sum.golang.org/lookup/github.com/goproxyio/goproxy@v1.0.0: 200 OK (0.414s)
# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/x014/109
# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/x014/199.p/195
# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/1/055.p/119
# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/x014/109: 200 OK (0.028s)
# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/1/055.p/119: 200 OK (0.040s)
# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/x014/199.p/195: 200 OK (0.057s)
# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/324
# get https://goproxy.cn/sumdb/sum.golang.org/tile/8/0/324: 200 OK (0.226s)
go: github.com/goproxyio/goproxy upgrade => v1.0.0
# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.mod
# get https://goproxy.cn/github.com/goproxyio/goproxy/@v/v1.0.0.mod: 200 OK (0.093s)
go: finding module for package github.com/goproxyio/goproxy/internal/module
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/module/@v/list
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/@v/list
go: finding module for package github.com/goproxyio/goproxy/internal/cfg
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/cfg/@v/list
go: finding module for package github.com/goproxyio/goproxy/internal/modfetch
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modfetch/@v/list
go: finding module for package github.com/goproxyio/goproxy/internal/modload
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modload/@v/list
go: finding module for package github.com/goproxyio/goproxy/internal/modfetch/codehost
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modfetch/codehost/@v/list
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/module/@v/list: 404 Not Found (2.579s)
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modfetch/codehost/@v/list: 404 Not Found (2.474s)
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modfetch/@v/list: 404 Not Found (2.882s)
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/@v/list: 404 Not Found (2.984s)
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/cfg/@v/list: 404 Not Found (3.339s)
# get https://goproxy.cn/github.com/goproxyio/goproxy/internal/modload/@v/list: 404 Not Found (3.353s)
go: finding module for package github.com/goproxyio/goproxy/internal/modload
go: finding module for package github.com/goproxyio/goproxy/internal/module
go: finding module for package github.com/goproxyio/goproxy/internal/modfetch/codehost
go: finding module for package github.com/goproxyio/goproxy/internal/cfg
go: finding module for package github.com/goproxyio/goproxy/internal/modfetch
../../../../pkg/mod/github.com/goproxyio/goproxy@v1.0.0/pkg/proxy/proxy.go:12:2: module github.com/goproxyio/goproxy@latest found (v1.0.0), but does not contain package github.com/goproxyio/goproxy/internal/cfg
../../../../pkg/mod/github.com/goproxyio/goproxy@v1.0.0/pkg/proxy/proxy.go:13:2: module github.com/goproxyio/goproxy@latest found (v1.0.0), but does not contain package github.com/goproxyio/goproxy/internal/modfetch
../../../../pkg/mod/github.com/goproxyio/goproxy@v1.0.0/pkg/proxy/proxy.go:14:2: module github.com/goproxyio/goproxy@latest found (v1.0.0), but does not contain package github.com/goproxyio/goproxy/internal/modfetch/codehost
../../../../pkg/mod/github.com/goproxyio/goproxy@v1.0.0/pkg/proxy/proxy.go:15:2: module github.com/goproxyio/goproxy@latest found (v1.0.0), but does not contain package github.com/goproxyio/goproxy/internal/modload
../../../../pkg/mod/github.com/goproxyio/goproxy@v1.0.0/pkg/proxy/proxy.go:16:2: module github.com/goproxyio/goproxy@latest found (v1.0.0), but does not contain package github.com/goproxyio/goproxy/internal/module

开启Go Module后,go get增加了一个新的环境变量GOPROXY。该环境变量一旦开启,go get就完全切换到新的取包流程,即GOPROXY流程。

在GOPROXY流程中,官方定义了一组代理接口, 请参考官方接口定义。

https://tip.golang.org/cmd/go/#hdr-Module_proxy_protocol

GET $GOPROXY/<module>/@v/list returns a list of all known versions of the given module, one per line.
GET $GOPROXY/<module>/@v/<version>.info returns JSON-formatted metadata about that version of the given module.
GET $GOPROXY/<module>/@v/<version>.mod returns the go.mod file for that version of the given module.
GET $GOPROXY/<module>/@v/<version>.zip returns the zip archive for that version of the given module.

其实这组接口的定义就是$GOPATH/pkg/mod/cache/download中的文件系统。就是说,我们可以直接将此目录下的文件系统作为代理使用,如下命令:

export GOPROXY=file:///$GOPATH/pkg/mod/cache/download/

goproxy 其实很简单,实现了上述四个接口的代理

% ls
Dockerfile              contrib                 main.go                 scripts
LICENSE                 docker-compose.yaml     proxy                   sumdb
Makefile                go.mod                  renameio                test
README.md               go.sum                  robustio

先看下main.go文件

func main() 

handle = &logger{proxy.NewRouter(proxy.NewServer(new(ops)), &proxy.RouterOptions{
    Pattern:      excludeHost,
    Proxy:        proxyHost,
    DownloadRoot: downloadRoot,
})}

handle = &logger{proxy.NewServer(new(ops))}

server := &http.Server{Addr: listen, Handler: handle}

注册了一个ops server

ops实现了协议要求的接口

type ops struct{}
func (*ops) List(ctx context.Context, mpath string) (proxy.File, error)

func (*ops) Latest(ctx context.Context, path string) (proxy.File, error) {
  d, err := download(module.Version{Path: path, Version: "latest"})

func (*ops) Info(ctx context.Context, m module.Version) (proxy.File, error) 

func (*ops) GoMod(ctx context.Context, m module.Version) (proxy.File, error)

func (*ops) Zip(ctx context.Context, m module.Version) (proxy.File, error)

接着看下proxy/router.go文件

func NewRouter(srv *Server, opts *RouterOptions) *Router
  rt := &Router{
    opts: opts,
    srv:  srv,
  }

    remote, err := url.Parse(opts.Proxy)

    proxy := httputil.NewSingleHostReverseProxy(remote)

    proxy.Director = func(r *http.Request) {
        director(r)
        r.Host = remote.Host
    }

    rt.proxy.Transport = &http.Transport{
        Proxy:           http.ProxyFromEnvironment,
        TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
    }

调用了httputil的

 httputil.NewSingleHostReverseProxy

函数

func (rt *Router) ServeHTTP(w http.ResponseWriter, r *http.Request) {
if strings.HasPrefix(r.URL.Path, "/sumdb/") {
    sumdb.Handler(mw, r)
}

if strings.HasSuffix(r.URL.Path, "/@latest") {
}
rt.proxy.ServeHTTP(mw, r)
}

func GlobsMatchPath(globs, target string) bool {
matched, _ := path.Match(glob, prefix)
}

最后看看proxy/server.go文件

首先注入ops

func NewServer(ops ServerOps) *Server {
  return &Server{ops: ops}
}

然后ServeHTTP接口对ops的接口进行了包装和反向代理

func (s *Server) ServeHTTP(w http.ResponseWriter, r *http.Request) {

if strings.HasPrefix(r.URL.Path, "/sumdb/") {
    sumdb.Handler(w, r)
}

i := strings.Index(r.URL.Path, "/@")

modPath, err := module.UnescapePath(strings.TrimPrefix(r.URL.Path[:i], "/"))

switch what {
  case "latest":
    ctype = contentTypeJSON
    f, openErr = s.ops.Latest(ctx, modPath)
  case "v/list":
    ctype = contentTypeText
    f, openErr = s.ops.List(ctx, modPath)
  default:
    what = strings.TrimPrefix(what, "v/")
}

switch ext {
    case ".info":
      ctype = "application/json"
      f, openErr = s.ops.Info(ctx, m)
    case ".mod":
      ctype = "text/plain; charset=UTF-8"
      f, openErr = s.ops.GoMod(ctx, m)
    case ".zip":
      ctype = "application/octet-stream"
      f, openErr = s.ops.Zip(ctx, m)
    default:
      http.Error(w, "request not recognized", http.StatusNotFound)
      return
    }

http.ServeContent(w, r, what, info.ModTime(), f)
func ServeContent(w ResponseWriter, req *Request, name string, modtime time.Time, content io.ReadSeeker)

该函数使用提供的ReaderSeeker提供的内容来恢复请求,该函数相对于io.Copy的优点是可以处理范围类请求,设定MIME类型,并且处理了If-Modified-Since请求.如果未设定content-type类型,该函数首先通过文件扩展名来判断类型,如果失效的话,读取content的第一块数据并将他传递给DetectContentType进行类型判断.name可以不被使用,更进一步说,他可以为空并且不在respone中返回.如果modtime不是0时间,该时间则体现在response的最后一次修改的header中,如果请求包括一个If-Modified-Since header,该函数利用modtime来决定是否发送该content.该函数利用Seek功能来决定content的大小.

Copyright© 2013-2020

All Rights Reserved 京ICP备2023019179号-8