更新git repo所需的最小网络流量是多少?

2024-06-25 06:42:36 发布

您现在位置:Python中文网/ 问答频道 /正文

git很慢,让我们自动化吧

我想写一个脚本来更新大约150个git存储库。与我们早期的subversion安装相比,Gitlab/hub在网络上的速度几乎慢了一个数量级,例如:

(dev) go|c:\srv\lib\examradar> python -c "import time;start=time.time();import os;os.system('svn up');print time.time() - start"
Updating '.':
At revision 31294.
0.559000015259

(dev) go|c:\srv\lib\code\dkrepo> python -c "import time;start=time.time();import os;os.system('git pull');print time.time() - start"
Already up to date.
Current branch master is up to date.
4.31999993324

例如,150 svn回购至少需要84秒,而150 git回购需要10分钟以上(!) (在win10上的wsl上运行相同的命令会得到0.48秒和1.52秒——如图所示;-)

通过一个脚本,我们可以并行完成所有“简单”的更新,并将git的大小写缩短到~100秒。不幸的是,我们遇到了超时(通常在执行git rev-parse @{u}时),所以我正在寻找更新git repo并善待git服务器的最有效方法。你知道吗

我对“作弊”持开放态度,例如,在git之外,如果有一种方法知道(很有可能)回购不需要更新(webhooks?,后台获取守护进程?)你知道吗

搞砸回购是非常具有破坏性的,因此,如果撤资会造成合并冲突,它应该退出。你知道吗

当前代码

我使用python调用包使调用命令更容易。我会很高兴的回答,只使用原始git命令太多。这是我到目前为止得到的。。。你知道吗

首先是一个方便的函数,它打印正在运行的命令及其输出,并以字符串形式返回输出:

from invoke import task

def runner(c):
    return lambda cmd: c.run(cmd, hide=False, echo=True).stdout.strip()

然后是获取回购状态的任务/函数。我相信只有git fetch和git rev parse@{u}`触及网络(?)地址:

@task
def get_status(c, wc):
    """Return a set containing the strings

          local-clean     if there are no local changes
          local-dirty     if there are local changes
          untracked       if there are files that haven't been added to git
          remote-change   if upstream has changed
          local-change    if there are local committed (but not pushed) changes
          diverged        if local and upstream have diverged

    """
    run = runner(c)

    with c.cd(wc):
        status = []
        porcelain = run('git status --porcelain')
        if porcelain == "":
            status.append('local-clean')
        else:
            status.append('local-dirty')
        untracked = run('git ls-files --others --exclude-standard')
        if untracked:
            status.append('untracked')
        run('git fetch')    # only interested in current branch so not using `git remote update`
        local = run('git rev-parse @')  # get local hash
        try:
            remote = run('git rev-parse @{u}')  # get upstream hash
        except:
            remote = local  # repo doesn't have an upstream
        if local != remote:
            base = run('git merge-base @ @{u}')  # common ancestor
            if local == base:
                status.append('remote-change')
            elif remote == base:
                status.append('local-change')
            else:
                status.append('diverged')

    print("STATUS:", status)
    return set(status)

在最新回购协议上,此打印:

(dev) go|c:\srv\tmp\tstgitup> inv get-status \srv\lib\core\ttcal
cd \srv\lib\core\ttcal && git status --porcelain
cd \srv\lib\core\ttcal && git ls-files --others --exclude-standard
cd \srv\lib\core\ttcal && git fetch
cd \srv\lib\core\ttcal && git rev-parse @
eb17f1a9723c992b265b9dce0ffb85274e956538
cd \srv\lib\core\ttcal && git rev-parse @{u}
eb17f1a9723c992b265b9dce0ffb85274e956538
STATUS: ['local-clean']

从状态中,我生成一个更新策略(参见下面的语义):

def update_policy(status):
    policy = None

    if 'diverged' in status or 'untracked' in status:
        policy = 'BAIL'
    elif 'local-dirty' in status:           # local uncommitted changes
        if 'remote-change' in status:       # remote changes
            if 'local-change' in status:    # local committed changes
                policy = 'BAIL'
            else:
                policy = 'STASH'  # local uncommitted changes and remote changes
        else:
            policy = 'NOOP'     # local uncommitted changes, no remote changes
    elif 'remote-change' in status:
        if 'local-change' in status and 'local-clean' in status:
            policy = 'PULL'     # remote changes and local committed changes
        elif 'local-dirty' in status:
            policy = 'STASH'    # remote changes and local uncommitted changes
        else:
            policy = 'PULL'     # remote change, no local changes
    elif 'local-change' in status:
        if 'remote-change' in status:
            policy = 'PULL'     # local committed changes and remote changes
        else:
            policy = 'NOOP'     # no remote changes
    else:
        policy = 'NOOP'  # no local/remote changes, no untracked files.
    return policy    

最后,将执行更新的gitup命令:

@task
def gitup(c, wc):
    run = runner(c)
    status = get_status(c, wc)
    policy = update_policy(status)
    print("UPDATE:POLICY:", policy)

    if policy == 'BAIL':
        print("don't know what to do, bailing..")
    elif policy == 'NOOP':
        print("nothing to do..")
    elif policy == 'PULL':
        # run('git pull')
        print("RUN: git merge FETCH_HEAD")  # we've already done a `git fetch` so don't call `git pull`
    elif policy == 'STASH':
        # print("RUN: git stash clear")  #..?
        print("RUN: git stash push")
        print("RUN: git merge FETCH_HEAD")
        print("RUN: git stash pop -q")
    else:
        print("UNKNOWN POLICY:", policy)

状态/政策/行动

对于具有本地未承诺变更的回购:

(dev) go|c:\srv\tmp\tstgitup> inv gitup \srv\lib\almanac
cd \srv\lib\almanac && git status --porcelain
 M .gitignore
cd \srv\lib\almanac && git ls-files --others --exclude-standard
cd \srv\lib\almanac && git fetch
cd \srv\lib\almanac && git rev-parse @
e23bfd8a03432ca02cfb0e31bb229d2bb53dfc4f
cd \srv\lib\almanac && git rev-parse @{u}
e23bfd8a03432ca02cfb0e31bb229d2bb53dfc4f
STATUS: ['local-dirty']
UPDATE:POLICY: NOOP
nothing to do..

正在签入文件

(dev) go|c:\srv\tmp\tstgitup> pushd \srv\lib\almanac && git commit -am "update gitignore" && popd
[master 21a727f] update gitignore
 1 file changed, 1 insertion(+)

给予

(dev) go|c:\srv\tmp\tstgitup> inv gitup \srv\lib\almanac
cd \srv\lib\almanac && git status --porcelain
cd \srv\lib\almanac && git ls-files --others --exclude-standard
cd \srv\lib\almanac && git fetch
cd \srv\lib\almanac && git rev-parse @
21a727fd31357e585f4de6f6af6d3fef87da4dee
cd \srv\lib\almanac && git rev-parse @{u}
e23bfd8a03432ca02cfb0e31bb229d2bb53dfc4f
cd \srv\lib\almanac && git merge-base @ @{u}
e23bfd8a03432ca02cfb0e31bb229d2bb53dfc4f
STATUS: ['local-clean', 'local-change']
UPDATE:POLICY: NOOP
nothing to do..

推动变革:

(dev) go|c:\srv\tmp\tstgitup> pushd \srv\lib\almanac && git push && popd
Counting objects: 3, done.
Delta compression using up to 12 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 285 bytes | 285.00 KiB/s, done.
Total 3 (delta 2), reused 0 (delta 0)
To https://gitlab.com/norsktest/almanac.git
   e23bfd8..21a727f  master -> master

(dev) go|c:\srv\tmp\tstgitup> inv gitup \srv\lib\almanac
cd \srv\lib\almanac && git status --porcelain
cd \srv\lib\almanac && git ls-files --others --exclude-standard
cd \srv\lib\almanac && git fetch
cd \srv\lib\almanac && git rev-parse @
21a727fd31357e585f4de6f6af6d3fef87da4dee
cd \srv\lib\almanac && git rev-parse @{u}
21a727fd31357e585f4de6f6af6d3fef87da4dee
STATUS: ['local-clean']
UPDATE:POLICY: NOOP
nothing to do..

变更为上游后:

(dev) go|c:\srv\tmp\tstgitup> inv gitup \srv\lib\almanac
cd \srv\lib\almanac && git status --porcelain
cd \srv\lib\almanac && git ls-files --others --exclude-standard
cd \srv\lib\almanac && git fetch
From https://gitlab.com/norsktest/almanac
   21a727f..1f0c065  master     -> origin/master
cd \srv\lib\almanac && git rev-parse @
21a727fd31357e585f4de6f6af6d3fef87da4dee
cd \srv\lib\almanac && git rev-parse @{u}
1f0c06576ee6eb2deb159f5d6d4b54c14867ca3a
cd \srv\lib\almanac && git merge-base @ @{u}
21a727fd31357e585f4de6f6af6d3fef87da4dee
STATUS: ['local-clean', 'remote-change']
UPDATE:POLICY: PULL
RUN: git merge FETCH_HEAD

更改现有文件:

(dev) go|c:\srv\tmp\tstgitup> echo foobar >> \srv\lib\almanac\.gitignore

(dev) go|c:\srv\tmp\tstgitup> inv gitup \srv\lib\almanac
cd \srv\lib\almanac && git status --porcelain
 M .gitignore
cd \srv\lib\almanac && git ls-files --others --exclude-standard
cd \srv\lib\almanac && git fetch
cd \srv\lib\almanac && git rev-parse @
21a727fd31357e585f4de6f6af6d3fef87da4dee
cd \srv\lib\almanac && git rev-parse @{u}
1f0c06576ee6eb2deb159f5d6d4b54c14867ca3a
cd \srv\lib\almanac && git merge-base @ @{u}
21a727fd31357e585f4de6f6af6d3fef87da4dee
STATUS: ['local-dirty', 'remote-change']
UPDATE:POLICY: STASH
RUN: git stash push
RUN: git merge FETCH_HEAD
RUN: git stash pop -q

…最后(?)添加新的未跟踪文件:

(dev) go|c:\srv\tmp\tstgitup> touch \srv\lib\almanac\foo.bar

(dev) go|c:\srv\tmp\tstgitup> inv gitup \srv\lib\almanac
cd \srv\lib\almanac && git status --porcelain
 M .gitignore
?? foo.bar
cd \srv\lib\almanac && git ls-files --others --exclude-standard
foo.bar
cd \srv\lib\almanac && git fetch
cd \srv\lib\almanac && git rev-parse @
21a727fd31357e585f4de6f6af6d3fef87da4dee
cd \srv\lib\almanac && git rev-parse @{u}
1f0c06576ee6eb2deb159f5d6d4b54c14867ca3a
cd \srv\lib\almanac && git merge-base @ @{u}
21a727fd31357e585f4de6f6af6d3fef87da4dee
STATUS: ['local-dirty', 'untracked', 'remote-change']
UPDATE:POLICY: BAIL
don't know what to do, bailing..

Tags: devgitifremoteparseliblocalstatus
1条回答
网友
1楼 · 发布于 2024-06-25 06:42:36

您不必要求git同步完整的存储库历史记录,它通常是最方便和最便宜的,在您在那里的时候就可以做到。在比较svn和git之前,尝试让它们做同样的事情。svn up只关心当前的提示,其余的则不做任何检查。你知道吗

$ time git ls-remote git://github.com/torvalds/linux refs/heads/master
6e8ba0098e241a5425f7aa6d950a5a00c44c9781        refs/heads/master

real    0m0.536s
user    0m0.004s
sys     0m0.007s
$

毫不奇怪,检查一个远程提示需要与svn和git差不多的时间。你知道吗

您当前的分支的短名称是git symbolic-ref -q short HEAD(或者您不在分支上)。你知道吗

所以一个更接近于你的svn up所做的事情的等价物是

if branch=`git symbolic-ref -q  short HEAD` && 
    remote=`git config branch.$branch.remote` &&
    merge=`git config branch.$branch.merge` &&
    upstreamtip=`git ls-remote $remote $merge | cut -f1` &&
    test $upstreamtip != `git rev-parse @{u}`
        then git pull $remote $merge
fi

相关问题 更多 >