Python Git模块的经验?[关闭]

问题:Python Git模块的经验?[关闭]

人们对Python的任何Git模块有何经验?(我知道GitPython,PyGit和Dulwich-如果您知道其他人,请随意提及。)

我正在编写一个程序,该程序必须与Git存储库进行交互(添加,删除,提交),但是没有使用Git的经验,所以我要寻找的一件事是关于Git的易用性/理解性。

我主要感兴趣的其他内容是库的成熟度和完整性,合理的错误缺失,持续的开发以及文档和开发人员的帮助。

如果您有其他我想/需要知道的事情,请随时提及。

What are people’s experiences with any of the Git modules for Python? (I know of GitPython, PyGit, and Dulwich – feel free to mention others if you know of them.)

I am writing a program which will have to interact (add, delete, commit) with a Git repository, but have no experience with Git, so one of the things I’m looking for is ease of use/understanding with regards to Git.

The other things I’m primarily interested in are maturity and completeness of the library, a reasonable lack of bugs, continued development, and helpfulness of the documentation and developers.

If you think of something else I might want/need to know, please feel free to mention it.


回答 0

虽然这个问题是在不久前提出的,但我当时还不知道库的状态,但是值得搜索的人提到,GitPython在抽象命令行工具方面做得很好,因此您无需使用子流程。您可以使用一些有用的内置抽象,但是对于其他所有事情,您都可以执行以下操作:

import git
repo = git.Repo( '/home/me/repodir' )
print repo.git.status()
# checkout and track a remote branch
print repo.git.checkout( 'origin/somebranch', b='somebranch' )
# add a file
print repo.git.add( 'somefile' )
# commit
print repo.git.commit( m='my commit message' )
# now we are one commit ahead
print repo.git.status()

GitPython中的其他所有功能都使其更易于浏览。我对此库非常满意,并赞赏它是基础git工具的包装。

更新:我已经切换到不仅使用git,而且还使用python中需要的大多数命令行实用程序使用sh模块。为了复制上面的内容,我将改为执行以下操作:

import sh
git = sh.git.bake(_cwd='/home/me/repodir')
print git.status()
# checkout and track a remote branch
print git.checkout('-b', 'somebranch')
# add a file
print git.add('somefile')
# commit
print git.commit(m='my commit message')
# now we are one commit ahead
print git.status()

While this question was asked a while ago and I don’t know the state of the libraries at that point, it is worth mentioning for searchers that GitPython does a good job of abstracting the command line tools so that you don’t need to use subprocess. There are some useful built in abstractions that you can use, but for everything else you can do things like:

import git
repo = git.Repo( '/home/me/repodir' )
print repo.git.status()
# checkout and track a remote branch
print repo.git.checkout( 'origin/somebranch', b='somebranch' )
# add a file
print repo.git.add( 'somefile' )
# commit
print repo.git.commit( m='my commit message' )
# now we are one commit ahead
print repo.git.status()

Everything else in GitPython just makes it easier to navigate. I’m fairly well satisfied with this library and appreciate that it is a wrapper on the underlying git tools.

UPDATE: I’ve switched to using the sh module for not just git but most commandline utilities I need in python. To replicate the above I would do this instead:

import sh
git = sh.git.bake(_cwd='/home/me/repodir')
print git.status()
# checkout and track a remote branch
print git.checkout('-b', 'somebranch')
# add a file
print git.add('somefile')
# commit
print git.commit(m='my commit message')
# now we are one commit ahead
print git.status()

回答 1

我以为我会回答自己的问题,因为我所采取的途径与答案中所建议的不同。尽管如此,感谢那些回答。

首先,简要介绍一下我在GitPython,PyGit和Dulwich的经验:

  • GitPython:下载后,我将其导入并初始化了适当的对象。但是,尝试执行本教程中建议的操作会导致错误。缺乏更多文档,我转向其他地方。
  • PyGit:这甚至都不会导入,而且我找不到任何文档。
  • 德威:似乎是最有前途的(至少就我想要和看到的而言)。与GitPython相比,我在其中取得了一些进步,因为它的卵来自Python源。但是,过了一会儿,我决定尝试做一下可能会更容易。

同样,StGit看起来很有趣,但是我需要将功能提取到一个单独的模块中,并且不希望现在等待它发生。

在比使上面的三个模块正常工作所花费的时间少得多的时间内,我设法通过子流程模块使git命令起作用,例如

def gitAdd(fileName, repoDir):
    cmd = ['git', 'add', fileName]
    p = subprocess.Popen(cmd, cwd=repoDir)
    p.wait()

gitAdd('exampleFile.txt', '/usr/local/example_git_repo_dir')

这还没有完全集成到我的程序中,但是除了速度(我有时会处理数百甚至数千个文件)之外,我没有预料到任何问题。

也许我只是没有耐心让Dulwich或GitPython正常运行。就是说,我希望这些模块能够得到更多的开发并且很快会有用。

I thought I would answer my own question, since I’m taking a different path than suggested in the answers. Nonetheless, thanks to those who answered.

First, a brief synopsis of my experiences with GitPython, PyGit, and Dulwich:

  • GitPython: After downloading, I got this imported and the appropriate object initialized. However, trying to do what was suggested in the tutorial led to errors. Lacking more documentation, I turned elsewhere.
  • PyGit: This would not even import, and I could find no documentation.
  • Dulwich: Seems to be the most promising (at least for what I wanted and saw). I made some progress with it, more than with GitPython, since its egg comes with Python source. However, after a while, I decided it may just be easier to try what I did.

Also, StGit looks interesting, but I would need the functionality extracted into a separate module and do not want wait for that to happen right now.

In (much) less time than I spent trying to get the three modules above working, I managed to get git commands working via the subprocess module, e.g.

def gitAdd(fileName, repoDir):
    cmd = ['git', 'add', fileName]
    p = subprocess.Popen(cmd, cwd=repoDir)
    p.wait()

gitAdd('exampleFile.txt', '/usr/local/example_git_repo_dir')

This isn’t fully incorporated into my program yet, but I’m not anticipating a problem, except maybe speed (since I’ll be processing hundreds or even thousands of files at times).

Maybe I just didn’t have the patience to get things going with Dulwich or GitPython. That said, I’m hopeful the modules will get more development and be more useful soon.


回答 2

我建议pygit2-它使用出色的libgit2绑定

I’d recommend pygit2 – it uses the excellent libgit2 bindings


回答 3

这是一个非常老的问题,在寻找Git库时,我发现了今年(2013年)制造的一个名为Gittle的库

它对我很有用(我尝试过的其他地方都比较薄弱),而且似乎涵盖了大多数常见操作。

自述文件中的一些示例:

from gittle import Gittle

# Clone a repository
repo_path = '/tmp/gittle_bare'
repo_url = 'git://github.com/FriendCode/gittle.git'
repo = Gittle.clone(repo_url, repo_path)

# Stage multiple files
repo.stage(['other1.txt', 'other2.txt'])

# Do the commit
repo.commit(name="Samy Pesse", email="samy@friendco.de", message="This is a commit")

# Authentication with RSA private key
key_file = open('/Users/Me/keys/rsa/private_rsa')
repo.auth(pkey=key_file)

# Do push
repo.push()

This is a pretty old question, and while looking for Git libraries, I found one that was made this year (2013) called Gittle.

It worked great for me (where the others I tried were flaky), and seems to cover most of the common actions.

Some examples from the README:

from gittle import Gittle

# Clone a repository
repo_path = '/tmp/gittle_bare'
repo_url = 'git://github.com/FriendCode/gittle.git'
repo = Gittle.clone(repo_url, repo_path)

# Stage multiple files
repo.stage(['other1.txt', 'other2.txt'])

# Do the commit
repo.commit(name="Samy Pesse", email="samy@friendco.de", message="This is a commit")

# Authentication with RSA private key
key_file = open('/Users/Me/keys/rsa/private_rsa')
repo.auth(pkey=key_file)

# Do push
repo.push()

回答 4

也许有帮助,但是Bazaar和Mercurial都使用dulwich来实现Git的互操作性。

Dulwich在某种意义上可能与另一个有所不同,因为它是python中git的重新实现。另一个可能只是Git命令的包装器(因此从较高的角度来看,它可能更易于使用:commit / add / delete),这可能意味着它们的API与git的命令行非常接近,因此您需要获得有关Git的经验。

Maybe it helps, but Bazaar and Mercurial are both using dulwich for their Git interoperability.

Dulwich is probably different than the other in the sense that’s it’s a reimplementation of git in python. The other might just be a wrapper around Git’s commands (so it could be simpler to use from a high level point of view: commit/add/delete), it probably means their API is very close to git’s command line so you’ll need to gain experience with Git.


回答 5

更新的答案反映了更改的时间:

GitPython当前是最容易使用的。它支持许多git plumbing命令的包装,并具有可插入的对象数据库(其中的一个是德威奇),并且如果未实现命令,则提供了一个简单的api,可以用于命令行。例如:

repo = Repo('.')
repo.checkout(b='new_branch')

这调用:

bash$ git checkout -b new_branch

德威也不错,但水平要低得多。使用它有点痛苦,因为它需要在管道级上对git对象进行操作,并且没有通常想要的精美瓷器。但是,如果您打算修改git的任何部分,或者使用git-receive-pack和git-upload-pack,则需要使用dulwich。

An updated answer reflecting changed times:

GitPython currently is the easiest to use. It supports wrapping of many git plumbing commands and has pluggable object database (dulwich being one of them), and if a command isn’t implemented, provides an easy api for shelling out to the command line. For example:

repo = Repo('.')
repo.checkout(b='new_branch')

This calls:

bash$ git checkout -b new_branch

Dulwich is also good but much lower level. It’s somewhat of a pain to use because it requires operating on git objects at the plumbing level and doesn’t have nice porcelain that you’d normally want to do. However, if you plan on modifying any parts of git, or use git-receive-pack and git-upload-pack, you need to use dulwich.


回答 6

为了完整起见,http://github.com/alex/pyvcs/是所有dvc的抽象层。它使用dulwich,但与其他dvc提供互操作。

For the sake of completeness, http://github.com/alex/pyvcs/ is an abstraction layer for all dvcs’s. It uses dulwich, but provides interop with the other dvcs’s.


回答 7

这是“ git status”的真正快速实现:

import os
import string
from subprocess import *

repoDir = '/Users/foo/project'

def command(x):
    return str(Popen(x.split(' '), stdout=PIPE).communicate()[0])

def rm_empty(L): return [l for l in L if (l and l!="")]

def getUntracked():
    os.chdir(repoDir)
    status = command("git status")
    if "# Untracked files:" in status:
        untf = status.split("# Untracked files:")[1][1:].split("\n")
        return rm_empty([x[2:] for x in untf if string.strip(x) != "#" and x.startswith("#\t")])
    else:
        return []

def getNew():
    os.chdir(repoDir)
    status = command("git status").split("\n")
    return [x[14:] for x in status if x.startswith("#\tnew file:   ")]

def getModified():
    os.chdir(repoDir)
    status = command("git status").split("\n")
    return [x[14:] for x in status if x.startswith("#\tmodified:   ")]

print("Untracked:")
print( getUntracked() )
print("New:")
print( getNew() )
print("Modified:")
print( getModified() )

Here’s a really quick implementation of “git status”:

import os
import string
from subprocess import *

repoDir = '/Users/foo/project'

def command(x):
    return str(Popen(x.split(' '), stdout=PIPE).communicate()[0])

def rm_empty(L): return [l for l in L if (l and l!="")]

def getUntracked():
    os.chdir(repoDir)
    status = command("git status")
    if "# Untracked files:" in status:
        untf = status.split("# Untracked files:")[1][1:].split("\n")
        return rm_empty([x[2:] for x in untf if string.strip(x) != "#" and x.startswith("#\t")])
    else:
        return []

def getNew():
    os.chdir(repoDir)
    status = command("git status").split("\n")
    return [x[14:] for x in status if x.startswith("#\tnew file:   ")]

def getModified():
    os.chdir(repoDir)
    status = command("git status").split("\n")
    return [x[14:] for x in status if x.startswith("#\tmodified:   ")]

print("Untracked:")
print( getUntracked() )
print("New:")
print( getNew() )
print("Modified:")
print( getModified() )

回答 8

PTBNL的答案对我来说非常完美。我为Windows用户提供了更多功能。

import time
import subprocess
def gitAdd(fileName, repoDir):
    cmd = 'git add ' + fileName
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    print out,error
    pipe.wait()
    return 

def gitCommit(commitMessage, repoDir):
    cmd = 'git commit -am "%s"'%commitMessage
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    print out,error
    pipe.wait()
    return 
def gitPush(repoDir):
    cmd = 'git push '
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    pipe.wait()
    return 

temp=time.localtime(time.time())
uploaddate= str(temp[0])+'_'+str(temp[1])+'_'+str(temp[2])+'_'+str(temp[3])+'_'+str(temp[4])

repoDir='d:\\c_Billy\\vfat\\Programming\\Projector\\billyccm' # your git repository , windows your need to use double backslash for right directory.
gitAdd('.',repoDir )
gitCommit(uploaddate, repoDir)
gitPush(repoDir)

PTBNL’s Answer is quite perfect for me. I make a little more for Windows user.

import time
import subprocess
def gitAdd(fileName, repoDir):
    cmd = 'git add ' + fileName
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    print out,error
    pipe.wait()
    return 

def gitCommit(commitMessage, repoDir):
    cmd = 'git commit -am "%s"'%commitMessage
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    print out,error
    pipe.wait()
    return 
def gitPush(repoDir):
    cmd = 'git push '
    pipe = subprocess.Popen(cmd, shell=True, cwd=repoDir,stdout = subprocess.PIPE,stderr = subprocess.PIPE )
    (out, error) = pipe.communicate()
    pipe.wait()
    return 

temp=time.localtime(time.time())
uploaddate= str(temp[0])+'_'+str(temp[1])+'_'+str(temp[2])+'_'+str(temp[3])+'_'+str(temp[4])

repoDir='d:\\c_Billy\\vfat\\Programming\\Projector\\billyccm' # your git repository , windows your need to use double backslash for right directory.
gitAdd('.',repoDir )
gitCommit(uploaddate, repoDir)
gitPush(repoDir)

回答 9

StGit的git交互库部分实际上非常好。但是,它不是作为单独的软件包分解的,但是,如果有足够的兴趣,我相信可以解决。

它具有非常好的抽象,用于表示提交,树等,以及用于创建新的提交和树。

The git interaction library part of StGit is actually pretty good. However, it isn’t broken out as a separate package but if there is sufficient interest, I’m sure that can be fixed.

It has very nice abstractions for representing commits, trees etc, and for creating new commits and trees.


回答 10

记录下来,前面提到的Git Python库似乎都没有包含“ git status”等效项,这实际上是我唯一想要的,因为通过子进程处理其余git命令非常容易。

For the record, none of the aforementioned Git Python libraries seem to contain a “git status” equivalent, which is really the only thing I would want since dealing with the rest of the git commands via subprocess is so easy.