问题:Python:通过比较两个绝对路径获得相对路径

说,我有两条绝对路径。我需要检查其中一条路径所指的位置是否是另一条路径的后代。如果为真,则需要找出祖先的后代的相对路径。用Python实现此功能的好方法是什么?我可以受益于任何图书馆?

Say, I have two absolute paths. I need to check if the location referring to by one of the paths is a descendant of the other. If true, I need to find out the relative path of the descendant from the ancestor. What’s a good way to implement this in Python? Any library that I can benefit from?


回答 0

os.path.commonprefix()os.path.relpath()是您的朋友:

>>> print os.path.commonprefix(['/usr/var/log', '/usr/var/security'])
'/usr/var'
>>> print os.path.commonprefix(['/tmp', '/usr/var'])  # No common prefix: the root is the common prefix
'/'

因此,您可以测试公共前缀是否是路径之一,即,其中一个路径是否是公共祖先:

paths = […, …, …]
common_prefix = os.path.commonprefix(list_of_paths)
if common_prefix in paths:
    

然后,您可以找到相对路径:

relative_paths = [os.path.relpath(path, common_prefix) for path in paths]

您甚至可以使用此方法处理两条以上的路径,并测试所有路径是否都在其中一条路径以下。

PS:根据您的路径看起来,您可能想先执行一些规范化操作(这在不知道它们是否始终以“ /”结尾或某些路径是否相对的情况下很有用)。相关功能包括os.path.abspath()os.path.normpath()

PPS:正如Peter Briggs在评论中提到的那样,上述简单方法可能会失败:

>>> os.path.commonprefix(['/usr/var', '/usr/var2/log'])
'/usr/var'

即使/usr/var没有路径的一个共同的前缀。在调用之前强制所有路径以“ /”结尾commonprefix()可解决此(特定)问题。

PPPS:如bluenote10所述,添加斜杠不能解决一般问题。这是他的后续问题:如何避免Python的os.path.commonprefix的谬误?

PPPPS:从Python 3.4开始,我们有pathlib,该模块提供了更合理的路径操作环境。我猜想,可以通过获取每个路径的所有前缀(带有PurePath.parents()),获取所有这些父集的交集并选择最长的公共前缀来获得一组路径的公共前缀。

PPPPPS:Python 3.5为这个问题引入了适当的解决方案:os.path.commonpath(),它返回有效路径。

os.path.commonprefix() and os.path.relpath() are your friends:

>>> print os.path.commonprefix(['/usr/var/log', '/usr/var/security'])
'/usr/var'
>>> print os.path.commonprefix(['/tmp', '/usr/var'])  # No common prefix: the root is the common prefix
'/'

You can thus test whether the common prefix is one of the paths, i.e. if one of the paths is a common ancestor:

paths = […, …, …]
common_prefix = os.path.commonprefix(list_of_paths)
if common_prefix in paths:
    …

You can then find the relative paths:

relative_paths = [os.path.relpath(path, common_prefix) for path in paths]

You can even handle more than two paths, with this method, and test whether all the paths are all below one of them.

PS: depending on how your paths look like, you might want to perform some normalization first (this is useful in situations where one does not know whether they always end with ‘/’ or not, or if some of the paths are relative). Relevant functions include os.path.abspath() and os.path.normpath().

PPS: as Peter Briggs mentioned in the comments, the simple approach described above can fail:

>>> os.path.commonprefix(['/usr/var', '/usr/var2/log'])
'/usr/var'

even though /usr/var is not a common prefix of the paths. Forcing all paths to end with ‘/’ before calling commonprefix() solves this (specific) problem.

PPPS: as bluenote10 mentioned, adding a slash does not solve the general problem. Here is his followup question: How to circumvent the fallacy of Python’s os.path.commonprefix?

PPPPS: starting with Python 3.4, we have pathlib, a module that provides a saner path manipulation environment. I guess that the common prefix of a set of paths can be obtained by getting all the prefixes of each path (with PurePath.parents()), taking the intersection of all these parent sets, and selecting the longest common prefix.

PPPPPS: Python 3.5 introduced a proper solution to this question: os.path.commonpath(), which returns a valid path.


回答 1

os.path.relpath

从当前目录或可选的起点返回相对文件路径的路径。

>>> from os.path import relpath
>>> relpath('/usr/var/log/', '/usr/var')
'log'
>>> relpath('/usr/var/log/', '/usr/var/sad/')
'../log'

因此,如果相对路径以'..'– 开头,则表示第二条路径不是第一条路径的后代。

在Python3中,您可以使用PurePath.relative_to

Python 3.5.1 (default, Jan 22 2016, 08:54:32)
>>> from pathlib import Path

>>> Path('/usr/var/log').relative_to('/usr/var/log/')
PosixPath('.')

>>> Path('/usr/var/log').relative_to('/usr/var/')
PosixPath('log')

>>> Path('/usr/var/log').relative_to('/etc/')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pathlib.py", line 851, in relative_to
    .format(str(self), str(formatted)))
ValueError: '/usr/var/log' does not start with '/etc'

os.path.relpath:

Return a relative filepath to path either from the current directory or from an optional start point.

>>> from os.path import relpath
>>> relpath('/usr/var/log/', '/usr/var')
'log'
>>> relpath('/usr/var/log/', '/usr/var/sad/')
'../log'

So, if relative path starts with '..' – it means that the second path is not descendant of the first path.

In Python3 you can use PurePath.relative_to:

Python 3.5.1 (default, Jan 22 2016, 08:54:32)
>>> from pathlib import Path

>>> Path('/usr/var/log').relative_to('/usr/var/log/')
PosixPath('.')

>>> Path('/usr/var/log').relative_to('/usr/var/')
PosixPath('log')

>>> Path('/usr/var/log').relative_to('/etc/')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pathlib.py", line 851, in relative_to
    .format(str(self), str(formatted)))
ValueError: '/usr/var/log' does not start with '/etc'

回答 2

另一种选择是

>>> print os.path.relpath('/usr/var/log/', '/usr/var')
log

Another option is

>>> print os.path.relpath('/usr/var/log/', '/usr/var')
log

回答 3

在Python 3中使用pathlib编写了jme的建议。

from pathlib import Path
parent = Path(r'/a/b')
son = Path(r'/a/b/c/d')            

if parent in son.parents or parent==son:
    print(son.relative_to(parent)) # returns Path object equivalent to 'c/d'

A write-up of jme’s suggestion, using pathlib, in Python 3.

from pathlib import Path
parent = Path(r'/a/b')
son = Path(r'/a/b/c/d')            
​
if parent in son.parents or parent==son:
    print(son.relative_to(parent)) # returns Path object equivalent to 'c/d'

回答 4

不带dep的纯Python2:

def relpath(cwd, path):
    """Create a relative path for path from cwd, if possible"""
    if sys.platform == "win32":
        cwd = cwd.lower()
        path = path.lower()
    _cwd = os.path.abspath(cwd).split(os.path.sep)
    _path = os.path.abspath(path).split(os.path.sep)
    eq_until_pos = None
    for i in xrange(min(len(_cwd), len(_path))):
        if _cwd[i] == _path[i]:
            eq_until_pos = i
        else:
            break
    if eq_until_pos is None:
        return path
    newpath = [".." for i in xrange(len(_cwd[eq_until_pos+1:]))]
    newpath.extend(_path[eq_until_pos+1:])
    return os.path.join(*newpath) if newpath else "."

Pure Python2 w/o dep:

def relpath(cwd, path):
    """Create a relative path for path from cwd, if possible"""
    if sys.platform == "win32":
        cwd = cwd.lower()
        path = path.lower()
    _cwd = os.path.abspath(cwd).split(os.path.sep)
    _path = os.path.abspath(path).split(os.path.sep)
    eq_until_pos = None
    for i in xrange(min(len(_cwd), len(_path))):
        if _cwd[i] == _path[i]:
            eq_until_pos = i
        else:
            break
    if eq_until_pos is None:
        return path
    newpath = [".." for i in xrange(len(_cwd[eq_until_pos+1:]))]
    newpath.extend(_path[eq_until_pos+1:])
    return os.path.join(*newpath) if newpath else "."

回答 5

编辑:请参阅jme的答案以获取使用Python3的最佳方法。

使用pathlib,您有以下解决方案:

假设我们要检查是否son是的后代parent,并且两者都是Path对象。我们可以使用来获得路径中零件的列表list(parent.parts)。然后,我们只检查儿子的开头是否等于父母的片段列表。

>>> lparent = list(parent.parts)
>>> lson = list(son.parts)
>>> if lson[:len(lparent)] == lparent:
>>> ... #parent is a parent of son :)

如果您想剩下的部分,您可以做

>>> ''.join(lson[len(lparent):])

它是一个字符串,但是您当然可以将其用作其他Path对象的构造函数。

Edit : See jme’s answer for the best way with Python3.

Using pathlib, you have the following solution :

Let’s say we want to check if son is a descendant of parent, and both are Path objects. We can get a list of the parts in the path with list(parent.parts). Then, we just check that the begining of the son is equal to the list of segments of the parent.

>>> lparent = list(parent.parts)
>>> lson = list(son.parts)
>>> if lson[:len(lparent)] == lparent:
>>> ... #parent is a parent of son :)

If you want to get the remaining part, you can just do

>>> ''.join(lson[len(lparent):])

It’s a string, but you can of course use it as a constructor of an other Path object.


声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。