在python中引用bazel数据文件的正确方法是什么?

2024-09-27 07:20:23 发布

您现在位置:Python中文网/ 问答频道 /正文

假设我有下面的BUILD文件

py_library(
  name = "foo",
  src = ["foo.py"],
  data = ["//bar:data.json"],
)

我应该如何引用foo.py文件中的data.json?我想要下面这样的东西,我应该用什么来做some_path?在

^{pr2}$

我在网上找不到很多关于*.runfiles的一般性文档——任何指针都将不胜感激!在


Tags: 文件pathname文档pybuildsrcjson
3条回答

更恰当的答案是使用Bazel提供的runfiles库,该库位于https://github.com/bazelbuild/bazel/blob/master/tools/python/runfiles/runfiles.py

请参阅文件开头的用法部分。在

以下是一个函数,在我所知的所有情况下,它应该返回任何py_二进制文件的runfiles根路径:

import os
import re

def find_runfiles():
    """Find the runfiles tree (useful when _not_ run from a zip file)"""
    # Follow symlinks, looking for my module space
    stub_filename = os.path.abspath(sys.argv[0])
    while True:
        # Found it?
        module_space = stub_filename + '.runfiles'
        if os.path.isdir(module_space):
            break

        runfiles_pattern = r"(.*\.runfiles)"
        matchobj = re.match(runfiles_pattern, os.path.abspath(sys.argv[0]))
        if matchobj:
            module_space = matchobj.group(1)
            break

        raise RuntimeError('Cannot find .runfiles directory for %s' %
                           sys.argv[0])
    return module_space

对于问题中的示例,您可以这样使用:

^{pr2}$

请注意,如果您构建python应用程序的压缩可执行文件(可能使用subpar),那么这个函数将不会有帮助;对于这些应用程序,您需要更多的代码。下一个代码片段包括get_resource_filename()get_resource_directory(),这两个代码既适用于常规py二进制文件,也适用于.par二进制文件:

import atexit
import os
import re
import shutil
import sys
import tempfile
import zipfile


 def get_resource_filename(path):
    zip_path = get_zip_path(sys.modules.get("__main__").__file__)
    if zip_path:
        tmpdir = tempfile.mkdtemp()
        atexit.register(lambda: shutil.rmtree(tmpdir, ignore_errors=True))
        zf = BetterZipFile(zip_path)
        zf.extract(member=path, path=tmpdir)
        return os.path.join(tmpdir, path)
    elif os.path.exists(path):
        return path
    else:
        path_in_runfiles = os.path.join(find_runfiles(), path)
        if os.path.exists(path_in_runfiles):
            return path_in_runfiles
        else:
            raise ResourceNotFoundError


def get_resource_directory(path):
    """Find or extract an entire subtree and return its location."""
    zip_path = get_zip_path(sys.modules.get("__main__").__file__)
    if zip_path:
        tmpdir = tempfile.mkdtemp()
        atexit.register(lambda: shutil.rmtree(tmpdir, ignore_errors=True))
        zf = BetterZipFile(zip_path)
        members = []
        for fn in zf.namelist():
            if fn.startswith(path):
                members += [fn]
        zf.extractall(members=members, path=tmpdir)
        return os.path.join(tmpdir, path)
    elif os.path.exists(path):
        return path
    else:
        path_in_runfiles = os.path.join(find_runfiles(), path)
        if os.path.exists(path_in_runfiles):
            return path_in_runfiles
        else:
            raise ResourceNotFoundError


def get_zip_path(path):
    """If path is inside a zip file, return the zip file's path."""
    if path == os.path.sep:
        return None
    elif zipfile.is_zipfile(path):
        return path
    return get_zip_path(os.path.dirname(path))


class ResourceNotFoundError(RuntimeError):
    pass

def find_runfiles():
    """Find the runfiles tree (useful when _not_ run from a zip file)"""
    # Follow symlinks, looking for my module space
    stub_filename = os.path.abspath(sys.argv[0])
    while True:
        # Found it?
        module_space = stub_filename + '.runfiles'
        if os.path.isdir(module_space):
            break

        runfiles_pattern = r"(.*\.runfiles)"
        matchobj = re.match(runfiles_pattern, os.path.abspath(sys.argv[0]))
        if matchobj:
            module_space = matchobj.group(1)
            break

        raise RuntimeError('Cannot find .runfiles directory for %s' %
                           sys.argv[0])
    return module_space


class BetterZipFile(zipfile.ZipFile):
    """Shim around ZipFile that preserves permissions on extract."""

    def extract(self, member, path=None, pwd=None):

        if not isinstance(member, zipfile.ZipInfo):
            member = self.getinfo(member)

        if path is None:
            path = os.getcwd()

        ret_val = self._extract_member(member, path, pwd)
        attr = member.external_attr >> 16
        os.chmod(ret_val, attr)
        return ret_val

使用第二个代码片段,您的示例将如下所示:

with open(get_resource_filename("name_of_workspace/bar/data.json"), 'r') as fp:
    data = json.load(fp)

简短回答:os.path.dirname(__file__)

以下是完整的示例:

$ ls
bar/  BUILD  foo.py  WORKSPACE

$ cat BUILD
py_binary(
    name = "foo",
    srcs = ["foo.py"],
    data = ["//bar:data.json"],
)

$ cat foo.py
import json
import os

ws = os.path.dirname(__file__)
with open(os.path.join(ws, "bar/data.json"), 'r') as fp:
  print(json.load(fp))

$ cat bar/BUILD
exports_files(["data.json"])

$ bazel run :foo

编辑:当你的包在子目录中时,它不能很好地工作。您可能需要使用os.path.dirname返回。在

相关问题 更多 >

    热门问题