用于处理jupyter笔记本缓存的已定义接口。

jupyter-cache的Python项目详细描述


Install| Example| Contributing

jupyter缓存

CI StatusCoverageDocumentation StatusCode style: blackPyPI

用于处理jupyter笔记本缓存的已定义接口。在

注意:此软件包处于Alpha阶段,可能会更改。在

一些期望的要求(尚未全部实现):

  • 持久的
  • 将“编辑内容”与“编辑代码单元格”分开。细胞 重新排列和代码单元更改需要重新执行。不应更改内容。在
  • 允许并行访问笔记本(用于执行)
  • 存储执行统计/报告
  • 存储外部资源:正在执行的笔记本通常需要外部资源:导入脚本/数据等。这些都是由用户准备的。在
  • 存储执行工件:在执行期间创建
  • 一个透明且健壮的缓存失效:假设用户更新外部依赖项或Python模块,或者签出不同的git分支。在

安装

pip install jupyter-cache[cli]

为了发展:

^{pr2}$

API用法示例

接下来。。。在

CLI用法示例

从签出的存储库文件夹:

$ jcache --help
Usage: jcache [OPTIONS] COMMAND [ARGS]...  The command line interface of jupyter-cache.Options:  -v, --version       Show the version and exit.  -p, --cache-path    Print the current cache path and exit.  -a, --autocomplete  Print the autocompletion command and exit.  -h, --help          Show this message and exit.Commands:  cache    Commands for adding to and inspecting the cache.  clear    Clear the cache completely.  config   Commands for configuring the cache.  execute  Execute staged notebooks that are outdated.  stage    Commands for staging notebooks to be executed.

Important:在终端中执行以下操作以自动完成:

eval "$(_JCACHE_COMPLETE=source jcache)"

缓存已执行的笔记本

$ jcache cache --help
Usage: cache [OPTIONS] COMMAND [ARGS]...  Commands for adding to and inspecting the cache.Options:  --help  Show this message and exit.Commands:  add                 Cache notebook(s) that have already been executed.  add-with-artefacts  Cache a notebook, with possible artefact files.  cat-artifact        Print the contents of a cached artefact.  diff-nb             Print a diff of a notebook to one stored in the cache.  list                List cached notebook records in the cache.  remove              Remove notebooks stored in the cache.  show                Show details of a cached notebook in the cache.

第一次需要缓存时,它将延迟创建:

$ jcache cache list
Cache path: ../.jupyter_cacheThe cache does not yet exist, do you want to create it? [y/N]: yNo Cached Notebooks

您可以直接将笔记本添加到缓存中。 缓存时,将检查笔记本是否已执行 正确地说,单元执行计数从1开始按顺序递增。在

$ jcache cache add tests/notebooks/basic.ipynb
Caching: ../tests/notebooks/basic.ipynbValidity Error: Expected cell 1 to have execution_count 1 not 2The notebook may not have been executed, continue caching? [y/N]: ySuccess!

或跳过验证:

$ jcache cache add --no-validate tests/notebooks/basic.ipynb tests/notebooks/basic_failing.ipynb tests/notebooks/basic_unrun.ipynb tests/notebooks/complex_outputs.ipynb tests/notebooks/external_output.ipynb
Caching: ../tests/notebooks/basic.ipynbCaching: ../tests/notebooks/basic_failing.ipynbCaching: ../tests/notebooks/basic_unrun.ipynbCaching: ../tests/notebooks/complex_outputs.ipynbCaching: ../tests/notebooks/external_output.ipynbSuccess!

一旦你缓存了一些笔记本,你就可以查看“缓存记录” 对于缓存的内容。在

每个笔记本都经过哈希处理(仅限代码单元格和内核规范), 用于与“分段”笔记本进行比较。 可以为同一个URI添加多个哈希 (URI只是用于inspeption)并且缓存的大小是有限的 (当前默认值为1000),因此,在该大小下, 最后访问的记录开始被删除。 您可以按其ID删除缓存的记录

$ jcache cache list
  ID  Origin URI                             Created           Accessed----  -------------------------------------  ----------------  ----------------   5  tests/notebooks/external_output.ipynb  2020-03-12 17:31  2020-03-12 17:31   4  tests/notebooks/complex_outputs.ipynb  2020-03-12 17:31  2020-03-12 17:31   3  tests/notebooks/basic_unrun.ipynb      2020-03-12 17:31  2020-03-12 17:31   2  tests/notebooks/basic_failing.ipynb    2020-03-12 17:31  2020-03-12 17:31

提示:使用--latest-only选项,仅显示缓存笔记本的最新版本。在

你也可以在笔记本里放些艺术品 (笔记本执行的外部输出)。在

$ jcache cache add-with-artefacts -nb tests/notebooks/basic.ipynb tests/notebooks/artifact_folder/artifact.txt
Caching: ../tests/notebooks/basic.ipynbValidity Error: Expected cell 1 to have execution_count 1 not 2The notebook may not have been executed, continue caching? [y/N]: ySuccess!

通过引用缓存笔记本的ID显示其完整描述

$ jcache cache show 6ID: 6Origin URI: ../tests/notebooks/basic.ipynbCreated: 2020-03-12 17:31Accessed: 2020-03-12 17:31Hashkey: 818f3412b998fcf4fe9ca3cca11a3fc3Artifacts:- artifact_folder/artifact.txt

注:工艺品路径必须位于笔记本文件夹的“上游”:

$ jcache cache add-with-artefacts -nb tests/notebooks/basic.ipynb tests/test_db.py
Caching: ../tests/notebooks/basic.ipynbArtifact Error: Path '../tests/test_db.py' is not in folder '../tests/notebooks''

查看执行艺术品的内容:

$ jcache cache cat-artifact 6 artifact_folder/artifact.txt
An artifact

您可以通过其ID直接删除缓存的笔记本:

$ jcache cache remove 4Removing Cache ID = 4Success!

您还可以将任何缓存笔记本与任何(外部)笔记本进行区分:

$ jcache cache diff-nb 2 tests/notebooks/basic.ipynb
nbdiff--- cached pk=2+++ other: ../tests/notebooks/basic.ipynb## inserted before nb/cells/0:+  code cell:+    execution_count: 2+    source:+      a=1+      print(a)+    outputs:+      output 0:+        output_type: stream+        name: stdout+        text:+          1## deleted nb/cells/0:-  code cell:-    source:-      raise Exception('oopsie!')Success!

准备执行笔记本

$ jcache stage --help
Usage: stage [OPTIONS] COMMAND [ARGS]...  Commands for staging notebooks to be executed.Options:  --help  Show this message and exit.Commands:  add              Stage notebook(s) for execution.  add-with-assets  Stage a notebook, with possible asset files.  list             List notebooks staged for possible execution.  remove-ids       Un-stage notebook(s), by ID.  remove-uris      Un-stage notebook(s), by URI.  show             Show details of a staged notebook.

分阶段笔记本被记录为指向其URI的指针, i、 e.在执行之前不会进行物理复制。在

如果你准备一些笔记本来执行 您可以列出它们以查看哪些记录在缓存中(通过哈希), 需要执行:

$ jcache stage add tests/notebooks/basic.ipynb tests/notebooks/basic_failing.ipynb tests/notebooks/basic_unrun.ipynb tests/notebooks/complex_outputs.ipynb tests/notebooks/external_output.ipynb
Staging: ../tests/notebooks/basic.ipynbStaging: ../tests/notebooks/basic_failing.ipynbStaging: ../tests/notebooks/basic_unrun.ipynbStaging: ../tests/notebooks/complex_outputs.ipynbStaging: ../tests/notebooks/external_output.ipynbSuccess!
$ jcache stage list
  ID  URI                                    Created             Assets    Cache ID----  -------------------------------------  ----------------  --------  ----------   5  tests/notebooks/external_output.ipynb  2020-03-12 17:31         0           5   4  tests/notebooks/complex_outputs.ipynb  2020-03-12 17:31         0   3  tests/notebooks/basic_unrun.ipynb      2020-03-12 17:31         0           6   2  tests/notebooks/basic_failing.ipynb    2020-03-12 17:31         0           2   1  tests/notebooks/basic.ipynb            2020-03-12 17:31         0           6

您可以通过其URI或ID删除暂存笔记本:

$ jcache stage remove-ids 4Unstaging ID: 4Success!

然后,您可以运行所需笔记本的基本执行:

$ jcache cache remove 62Removing Cache ID = 6Removing Cache ID = 2Success!
^{pr21}$

成功执行的笔记本将缓存到缓存中, 连同所有被处决的“文物”, 在笔记本文件夹中,以及执行器提供的数据。在

$ jcache stage list
  ID  URI                                    Created             Assets    Cache ID----  -------------------------------------  ----------------  --------  ----------   5  tests/notebooks/external_output.ipynb  2020-03-12 17:31         0           5   3  tests/notebooks/basic_unrun.ipynb      2020-03-12 17:31         0           6   2  tests/notebooks/basic_failing.ipynb    2020-03-12 17:31         0   1  tests/notebooks/basic.ipynb            2020-03-12 17:31         0           6

执行数据(如执行时间)将存储在缓存记录中:

$ jcache cache show 6ID: 6Origin URI: ../tests/notebooks/basic_unrun.ipynbCreated: 2020-03-12 17:31Accessed: 2020-03-12 17:31Hashkey: 818f3412b998fcf4fe9ca3cca11a3fc3Data:  execution_seconds: 1.0559415130000005

失败的笔记本不会被缓存,但异常回溯将添加到阶段记录中:

$ jcache stage show 2ID: 2URI: ../tests/notebooks/basic_failing.ipynbCreated: 2020-03-12 17:31Failed Last Execution!Traceback (most recent call last):  File "../jupyter_cache/executors/basic.py", line 152, in execute    executenb(nb_bundle.nb, cwd=tmpdirname)  File "/anaconda/envs/mistune/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 737, in executenb    return ep.preprocess(nb, resources, km=km)[0]  File "/anaconda/envs/mistune/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 405, in preprocess    nb, resources = super(ExecutePreprocessor, self).preprocess(nb, resources)  File "/anaconda/envs/mistune/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 69, in preprocess    nb.cells[index], resources = self.preprocess_cell(cell, resources, index)  File "/anaconda/envs/mistune/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 448, in preprocess_cell    raise CellExecutionError.from_cell_and_msg(cell, out)nbconvert.preprocessors.execute.CellExecutionError: An error occurred while executing the following cell:------------------raise Exception('oopsie!')---------------------------------------------------------------------------------------------Exception                                 Traceback (most recent call last)<ipython-input-1-714b2b556897> in <module>----> 1 raise Exception('oopsie!')Exception: oopsie!Exception: oopsie!

执行后,您可以保留暂存笔记本,以便以后重新执行,或将其删除:

$ jcache stage remove-ids --all
Are you sure you want to remove all? [y/N]: yUnstaging ID: 1Unstaging ID: 2Unstaging ID: 3Unstaging ID: 5Success!

你也可以在笔记本上放置资产; 执行期间笔记本所需的外部文件。 与人工制品一样,这些文件必须与noteb在同一个文件夹中哦, 或子文件夹。在

$ jcache stage add-with-assets -nb tests/notebooks/basic.ipynb tests/notebooks/artifact_folder/artifact.txt
Success!
$ jcache stage show 1ID: 1URI: ../tests/notebooks/basic.ipynbCreated: 2020-03-12 17:31Cache ID: 6Assets:- ../tests/notebooks/artifact_folder/artifact.txt

贡献

jupyter缓存在Executable Book Contribution Guide之后。我们希望你的帮助!在

代码样式

使用flake8测试代码样式, 在.flake8中设置了配置, 和格式化为black的代码。在

使用jupyter-cache[code_style]安装使pre-commit 包可用,这将通过重新格式化代码来确保在提交提交之前满足此样式 以及测试lint错误。 可通过以下方式设置:

>> cd jupyter-cache
>> pre-commit install

您可以选择分别运行black和{}:

>> black .
>> flake8 .

像VS代码这样的编辑器也有自动代码重新格式化实用程序,它可以遵守这个标准。在

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
二进制字符串的java NumberFormatExceprion   java如何在Neo4j中查找/匹配/选择标识符名称   java哪一年的日期与原始年份相同?   java什么时候JAXB可以用于Json而不是XML?   java使用PayPal权限API进行PayPal支付   java Getting error在使用安卓 vision api扫描二维码时加载图像失败   java直接突出显示RichTextFX中的一个文本范围   java Resolve@RegisteredAuth2AuthorizedClient,其令牌在spring Security 5.2的spring server之外获得。十、   yyyymmddhhmmss的Java正则表达式   java我试图将google recaptcha与spring mvc集成,但GreCaptCharResponse总是返回false。这里有一些代码   java使用JsonPath将文本转换为json   java无法解析符号。Maven依赖项已就位,但代码为红色   java使用循环查找范围内具有不同数字的数字   java这个SwingWorker是否不重用ThreadPoolExecutor中的线程?