外观清理工具

henr的Python项目详细描述


image


亨利:一个旁观者清理工具

henry是一个命令行工具,可以帮助确定looker实例中的模型膨胀,并识别模型和探索中未使用的内容。它旨在帮助开发人员从未使用的探索中清理模型,并从未使用的连接和字段中探索,以及维护一个健康且用户友好的实例。

目录

Status and Support

Henry is NOT supported or warranted by Looker in any way. Please do not contact Looker support for issues with Henry. Issues can be logged via https://github.com/looker-open-source/henry/issues

Where to get it

The source code is currently hosted on GitHub at https://github.com/looker-open-source/henry/。最新发布的版本可以在PyPI上找到,并可以使用:

$ pip install henry

对于开发设置,请遵循开发设置below

Usage

In order to display usage information, use:

^{pr 2}$

Storing Credentials

API3 login credentials can be specified at runtime using various flags or more conveniently, using a ^{} having the format shown below.

^{pr 3}$

Make sure that the ^{} file has restricted permissions by running ^{}. The tool will also ensure that this is the case every time it writes to the file.

If ^{} resides in the current working directory, then you don't need to do anything. If not, its location needs to be specified at runtime using the ^{} parameter or in the global config file

Global Config File

A global settings file called ^{} can be defined in ^{}. The file can be used to define a number of paramaters to be used at runtime:

^{pr 4}$

API timeout settings

The ^{} parameter can be used to specify API call timeout settings. It can take 3 types of values: null, an integer representing connect and read timeouts (in seconds) combined or a list that specifies the connect and read timeouts separately (e.g. "[5, 15]").

Config Path

The ^{} parameter defines the absolute location to the API3 credentials file

按照优先顺序,以下是用于定义凭据文件路径位置的方法: --路径,在~/.henry/settings.json中配置路径,然后是默认值。

Global Options that apply to many commands

Suppressing Formatted Output

Many commands provide tabular output. For tables the option ^{} will suppress the table headers and format lines, making it easier to use tools like grep, awk, etc. to retrieve values from the output of these commands.

Output to File

Using the ^{} option allows you to specify a path and a file to save the results to. When combined with ^{} the format lines will be suppressed. Example usage:

^{pr 5}$

saves the results to unused_explores.csv in the current working directory.

Pulse Command

The command ^{} runs a number of tests that help determine the overall instance health. A healthy Looker instance should pass all the tests. Below is a list of tests currently implemented.

Connection Checks

Runs specific tests for each connection to make sure the connection is in working order. If any tests fail, the output will show which tests passed or failed for that particular connection. Example:

^{pr 6}$

Query Stats

Checks how many queries were run over the past 30 days and how many of them errored or got killed as well as some statistics around runtimes times. The IDs of queries that took more than 5 times the average query runtime are also outputted.

Scheduled Plans

Determines the number of scheduled jobs that ran in the past 30 days, how many were successful, how many ran but did not deliver or failed to run altogether.

Legacy Features

Outputs a list of legacy features that are still in use if any. These are features that have been replaced with improved ones and should be moved away from.

Version

Checks if the latest Looker version is being used. Looker supports only up to 3 releases back.

Analyze Command

The ^{} command is meant to help identify models and explores that have become bloated and use ^{} on them in order to trim them.

analyze projects

The ^{} command scans projects for their content as well as checks for the status of quintessential features for success such as the git connection status and validation requirements.

^{pr 7}$

analyze models

Shows the number of explores in each model as well as the number of queries against that model.

^{pr 8}$

analyze explores

Shows explores and their usage. If the ^{} argument is passed, joins and fields that have been used less than the threshold specified will be considered as unused.

^{pr 9}$

Vacuum Information

The ^{} command outputs a list of unused content based on predefined criteria that a developer can then use to cleanup models and explores.

vacuum models

The ^{} command exposes models and the number of queries against them over a predefined period of time. Explores that are listed here have not had the minimum number of queries against them in the timeframe specified. As a result it is safe to hide them and later delete them.

^{pr 10}$

vacuum explores

The ^{} command exposes joins and exposes fields that are below the minimum number of queries threshold (default =0, can be changed using the ^{} argument) over the specified timeframe (default: 90, can be changed using the ^{} argument).

Example: from the analyze function run above,我们知道队列探索有4个字段在过去90天内没有被查询过一次。运行以下真空命令:

$ henry vacuum explores --model thelook --explore cohorts

提供未使用字段的名称:

+---------+-----------+----------------+------------------------------+
| model   | explore   | unused_joins   | unused_fields                |
|---------+-----------+----------------+------------------------------|
| thelook | cohorts   | N/A            | order_items.created_date     |
|         |           |                | order_items.id               |
|         |           |                | order_items.total_sale_price |
|         |           |                | users.gender                 |
+---------+-----------+----------------+------------------------------+

需要注意的是,在一个探索中清空字段的字段并不意味着完全从视图文件中删除,因为它们可能在其他探索或联接中使用。相反,应该使用fieldslookml参数隐藏这些字段(如果它们在其他地方没有使用),或者将它们从explore中排除。

Logging

The tool logs activity as it's being used. Log files are stored in ^{} in your home directory. Sensitive information such as your client secret is filtered out for security reasons. Moreover, log files have restricted permissions which allow only the owner to read and write.

The logging module utilises a rotating file handler which is currently set to rollover when the current log file reaches 500 KB in size. The system saves old log files by adding the suffix '.1', '.2' etc., to the filename. The file being written to is always named ^{}. No more than 10 log files are kept at any point in time, ensuring logs do not consume more than 5 MB max.

Dependencies

Development

To install henry in development mode you need to install the dependencies above and clone the project's repo with:

^{pr 13}$

You can then install using:

^{pr 14}$

Alternatively, you can use ^{} if you want all the dependencies pulled in automatically (the -e option is for installing it in development mode)。

$ pip install -e .

Authors

Henry has primarily been developed by Joseph Axisa。见all contributors

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/looker-open-source/henry/issues。这个项目旨在成为一个安全、受欢迎的协作空间,而且贡献者应该遵守Contributor Covenant行为准则。

Code of Conduct

Everyone interacting in the Henry project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the code of conduct

Copyright

Copyright (c) 2018 Joseph Axisa for Looker Data Sciences. See MIT License了解更多详细信息。

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
具有多用户OU和多访问CNs的java Spring LDAP身份验证   java分配的变量神秘地变为null   java比较两个表或文本文件,并用行号和列输出差异   java如何在同一构建目录中设置netbeans中的文件路径?   java如何在avro模式中定义byte[]和LocalDateTime?   java在多个活动和片段中使用相同的微调器。实施它的最佳方式是什么?   java使用OOPS扩展已编写的类   java如何在特定于文件的基础上禁用Eclipse中的编译器警告?   java将字符串转换为日期的格式不正确   Java文件从一台服务器复制到另一台服务器   java Jacksonal和JacksonApperasl的最新jar版本是什么?   java如何在使用selenium chrome web驱动程序时禁用chrome中的身份验证提示   java是什么。推荐人和推荐人。Eclipse工作区中的元数据以及它们是否应该在设备之间同步?   java我应该把sqlite db文件放在哪里,这样我的jar就可以访问它了?这对连接字符串有何影响?   java如何在选择单元格时设置JTable标题背景色   java Cassandra 2 Hector:复合行键上的范围切片查询返回空行   java方法注释继承   Python字节对象与java   java Android和从sqlite数据库加载listview