bigquery datasetmanager是一个用于bigquery数据集的简单的基于文件的cli管理工具。

BigQuery-DatasetManager的Python项目详细描述


https://img.shields.io/pypi/pyversions/BigQuery-DatasetManager.svghttps://travis-ci.org/laughingman7743/BigQuery-DatasetManager.svg?branch=masterhttps://codecov.io/gh/laughingman7743/BigQuery-DatasetManager/branch/master/graph/badge.svghttps://img.shields.io/pypi/l/BigQuery-DatasetManager.svg

bigquery数据集管理器

bigquery datasetmanager是一个用于BigQuery Datasets的简单的基于文件的cli管理工具。

要求

  • Python
  • 第2、7、3、4、3.5、3.6节

安装

$ pip install BigQuery-DatasetManager

资源表示

数据集和表的资源表示见YAML format

数据集

name:dataset1friendly_name:nulldescription:nulldefault_table_expiration_ms:nulllocation:USaccess_entries:-role:OWNERentity_type:specialGroupentity_id:projectOwners-role:WRITERentity_type:specialGroupentity_id:projectWriters-role:READERentity_type:specialGroupentity_id:projectReaders-role:OWNERentity_type:userByEmailentity_id:aaa@bbb.gserviceaccount.com-role:nullentity_type:viewentity_id:datasetId:view1projectId:project1tableId:table1labels:foo:bar
Key nameValueDescription
dataset_idstrID of the dataset.
friendly_namestrTitle of the dataset.
descriptionstrDescription of the dataset.
default_table_expiration_msintDefault expiration time for tables in the dataset.
locationstrLocation in which the dataset is hosted.
access_entriesseqRepresents grant of an access role to an entity.
access_entriesrolestr

Role granted to the entity. The following string values are supported:

  • ^{tt1}$
  • ^{tt2}$
  • ^{tt3}$

It may also be ^{tt4}$ if the ^{tt5}$ is ^{tt6}$.

entity_typestr

Type of entity being granted the role. One of

  • ^{tt7}$
  • ^{tt8}$
  • ^{tt9}$
  • ^{tt10}$
  • ^{tt6}$
entity_idstr/mapIf the ^{tt5}$ is not ‘view’, the ^{tt13}$ is the ^{tt14}$ ID of the entity being granted the role. If the ^{tt5}$ is ‘view’, the ^{tt13}$ is a ^{tt17}$ representing the view from a different dataset to grant access to.
datasetIdstrID of the dataset containing this table. (Specifies when ^{tt5}$ is ^{tt6}$.)
projectIdstrID of the project containing this table. (Specifies when ^{tt5}$ is ^{tt6}$.)
tableIdstrID of the table. (Specifies when ^{tt5}$ is ^{tt6}$.)
labelsmapLabels for the dataset.

注意:有关密钥名称的详细信息,请参见the official documentation of BigQuery Datasets

表格

table_id:table1friendly_name:nulldescription:nullexpires:nullpartitioning_type:nullview_use_legacy_sql:nullview_query:nullschema:-name:column1field_type:STRINGmode:REQUIREDdescription:nullfields:null-name:column2field_type:RECORDmode:NULLABLEdescription:nullfields:-name:column2_1field_type:STRINGmode:NULLABLEdescription:nullfields:null-name:column2_2field_type:INTEGERmode:NULLABLEdescription:nullfields:null-name:column2_3field_type:RECORDmode:REPEATEDdescription:nullfields:-name:column2_3_1field_type:BOOLEANmode:NULLABLEdescription:nullfields:nulllabels:foo:bar
table_id:view1friendly_name:nulldescription:nullexpires:nullpartitioning_type:nullview_use_legacy_sql:falseview_query:|select*from`project1.dataset1.table1`schema:nulllabels:null
Key nameValueDescription
table_idstrID of the table.
friendly_namestrTitle of the table.
descriptionstrDescription of the table.
expiresstrDatetime at which the table will be deleted. (ISO8601 format ^{tt24}$)
partitioning_typestrTime partitioning of the table if it is partitioned. The only partitioning type that is currently supported is ^{tt25}$.
view_use_legacy_sqlboolSpecifies whether to use BigQuery’s legacy SQL for this view.
view_querystrSQL query defining the table as a view.
schemaseqThe schema of the table destination for the row.
schemanamestrThe name of the field.
field_typestr

The type of the field. One of

  • ^{tt26}$
  • ^{tt27}$
  • ^{tt28}$
  • ^{tt29}$ (same as INTEGER)
  • ^{tt30}$
  • ^{tt31}$ (same as FLOAT)
  • ^{tt32}$
  • ^{tt33}$ (same as BOOLEAN)
  • ^{tt34}$
  • ^{tt35}$
  • ^{tt36}$
  • ^{tt37}$
  • ^{tt38}$ (where RECORD indicates that the field contains a nested schema)
  • ^{tt39}$ (same as RECORD)
modestr

The mode of the field. One of

  • ^{tt40}$
  • ^{tt41}$
  • ^{tt42}$
descriptionstrDescription for the field.
fieldsseqDescribes the nested schema fields if the type property is set to ^{tt38}$.
labelsmapLabels for the table.

注意:有关密钥名称的详细信息,请参见the official documentation of BigQuery Tables

目录结构
.
├── dataset1        # Directory storing the table configuration file of dataset1.
│   ├── table1.yml  # Configuration file of table1 in dataset1.
│   └── table2.yml  # Configuration file of table2 in dataset1.
├── dataset1.yml    # Configuration file of dataset1.
├── dataset2        # Directory storing the table configuration file of dataset2.
│   └── .gitkeep    # When keeping a directory, dataset2 is empty.
├── dataset2.yml    # Configuration file of dataset2.
└── dataset3.yml    # Configuration file of dataset3. This dataset does not manage the table.

注意:如果不想管理表,请删除与数据集名称相同的目录。

使用量

Usage: bqdm [OPTIONS] COMMAND [ARGS]...

Options:
  -c, --credential-file PATH  Location of credential file for service accounts.
  -p, --project TEXT          Project ID for the project which you’d like to manage with.
  --color / --no-color        Enables output with coloring.
  --parallelism INTEGER       Limit the number of concurrent operation.
  --debug                     Debug output management.
  -h, --help                  Show this message and exit.

Commands:
  apply    Builds or changes datasets.
  destroy  Specify subcommand `plan` or `apply`
  export   Export existing datasets into file in YAML format.
  plan     Generate and show an execution plan.

导出

Usage: bqdm export [OPTIONS] [OUTPUT_DIR]

  Export existing datasets into file in YAML format.

Options:
  -d, --dataset TEXT          Specify the ID of the dataset to manage.
  -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
  -h, --help                  Show this message and exit.

计划

Usage: bqdm plan [OPTIONS] [CONF_DIR]

  Generate and show an execution plan.

Options:
  --detailed_exitcode         Return a detailed exit code when the command exits.
                              When provided, this argument changes
                              the exit codes and their meanings to provide
                              more granular information about what the
                              resulting plan contains:
                              0 = Succeeded with empty diff
                              1 = Error
                              2 = Succeeded with non-
                              empty diff
  -d, --dataset TEXT          Specify the ID of the dataset to manage.
  -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
  -h, --help                  Show this message and exit.

应用

Usage: bqdm apply [OPTIONS] [CONF_DIR]

  Builds or changes datasets.

Options:
  -d, --dataset TEXT              Specify the ID of the dataset to manage.
  -e, --exclude-dataset TEXT      Specify the ID of the dataset to exclude from managed.
  -m, --mode [select_insert|select_insert_backup|replace|replace_backup|drop_create|drop_create_backup]
                                  Specify the migration mode when changing the schema.
                                  Choice from `select_insert`,
                                  `select_insert_backup`, `replace`, r`eplace_backup`,
                                  `drop_create`,
                                  `drop_create_backup`.  [required]
  -b, --backup-dataset TEXT       Specify the ID of the dataset to store the backup at migration
  -h, --help                      Show this message and exit.

注:见migration mode

销毁

Usage: bqdm destroy [OPTIONS] COMMAND [ARGS]...

  Specify subcommand `plan` or `apply`

Options:
  -h, --help  Show this message and exit.

Commands:
  apply  Destroy managed datasets.
  plan   Generate and show an execution plan for...
销毁计划
Usage: bqdm destroy plan [OPTIONS] [CONF_DIR]

  Generate and show an execution plan for datasets destruction.

Options:
  --detailed-exitcode         Return a detailed exit code when the command exits.
                              When provided, this argument changes
                              the exit codes and their meanings to provide
                              more granular information about what the
                              resulting plan contains:
                              0 = Succeeded with empty diff
                              1 = Error
                              2 = Succeeded with non-
                              empty diff
  -d, --dataset TEXT          Specify the ID of the dataset to manage.
  -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
  -h, --help                  Show this message and exit.
销毁应用程序
Usage: bqdm destroy apply [OPTIONS] [CONF_DIR]

  Destroy managed datasets.

Options:
  -d, --dataset TEXT          Specify the ID of the dataset to manage.
  -e, --exclude-dataset TEXT  Specify the ID of the dataset to exclude from managed.
  -h, --help                  Show this message and exit.

迁移模式

选择“插入”
  1. 待办事项

限制:TOdo

选择“插入备份”
  1. 待办事项

限制:TOdo

更换
  1. 待办事项

限制:TOdo

更换备份

  1. 待办事项

限制:TOdo

拖放创建

  1. 待办事项

拖放创建备份

  1. 待办事项

认证

参见google-cloud-python官方文档中的authentication section

If you’re running in Compute Engine or App Engine, authentication should “just work”.

If you’re developing locally, the easiest way to authenticate is using the Google Cloud SDK:

$ gcloud auth application-default login

Note that this command generates credentials for client libraries. To authenticate the CLI itself, use:

$ gcloud auth login

Previously, gcloud auth login was used for both use cases. If your gcloud installation does not support the new command, please update it:

$ gcloud components update

If you’re running your application elsewhere, you should download a service account JSON keyfile and point to it using an environment variable:

$ exportGOOGLE_APPLICATION_CREDENTIALS="/path/to/keyfile.json"

测试

取决于以下环境变量:

$ exportGOOGLE_APPLICATION_CREDENTIALS=/path/to/credentials.json
$ exportGOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID

运行测试
$ pip install pipenv
$ pipenv install --dev
$ pipenv run pytest

运行测试多个python版本
$ pip install pipenv
$ pipenv install --dev
$ pyenv local3.6.5 3.5.5 3.4.8 2.7.14
$ pipenv run tox

待办事项

  1. 支持表的加密配置
  2. 支持表的外部数据配置
  3. 模式复制
  4. 集成测试

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
DIIOP_IOR中的java端口0。TXT,我如何更改它?   hadoop面临的问题:java。lang.NoClassDefFoundError:org/bouncycastle/jcajce/JcaJceHelper在使用更新的BC jar时   java将大科学数转换为长科学数   Java8文件流,如何控制文件的关闭?   是否有类似于dotnetshoutout的资源。com&dotnetkicks。Java世界中的com?   java返回类型void/方法替代方案?   如何使用java。lang.NullPointerException:void 安卓。支持v7。应用程序。ActionBar。setElevation(float)“”在空对象引用上'   java使用kafka流获取时间窗口中给定密钥的最后一个事件   java多边形旋转不正确   java我们应该在params中编写什么。jpbc的属性文件   java如何计算线程数?   使用jar时发生java错误,但不在库本身中   java优先级列表排队方法错误   java和org之间的区别。莫基托。莫基托。任何和组织。莫基托。媒人。任何