将csv文件转换为sqlite数据库

csvs-to-sqlite的Python项目详细描述


CSV到SQLite

PyPITravis CILicense

将csv文件转换为sqlite数据库。使用Datasette浏览并发布sqlite数据库。

基本用法:

csvs-to-sqlite myfile.csv mydatabase.db

这将创建一个名为mydatabase.db的新sqlite数据库,其中包含 单个表,myfile,包含csv内容。

您可以提供多个csv文件:

csvs-to-sqlite one.csv two.csv bundle.db

bundle.db数据库将包含两个表,onetwo

这意味着您可以使用通配符:

csvs-to-sqlite ~/Downloads/*.csv my-downloads.db

如果将路径传递到一个或多个目录,则脚本将递归地 在这些目录中搜索csv文件并为每个文件创建表。

csvs-to-sqlite ~/path/to/directory all-my-csvs.db

处理tsv(制表符分隔值)

可以使用-s选项指定不同的分隔符。如果你愿意 要使用制表符,您需要应用shell转义,如下所示:

csvs-to-sqlite my-file.tsv my-file.db -s $'\t'

将列重构为单独的查找表

假设您有一个csv文件,如下所示:

county,precinct,office,district,party,candidate,votes
Clark,1,President,,REP,John R. Kasich,5
Clark,2,President,,REP,John R. Kasich,0
Clark,3,President,,REP,John R. Kasich,7

Real example taken from the Open Elections project

现在可以使用新的 --extract-列选项(简称:-c)-例如:

csvs-to-sqlite openelections-data-*/*.csv \
    -c county:County:name \
    -c precinct:Precinct:name \
    -c office -c district -c party -c candidate \
    openelections.db

格式如下:

column_name:optional_table_name:optional_table_value_column_name

如果只指定列名,例如-c office,下表将 创建时间:

CREATE TABLE "office" (
    "id" INTEGER PRIMARY KEY,
    "value" TEXT
);

如果您指定了这三个选项,例如-c precinct:Precinct:name表 会像这样:

CREATE TABLE "Precinct" (
    "id" INTEGER PRIMARY KEY,
    "name" TEXT
);

原始表的创建方式如下:

CREATE TABLE "ca__primary__san_francisco__precinct" (
    "county" INTEGER,
    "precinct" INTEGER,
    "office" INTEGER,
    "district" INTEGER,
    "party" INTEGER,
    "candidate" INTEGER,
    "votes" INTEGER,
    FOREIGN KEY (county) REFERENCES County(id),
    FOREIGN KEY (party) REFERENCES party(id),
    FOREIGN KEY (precinct) REFERENCES Precinct(id),
    FOREIGN KEY (office) REFERENCES office(id),
    FOREIGN KEY (candidate) REFERENCES candidate(id)
);

它们将填充引用新派生表的id。

安装

$ pip install csvs-to-sqlite

csvs-to-sqlite现在需要python 3。如果您运行的是Python2,则可以安装支持Python2的最新版本:

$ pip install csvs-to-sqlite==0.9.2

CSV到SQLite--帮助

Usage: csvs-to-sqlite [OPTIONS] PATHS... DBNAME

  PATHS: paths to individual .csv files or to directories containing .csvs

  DBNAME: name of the SQLite database file to create

Options:
  -s, --separator TEXT         Field separator in input .csv
  -q, --quoting INTEGER        Control field quoting behavior per csv.QUOTE_*
                               constants. Use one of QUOTE_MINIMAL (0),
                               QUOTE_ALL (1), QUOTE_NONNUMERIC (2) or
                               QUOTE_NONE (3).
  --skip-errors                Skip lines with too many fields instead of
                               stopping the import
  --replace-tables             Replace tables if they already exist
  -t, --table TEXT             Table to use (instead of using CSV filename)
  -c, --extract-column TEXT    One or more columns to 'extract' into a
                               separate lookup table. If you pass a simple
                               column name that column will be replaced with
                               integer foreign key references to a new table
                               of that name. You can customize the name of the
                               table like so:
                                   state:States:state_name
                               This will pull unique values from the 'state'
                               column and use them to populate a new 'States'
                               table, with an id column primary key and a
                               state_name column containing the strings from
                               the original column.
  -d, --date TEXT              One or more columns to parse into ISO formatted
                               dates
  -dt, --datetime TEXT         One or more columns to parse into ISO formatted
                               datetimes
  -df, --datetime-format TEXT  One or more custom date format strings to try
                               when parsing dates/datetimes
  -pk, --primary-key TEXT      One or more columns to use as the primary key
  -f, --fts TEXT               One or more columns to use to populate a full-
                               text index
  -i, --index TEXT             Add index on this column (or a compound index
                               with -i col1,col2)
  --shape TEXT                 Custom shape for the DB table - format is
                               csvcol:dbcol(TYPE),...
  --filename-column TEXT       Add a column with this name and populate with
                               CSV file name
  --no-index-fks               Skip adding index to foreign key columns
                               created using --extract-column (default is to
                               add them)
  --no-fulltext-fks            Skip adding full-text index on values extracted
                               using --extract-column (default is to add them)
  --version                    Show the version and exit.
  --help                       Show this message and exit.

欢迎加入QQ群-->: 979659372 Python中文网_新手群

推荐PyPI第三方库


热门话题
java JPA,playframework,在分组之前选择order by   java如何让安卓app在一天中的特定时间做某事   web服务Java restlet文件上传使用符合RFC 5987的apache commons文件上传   如何使用java在一个用C++和Direct9编码的网页中运行应用程序?   允许生成ddl的java Postgres数据库设置   垃圾收集有哪些好的Java编码实践可以帮助Java GC?   java@RestController将表单返回到同一页面但URL不同   java Eclipse:无法将project facet动态Web模块的版本更改为2.3   无迭代器分配的内存管理Java NodeUpplicate集合   java Jackson:使用默认(反)序列化程序   Java:如何将保存在同一路径上的多个文件夹中的最大名称提取到一个文件夹名称都是数字的变量中?   java将二维字符串数组转换为二维整数数组