将多个CSV文件导入mysql数据库并为其创建表

2024-09-30 08:26:45 发布

您现在位置:Python中文网/ 问答频道 /正文

我有一个包含数百个csv文件的文件夹。每个文件都有日期,因为我的目录中的数据每天都在创建,例如2020-01-15.csv、2020-01-16.csv、2020-01-17.csv等。我正在寻找一种最好的方法,每天将文件导入mysql数据库并为每个文件创建表(如果已经存在文件名的表,则无需创建表)

到目前为止,为了将我的文件导入mysql数据库,我使用了mysqlimport,但我当时使用它导入了一个文件,这一次似乎我对该函数的了解还不够,这是我在bash中尝试的:

mysqlimport -h localhost -umyusername -pmypassword database_name /path/to/my/data/*.csv

收到错误:

mysqlimport: Error: 1146, Table 'database_name.2020-01-15' doesn't exist, when using table: 2020-01-15

有人能帮我吗?在python中有没有更简单的方法?提前谢谢

文件单文件结构:

['date,id,name,gsmCount,userCount,regionCount\n',
 '2020-01-25,g45ddf-54fdfd4,GammaY,22142,3212,132\n', 
 '2020-01-25,g412ddf-54re321d4,BetaT,351871,734,67\n',
 '2020-01-25,fsdsf579hhh-fgd4,LambdaD,367,41,7\n']

这是我目前的剧本:

#!/bin/bash

# show commands being executed, per debug
set -x

# define database connectivity
_db="mydatabasename"
_db_user="myusername"
_db_password="mypassword"


# define directory containing CSV files
_csv_directory="/path/to/my/data"

# go into directory
cd $_csv_directory || exit

# edit file name
rename "s/ //g" *.csv
rename "s/^/tp/g" *.csv

# get a list of CSV files in directory
_csv_files=`ls -1 *.csv`

# loop through csv files
for _csv_file in ${_csv_files[@]}
do

 # remove file extension
 _csv_file_extensionless=`echo "$_csv_file" | sed 's/\(.*\)\..*/\1/'`

 # define table name
 _table_name="${_csv_file_extensionless}"

 # get header columns from CSV file
 _header_columns=`head -1 $_csv_directory/$_csv_file | tr ',' '\n' |   sed 's/"//' | sed 's/ /_/g'`
 _header_columns_string=`head -1 $_csv_directory/$_csv_file | sed 's/ /_/g' | sed 's/"//g' | sed 's/(//g' | sed 's/)//g'`

 # ensure table exists
 mysql -u $_db_user -p$_db_password $_db << eof
     CREATE TABLE IF NOT EXISTS \`$_table_name\` ENGINE=MyISAM DEFAULT CHARSET=utf8
eof

 # loop through header columns
 for _header in "${_header_columns[@]}"
 do

   # add column
   mysql -u $_db_user -p$_db_password $_db --execute="alter table \`$_table_name\` add column IF NOT EXISTS \`$_header\` text"

 done

 # import csv into mysql
 mysqlimport --fields-enclosed-by='"' --fields-terminated-by=',' -- lines-terminated-by="\n" --columns=$_header_columns_string -u $_db_user - p$_db_password $_db $_csv_directory/$_csv_file

done
exit

下面是我在运行上述程序时收到的错误:

myserver:~ user_name$ bash -c -l "/path/to/my/script/uploadmysql.sh"
+ _db=mydatabasename
+ _db_user=myusername
+ _db_password=mypassword
+ _csv_directory=/path/to/my/data
+ cd /path/to/my/data
+ rename 's/ //g' 2020-01-25.csv 2020-01-26.csv 2020-01-27.csv
/path/to/my/script/uploadmysql.sh: line 19: rename: command not found
+ rename 's/^/tp/g' 2020-01-25.csv 2020-01-26.csv 2020-01-27.csv
/path/to/my/script/uploadmysql.sh: line 20: rename: command not found
++ ls -1 2020-01-25.csv 2020-01-26.csv 2020-01-27.csv
+ _csv_files='2020-01-25.csv
2020-01-26.csv
2020-01-27.csv'
+ for _csv_file in '${_csv_files[@]}'
++ echo 2020-01-25.csv
++ sed 's/\(.*\)\..*/\1/'
+ _csv_file_extensionless=2020-01-25
+ _table_name=2020-01-25
++ head -1 /path/to/my/data/2020-01-25.csv
++ tr , '\n'
++ sed 's/"//'
++ sed 's/ /_/g'
+ _header_columns='date
id
Name
gsmCount
userCount
regionCount'
++ head -1 /path/to/my/data/2020-01-25.csv
++ sed 's/ /_/g'
++ sed 's/"//g'
++ sed 's/(//g'
++ sed 's/)//g'
+ _header_columns_string=date,id,Name,gsmCount,userCount,regionCount
+ mysql -u myusername -pmypassword mydatabase
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1113 (42000) at line 1: A table must have at least 1 column
+ for _header in '"${_header_columns[@]}"'
+ mysql -u myusername -pmypassword mydatabase '--execute=alter table  `2020-01-25` add column IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text'
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax;  check the manual that corresponds to your MySQL server version for the  right syntax to use near 'IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text' at line 1
+ mysqlimport '--fields-enclosed-by="' --fields-terminated-by=, '--lines-terminated-by=\n' -- columns=date,id,Name,gsmCount,userCount,regionCount -u myusername - pmypassword mydatabase /path/to/my/data/2020-01-25.csv
mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqlimport: Error: 1146, Table 'mydatabase.2020-01-25' doesn't exist, when using table: 2020-01-25
+ for _csv_file in '${_csv_files[@]}'
++ echo 2020-01-26.csv
++ sed 's/\(.*\)\..*/\1/'
+ _csv_file_extensionless=2020-01-26
+ _table_name=2020-01-26
++ head -1 /path/to/my/data/2020-01-26.csv
++ tr , '\n'
++ sed 's/"//'
++ sed 's/ /_/g'
+ _header_columns='date
id
Name
gsmCount
userCount
regionCount'
++ head -1 /path/to/my/data/2020-01-26.csv
++ sed 's/ /_/g'
++ sed 's/"//g'
++ sed 's/(//g'
++ sed 's/)//g'
+ _header_columns_string=date,id,Name,gsmCount,userCount,regionCount
+ mysql -u myusername -pmypassword mydatabase
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1113 (42000) at line 1: A table must have at least 1 column
+ for _header in '"${_header_columns[@]}"'
+ mysql -u myusername -pmypassword mydatabase '--execute=alter table   `2020-01-26` add column IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text'
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text' at line 1
+ mysqlimport '--fields-enclosed-by="' --fields-terminated-by=, '--  lines-terminated-by=\n' --  columns=date,id,Name,gsmCount,userCount,regionCount -u myusername -  pmypassword mydatabase /path/to/my/data/2020-01-26.csv
mysqlimport: [Warning] Using a password on the command line interface can  be insecure.
mysqlimport: Error: 1146, Table 'mydatabase.2020-01-26' doesn't exist,  when using table: 2020-01-26
+ for _csv_file in '${_csv_files[@]}'
++ echo 2020-01-27.csv
++ sed 's/\(.*\)\..*/\1/'
+ _csv_file_extensionless=2020-01-27
+ _table_name=2020-01-27
++ head -1 /path/to/my/data/2020-01-27.csv
++ tr , '\n'
++ sed 's/"//'
++ sed 's/ /_/g'
+ _header_columns='date
id
Name
gsmCount
userCount
regionCount'
++ head -1 /path/to/my/data/2020-01-27.csv
++ sed 's/ /_/g'
++ sed 's/"//g'
++ sed 's/(//g'
++ sed 's/)//g'
+ _header_columns_string=date,id,Name,gsmCount,userCount,regionCount
+ mysql -u myusername -pmypassword mydatabase
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1113 (42000) at line 1: A table must have at least 1 column
+ for _header in '"${_header_columns[@]}"'
+ mysql -u myusername -pmypassword mydatabase '--execute=alter table     `2020-01-27` add column IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text'
mysql: [Warning] Using a password on the command line interface can be   insecure.
ERROR 1064 (42000) at line 1: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'IF NOT EXISTS `date
id
Name
gsmCount
userCount
regionCount` text' at line 1
+ mysqlimport '--fields-enclosed-by="' --fields-terminated-by=, '--lines-terminated-by=\n' --columns=date,id,Name,gsmCount,userCount,regionCount -u myusername - pmypassword mydatabase /path/to/my/data/2020-01-27.csv
 mysqlimport: [Warning] Using a password on the command line interface can be insecure.
mysqlimport: Error: 1146, Table 'mydatabase.2020-01-27' doesn't exist,      when using table: 2020-01-27
+ exit

仍在尝试消除错误,并将多个CSV文件作为表导入mysql。有人能告诉我如何解决这些问题吗?提前谢谢


Tags: columnscsvtopathiddatemyline
1条回答
网友
1楼 · 发布于 2024-09-30 08:26:45

这个execsql.py(https://pypi.org/project/execsql/)示例演示了如何获取目录中的所有文件名,对它们进行循环,并将每个文件名导入到自己的表:http://execsql.osdn.io/examples.html#example-13-import-all-the-csv-files-in-a-directory。这个示例是为Postgres而不是MySQL编写的,它将表放在一个暂存目录(名为“暂存”)中,但是可以很容易地修改它以使用MySQL

免责声明:我编写了execsql

相关问题 更多 >

    热门问题