生成zipfian分布的数据,并用数据填充MySQL数据库

2024-10-01 02:20:08 发布

您现在位置:Python中文网/ 问答频道 /正文

我需要生成具有zipfian分布的数据,然后用这组生成的数据填充数据库。如果我有一个MySQL表:

CREATE TABLE table1(
   id INT(11) PRIMARY_KEY AUTO_INCREMENT,
   x INT(11) NOT NULL,
   ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

我想根据变量“x”生成zipfian分布的数据。此变量的范围为1到10。我使用这个post创建了以下python脚本:

 import numpy as np
 import pymysql


 def Zipf(a: np.float64, min: np.uint64, max: np.uint64, size=None):
     """
     Generate Zipf-like random variables,
     but in inclusive [min...max] interval
     """
     if min == 0:
         raise ZeroDivisionError("")
         v = np.arange(min, max+1) # values to sample
         p = 1.0 / np.power(v, a)  # probabilities
         p /= np.sum(p)            # normalized

         return np.random.choice(v, size=size, replace=True, p=p)

 min = np.uint64(1)
 max = np.uint64(10)

 q = Zipf(1.2, min, max, 100)
 # print(q)

 db = pymysql.connect(host="localhost",    # your host, usually localhost
                 user="root",         # your username
                 passwd="password",  # your password
                 db="db2")        # name of the data base

 # you must create a Cursor object. It will let
 #  you execute all the queries you need
 cur = db.cursor()
 for i in q:
     cur.execute('INSERT INTO table1 (x) VALUES("%x")' % (int(i)) )
     db.commit()

这给了我以下错误:

  File "/Users/alfie/PycharmProjects/zipfian/zipf.py", line 36, in <module>
    cur.execute('INSERT INTO table1 (x) VALUES("%x")' % (int(i)) )
  File "/Users/alfie/PycharmProjects/zipfian/venv/lib/python3.7/site-packages/pymysql/cursors.py", line 170, in execute
    result = self._query(query)
  File "/Users/alfie/PycharmProjects/zipfian/venv/lib/python3.7/site-packages/pymysql/cursors.py", line 328, in _query
    conn.query(q)
  File "/Users/alfie/PycharmProjects/zipfian/venv/lib/python3.7/site-packages/pymysql/connections.py", line 517, in query
    self._affected_rows = self._read_query_result(unbuffered=unbuffered)
  File "/Users/alfie/PycharmProjects/zipfian/venv/lib/python3.7/site-packages/pymysql/connections.py", line 732, in _read_query_result
    result.read()
  File "/Users/alfie/PycharmProjects/zipfian/venv/lib/python3.7/site-packages/pymysql/connections.py", line 1075, in read
    first_packet = self.connection._read_packet()
  File "/Users/alfie/PycharmProjects/zipfian/venv/lib/python3.7/site-packages/pymysql/connections.py", line 684, in _read_packet
    packet.check_error()
  File "/Users/alfie/PycharmProjects/zipfian/venv/lib/python3.7/site-packages/pymysql/protocol.py", line 220, in check_error
    err.raise_mysql_exception(self._data)
  File "/Users/alfie/PycharmProjects/zipfian/venv/lib/python3.7/site-packages/pymysql/err.py", line 109, in raise_mysql_exception
    raise errorclass(errno, errval)
pymysql.err.InternalError: (1366, "Incorrect integer value: 'a' for column 'x' at row 1")

如果我使用3和8作为我链接的帖子中的最小值和最大值,那么就没有错误,一切正常。即使我改变了打印范围,打印(q)仍然有效。任何帮助都将不胜感激


Tags: inpyvenvlibpackagesnplinesite