psycopg2:如何将Python应用程序中的数据添加到Postgresql

2024-06-01 14:08:34 发布

您现在位置:Python中文网/ 问答频道 /正文

这是一个项目的一部分,该项目捕捉新闻标题,跟踪倾向,并返回情绪分析(即积极、消极和中立)。除了简单地将捕获的数据添加到PostgreSQL表(仅供参考:CNN和USAToday也在工作中,因为我尊重所有的新闻来源)之外,所有工作都在以下代码上进行

目前,出现以下错误:

“发生异常:SyntaxError “%”处或附近出现语法错误

我试图找到我的具体情况的示例,但是我看到的所有示例都是插入值中的实际结果,而不是包含所需数据的python值。 我还附上了原始表格创建代码:

--create table
create table news_data (
timestamp timestamp,
user varchar(75),
url text,
site varchar(75),
lean varchar(75),
source varchar(75),
headline varchar(1000),
results text,
positive float,
negative float,
neutral float,

)

PYTHON代码:

#import the needed libraries
import os
import nltk
#nltk.download() #Only use once
import datetime
import getpass
import requests
import time
import numpy as np
import pandas as pd
import pandasql as psql
import lxml
from bs4 import BeautifulSoup
from nltk.sentiment.vader import SentimentIntensityAnalyzer as SIA
#from nltk.corpus.reader.plaintext import PlaintextCorpusReader as PCR
import psycopg2 as p2

#Application process
ts = datetime.datetime.now().date() #needs to be converted to a string
timestamp = ts.strftime("%d-%b-%Y (%H:%M:%S.%f)")
user = getpass.getuser()
url = 'https://www.foxnews.com/'
site = 'Fox News'
lean = 'Conservative'
source = requests.get(url)
soup = BeautifulSoup(source.content, 'lxml') #'source.content' is critical to success here
headline = soup.find('h2', class_='title title-color-default').text #apply sntiment analysis to headline
vader = SIA()
sentiment_dict = vader.polarity_scores(headline)
results = vader.polarity_scores(headline)
positive = (sentiment_dict['pos']*100)
negative = (sentiment_dict['neg']*100)
neutral = (sentiment_dict['neu']*100)


#lists
#sql_list = (timestamp, user, url, site, lean, headline, results, positive, negative, neutral)

#Postgresql connection
db_connection = p2.connect(user = 'PLACEHOLDER', password='PLACEHOLDER', database='PLACEHOLDER')
cursor = db_connection.cursor()
insertion = cursor.execute('''
INSERT INTO news_data VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)
''')

record_to_insert = (timestamp, user, url, site, lean, headline, source, results, positive, negative, neutral)
cursor.execute(insertion, record_to_insert)


cursor.close()
db_connection.close()

        
#Testing
#print(headline)
#print(negative)

即使在将最后一行更改为以下内容后,我仍然会出现以下错误:

发生异常:SyntaxError “%处或附近出现语法错误” 第1行:…ce,结果,正、负、中性)值(%s,%s,%s

更改代码:

cursor.execute("INSERT INTO news_data (timestamp, url, site, lean, headline, source, results, positive, negative, neutral) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s), (timestamp, url, site, lean, headline, results, positive, negative, neutral)")

cursor.commit()

cursor.close()

db_connection.close()
        
#Testing
#print(headline)
#print(negative)

Tags: toimporturlsourceassitecursorresults
3条回答

不必查看语句中的每种数据类型,我可以看到其中有一些是字符串。确保在必要时用单引号括住字符串类型值

insertion = cursor.execute('''
INSERT INTO news_data VALUES ('%s','%s', ...)
''')

啊,这是代码的相关部分

#Postgresql connection
db_connection = p2.connect(user = 'PLACEHOLDER', password='PLACEHOLDER', database='PLACEHOLDER')
cursor = db_connection.cursor()

# hmm, this won't work.  You need to provide values for all those placeholders
insertion = cursor.execute('''
INSERT INTO news_data VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)
''')

# oh, i see what you're up to.  You want this:

insertion = 'INSERT INTO news_data....'

record_to_insert = (timestamp, user, url, site, lean, headline, source, results, positive, negative, neutral)

cursor.execute(insertion, record_to_insert)

cursor.close()

# most likely, you'll also need
db_connection.commit()

db_connection.close()

基本上,psycopg2不支持从Python变量添加存储值。您必须首先将数据保存到CSV文件中,让代码从CSV文件中获取数据,将数据上载到PostgreSQL,然后在过程完成后删除CSV文件

希望这将有助于将来试图完成类似任务的人

相关问题 更多 >