如何用修改后的csv数据框更新postgreSQL?

2024-09-30 10:40:32 发布

您现在位置:Python中文网/ 问答频道 /正文

以下代码起作用:

import pandas as pd
import csv
import psycopg2

df = pd.read_csv(r'https://developers.google.com/adwords/api/docs/appendix/geo/geotargets-2021-02-24.csv')
df=df.rename(columns = {'Criteria ID':'Criteria_ID','Canonical Name':'Canonical_Name','Parent ID':'Parent_ID','Country Code':'Country_Code','Target Type':'Target_Type'})
df = df.loc[df['Country_Code']=='IN']
df.to_csv(r'C:\Users\Harshal\Desktop\tar.csv',index=False)

conn = psycopg2.connect(host='1.11.11.111',
                   dbname='postgres',
                   user='postgres',
                   password='myPassword',
                   port='1234')  
cur = conn.cursor()
f = open('C:\Users\Harshal\Desktop\tar.csv', 'r')
cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", f)
conn.commit()
conn.close()
f.close()

但我不想保存更改后的数据帧,而是想直接将其上传到postgreSQL表中。我尝试了cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", df),但它抛出了错误。 注意:cur.copy_expert("""copy geotargets_india from stdin with (format csv, header, delimiter ',', quote '"')""", f)无法避免,因为我在保存csv时带有一些条件。 我的表格结构:

create table public.geotargets_india(
Criteria_ID integer not null,
Name character varying(50) COLLATE pg_catalog."default" NOT NULL,
Canonical_Name character varying(100) COLLATE pg_catalog."default" NOT NULL,
Parent_ID NUMERIC(10,2),
Country_Code character varying(10) COLLATE pg_catalog."default" NOT NULL,
Target_Type character varying(50) COLLATE pg_catalog."default" NOT NULL,
Status character varying(50) COLLATE pg_catalog."default" NOT NULL
)

enter image description here

编辑:我试过了

import pandas as pd
import csv
import psycopg2
from sqlalchemy import create_engine

df = pd.read_csv(r'https://developers.google.com/adwords/api/docs/appendix/geo/geotargets-2021-02-24.csv')
df=df.rename(columns = {'Criteria ID':'Criteria_Id','Canonical         Name':'Canonical_Name','Parent ID':'Parent_ID','Country Code':'Country_Code','Target Type':'Target_Type'})
df = df.loc[df['Country_Code']=='IN']
df['Canonical_Name']=df['Canonical_Name'].str.replace(',', " ")
engine = create_engine('postgresql+psycopg2://postgres:myPassword@1.11.11.111:1234/postgres')
df.to_sql(
 'geotargets_india',
  con=engine,
  schema=None, 
  if_exists='append', 
  index=False
)

但是获取错误:UndefinedColumn: column "Criteria_Id" of relation "geotargets_india" does not exist LINE 1: INSERT INTO geotargets_india ("Criteria_Id", "Name", "Canoni...

EDIT2:如果我删除了我的表,并且新表创建的脚本如下所示,则上述代码可以工作:

CREATE TABLE public.geotargets_india
(
"Criteria_Id" bigint,
"Name" text COLLATE pg_catalog."default",
"Canonical_Name" text COLLATE pg_catalog."default",
"Parent_ID" double precision,
"Country_Code" text COLLATE pg_catalog."default",
"Target_Type" text COLLATE pg_catalog."default",
"Status" text COLLATE pg_catalog."default"
)

为什么它不能使用预定义的表模式


Tags: csvnameimportiddefaultdfcodecountry
2条回答

我建议您使用sqlalchemy orm, 这很简单

    df = pd.read_csv(r'https://developers.google.com/adwords/api/docs/appendix/geo/geotargets-2021-02-24.csv')
   engine = create_engine('postgresql+psycopg2://user:password@host:port/database')
   df.to_sql(dbname,engine, if_exists='append',index=False)

我试过你的代码,修正了一些行,我的行成功了

import pandas as pd
from sqlalchemy import create_engine

df = pd.read_csv(r'https://developers.google.com/adwords/api/docs/appendix/geo/geotargets-2021-02-24.csv', delimiter=',')
print(df)
df=df.rename(columns = {'Criteria ID':'Criteria_Id','Canonical Name':'Canonical_Name','Parent ID':'Parent_ID','Country Code':'Country_Code','Target Type':'Target_Type'})
df = df.loc[df['Country_Code']=='IN']
df['Canonical_Name']=df['Canonical_Name'].str.replace(',', " ")
engine = create_engine('postgresql+psycopg2://collaborateur1:nG@e3P@tapp581lv:2345/base_project')
df.to_sql('geotargets_india',con = engine,schema=None,if_exists='append',index=False)

我添加了分隔符“,”并更正了“规范名称”

相关问题 更多 >

    热门问题