Apache Nifi EXECUTESTREAMCAND UnicodeEncodeError中的python代码:不允许使用代理

2024-10-01 17:22:16 发布

您现在位置:Python中文网/ 问答频道 /正文

我在csv文件中有一些中文单词。我通过nifi ExecuteStream命令处理器输入csv文件。我得到了独角兽的警告:不允许代理。我确保csv文件是utf-8。问题应该是关于中文单词,因为当我删除这些单词时,我没有错误

示例csv:

SAP_MATERIAL,GENERIC_ARTICLE,DIM1,DIM2,EAN_UPC,CURRENT_YEAR,CURRENT_SEASON,ARTICLE_DESC,CHINESE_DESC,STYLE,GENDER_CODE,LOCAL_GENDER,LOCAL_GENDER_DESC,SBU_CODE,SBU_DESC,COLLECTION,COLLECTION_DESC,BRAND,SBU_SUB_CODE,SBU_SUB_DESC,CN_RETAIL_PRODUCT_TYPE_DESC,SBU_DESC_CN,SBU_SUB_DESC_CN,TBL_CATEGORY,AGE_GROUP,SAP_GENDER_DESC,COLOR_GROUP_DESC,keep,df_cal
TB027657626105500M,TB0276576261,055,00M,885641855420,2012,SS,"[TB0276576261]KENNBNK GLADIATOR RED,FQ",,027657,W,,,906,TB Tree Footwear,X,,TI,9AR,TB_FT_Women,,TB Tree 鞋履,TB_鞋履_女款,Footwear,,W - WOMEN,RED,1,333
TB027657626105500M,TB0276576261,055,00M,885641855420,2012,SS,"[TB0276576261]KENNBNK GLADIATOR RED,FQ",,027657,W,,,906,TB Tree Footwear,X,,TI,9AR,TB_FT_Women,,TB Tree 鞋履,TB_鞋履_女款,Footwear,,W - WOMEN,RED,2,333

这是我的代码:

#!/usr/bin/python3.6

import sys
import pandas as pd
import numpy as np
import io

df = pd.read_csv(sys.stdin)
df = df.drop_duplicates(
        subset=df.columns.difference(['keep']),keep = False)
df = df[(df.keep == '2')]
df.drop(['keep','df_cal'],axis = 1,inplace = True)


df.to_csv(sys.stdout,index = None)

这里有一些图片帮助大家了解情况

这是输入ExecuteStream命令处理器之前的文件: enter image description here 这就是错误: enter image description here ExecuteStream命令处理器设置: enter image description here 合并内容处理器设置: enter image description here

我试图更新我的代码,但仍然得到相同的错误

#!/usr/bin/python3.6

import sys
import pandas as pd
import numpy as np
import io

df = pd.read_csv(sys.stdin)
df = df.drop_duplicates(
        subset=df.columns.difference(['keep']),keep = False)
df = df[(df.keep == '2')]
df.drop(['keep','df_cal'],axis = 1,inplace = True)
for column in df:
    df[column] = df[column].astype(str).str.encode('utf-8')

df.to_csv(sys.stdout,index = None)

我如何修复错误? 任何帮助都将不胜感激


Tags: 文件csvimporttreedf错误sysred

热门问题