Pandas CSV到Dataframe(用Base64编码的列)

2024-07-08 15:03:20 发布

您现在位置:Python中文网/ 问答频道 /正文

下面的代码用于使用Pandas将防火墙日志从csv摄取到数据帧中。在

df = pd.read_csv('/Users/alistairgillespie/Documents/Projects/COMP5310/Akamai Data/FINAL/data.csv', dtype = {"_time": str, "city": str,"country": str,"lat": str,"long": str,"region": str,"UA": str,"bytes": str,"cliIP": str,"reqHost": str, "reqMethod": str, "reqPath": str,"reqPort": str,"respCT": str,"respLen": str,"status": str,"referer": str,"date": str,"conn": str,"denyData": str,"denyRules": str,"policy": str,"ruleSet": str,"warnRules": str,"warnData": str,"warnSlrs": str,"warnTags": str})

*请原谅长列的柱子

在dataframe中,我希望迭代每一行,并使用unquote和base64decode函数调用解码“denyData”列字段(如果不是NaN)。我尝试使用以下代码来执行此操作:

^{pr2}$

将产生以下错误:

TypeError: argument of type 'float' is not iterable

将csv中的字节列处理为Pandas数据帧的正确方法是什么?这是清除这些数据的正确方法吗?下面是一个数据示例。在

Example of the column with encoded data


Tags: csv数据方法代码pandasdfreadusers
1条回答
网友
1楼 · 发布于 2024-07-08 15:03:20

您可以尝试if-else,因为错误显然意味着无法处理NaNs:

for i, row in df.iterrows():
    print(pd.notnull(row))
    if pd.notnull(row):
        df.loc[i, 'denyData'] = base64.b64decode(parse.unquote(row['denyData']))
    else:
        df.loc[i, 'denyData'] = np.nan

相关问题 更多 >

    热门问题