更新trimatch pandas脚本以更新而不是附加列

2024-10-03 21:24:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在寻找一些帮助来更新我的python脚本以“匹配”pandas,而不是创建新的列。。。我已经添加了下面的所有细节以及不正确和正确的结果。在

任何帮助都将不胜感激。在

test.csv(原始CSV文件)

MATCH1,MATCH2,TITLE,TITLE,TITLE,TITLE,TITLE,TITLE,MATCH3,DATA,TITLE,TITLE
DMATCH1,MData (N/A),data,data,data,data,data,data,Tommy,55,data,data
DMATCH1,MData (N/A),data,data,data,data,data,data,Ben,54,data,data
DMATCH1,MData (N/A),data,data,data,data,data,data,Jim,52,data,data
DMATCH1,MData (N/A),data,data,data,data,data,data,Elz M,22,data,data
DMATCH2,MData (B/B),data,data,data,data,data,data,James Smith,15,data,data
DMATCH2,MData (B/B),data,data,data,data,data,data,Jessica Long,224,data,data
DMATCH2,MData (B/B),data,data,data,data,data,data,Mike,62,data,data
DMATCH3,Mdata,data,data,data,data,data,data,Joe Reane,66,data,data
DMATCH3,Mdata,data,data,data,data,data,data,Peter Jones,256,data,data
DMATCH3,Mdata,data,data,data,data,data,data,Lesley Lope,5226,data,data

test.txt(原始文本文件)

^{pr2}$

Test.py(运行我)

^{3}$

test.csv(运行test.py后更新的CSV文件)

MATCH1,MATCH2,TITLE,TITLE,TITLE,TITLE,TITLE,TITLE,MATCH3,DATA,TITLE,TITLE,,,,,
DMATCH1,MData (N/A),data,data,data,data,data,data,Tommy,55,data,data,3,144512/23332,Data $50.90,misc2 $10.40,bla3 $20.20
DMATCH1,MData (N/A),data,data,data,data,data,data,Ben,54,data,data,1,90000/222311,,,
DMATCH1,MData (N/A),data,data,data,data,data,data,Jim,52,data,data,1,90000/222311,,,
DMATCH1,MData (N/A),data,data,data,data,data,data,Elz M,22,data,data,1,90000/222311,,,
DMATCH2,MData (B/B),data,data,data,data,data,data,James Smith,15,data,data,4,2333/114441,Data $50.90,,bla3 $5.44
DMATCH2,MData (B/B),data,data,data,data,data,data,Jessica Long,224,data,data,4,2333/114441,,,
DMATCH2,MData (B/B),data,data,data,data,data,data,Mike,62,data,data,4,90000/222311,,,
DMATCH3,Mdata,data,data,data,data,data,data,Joe Reane,66,data,data,,,,,
DMATCH3,Mdata,data,data,data,data,data,data,Peter Jones,256,data,data,,,,,
DMATCH3,Mdata,data,data,data,data,data,data,Lesley Lope,5226,data,data,,,,,

test.txt(更新的文本文件)

Mdata
DMATCH3
5 Joe Reane 0/0
5 Peter Jones 90000/222311
Data $10.91
misc2 $420.00
bla3 $210.00

由于文本文件已更新,我们需要重新运行test.py,这将输出不正确/错误的内容:test.csv(更新)

MATCH1,MATCH2,TITLE,TITLE.1,TITLE.2,TITLE.3,TITLE.4,TITLE.5,MATCH3,DATA,TITLE.6,TITLE.7,TITLE01_x,TITLE02_x,Data_x,misc2_x,bla3_x,TITLE01_y,TITLE02_y,Data_y,misc2_y,bla3_y
DMATCH1,MData (N/A),data,data,data,data,data,data,Tommy,55,data,data,3.0,144512/23332,$50.90,$10.40,$20.20,3,144512/23332,$50.90,$10.40,$20.20
DMATCH1,MData (N/A),data,data,data,data,data,data,Ben,54,data,data,1.0,90000/222311,$50.90,$10.40,$20.20,1,90000/222311,$50.90,$10.40,$20.20
DMATCH1,MData (N/A),data,data,data,data,data,data,Jim,52,data,data,1.0,90000/222311,$50.90,$10.40,$20.20,1,90000/222311,$50.90,$10.40,$20.20
DMATCH1,MData (N/A),data,data,data,data,data,data,Elz M,22,data,data,1.0,90000/222311,$50.90,$10.40,$20.20,1,90000/222311,$50.90,$10.40,$20.20
DMATCH2,MData (B/B),data,data,data,data,data,data,James Smith,15,data,data,4.0,2333/114441,$50.90,,$5.44,4,2333/114441,$50.90,,$5.44
DMATCH2,MData (B/B),data,data,data,data,data,data,Jessica Long,224,data,data,4.0,2333/114441,$50.90,,$5.44,4,2333/114441,$50.90,,$5.44
DMATCH2,MData (B/B),data,data,data,data,data,data,Mike,62,data,data,4.0,90000/222311,$50.90,,$5.44,4,90000/222311,$50.90,,$5.44
DMATCH3,Mdata,data,data,data,data,data,data,Joe Reane,66,data,data,,,,,,,,,,
DMATCH3,Mdata,data,data,data,data,data,data,Peter Jones,256,data,data,,,,,,,,,,
DMATCH3,Mdata,data,data,data,data,data,data,Lesley Lope,5226,data,data,,,,,,,,,,

正确的输出应该是一个更新的文件:test.csv

MATCH1,MATCH2,TITLE,TITLE,TITLE,TITLE,TITLE,TITLE,MATCH3,DATA,TITLE,TITLE,,,,,
DMATCH1,MData (N/A),data,data,data,data,data,data,Tommy,55,data,data,3,144512/23332,Data $50.90,misc2 $10.40,bla3 $20.20
DMATCH1,MData (N/A),data,data,data,data,data,data,Ben,54,data,data,1,90000/222311,,,
DMATCH1,MData (N/A),data,data,data,data,data,data,Jim,52,data,data,1,90000/222311,,,
DMATCH1,MData (N/A),data,data,data,data,data,data,Elz M,22,data,data,1,90000/222311,,,
DMATCH2,MData (B/B),data,data,data,data,data,data,James Smith,15,data,data,4,2333/114441,Data $50.90,,bla3 $5.44
DMATCH2,MData (B/B),data,data,data,data,data,data,Jessica Long,224,data,data,4,2333/114441,,,
DMATCH2,MData (B/B),data,data,data,data,data,data,Mike,62,data,data,4,90000/222311,,,
DMATCH3,Mdata,data,data,data,data,data,data,Joe Reane,66,data,data,5,0/0,,misc2 $420.00,bla3 $210.00
DMATCH3,Mdata,data,data,data,data,data,data,Peter Jones,256,data,data,5,90000/222311,,,
DMATCH3,Mdata,data,data,data,data,data,data,Lesley Lope,5226,data,data,,,,,

提前感谢
-海弗莱克斯


jsexauer的回溯错误

Traceback (most recent call last):
  File "C:\test.py", line 62, in <module>
    mergeddata = pandas.merge(csvdata, textdata, how='right', on=mergecols, sort=False)
  File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 37, in merge
    return op.get_result()
  File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 197, in get_result
    self._maybe_add_join_keys(result, left_indexer, right_indexer)
  File "C:\Python27\lib\site-packages\pandas\tools\merge.py", line 222, in _maybe_add_join_keys
    right_na_indexer))
ValueError: could not convert string to float:

Tags: pytestpandasdatatitlejoemdatabla3
1条回答
网友
1楼 · 发布于 2024-10-03 21:24:57

更新

我想我现在该走了。我想你想用^{} method这是你想要的吗?在

textcols = ['MATCH2', 'MATCH1', 'TITLE01', 'MATCH3', 'TITLE02', 'Data', 'misc2', 'bla3']
csvdata = pandas.read_csv(CSV_IN)
textdata = pandas.DataFrame(table, columns=textcols)

# Add any new columns
newCols = textdata.columns - csvdata.columns
for c in newCols:
    csvdata[c] = None

mergecols = ['MATCH2', 'MATCH1', 'MATCH3']
csvdata.set_index(mergecols, inplace=True, drop=False)
textdata.set_index(mergecols, inplace=True,drop=False)
csvdata.update(textdata)
csvdata.to_csv(CSV_OUT, index=False)

相关问题 更多 >