比较两个CSV文件并导出Python中的异同？

Name Serial Models computer1 serial1 model1 computer2 serial2 model2 computer3 serial3 model3 computer4 serial4 model4 computerH serialN/A computerP serialN/A computer2 serialN/A computer3 serialN/A computer4 serialN/A

3条回答

网友

1楼 · 编辑于 2024-06-25 23:01:49

这可能有助于进行各种比较

import numpy as np
import pandas as pd
#file_name = "list.xlsx"
df = pd.DataFrame({'List1':[1,2,3,4,5,5,11,4],'List 2':[3,5,6,8,9,3,4,9]}, columns=['List1', 'List 2'])#pd.read_excel(file_name, sheetname=0)
print(df)
#df.to_excel("list1.xlsx", header=True, index=False)
df['Intersect']=pd.DataFrame(np.intersect1d(df['List1'], df['List 2'])) #unique common in both
df['commonin1']=df['List1'][np.in1d(df['List1'], df['List 2'])] #non unique common items of list 1
df['commonin2']=df['List 2'][np.in1d(df['List 2'], df['List1'])] #non unique common items of list 2
df['1not2']=pd.DataFrame(np.setdiff1d(df['List1'], df['List 2'])) #in list1 but not in list 2
df['2not1']=pd.DataFrame(np.setdiff1d(df['List 2'], df['List1'])) #in list 2 but not in list1
df['1not2NU']=df['List1'][~np.in1d(df['List1'], df['List 2'])] #in list1 but not in list 2 non unique
df['2not1NU']=df['List 2'][~np.in1d(df['List 2'], df['List1'])] #in list 2 but not in list1 non unique
df['exclusive']=pd.DataFrame(np.setxor1d(df['List1'], df['List 2'])) # in a and not b + in b but not a
df=pd.concat([df,pd.DataFrame(np.union1d(df['List1'], df['List 2']), columns=['Union'])], axis=1) # unique all
df

网友

2楼 · 编辑于 2024-06-25 23:01:49

看看这个：

import pandas as pd
netscan = pd.read_csv('netscan.csv', header=0) # read netscan.csv and columns names are from the first row of your csv
computer_list = pd.read_csv('computer_list.csv', header=0)

# An inner merge keeps only row found in both pandas.DataFrame 
computer_match = netscan.merge(right=computer_list, how='inner', on='Name', suffixes=('netscan_', 'computer_list_'))

# Get list of Name of computers that matches
match_list = computer_match.Name.unique().tolist()

# Get characteristics of not matched computers
computer_no_match = computer_list.loc[computer_list.Name.isin(match_list), :]

# Finally, save everything to CSV
computer_match.to_csv('computer_match.csv', index=False)
computer_no_match.to_csv('computer_no_match.csv', index=False)

网友

3楼 · 编辑于 2024-06-25 23:01:49

您可以合并netscan和computer数据帧，然后用SerialN/A填充Serial列中缺少的值。在

import pandas as pd
netscan = pd.read_csv('netscan.csv')
computer = pd.read_csv('computer_list.csv', usecols=['Name'])
for df in [netscan, computer]:
    df['Name'] = df['Name'].str.rstrip()
result = pd.merge(netscan, computer, on='Name', how='outer')
result['Serial'] = result['Serial'].fillna('SerialN/A')
result.to_csv('result.csv', index=False)
print(result)

生成一个CSV文件（result.csv），其中包含

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章

比较两个CSV文件并导出Python中的异同？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >