如何比较来自两个不同数据帧的数据

2024-09-28 03:20:06 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图用Python比较来自2个数据帧的数据。我有一列,这是一个共同的一个在他们两个,但他们有不同的名称。第一列的名称是“文件”,第二列的名称是“Código da transação”。不管怎样,我创建了这个函数,来比较数据,但是在那些行中出现了错误。。。为什么会这样?你知道吗

def checar_valor(a,b):
    for i in range(len(a)):
        if  b.isin([a['File'][i]]): #ERROR
            print("O valor %s está presente nos dois dataframes" % a['File'][i])
        else:
            print("O valor %s está presente apenas no dataframe %s" % (a['File'][i], "a"))

for q in range(len(b)):
    if a.isin([b['Código da transação'][q]]): #ERROR
        print("O valor %s está presente nos dois dataframes" % b['Código da transação'][q])
    else:
        print("O valor %s está presente apenas no dataframe %s" % (b['Código da transação'][q], "b"))


Traceback (most recent call last):
  File "C:/Users/nick/PycharmProjects/WebCrawler/Extranet/testezin.py", line 75, in <module>
    checar_valor(rs, ga)
  File "C:/Users/nick/PycharmProjects/WebCrawler/Extranet/testezin.py", line 64, in checar_valor
    if  b.isin([a['File'][i]]): #ERRO
  File "C:\Users\nick\PycharmProjects\WebCrawler\venv\lib\site-packages\pandas\core\generic.py", line 1576, in __nonzero__
    .format(self.__class__.__name__))
ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Tags: 数据in名称ifusersdavalorfile
1条回答
网友
1楼 · 发布于 2024-09-28 03:20:06

使用pd.DataFrame.where可以得到一个数据帧,其中包含两个数据帧中的值以及位于同一位置的值

df1.where(df1.values==df2.values)

Edit : following your comment, this should work :

A = pd.DataFrame([4,2,3], columns = ['Number'])
B = pd.DataFrame([2,5,6], columns = ['Number'])

a = set(A['Number'])
b = set(B['Number'])

my_set = set(a | b) #put every value in a set, so that you don't check each column twice

for i in my_set:
    if i in A['Number'].values:
        if i in B['Number'].values:
            print(str(i) + ' is in both DataFrames')
        else :
            print(str(i) + ' is in A but not in B')
    else: #if the value is not in A, it is obviously in B
        print(str(i) + ' is in B but not in A')

输出:

2 is in both DataFrames 
3 is in A but not in B 
4 is in A but not in B
5 is in B but not in A 
6 is in B but not in A

相关问题 更多 >

    热门问题