当重复值为1时，itertools出现问题错误“>=”在“int”和“tuple”实例之间不受支持

2024-10-02 12:30:40 发布

男 | 程序猿一只，喜欢编程写python代码。

这里的新手，如果这是基本的，很抱歉，但无论出于什么原因，我的代码在执行两个阈值时运行良好，但一旦我将其设置为一个阈值，就会出现错误“>；=”“int”和“tuple”实例之间不支持。我想使用1，因为2或更多创建了太多的可能性，并且使用了太多的内存

无错误代码：

import pandas as pd
data = {'Name_Raw':['AECOM TECHNICAL SERVICES', 'AECOM_*', 'AECOM- Amentum', 'AECOM GOVERNMENT SERVICES (Inactive)', 'ADT LLC dba ADT Security Services', 'ADT', 'AAA Call Center', 'AAA of Northern California, Nevada', 'ANHEUSER BUSCH InBev'], 'Name_CleanCorrect':['AECOM', 'AECOM', 'AECOM', 'AECOM', 'ADT SECURITY CORPORATION', 'ADT SECURITY CORPORATION', 'AAA', 'AAA', 'AB InBev'], 'Name_ngram':['AECOM', 'AECOM', 'AECOM', 'AECOM', 'ADT SECURITY CORPORATION', 'ADT SECURITY CORPORATION', 'AAA', 'State Bar of California', 'Ivanhoe Cambridge USA'], 'Score_ngrams':[38, 100, 51, 33, 52, 41, 36, 30, 16], 'Name_Fuzz':['AECOM', 'AECOM', 'AECOM', 'AECOM', 'ADT SECURITY CORPORATION', 'ADT SECURITY CORPORATION', 'AAA', 'State Bar of California', 'AB InBev'], 'Score_fuzz':[100, 100, 100, 100, 65, 85, 85, 37, 65], 'Name_jw':['Chicago Title Insuranc', 'Invesco', 'Heitman', 'Patheon/Thermo Fisher Scientific', 'Securitas Security Service', 'Michael Baker International, LLC', 'Bank of America', 'Ascension Health', 'Frontier Communication'], 'Score_jw':[66, 66, 63, 61, 62, 64, 67, 32, 100]}

df2 = pd.DataFrame(data)

from itertools import product

def f(x, ngram_thresh, leven_thresh):
    if x['Score_ngrams'] >= ngram_thresh : return x['Name_ngram']
    elif x['Score_fuzz'] >= leven_thresh : return x['Name_Fuzz']
    else: return 0

for ngram_t, leven_t in product(range(40,110,5), repeat=2):
    df2[f'Name_Clean_{ngram_t}_{leven_t}'] = df2.apply(f, ngram_thresh=ngram_t, leven_thresh=leven_t, axis=1)

print(df2)

重复中错误更改为仅一个的代码：

def f(x, ngram_thresh):
    if x['Score_ngrams'] >= ngram_thresh : return x['Name_ngram']
    else: return 0

for ngram_t in product(range(40,110,5), repeat=1):
    df2[f'Name_Clean_{ngram_t}'] = df2.apply(f, ngram_thresh=ngram_t, axis=1)

错误：

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-96-9f9d8793169c> in <module>
      5 for ngram_t in product(range(40,110,5), repeat=1):
      6     print(ngram_t)
----> 7     df2[f'Name_Clean_{ngram_t}'] = df2.apply(f, ngram_thresh=ngram_t, axis=1)

e:\Anaconda3\lib\site-packages\pandas\core\frame.py in apply(self, func, axis, raw, result_type, args, **kwds)
   7539             kwds=kwds,
   7540         )
-> 7541         return op.get_result()
   7542 
   7543     def applymap(self, func) -> "DataFrame":

e:\Anaconda3\lib\site-packages\pandas\core\apply.py in get_result(self)
    178             return self.apply_raw()
    179 
--> 180         return self.apply_standard()
    181 
    182     def apply_empty_result(self):

e:\Anaconda3\lib\site-packages\pandas\core\apply.py in apply_standard(self)
    253 
    254     def apply_standard(self):
--> 255         results, res_index = self.apply_series_generator()
    256 
    257         # wrap results

e:\Anaconda3\lib\site-packages\pandas\core\apply.py in apply_series_generator(self)
    282                 for i, v in enumerate(series_gen):
    283                     # ignore SettingWithCopy here in case the user mutates
--> 284                     results[i] = self.f(v)
    285                     if isinstance(results[i], ABCSeries):
    286                         # If we have a view on v, we need to make a copy because

e:\Anaconda3\lib\site-packages\pandas\core\apply.py in f(x)
    107 
    108             def f(x):
--> 109                 return func(x, *args, **kwds)
    110 
    111         else:

<ipython-input-96-9f9d8793169c> in f(x, ngram_thresh)
      1 def f(x, ngram_thresh):
----> 2     if x['Score_ngrams'] >= ngram_thresh : return x['Name_ngram']
      3     else: return 0
      4 
      5 for ngram_t in product(range(40,110,5), repeat=1):

TypeError: '>=' not supported between instances of 'int' and 'tuple'

Tags： name in self pandas return def score security

1条回答

网友

1楼 · 发布于 2024-10-02 12:30:40

你忘了逗号。做for ngram_t, in product...

当重复值为1时，itertools出现问题错误“>=”在“int”和“tuple”实例之间不受支持

错误：

相关问题更多 >

编程相关推荐

热门问题

热门文章

当重复值为1时，itertools出现问题错误“>=”在“int”和“tuple”实例之间不受支持

错误：

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >