擅长:python、mysql、java
<pre><code>def match_word(ref_row,series):
"""
inputs
ref_row (str): this is the string of reference
series (pandas.series): this a series containing all other strings you want to cross-check
outputs:
series (pandas.series): this will be a series of booleans
"""
#convert ref_row into a set of strings. Use strip to remove whitespaces before and after the initial string
ref_row = set(ref_row.strip().split(' '))
#convert strings into set of strings
series = series.apply(lambda x:set(x.strip().split(' ')))
#now cross check each row with the reference row.
#find the size (number of words) of the intersection
series = series.apply(lambda x:len(list(x.intersection(ref_row))))
#if the size of the intersection set is greater than zero. Then there is a common word between ref_row and all the series
series = series>0
return series
</code></pre>
<p>现在,您可以按如下方式调用上述函数:</p>
^{pr2}$
<p>请注意,这不是最好的优化算法,但它是快速和肮脏的方法。这是一个O(n2)。在</p>