计算一列中有多少个字符出现在另一列中（Pandas）

df = pd.DataFrame(data=[["AL0","CP1","NM3","PK9","RM2"],["AL0X24", "CXP44", "MLN", "KKRR9", "22MMRRS"]]).T

3条回答

网友

1楼 · 编辑于 2024-06-25 22:45:13

按照dataframe数据结构，您可以执行以下操作：

>>> def count_common(s1, s2):
...     return len(set(s1) & set(s2))
...
>>> df["result"] = df.apply(lambda x: count_common(x[0], x[1]), axis=1)
>>> df
     0        1  result
0  AL0   AL0X24       3
1  CP1    CXP44       2
2  NM3      MLN       2
3  PK9    KKRR9       2
4  RM2  22MMRRS       3

网友

2楼 · 编辑于 2024-06-25 22:45:13

如果您比较具有相同多字符的名称，例如AAL0和AAL0X24，则其他解决方案将失败。这里的结果应该是4

from collections import Counter

df = pd.DataFrame(data=[["AL0","CP1","NM3","PK9","RM2", "AAL0"],
                        ["AL0X24", "CXP44", "MLN", "KKRR9", "22MMRRS", "AAL0X24"]]).T

def num_shared_chars(char_counter1, char_counter2):
    shared_chars = set(char_counter1.keys()).intersection(char_counter2.keys())
    return sum([min(char_counter1[k], char_counter2[k]) for k in shared_chars])

df_counter = df.applymap(Counter)
df['shared_chars'] = df_counter.apply(lambda row: num_shared_chars(row[0], row[1]), axis = 'columns')

结果:

      0        1  shared_chars
0   AL0   AL0X24             3
1   CP1    CXP44             2
2   NM3      MLN             2
3   PK9    KKRR9             2
4   RM2  22MMRRS             3
5  AAL0  AAL0X24             4

网友

3楼 · 编辑于 2024-06-25 22:45:13

压缩两列后看起来像set.intersection：

[len(set(a).intersection(set(b))) for a,b in zip(df[0],df[1])]
#[3, 2, 2, 2, 3]

相关问题更多 >

编程相关推荐

热门问题

热门文章

计算一列中有多少个字符出现在另一列中（Pandas）

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >