numpy vectorized:检查数组中的字符串是否以另一个数组中的字符串结尾

import numpy as np strings = ['val1', 'val2', 'val3'] ends = ['1', '2', 'al1'] def buildFunction(ending): return lambda x: x.endswith(ending) funcs = list(map(buildFunction, ends)) def end_function_vector(val): return np.vectorize(lambda f, x: f(x))(funcs, np.repeat(val, len(funcs))) result = np.array(list(map(end_function_vector, strings)))

2条回答

网友

1楼 · 编辑于 2024-09-30 01:18:45

这是一个几乎*矢量化的方法，利用^{}-

# Get lengths of strings in each array
lens_strings = np.array(list(map(len,strings)))
lens_ends = np.array(list(map(len,ends)))

# Get the right most index of match, add the ends strings.
# The matching ones would cover the entire lengths of strings.
# So, do a final comparison against those lengths.
rfind = np.core.defchararray.rfind
out = rfind(strings[:,None], ends) + lens_ends == lens_strings[:,None]

样本运行-

^{pr2}$

*几乎是因为使用了map，但由于我们只使用它来获取输入元素的字符串长度，因此与解决我们的情况所需的其他操作相比，它的成本必须是最小的。在

网友

2楼 · 编辑于 2024-09-30 01:18:45

Numpy对chararray有这样的操作：numpy.core.defchararray.endswith()。在

下面的代码可以大大加快速度，但是在创建两个与输出数组大小相同的数组时，确实需要大量内存：

A = np.array(['val1', 'val2', 'val3'])
B = np.array(['1', '2', 'al1'])

A_matrix = np.repeat(A[:, np.newaxis], len(B), axis=1)
B_matrix = np.repeat(B[:, np.newaxis], len(A), axis=1).transpose()

result = np.core.defchararray.endswith(A_matrix, B_matrix)

更新：
如Divakar所述，上述代码可合并为：

^{pr2}$

相关问题更多 >

编程相关推荐

热门问题

热门文章