如何正确使用矢量化？

2024-09-26 18:05:27 发布

您现在位置：Python中文网/ 问答频道 /正文

2099

网友

男 | 程序猿一只，喜欢编程写python代码。

根据an article，vectorization比apply一个函数到dafaframe列的速度要快得多。你知道吗

但我有一个特殊的例子：

import pandas as pd

df = pd.DataFrame({'IP': [ '1.0.64.2', '100.23.154.63', '54.62.1.3']})

def compare3rd(ip):
    """Check if the 3dr part of an IP is greater than 100 or not"""
    ip_3rd = ip.split('.')[2]
    if int(ip_3rd) > 100:
        return True
    else:
        return False


# This works but very slow
df['check_results'] = df.IP.apply(lambda x: compare3rd(x))
print df

# This is supposed to be much faster
# But it doesn't work ...
df['check_results_2'] = compare3rd(df['IP'].values)
print df

完全错误回溯如下所示：

Traceback (most recent call last):
  File "test.py", line 16, in <module>
    df['check_results_2'] = compare3rd(df['IP'].values)
  File "test.py", line 6, in compare3rd
    ip_3rd = ip.split('.')[2]
AttributeError: 'numpy.ndarray' object has no attribute 'split'

我的问题是：在这种情况下，如何正确地使用这个vectorization方法？你知道吗

Tags： ip an df return if is check this

1条回答

网友

1楼 · 发布于 2024-09-26 18:05:27

用pandas中的str检查

df.IP.str.split('.').str[2].astype(int)>100
0    False
1     True
2    False
Name: IP, dtype: bool

既然你提到vectorize

import numpy as np
np.vectorize(compare3rd)(df.IP.values)
array([False,  True, False])

如何正确使用矢量化？

相关问题更多 >

编程相关推荐

热门问题

热门文章

如何正确使用矢量化？

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >