Python如何计算一个lin中包含多个项的序列

f = open("routeviews-rv2-20181110-1200.pfx2as", 'r') #read file into array, ignore first 6 lines lines = loadtxt("routeviews-rv2-20181110-1200.pfx2as", dtype='str', delimiter="\t", unpack=False) #convert to dataframe df = pd.DataFrame(lines,columns=['IPPrefix', 'PrefixLength', 'AS']) series = df['AS'].astype(str).str.replace('_', ',').str.split(',') arr = numpy.array(list(chain.from_iterable(series))) ASes= pd.Series(numpy.bincount(arr))

Out[94]: df= A B C 0 1.0.0.0 24 13335 1 1.0.4.0 22 56203 2 1.0.4.0 24 56203 3 1.0.5.0 24 56203 ... ... ... 67820 1.173.142.0 24 31133_65500,65501 ... ... ... 778719 223.255.252.0 24 58519 778720 223.255.254.0 24 55415

1条回答

网友

1楼 · 发布于 2024-09-29 23:28:18

^{}+^{}+^{}

您可以用,替换_，在使用np.bincount之前先拆分然后链接：

from itertools import chain

series = df['A'].astype(str).str.replace('_', ',').str.split(',')
arr = np.array(list(chain.from_iterable(series))).astype(int)

print(pd.Series(np.bincount(arr)))

0     0
1     0
2     2
3     4
4     1
5     6
6     1
7     0
8     0
9     0
10    1
dtype: int64

^{}+^{}+^{}

相关问题更多 >

编程相关推荐

热门问题

热门文章