通过域Nam将网络流量与授权列表进行比较

2024-09-30 00:35:57 发布

您现在位置:Python中文网/ 问答频道 /正文

我试图通过网络流量解析,并将流量中的域名与最常见的网站列表进行比较。其目的是打印所有不在常见网站列表中的网站名称


with open('/Users/downloads/scripting_for_security/resources/top_100.txt') as f:
    safeAdd = f.readlines(),


with open('/Users/downloads/scripting_for_security/resources/traffic_log.txt') as n:
    netTraffic = n.readlines(),

domainTraffic = re.findall(r'\s(?:www.)?(\w+.com)', netTraffic)


for i in safeAdd:
    for e in domainTraffic:
        if i != e:
            print(e)

获取类型错误

TypeError Traceback (most recent call last) in 8 netTraffic = n.readlines(), 9 ---> 10 domainTraffic = re.findall(r'\s(?:www.)?(\w+.com)', netTraffic) 11 12

~/anaconda3/lib/python3.7/re.py in findall(pattern, string, flags) 221 222 Empty matches are included in the result.""" --> 223 return _compile(pattern, flags).findall(string) 224 225 def finditer(pattern, string, flags=0):

TypeError: expected string or bytes-like object


Tags: inre列表forstring网站withopen
3条回答

netTraffic是根据https://docs.python.org/3/tutorial/inputoutput.html的列表

findall需要第二个string https://docs.python.org/3/library/re.html#re.findall类型的参数

如前所述,^{}需要一个字符串,而您正在传递一个列表。解决这个问题的方法之一是遍历字符串列表(netTraffic),并构建一个所有匹配项的列表(domainTraffic)。我在下面展示了:

with open('/Users/downloads/scripting_for_security/resources/top_100.txt') as f:
    safeAdd = f.readlines(),


with open('/Users/downloads/scripting_for_security/resources/traffic_log.txt') as n:
    netTraffic = n.readlines(),

#initialize empty list
domainTraffic = []

#iterate over each value and add matches to the list
for net in netTraffic:
    domainTraffic.extend(re.findall(r'\s(?:www.)?(\w+.com)', str(net))

#Use list comprehension to filter out the safeAdds
filtered_list = [add for add in domainTraffic if add not in safeAdd]

print(filtered_list)

您还可以^{}将列表转换成一个长字符串,然后对组合的字符串运行re.findall。这真的取决于你的弦是什么。你知道吗

这里的问题是你在传递一个listlines而不是一个文本给re.findall, 使用read()而不是readlines()

with open('data.txt') as f:
    print(type(f.readlines()))  # list
    print(type(f.read()))       # str accepted by the re.findall or any other function

在代码中更改以下内容:

safeAdd = f.read()

netTraffic = n.read()

并删除,netTraffic将是一个tuple包含一个listlines,请检查以下内容:

  x = 1, # equavalent to x = (1,)  result is tuple
  x = 1 # is equavalent to x = (1) without "," it's integer

相关问题 更多 >

    热门问题