这个Python代码可以进一步压缩吗?

2024-10-01 04:53:17 发布

您现在位置:Python中文网/ 问答频道 /正文

下面是获取域中所有子域的Python代码。它接受一个包含网站页面源的文件作为输入。第二个参数是域名。例如:"https://www.sometime.com"。你知道吗

import re
def getSubDomains(fil,domain):
   with open(fil) as f:
    subDomainLst = []
    for line in f:
      m = re.findall(r'\bhref="\https://[\w+\.*]+%s/'%domain,line)
      if(m):
         for ele in m: subDomainLst.append(ele.split('/')[2])
      else:
            continue
   subDomainLst = list(set(subDomainLst))
   for ele in subDomainLst: print ele 
def main():
    fil1,domain1 = raw_input("Enter the file name\n"),raw_input("Enter the domain\n")
    getSubDomains(fil1,domain1)
main() if __name__ == '__main__' else Pass

我试着缩小内部的“if else语句”来

for ele in m: subDomainLst.append(ele.split('/')[2]) if(m) else continue

但这是一个错误。你知道吗

上面的代码是否可以进一步缩小(目前为16行),以便占用最少的行数并变得更具python风格?你知道吗


Tags: 代码inhttpsreforifmaindomain
3条回答

您可能需要将if语句更改为try..except

try:
    for ele in m: subDomainLst.append(ele.split('/')[2])
except TypeError:
    print "OMG m is not iterable!"

或者类似的

您不需要添加继续。您可以尝试这样做,尽管我不建议这样做,因为这样会使代码不可读。你知道吗

subDomainLst = [ele.split('/')[2] for line in f for ele in re.findall(r'\bhref="\https://[\w+\.*]+%s/' % domain, line)]

顺便说一句,您应该将代码缩进4个空格,并尽量避免一行不可理解的语句:pythonic意味着也可读

完整代码:

if __name__ == '__main__':
    import re 
    fil, domain = raw_input("Enter the file name\n"), raw_input("Enter the domain\n")
    with open(fil) as f:
        print '\n'.join([ele.split('/')[2] for line in f for ele in re.findall(r'\bhref="\https://[\w+\.*]+%s/' % domain, line)])

你有两个不同的目标:缩小界线,变得更像Python。 这里是一行,但不是pythonic:

import re;fil,domain = raw_input("Enter the file name\n"),raw_input("Enter the domain\n");print '\n'.join(set(ele.split('/')[2] for line in open(fil) for ele in (re.findall(r'\bhref="\https://[\w+\.*]+%s/'%domain,line) or ())))

相关问题 更多 >