我有一个txt文件,想打印特定的单词

2024-09-19 23:27:05 发布

您现在位置:Python中文网/ 问答频道 /正文

这是我的文本文件:

10.10.10.10 POST /include/jquery.js HTTP/1.1 233 
192.10.10.12 POST /include/jquery.js HTTP/1.1 232 
10.10.10.12 POST /node/jquery.jshowoff2.js HTTP/1.1 23e
171.1.1.15 POST /include/jquery.min.js HTTP/1.1 121
10.10.10.10 POST /text/jquery.sho.min.js HTTP/1.1 233

我只想打印包含.js的文件名。 例如,对于要打印的第一行: jquery.js

这是我现在拥有的,但它正在打印完整的行

import re
import sys
linenum = 0
substr = '.js'
with open ('access_log.txt', 'rt') as myfile:
    for line in myfile:
        linenum += 1
        if line.find(substr) != -1: 
            print(line, end=' ')

输出:

10.10.10.10 POST /include/jquery.js HTTP/1.1 233 
 192.10.10.12 POST /include/jquery.js HTTP/1.1 232 
 10.10.10.12 POST /node/jquery.jshowoff2.js HTTP/1.1 23e
 171.1.1.15 POST /include/jquery.min.js HTTP/1.1 121
 10.10.10.10 POST /text/jquery.sho.min.js HTTP/1.1 233 

Tags: textimportnodehttpincludelinejsjquery
2条回答

if __name__ == "__main__":
    unique_files = set()

    with open('data/text-file-specific-words.txt', 'r') as f:
        for line in f.readlines():
            for word in line.split():
                tokens = word.split('/')
                for token in tokens:
                    if token.endswith('.js'):
                        unique_files.add(token)

    print(sorted(unique_files))

以下是如何使用rfind()方法查找字符串中最后出现的'/'的索引(如果有):

import re

with open('access_log.txt', 'r') as myfile:
    for line in myfile:
        st = line.split()
        print([a[a.rfind('/')+1:] for a in st if a.endswith('.js')])

输出:

['jquery.js']
['jquery.js']
['jquery.jshowoff2.js']
['jquery.min.js']
['jquery.sho.min.js']

相关问题 更多 >