我正在尝试解析包含目录和文件列表的txt文件。我对'/ACS/SDU_:'
和'/ACS/ScienceDataFile:'
目录感兴趣。我如何排除像/data/foo/bar/ATB6/Science/TGO/ACS:
和/data/foo/bar/ATB7B/Science/TGO/ACS:
这样的dir?我试图通过if 'ATB6' not in line:
{
filesDict = dict()
for file in glob.glob('/foo/bar/catalog/*.txt'):
with open(os.path.join('/foo/bar/catalog', file), 'r') as openFile:
path = None
files = []
for line in openFile:
line = line.rstrip()
if not line.strip():
if files and (path is not None):
filesDict[path] = files
path = None
files = []
continue
if line.endswith('/ACS/SDU_:') or line.endswith('/ACS/ScienceDataFile:'):
# save previous results, if any
if files and (path is not None):
filesDict[path] = files
path = line[5:-2]
files = []
next(openFile)
continue
if path:
if 'manifest' not in line:
files.append(line)
# last folder read from file but not yet stored
if path:
filesDict[path] = files
Exmaple的txt文件内容:
/data/foo/bar/Science/TGO/NOMAD/ScienceDataFile:
123992
3766886 2016-02-17 10:44 SCI__DNMD__03000082_2016-048T09-07-27__00001.EXM
5245980 2016-02-17 10:00 SCI__DNMD__03000081_2016-048T08-48-13__00001.EXM
3766570 2016-02-17 09:26 SCI__DNMD__03000080_2016-048T08-20-01__00001.EXM
/data/foo/bar/Science/TGO/CASSIS/SDU_:
208744
26934224 2016-02-17 13:11 SDU__DCAS_0003_01200002_2016-047T15-18-48__00001.EXM
35322818 2016-02-17 13:11 SDU__DCAS_0002_01200002_2016-047T15-03-48__00001.EXM
/data/foo/bar/Science/ACS/SDU_:
68421952
17660866 2021-09-06 09:56 SDU__DACS_69DC_0241DB01_2021-246T08-13-26__00001.EXM
17660866 2021-09-06 09:41 SDU__DACS_69DB_0241DB01_2021-246T08-12-37__00001.EXM
17660866 2021-09-06 09:24 SDU__DACS_69DA_0241DB01_2021-246T08-11-46__00001.EXM
17660866 2021-09-06 08:27 SDU__DACS_69D9_0241DB01_2021-246T08-10-56__00001.EXM
/data/foo/bar/Science/TGO/ACS/ScienceDataFile:
69881252
14759936 2021-09-05 21:51 SCI__DACS__0241DA01_2021-246T04-26-15__00001.EXM
53 2021-09-05 21:51 SCI__DACS__0241DA01_2021-246T04-26-15__00001.EXM.manifest
318758912 2021-09-05 14:42 SCI__DACS__0241D801_2021-246T00-30-32__00001.EXM
/data/foo/bar/ATB6/Science/TGO/ACS/ScienceDataFile:
0
/data/foo/bar/ATB7B/Science/TGO/ACS/SDU_:
4
4
116 2017-07-12 11:59 ScienceDataFile/
4096 2017-07-12 11:56 SDU_/
您的
if line.endswith
检查仅在您看到该行的时间有效。因此,在解析文件时,条件本身在错误的时间进行计算(在看到您感兴趣的路径的文件之前)您需要改为检查
path
并存储您感兴趣的后缀(每次在dict中“保存”该path
时使用此选项)将
endswith(':')
更改回原来的位置,因为这将正确标识所有路径,而不仅仅是您感兴趣的路径。可以将['ACS/SDU_:', '/ACS/ScienceDataFile:']
列表提取到它自己的变量中以供重用相关问题 更多 >
编程相关推荐