我有下面的函数,它将先前构造的文件读入defaultdict。它读取的文件是csv文件,包含文件大小和文件路径
如果有多个文件与另一个文件的文件大小匹配,则该文件将通过哈希函数运行
我的问题是,print给了我预期的输出,而将输出写入文件则不是
def loadfiles():
'''Loads files and identifies potential duplicates'''
files = defaultdict(list) # uses defaultdict
with open(tmpfile) as csvfile: # reads the file into a dictionary
reader = csv.DictReader(csvfile)
for row in reader:
files[row['size']].append(row['file'])
for key, value in files.items():
if len([item for item in value if item]) > 1:
with open (reportname, 'w') as fr:
writer = csv.writer(fr, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
writer.writerow(['size','filename','hash'])
for value in value:
writer.writerow([key,value,str(md5Checksum(value))])
print(key, value, str(md5Checksum(value)))
文件的输出如下:
size,filename,hash
43842270,/home/bob/scripts/inprogress_python_scripts/file_dup/testingscript/webwolf-8.0.0.M25.jar,b325dc62d33e2ada19aea07cbcfb237f
43842270,/home/bob/scripts/inprogress_python_scripts/file_dup/testingscript/bkwolf.jar,b325dc62d33e2ada19aea07cbcfb237f
其中,从print到screen的输出是:
128555 /home/bob/scripts/inprogress_python_scripts/file_dup/testingscript/SN0aaa(1).pdf def426a8dee8f226e40df826fcde9904
128555 /home/bob/scripts/inprogress_python_scripts/file_dup/testingscript/SN0aaa(1) (another copy).pdf def426a8dee8f226e40df826fcde9904
128555 /home/bob/scripts/inprogress_python_scripts/file_dup/testingscript/SN0aaa.pdf def426a8dee8f226e40df826fcde9904
128555 /home/bob/scripts/inprogress_python_scripts/file_dup/testingscript/SN0aaa(1) (copy).pdf def426a8dee8f226e40df826fcde9904
43842270 /home/bob/scripts/inprogress_python_scripts/file_dup/testingscript/webwolf-8.0.0.M25.jar b325dc62d33e2ada19aea07cbcfb237f
43842270 /home/b/scripts/inprogress_python_scripts/file_dup/testingscript/bkwolf.jar b325dc62d33e2ada19aea07cbcfb237f
有什么问题吗
使用“w”以写模式打开文件,覆盖文件中已经存在的任何内容。用“a”代替append
这将导致一个问题,你将有你的头(大小,文件名,哈希)在那里多次-考虑写在第一行,而不是在一个循环
参见,例如:https://www.w3schools.com/python/python_file_write.asp
相关问题 更多 >
编程相关推荐