擅长:python、mysql、java
<p>您可以逐行读取文件,然后将它们写回,而不需要文件中不需要的行。只要确定你要删除的内容-是不是就是你写的那一行?总是第二条线吗?是每一行吗?是第一行吗?等等</p>
<pre><code>import os
import sys
# Assumes first argument when running the script is a directory containing XML files
directory = sys.argv[1] if len(sys.argv) > 1 else "."
files = os.listdir(directory)
for f in files:
# Ignore not XML files
if not f.endswith(".xml"):
continue
# Read file content
with open(f, 'r') as f_in:
content = f_in.readlines()
# Rewrite the original file
with open(f, 'w') as f_out:
for line in content:
# The condition may differ based on what you really want to delete
if line != "<!DOCTYPE pdf2xml SYSTEM \"pdf2xml.dtd\">\n":
f_out.write(line)
</code></pre>
<p><strong>需要考虑的事项:</strong></p>
<ol>
<li>如果文件很大,您可能不想将其加载到内存中</li>
<li>例如,如果您总是只想删除文件中的第二行,则效率很低。你知道吗</li>
<li><p>你真的需要/想要用Python吗?有更好的解决办法。例如,如果您使用的是Linux或Mac,则可以使用<code>sed</code>:</p>
<pre><code>for f in *.xml; do sed -i '' -n '/<!DOCTYPE pdf2xml SYSTEM "pdf2xml.dtd">/!p' $f; done
</code></pre></li>
</ol>