<p>模块<code>docx</code>似乎无法处理<code>EMF</code>文件</p>
<p>我的意思是,围绕这一点开展的工作如下:</p>
<pre><code>import shutil
import zipfile
temp_dir = "_temp"
old_docx = "doc.docx"
new_docx = "doc_new.docx"
old_emf = temp_dir + "/word/media/image1.emf"
new_emf = "new_image.emf"
# unpack content of the docx file into the temp folder
with zipfile.ZipFile(old_docx, "r") as z:
files = z.namelist()
for f in files: z.extract(f, temp_dir)
# replace the image
shutil.copyfile(new_emf, old_emf)
# pack all files from temp folder back into the new docx file
with zipfile.ZipFile(new_docx, "a") as z:
for f in files: z.write(temp_dir + "/" + f, f)
# remove the temp folder
shutil.rmtree(temp_dir)
</code></pre>
<p>docx文件的典型结构:</p>
<pre><code>doc.docx
│
├─ [Content_Types].xml
│
├─ _rels
│ └─ .rels
│
├─ docProps
│ ├─ app.xml
│ └─ docProps
│
└─ word
├─ document.xml < text is here
├─ fontTable.xml
├─ settings.xml
├─ webSettings.xml
├─ styles.xml
│
├─ _rels
│ └─ document.xml.rels
│
├─ theme
│ └─ theme1.xml
│
└─ media
└─ image1.emf < your image is here
</code></pre>
<p>它将文档文件<code>doc.docx</code>的内容解压缩到临时文件夹<code>_temp</code>,然后用当前目录中的另一个文件<code>new_image.emf</code>替换临时目录中的文件<code>image1.emf</code>。然后它将临时文件夹的内容打包回<code>doc_new.docx</code>文件并删除临时目录</p>
<p>注意:新图像在<code>new_doc.docx</code>中的大小与旧图像相同</p>
<p>因此,工作流程可以是这样的:创建模板docx文件,手动将模板emf图片放在那里,然后保存docx文件。然后获取新的emf图像,将该图像放在docx文件旁边并运行脚本。通过这种方式,您可以获得一个带有新emf映像的新docx文件</p>
<p>我想您有很多emf图像,所以在这个脚本中添加几行代码是有意义的,这样它就可以拍摄多个图像并生成多个docx文件</p>
<p>如果所有emf图像的大小都相同,那么它就可以正常工作。如果它们的大小不同,则需要更多的编码来处理xml数据</p>
<p><strong>更新</strong></p>
<p>我已经知道了如何获得emf图像的大小。下面是完整的解决方案:</p>
<pre><code>from docx import Document
import shutil
import zipfile
temp_dir = "_temp"
old_docx = "doc.docx"
new_docx = "doc_new.docx"
old_emf = temp_dir + "/word/media/image1.emf" # don't change this line
new_emf = "img5.emf"
# unpack content of the docx file into temp folder
with zipfile.ZipFile(old_docx, "r") as z:
files = z.namelist()
for f in files: z.extract(f, temp_dir)
# replace the image
shutil.copyfile(new_emf, old_emf)
# pack all files from temp folder back into the new docx file
with zipfile.ZipFile(new_docx, "a") as z:
for f in files: z.write(temp_dir + "/" + f, f)
# remove temp folder
shutil.rmtree(temp_dir)
# get sizes of the emf image
with open(new_emf, "rb") as f:
f.read(16)
w1, w2 = f.read(1).hex(), f.read(1).hex()
f.read(2)
h1, h2 = f.read(1).hex(), f.read(1).hex()
width = int(str(w2) + str(w1), 16) * 762
height = int(str(h2) + str(h1), 16) * 762
# open the new docx file and set the sizes for the image
doc = Document(new_docx)
img = doc.inline_shapes[0] # suppose the first image is the image
img.width = width
img.height = height
doc.save(new_docx)
</code></pre>