如何使用iText将XHTML嵌套列表转换为pdf？

2 月，3 周 Questions & Answers 2162

我有XHTML内容，我必须根据这些内容动态创建一个PDF文件。我使用iText pdf转换器。我尝试了这种简单的方法，但在调用XMLWorkerHelper解析器后，总是得到不好的结果

XHTML：


            <ul>
               <li>First
                   <ol>
                        <li>Second</li>
                        <li>Second</li>
                  </ol>
               </li>
               <li>First</li>
            </ul>

期望值：

首先

第二
第二

首先

PDF结果：

第一秒

首先

结果中没有嵌套列表。我需要一个调用解析器的解决方案，而不是创建iText文档实例

# 1 楼答案

请看一看例子NestedListHtml

在本例中，我使用您的代码片段list.html：

<ul>
  <li>First
    <ol>
      <li>Second</li>
      <li>Second</li>
    </ol>
  </li>
  <li>First</li>
</ul>

我把它解析成一个ElementList：

// CSS
CSSResolver cssResolver =
    XMLWorkerHelper.getInstance().getDefaultCssResolver(true);

// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.autoBookmark(false);

// Pipelines
ElementList elements = new ElementList();
ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
HtmlPipeline html = new HtmlPipeline(htmlContext, end);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);

// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new FileInputStream(HTML));

现在，我可以将此列表添加到Document：

for (Element e : elements) {
    document.add(e);
}

或者我可以将此列表列在Paragraph中：

Paragraph para = new Paragraph();
for (Element e : elements) {
    para.add(e);
}
document.add(para);

您将获得所需的结果，如nested_list.pdf

您不能将嵌套列表添加到PdfPCell或ColumnText中。例如：这将不起作用：

PdfPTable table = new PdfPTable(2);
table.addCell("Nested lists don't work in a cell");
PdfPCell cell = new PdfPCell();
for (Element e : elements) {
    cell.addElement(e);
}
table.addCell(cell);
document.add(table);

这是由于ColumnText类的限制，该类已存在多年。我们已经评估了这个问题，解决这个问题的唯一方法是完全重写ColumnText。这不是我们当前技术路线图上的项目

public String replaceIndentSubList(String htmlContent) { org.jsoup.nodes.Document document = Jsoup.parseBodyFragment(htmlContent); Elements element_UL = document.select("ul"); Elements element_OL = document.select("ol"); if (!element_UL.isEmpty()) { htmlContent = replaceIndents(htmlContent, element_UL, "ul"); } if (!element_OL.isEmpty()) { htmlContent = replaceIndents(htmlContent, element_OL, "ol"); } return htmlContent; } public String replaceIndents(String htmlContent, Elements element, String tagType) { String attributeKey = "class"; String startingULTgas = "<" + tagType + ">"; String endingULTags = "</" + tagType + ">"; int lengthOfQLIndenet = new String("ql-indent-").length(); HashMap<String, String> startingLiTagMap = new HashMap<String, String>(); HashMap<String, String> lastLiTagMap = new HashMap<String, String>(); Pattern regex = Pattern.compile("ql-indent-\\d"); HashSet<String> hash_Set = new HashSet<String>(); Elements element_Tag = element.select("li"); for (org.jsoup.nodes.Element element2 : element_Tag) { org.jsoup.nodes.Attributes att = element2.attributes(); if (att.hasKey(attributeKey)) { String attributeValue = att.get(attributeKey); Matcher matcher = regex.matcher(attributeValue); if (matcher.find()) { if (!startingLiTagMap.containsKey(attributeValue)) { startingLiTagMap.put(attributeValue, element2.toString()); } hash_Set.add(matcher.group(0)); if (!startingLiTagMap.get(attributeValue) .equalsIgnoreCase(element2.toString())) { lastLiTagMap.put(attributeValue, element2.toString()); } } } } System.out.println(htmlContent); Iterator value = hash_Set.iterator(); while (value.hasNext()) { String liAttributeKey = (String) value.next(); int noOfIndentes = Integer .parseInt(liAttributeKey.substring(lengthOfQLIndenet)); if (noOfIndentes > 1) for (int i = 1; i < noOfIndentes; i++) { startingULTgas = startingULTgas + "<" + tagType + ">"; endingULTags = endingULTags + "</" + tagType + ">"; } htmlContent = htmlContent.replace(startingLiTagMap.get(liAttributeKey), startingULTgas + startingLiTagMap.get(liAttributeKey)); if (lastLiTagMap.get(liAttributeKey) != null) { System.out.println("Inside last Li Map"); htmlContent = htmlContent.replace(lastLiTagMap.get(liAttributeKey), lastLiTagMap.get(liAttributeKey) + endingULTags); } else { htmlContent = htmlContent.replace(startingLiTagMap.get(liAttributeKey), startingLiTagMap.get(liAttributeKey) + endingULTags); } startingULTgas = "<" + tagType + ">"; endingULTags = "</" + tagType + ">"; } System.out.println(htmlContent);[enter image description here][1] return htmlContent; }

Python中文网

有 Java 编程相关的问题?

如何使用iText将XHTML嵌套列表转换为pdf？

共 (2) 个答案

# 1 楼答案

# 2 楼答案