有没有一个讨厌的兄弟姐妹数?

2024-10-02 22:36:31 发布

您现在位置:Python中文网/ 问答频道 /正文

我正在尝试获取以下html代码的标题:

<FONT COLOR=#5FA505><B>Claim:</B></FONT> &nbsp; Coed makes unintentionally risqu&eacute; remark about professor's "little quizzies."
<BR><BR>
<CENTER><IMG SRC="/images/content-divider.gif"></CENTER>

我用的是这个代码:

^{pr2}$

我成功地从前面提到的html代码中提取了我想要的正确的Claim:值,但它也(在同一页面中具有类似结构的其他代码)提取了下面的html。我定义我的xpath()只是拉入名为Claim:font标记,那么它为什么还要拉下面的Origins?我怎样才能修好它呢?我试着看看我是否能只得到下一个而不是所有的,但是没用

<FONT COLOR=#5FA505 FACE=""><B>Origins:</B></FONT> &nbsp; Print references to the "little quizzies" tale date to 1962, but the tale itself has been around since the early 1950s. It continues to surface among college students to this day. Similar to a number of other college legends

Tags: theto代码brhtmlcolorcenterfont
2条回答

following-sibling轴返回元素后面的所有同级。如果只需要第一个同级,请尝试XPath表达式:

//font[b = "Claim:"]/following-sibling::text()[1]

或者,根据您的具体使用情况:

^{pr2}$

我认为您的xpath缺少text()限定符(解释了here)。它应该是:

'//font/[b/text()="Claim:"]/following-sibling::text()'

相关问题 更多 >