我有一个XML文件,其结构如下:
<text>
<dialogue>
<pattern>
We're having a {nice|great} time.
</pattern>
<criterion>
<!-- match this tag, get the above pattern -->
average_person, tourist, delighted
</criterion>
</dialogue>
<pattern>
The service {here stinks|is terrible}!
</pattern>
<criterion>
tourist, disgruntled, average_person
</criterion>
<dialogue>
<pattern>
They have {smoothies|funny hats}. Neat!
</pattern>
<criterion>
tourist, smoothie_enthusiast
</criterion>
</dialogue>
<dialogue>
<pattern>
I wonder how {expensive|valuable} these resort tickets are?
</pattern>
<criterion>
merchant, average_person
</criterion>
</dialogue>
</text>
我想做的是遍历dialogue
标记,查看criterion
标记,并匹配单词列表。如果它们匹配,那么我将使用dialogue
标记中的模式。我使用Python来完成这个任务
我现在所做的是通过使用lxml
“etree”遍历标记,它如下所示:
tree = etree.parse('tourists.xml')
root = tree.getroot()
g=0
for i in root.iterfind('dialogue/criterion'):
a = i.text.split(',')
# The "personality" variable has a value like "delighted" or "disgruntled".
# "tags_to_match" are the criterion that we want to, well, match. It may
# have criterion like "merchant", "tourist", or "delighted".
# When the tags match (in the "match_tags" function) returns true, it
# appends the pattern to the "tourist_patterns" list.
if personality is not 'average_person' and match_tags( tags_to_match, a):
tourist_patterns.append(root[g][0].text)
g+=1
# When we don't have a match, we just go with the "average_person" tag.
if len(tourist_patterns) == 0:
# Go through the tags again, choosing the ones that match the
# 'average_person' personality and put it in the "tourist_patterns" list.
然后我浏览“旅游模式”列表中的元素,找出我想要的
我在努力简化这件事。如何遍历标记,在criterion
标记中匹配所需的文本,然后在pattern
标记中获取模式?我还尝试设置一个默认值,当标准不匹配时(因此是“普通人”人格标准)
编辑:一些评论员要求列出要匹配的内容。基本上,我希望它匹配criterion
标记中的一些或所有单词,并给出pattern
标记中dialogue
标记下面的文本。因此,如果我在寻找“旅游者”和“冰沙爱好者”,那么在我的XML示例中会找到一个匹配项。然后我想得到pattern
标签文本“They have{smoothies}funny hats}。太好了。如果这与criterion
标签中的任何一个词都不匹配,我只会尝试匹配“普通人”和“旅游者”
反过来,tourist_patterns
在匹配时会如下所示:
>>> tourist_pattern
['They have {smoothies|funny hats}. Neat!']
当它不匹配时,它会匹配这个:
>>> tourist_pattern
['They have {smoothies|funny hats}. Neat!', 'The service {here stinks|is terrible}!']
希望能把事情弄清楚
目前没有回答
相关问题 更多 >
编程相关推荐