从Python中的字符串中删除Wiki标记

2024-10-01 04:48:57 发布

您现在位置：Python中文网/ 问答频道 /正文

1844

网友

男 | 程序猿一只，喜欢编程写python代码。

我有一个包含从Wikia页面下载的信息的字符串。在

为了解析它的内容，我如何从页面中剥离所有Wiki格式，只留下原始文本？在

下面是一个可能出现的例子：

#REDIRECT[[Blah]]

{{
I have some stuff in here
}}
[[I also have some stuff in here|and here]]
[[http://blehthisisfake.com Link to a fake website]]

&lt;span class="plainlinks"&gt;This is quite useless. Why was [[this page]] even created?&lt;/span&gt;

&lt;nowiki&gt;There are more HTML tags, they should probably all be stripped...&lt;/nowiki&gt;

There is random text in here. bleh bleh bleh

I'm not sure what single [brackets] do, but they should be stripped too...

预期产量：

^{pr2}$

有没有一个模块可以做到这一点？在

Tags： in lt gt here is have some 页面

1条回答

网友

1楼 · 发布于 2024-10-01 04:48:57

Google搜索“pythonwiki解析器”会出现this code，这会剥离并替换标记（有关详细信息，请参阅链接中的源代码）。在

从Python中的字符串中删除Wiki标记

相关问题更多 >

编程相关推荐

热门问题

热门文章

从Python中的字符串中删除Wiki标记

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >