Python rdflib outpu的Unicode编码错误

2024-10-02 10:29:45 发布

您现在位置：Python中文网/ 问答频道 /正文

8407

网友

男 | 程序猿一只，喜欢编程写python代码。

我正在使用rdflib来解析CommonCrawl microdata。它是一个大的N-Quads格式文件。除了保存到CSV文件或打印到终端的最后一个阶段之外，一切都很好，因为编码错误而失败。在

我目前的代码是：

import csv
import rdflib
from rdflib import ConjunctiveGraph, URIRef, Namespace, RDF, BNode

g = rdflib.ConjunctiveGraph()
g.parse("nquads.nquads", format="nquads")

with open('list.csv', 'wb') as csvfile:
    csvwriter = csv.writer(csvfile, delimiter=',',
                            quotechar='"', quoting=csv.QUOTE_MINIMAL)
    csvwriter.writerow(['URL'])

    for ctx in q.quads:
        s = ctx[3]
        s = s[s.index("<") + 1:s.rindex(">")] # Gets URL between < and >
        csvwriter.writerow([ s ])

这将运行1000行，但在某个点中断。在

错误是：

^{pr2}$

现在，我尝试了几种方法：

 s = ctx[3].toPython()
 s = ctx[3].value()
 s = str(ctx[3])
 s = ctx[3].encode('utf-8')
 s = ctx[3].encode('utf-8', 'ignore')

等等

ctx[3]数据的格式如下：

<http://www.serenabakessimplyfromscratch.com/2014/07/blueberry-cinnamon-swirl-crumb.html> a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'IOMemory'].
<http://www.seriouseats.com/recipes/2009/01/meat-lite-warm-winter-salad.html?ref=excerpt_readmore> a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'IOMemory'].
<http://www.grouprecipes.com/103118/broccoli-rice-casserole.html> a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'IOMemory'].
<http://www.grouprecipes.com/67612/asian-chicken-noodle-soup.html> a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'IOMemory'].
<http://www.grouprecipes.com/113715/bouillabaisse-style-fish-stew.html> a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'IOMemory'].
<http://www.drinksmixer.com/drink15xy188.html> a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'IOMemory'].

上面的代码在很多情况下都能正常工作，正确地提取URL并将其写入CSV，但在某些数据上不可避免地会出现中断。在

如何正确地从RDFlib获取文本内容？如何找出它是什么编码格式？有没有其他方法可以把文本内容输出？在

Tags： csv store import com http html www storage

0条回答

目前没有回答

Python rdflib outpu的Unicode编码错误

相关问题更多 >

编程相关推荐

热门问题

热门文章

Python rdflib outpu的Unicode编码错误

相关问题 更多 >

编程相关推荐

热门问题

热门文章

相关问题更多 >