高效存储三元组

# <1> <file:///home//uniprot/uniprot.rdf> <http://www.w3.org/2002/07/owl#imports> <http://purl.uniprot.org/core/> . # <2> <http://purl.uniprot.org/uniprot/Q6GZX4> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://purl.uniprot.org/core/Protein> . # <3> <http://purl.uniprot.org/uniprot/Q6GZX4> <http://purl.uniprot.org/core/reviewed> "true"^^<http://www.w3.org/2001/XMLSchema#boolean> .

@fileuniprot: <file:///home//uniprot/>. @owl: <http://www.w3.org/2002/07/owl#>. @purlUniprot: <http://purl.uniprot.org/>. @rdfs: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>. @xsd: <http://www.w3.org/2001/XMLSchema#>. @xsd: # <1> fileuniprot:uniprot.rdf owl:imports purlUniprot:core . # <2> purlUniprot:uniprot/Q6GZX4 rdfs:type purlUniprot:core/Protein . # <3> purlUniprot:Q6GZX4 purlUniprot:core/reviewed "true"^^ xsd:boolean .

1条回答

网友

1楼 · 发布于 2024-10-04 09:31:58

您可能需要考虑使用hdt进行非常好的压缩。您可以将uniprot文件更改回使用gzip压缩的rdf/xml，并将大小至少减少25倍。（bzip2将给出30）我建议使用pbzip2获得最佳效果。你知道吗

如果您确实想使用turtle语法进行一些压缩，那么可以使用sesame RIO、jena RIOT中预先存在的代码或librdf中的rapper

问题是你为什么要把文件作为nt开头？你知道吗

您实际考虑使用的文件格式称为turtle。N3是turtle plus规则，这个规则部分实际上没有在UniProt数据集中使用，只是在RDF/triples之外。你知道吗

rapper -i ntriples -o turtle ~/uniprot.nt  > ~/uniprot.ttl

忘了N3，读一下海龟吧。你知道吗

相关问题更多 >

编程相关推荐

热门问题

热门文章