Sparql上的同一个查询会得到不同的结果

1条回答

网友

1楼 · 发布于 2024-06-01 12:33:30

写下这个问题时，有几个可能的问题。根据这些评论，这里描述的第一个问题（关于lang，langMatches，等等）似乎是您实际遇到的问题，但我将保留对其他可能问题的描述，以防其他人发现它们有用。在

`lang`，`langMatches`，以及空字符串

lang被定义为返回""，用于没有语言标记的文本。根据RFC 4647§2.1，语言标记定义如下：

2.1. Basic Language Range
A "basic language range" has the same syntax as an [RFC3066] language tag or is the single character "*". The basic language range was originally described by HTTP/1.1 [RFC2616] and later [RFC3066]. It is defined by the following ABNF [RFC4234]:
language-range   = (1*8ALPHA *("-" 1*8alphanum)) / "*"
alphanum         = ALPHA / DIGIT

这意味着""实际上不是一个合法的语言标记。正如Jeen Broekstra pointed out on answers.semanticweb.com，SPARQL建议说：

17.2 Filter Evaluation
SPARQL provides a subset of the functions and operators defined by XQuery Operator Mapping. XQuery 1.0 section 2.2.3 Expression Processing describes the invocation of XPath functions. The following rules accommodate the differences in the data and execution models between XQuery and SPARQL: …
Functions invoked with an argument of the wrong type will produce a type error. Effective boolean value arguments (labeled "xsd:boolean (EBV)" in the operator mapping table below), are coerced to xsd:boolean using the EBV rules in section 17.2.2.

由于""不是合法的语言标记，它可能被认为是“错误类型的参数将产生类型错误”。在这种情况下，langMatches调用将产生错误，并且在filter表达式中将该错误视为false。即使由于这个原因它没有返回false，RFC 4647§3.3.1描述了语言标记和范围是如何比较的，但并没有确切地说明在比较中应该发生什么，因为它假设了合法的语言标记：

Basic filtering compares basic language ranges to language tags. Each basic language range in the language priority list is considered in turn, according to priority. A language range matches a particular language tag if, in a case-insensitive comparison, it exactly equals the tag, or if it exactly equals a prefix of the tag such that the first character following the prefix is "-". For example, the language-range "de-de" (German as used in Germany) matches the language tag "de-DE-1996" (German as used in Germany, orthography of 1996), but not the language tags "de-Deva" (German as written in the Devanagari script) or "de-Latn-DE" (German, Latin script, as used in Germany).

根据您的评论和我的本地实验，对于没有语言标记的文本，langMatches(lang(?obj),"")似乎在Virtuoso（正如它安装在DBpedia上一样）、Jena的ARQ（来自我的实验）和Proégé（来自我们的实验），它在RDFlib中返回false（或一个被强制为false的错误）。在

在这两种情况下，由于lang被定义为返回""而没有语言标记的文本，因此您应该能够通过使用lang(?obj) = ""更改{}来将它们可靠地包括在结果中。在

您使用的数据有问题

你没有查询相同的数据。从中下载的数据

http://dbpedia.org/resource/Johann_Sebastian_Bach

来自DBpedia，但是当您对

http://live.dbpedia.org/sparql

您正在针对dbpedialive运行它，它可能有不同的数据。如果在DBpedia Live端点和DBpedia端点上运行此查询，则会得到不同数量的结果：

SELECT count(*) WHERE { 
  dbpedia:Johann_Sebastian_Bach ?pred ?obj
  FILTER( langMatches(lang(?obj), "")  || langMatches(lang(?obj), "EN" ) )
}

DBpedia Live results31
DBpedia results34

与`distinct`有关的问题

另一个可能的问题是，第二个查询有一个distinct修饰符，而第一个没有，这意味着第二个查询的结果可能比第一个少。在

如果您对DBpedia SPARQL endpoint运行这个查询，您应该得到34个结果，无论您是否使用distinct修饰符，这是您下载数据并对其运行相同查询时应该得到的数字。在

^{pr2}$

SPARQL results

`lang`，`langMatches`，以及空字符串

2.1. Basic Language Range

17.2 Filter Evaluation

您使用的数据有问题

与`distinct`有关的问题

相关问题更多 >

编程相关推荐

热门问题

热门文章