【其他】RDF與SPARQL

RDF(Resource Description Framework)

RDF在線驗證器

RDF(資源描述框架)是描述網絡資源的 W3C 標準

<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:si="http://www.runoob.com/rdf/">
<rdf:Description rdf:about="http://www.runoob.com">
  <si:title>runoob.com</si:title>
  <si:author>Jan Egil Refsnes</si:author>
</rdf:Description>
</rdf:RDF>

定義 : <s, p, o>

  • s : URIs (incl. rdf:type) and Blank nodes
  • p: URIs (incl. rdf:type)
  • o: URIs (incl. rdf:type) and Blank nodes and Literals(文字)

一、元素

<rdf:RDF> 是 RDF 文檔的根元素。它把 XML 文檔定義爲一個 RDF 文檔。它也包含了對 RDF 命名空間的引用

<rdf:Description> 元素可通過 about 屬性標識一個資源; 可包含描述資源的那些元素

舉例說明: RDF 僅僅定義了這個框架。而 artist等元素必須被其他人進行定義

<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cd="http://www.recshop.fake/cd#">

<rdf:Description rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
  <cd:artist>Bob Dylan</cd:artist>
  ...
</rdf:Description>

</rdf:RDF>
<rdf:Description rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
  <cd:artist rdf:resource="http://www.recshop.fake/cd/dylan" />
  ...
</rdf:Description>

二、RDF序列化方法

RDF的表示形式和類型有了,那我們如何創建RDF數據集,將其序列化(Serialization)呢?換句話說,就是我們怎麼存儲和傳輸RDF數據。目前,RDF序列化的方式主要有:RDF/XML,N-Triples,Turtle,RDFa,JSON-LD等幾種

  • Turtle, 應該是使用得最多的一種RDF序列化方式了。它比RDF/XML緊湊,且可讀性比N-Triples好
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
@prefix foaf: <http://xmlns.com/foaf/0.1/>
@prefix ex: <http://www.cs.man.ac.uk/>
ex:sattler
    foaf:title "Dr." ; 
    foaf:knows ex:bparsia ; 
    foaf:knows
    [
        foaf:title "Count"; 
        foaf:lastName "Dracula"
    ]
  • JSON-LD,即“JSON for Linking Data”,用鍵值對的方式來存儲RDF數據

三、RDF Schema

RDF Schema 不提供實際的應用程序專用的類和屬性,而是提供了描述應用程序專用的類和屬性的框架

  • rdfs:subClassOf
  • rdfs:subPropertyOf
  • rdfs:domain
  • rdfs:range
<?xml version="1.0"?>

<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xml:base="http://www.animals.fake/animals#">

    <rdfs:Class rdf:ID="animal" />
    <rdfs:Class rdf:ID="horse">
      <rdfs:subClassOf rdf:resource="#animal"/>
    </rdfs:Class>

</rdf:RDF>

##四、RDF高級用法

參考資料:RDF

1. Reification

RDF reification vocabulary

rdf:Statement rdf:subject rdf:predicate rdf:object

假設簡單的triple: <ex:a> <ex:b> <ex:c> .

則該三元組的reification如下表示:

_:xxx rdf:type rdf:Statement .
_:xxx rdf:subject <ex:a> .
_:xxx rdf:predicate <ex:b> .
_:xxx rdf:object <ex:c> .

舉例說明:

:Tolkien :wrote :LordOfTheRings .
# 可以有一個單獨的資源來代表一個聲明,這樣你就可以陳述關於聲明本身的其他事情, 增加"Wikipedia said that"
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
_:x rdf:type rdf:Statement .
_:x rdf:subject :Tolkien .
_:x rdf:predicate :wrote .
_:x rdf:object :LordOfTheRings .
_:x :said :Wikipedia .

五、SPARQL

參考文檔:維基數據查詢用戶手冊Wikidata:SPARQL tutorial

SPARQL即SPARQL Protocol and RDF Query Language的遞歸縮寫,專門用於訪問和操作RDF數據,是語義網的核心技術之一

SPARQL的部分關鍵詞

  • SELECT, 指定我們要查詢的變量。
  • WHERE,指定我們要查詢的圖模式。含義上和SQL的WHERE沒有區別。
  • FROM,指定查詢的RDF數據集。
  • PREFIX,用於IRI的縮寫。

1. Wikidata

主語: Q30; 謂語: P36; 賓語: Q61

wd:Q30  wdt:P36  wd:Q61 .

前綴(Prefixes)

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wds: <http://www.wikidata.org/entity/statement/>
PREFIX wdv: <http://www.wikidata.org/value/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX bd: <http://www.bigdata.com/rdf#>

SELECT ?s ?desc WHERE {
  ?s wdt:P279 wd:Q7725634 .
  OPTIONAL {
     ?s rdfs:label ?desc filter (lang(?desc) = "en").
   }
 }

2. 基本用法

主語,謂語,賓語的形式 - SPO (Subject, Predicate, Object) also known as a Semantic Triple

SELECT ?a ?b ?c
WHERE
{
  x y ?a.
  m n ?b.
  ?b f ?c.
}

應用到wikidata:

SELECT ?child
WHERE
{
# ?child has father Bach
  ?child wdt:P22 wd:Q1339.
}

3. 高級用法

多個條件

SELECT ?child ?childLabel
WHERE
{
   # p22 父親, p25母親
  ?child wdt:P22 wd:Q1339;
         wdt:P25 wd:Q57487.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}

注意: ;, 的區別

  • ; : romeo loves juliet; kills romeo.
  • , : romeo kills tybalt, romeo.

嵌套查詢

SELECT ?grandChild ?grandChildLabel
WHERE
{
 # Bach has a child ?child, ?child has a child ?grandChild.
  wd:Q1339 wdt:P40 [ wdt:P40 ?grandChild ].
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}

過濾

SELECT ?book ?title 
WHERE {
	?book dc:title ?title . 
	?book inv:price ?price .
		FILTER ( ?price < 15 ) 
	?book inv:quantity ?num .
		FILTER ( ?num > 0 ) }

注意 not inmins 的 區別:

# person 的職業不是 Q1028181
minus {?person wdt:P106 wd:Q1028181 .}
# person 的職業可以不是 Q1028181
FILTER ( ?item not in ( wd:Q1028181 ) )

**UNION and DISTINCT **

# ?person 是[x]的學生 UNION [x]的學生是 ?person, DISTINCT 限制?person 僅出現1次
SELECT DISTINCT ?person
WHERE
{
  ?person wdt:P31 wd:Q5.
  {?person wdt:P1066 [wdt:P106 wd:Q1028181]}
  UNION {[wdt:P106 wd:Q1028181] wdt:P802 ?person}.
}

OPTIONAL

# OPTIONAL 表示參數可選
# data
person:a foaf:name "Alice" . person:a foaf:nick "A-online" . person:b foaf:name "Bob" .
# query 
SELECT ?name ?nick
{
    ?x foaf:name ?name . 
    OPTIONAL {?x foaf:nick ?nick }
}
#answer
?name   | ?nick
"Alice" | "A-online"
"Bob"   |  NULL

ASK: 返回true/false

# query
ASK
{ 
	?x foaf:name ?name .
	OPTIONAL { ?x foaf:nick ?nick } 
}
# answer
true

CONSTRUCT :

CONSTRUCT { ?person vc:FN ?name }
WHERE
{
	?person foaf:name ?name .
}
# answer
person:a vc:FN "Alice" . 
person:b vc:FN "Bob" .

實例和類(Instances and classes)

  • instance of (P31)
  • subclass of (P279).
SELECT ?work ?workLabel
WHERE
{
  ?work wdt:P31 wd:Q838948. # instance of work of art
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}

?item wdt:P31/wdt:P279* ?class.: This means that there’s one “instance of” and then any number of “subclass of” statements between the item and the class.

SELECT ?work ?workLabel
WHERE
{
  ?work wdt:P31/wdt:P279* wd:Q838948. # instance of any subclass of work of art
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}

注意:正則

  • ‘*’: 0或more
  • ‘+’: 1或more
  • ‘|’: 或

4. 實例

1.Canadian subjects with no English article in Wikipedia

#added before 2019-02

SELECT ?item ?itemLabel ?cnt WHERE {
{
  SELECT ?item (COUNT(?sitelink) AS ?cnt) WHERE { 
  ?item wdt:P27|wdt:P205|wdt:P17 wd:Q16 . #Canadian subjects.
  minus {?item wdt:P106 wd:Q488111 .} #Minus occupations that would be inappropriate in most situations.
  minus {?item wdt:P106 wd:Q3286043 .}
  minus {?item wdt:P106 wd:Q4610556 .}  
  ?sitelink schema:about ?item .
  FILTER NOT EXISTS {
    ?article schema:about ?item .
    ?article schema:isPartOf <https://en.wikipedia.org/> . #Targeting Wikipedia language where subjects has no article.
  }
  } GROUP BY ?item ORDER BY DESC (?cnt) LIMIT 1000 #Sorted by amount of articles in other languages. Result limited to 1000 lines to not have a timeout error.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,fr,es,de" }  #Service to resolve labels in (fallback) languages: automatic user language, English, French, Spanish, German.
} ORDER BY DESC (?cnt)
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章