RDF(Resource Description Framework)
RDF(資源描述框架)是描述網絡資源的 W3C 標準
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:si="http://www.runoob.com/rdf/">
<rdf:Description rdf:about="http://www.runoob.com">
<si:title>runoob.com</si:title>
<si:author>Jan Egil Refsnes</si:author>
</rdf:Description>
</rdf:RDF>
定義 : <s, p, o>
- s : URIs (incl. rdf:type) and Blank nodes
- p: URIs (incl. rdf:type)
- o: URIs (incl. rdf:type) and Blank nodes and Literals(文字)
一、元素
<rdf:RDF>
是 RDF 文檔的根元素。它把 XML 文檔定義爲一個 RDF 文檔。它也包含了對 RDF 命名空間的引用
<rdf:Description>
元素可通過 about 屬性標識一個資源; 可包含描述資源的那些元素
舉例說明: RDF 僅僅定義了這個框架。而 artist等元素必須被其他人進行定義
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:cd="http://www.recshop.fake/cd#">
<rdf:Description rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
<cd:artist>Bob Dylan</cd:artist>
...
</rdf:Description>
</rdf:RDF>
<rdf:Description rdf:about="http://www.recshop.fake/cd/Empire Burlesque">
<cd:artist rdf:resource="http://www.recshop.fake/cd/dylan" />
...
</rdf:Description>
二、RDF序列化方法
RDF的表示形式和類型有了,那我們如何創建RDF數據集,將其序列化(Serialization)呢?換句話說,就是我們怎麼存儲和傳輸RDF數據。目前,RDF序列化的方式主要有:RDF/XML,N-Triples,Turtle,RDFa,JSON-LD等幾種
- Turtle, 應該是使用得最多的一種RDF序列化方式了。它比RDF/XML緊湊,且可讀性比N-Triples好
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
@prefix foaf: <http://xmlns.com/foaf/0.1/>
@prefix ex: <http://www.cs.man.ac.uk/>
ex:sattler
foaf:title "Dr." ;
foaf:knows ex:bparsia ;
foaf:knows
[
foaf:title "Count";
foaf:lastName "Dracula"
]
- JSON-LD,即“JSON for Linking Data”,用鍵值對的方式來存儲RDF數據
三、RDF Schema
RDF Schema 不提供實際的應用程序專用的類和屬性,而是提供了描述應用程序專用的類和屬性的框架
rdfs:subClassOf
rdfs:subPropertyOf
rdfs:domain
rdfs:range
<?xml version="1.0"?>
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xml:base="http://www.animals.fake/animals#">
<rdfs:Class rdf:ID="animal" />
<rdfs:Class rdf:ID="horse">
<rdfs:subClassOf rdf:resource="#animal"/>
</rdfs:Class>
</rdf:RDF>
##四、RDF高級用法
參考資料:RDF
1. Reification
RDF reification vocabulary
rdf:Statement rdf:subject rdf:predicate rdf:object
假設簡單的triple: <ex:a> <ex:b> <ex:c> .
則該三元組的reification如下表示:
_:xxx rdf:type rdf:Statement .
_:xxx rdf:subject <ex:a> .
_:xxx rdf:predicate <ex:b> .
_:xxx rdf:object <ex:c> .
舉例說明:
:Tolkien :wrote :LordOfTheRings .
# 可以有一個單獨的資源來代表一個聲明,這樣你就可以陳述關於聲明本身的其他事情, 增加"Wikipedia said that"
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
_:x rdf:type rdf:Statement .
_:x rdf:subject :Tolkien .
_:x rdf:predicate :wrote .
_:x rdf:object :LordOfTheRings .
_:x :said :Wikipedia .
五、SPARQL
參考文檔:維基數據查詢用戶手冊 和Wikidata:SPARQL tutorial
SPARQL即SPARQL Protocol and RDF Query Language的遞歸縮寫,專門用於訪問和操作RDF數據,是語義網的核心技術之一
SPARQL的部分關鍵詞
- SELECT, 指定我們要查詢的變量。
- WHERE,指定我們要查詢的圖模式。含義上和SQL的WHERE沒有區別。
- FROM,指定查詢的RDF數據集。
- PREFIX,用於IRI的縮寫。
1. Wikidata
主語: Q30; 謂語: P36; 賓語: Q61
wd:Q30 wdt:P36 wd:Q61 .
前綴(Prefixes)
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wds: <http://www.wikidata.org/entity/statement/>
PREFIX wdv: <http://www.wikidata.org/value/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX p: <http://www.wikidata.org/prop/>
PREFIX ps: <http://www.wikidata.org/prop/statement/>
PREFIX pq: <http://www.wikidata.org/prop/qualifier/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX bd: <http://www.bigdata.com/rdf#>
SELECT ?s ?desc WHERE {
?s wdt:P279 wd:Q7725634 .
OPTIONAL {
?s rdfs:label ?desc filter (lang(?desc) = "en").
}
}
2. 基本用法
主語,謂語,賓語的形式 - SPO (Subject, Predicate, Object) also known as a Semantic Triple
SELECT ?a ?b ?c
WHERE
{
x y ?a.
m n ?b.
?b f ?c.
}
應用到wikidata:
SELECT ?child
WHERE
{
# ?child has father Bach
?child wdt:P22 wd:Q1339.
}
3. 高級用法
多個條件
SELECT ?child ?childLabel
WHERE
{
# p22 父親, p25母親
?child wdt:P22 wd:Q1339;
wdt:P25 wd:Q57487.
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
注意: ;
和 ,
的區別
;
: romeo loves juliet; kills romeo.,
: romeo kills tybalt, romeo.
嵌套查詢
SELECT ?grandChild ?grandChildLabel
WHERE
{
# Bach has a child ?child, ?child has a child ?grandChild.
wd:Q1339 wdt:P40 [ wdt:P40 ?grandChild ].
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
過濾
SELECT ?book ?title
WHERE {
?book dc:title ?title .
?book inv:price ?price .
FILTER ( ?price < 15 )
?book inv:quantity ?num .
FILTER ( ?num > 0 ) }
注意 not in
和 mins
的 區別:
# person 的職業不是 Q1028181
minus {?person wdt:P106 wd:Q1028181 .}
# person 的職業可以不是 Q1028181
FILTER ( ?item not in ( wd:Q1028181 ) )
**UNION and DISTINCT **
# ?person 是[x]的學生 UNION [x]的學生是 ?person, DISTINCT 限制?person 僅出現1次
SELECT DISTINCT ?person
WHERE
{
?person wdt:P31 wd:Q5.
{?person wdt:P1066 [wdt:P106 wd:Q1028181]}
UNION {[wdt:P106 wd:Q1028181] wdt:P802 ?person}.
}
OPTIONAL
# OPTIONAL 表示參數可選
# data
person:a foaf:name "Alice" . person:a foaf:nick "A-online" . person:b foaf:name "Bob" .
# query
SELECT ?name ?nick
{
?x foaf:name ?name .
OPTIONAL {?x foaf:nick ?nick }
}
#answer
?name | ?nick
"Alice" | "A-online"
"Bob" | NULL
ASK: 返回true/false
# query
ASK
{
?x foaf:name ?name .
OPTIONAL { ?x foaf:nick ?nick }
}
# answer
true
CONSTRUCT :
CONSTRUCT { ?person vc:FN ?name }
WHERE
{
?person foaf:name ?name .
}
# answer
person:a vc:FN "Alice" .
person:b vc:FN "Bob" .
實例和類(Instances and classes)
- instance of (P31)
- subclass of (P279).
SELECT ?work ?workLabel
WHERE
{
?work wdt:P31 wd:Q838948. # instance of work of art
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
?item wdt:P31/wdt:P279* ?class.
: This means that there’s one “instance of” and then any number of “subclass of” statements between the item and the class.
SELECT ?work ?workLabel
WHERE
{
?work wdt:P31/wdt:P279* wd:Q838948. # instance of any subclass of work of art
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE]". }
}
注意:正則
- ‘*’: 0或more
- ‘+’: 1或more
- ‘|’: 或
4. 實例
1.Canadian subjects with no English article in Wikipedia
#added before 2019-02
SELECT ?item ?itemLabel ?cnt WHERE {
{
SELECT ?item (COUNT(?sitelink) AS ?cnt) WHERE {
?item wdt:P27|wdt:P205|wdt:P17 wd:Q16 . #Canadian subjects.
minus {?item wdt:P106 wd:Q488111 .} #Minus occupations that would be inappropriate in most situations.
minus {?item wdt:P106 wd:Q3286043 .}
minus {?item wdt:P106 wd:Q4610556 .}
?sitelink schema:about ?item .
FILTER NOT EXISTS {
?article schema:about ?item .
?article schema:isPartOf <https://en.wikipedia.org/> . #Targeting Wikipedia language where subjects has no article.
}
} GROUP BY ?item ORDER BY DESC (?cnt) LIMIT 1000 #Sorted by amount of articles in other languages. Result limited to 1000 lines to not have a timeout error.
}
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en,fr,es,de" } #Service to resolve labels in (fallback) languages: automatic user language, English, French, Spanish, German.
} ORDER BY DESC (?cnt)