由於之前系統學過SPARQL的理論和實踐,但都不夠系統,爲此重新進行總結一下:
1 工具環境採用:Sesame-workbench來直接測試SPARQL語句的正確性。
2 編碼環境採用:jena來測試相關代碼應用。
3 測試數據:如下內容,並保存爲UTF-8的文件test.rdf文件
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF
xmlns:c="http://s.opencalais.com/1/pred/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3f77">
<c:資料類型>a327a4b0-8c6a-b911-3ef0-54c84c22723b</c:資料類型>
<c:externalID>http://www.deri.ie中國</c:externalID>
<c:id>http://id.opencalais.com/k5Mp3gJaH21KD339XKs8PQ</c:id>
<rdf:type rdf:resource="http://s.opencalais.com/1/type/sys/DocInfo"/>
<c:document></c:document>
<c:docTitle></c:docTitle>
<c:docDate>2013-05-29 07:48:08.424</c:docDate>
</rdf:Description>
<rdf:Description rdf:about="http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3222">
<c:資料類型>a327a4b0-8c6a-b911-3ef0-54c84c227222</c:資料類型>
<c:externalID>http://www.deri.com</c:externalID>
<c:id>http://id.opencalais.com/k5Mp3gJaH21KD339XsssssKs8PQ</c:id>
<rdf:type rdf:resource="http://s.opencalais.com/1/type/sys/DocInfo"/>
<c:document></c:document>
<c:docTitle></c:docTitle>
<c:docDate>2013-05-29 07:48:18.424</c:docDate>
</rdf:Description>
<rdf:Description rdf:about="http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3f77/meta">
<c:contentType>text/html</c:contentType>
<c:emVer>7.1.1103.5</c:emVer>
<c:langIdVer>DefaultLangId</c:langIdVer>
<c:language>InputTextTooShort</c:language>
<c:processingVer>CalaisJob01</c:processingVer>
<c:submissionDate>2013-05-29 07:48:08.268</c:submissionDate>
<rdf:type rdf:resource="http://s.opencalais.com/1/type/sys/DocInfoMeta"/>
<c:docId rdf:resource="http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3f77"/>
<c:submitterCode>d98c1dd4-008f-04b2-e980-0998ecf8427e</c:submitterCode>
<c:signature>digestalg-1|AILQiyYhIcdMscM1psx1oyF9Kpg=|dkCpZQrN
+zzh7aOln8p9IRHA7p5hSnFbV2cGGM56f3fkDRod4cl9Ew==</c:signature>
</rdf:Description>
<rdf:Description rdf:about="http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3f77/lid/DefaultLangId">
<rdf:type rdf:resource="http://s.opencalais.com/1/type/lid/DefaultLangId"/>
<c:docId rdf:resource="http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3f77"/>
<c:lang rdf:resource="http://d.opencalais.com/lid/DefaultLangId/InputTextTooShort"/>
</rdf:Description>
</rdf:RDF>
4 測試語句:
(1) 查詢相關連的內容組信息(基於同一id可,及x):
PREFIX c:<http://s.opencalais.com/1/pred/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?externalID ?docDate ?id
WHERE
{
?x c:externalID ?externalID .
?x c:docDate ?docDate .
?x c:id ?id .}
查詢出兩條記錄:
PREFIX c:<http://s.opencalais.com/1/pred/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?externalID ?docDate ?id from
WHERE
{
?x c:externalID ?externalID .
?y c:docDate ?docDate .
?z c:id ?id .}
查詢出組合記錄:
PREFIX c:<http://s.opencalais.com/1/pred/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT *
WHERE
{
?subject ?predicate ?object .}
結果爲全部內容。
(4)查詢object所在subject,即內容所在主題:
SELECT ?x
WHERE
{
?x c:processingVer ?processingVer}
結果爲:
X |
---|
<http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3f77/meta> |
PREFIX c:<http://s.opencalais.com/1/pred/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
construct {?x rdf:type ?externalID} where {?x c:externalID ?externalID}
執行結果:
PREFIX c:<http://s.opencalais.com/1/pred/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?externalID ?docDate ?id from
WHERE
{
?x c:externalID ?externalID .
?x c:docDate ?docDate .
?x c:id ?id .
FILTER regex(?externalID, "com", "i") }
執行結果:
ExternalID | DocDate | Id |
---|---|---|
"http://www.deri.com" | "2013-05-29 07:48:18.424" | "http://id.opencalais.com/k5Mp3gJaH21KD339XsssssKs8PQ" |
【未差異處理前】
PREFIX c:<http://s.opencalais.com/1/pred/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?docId ?contentType
WHERE
{
?x c:docId ?docId .
?x c:contentType ?contentType .
}
執行結果:
DocId | ContentType |
---|---|
<http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3f77> | "text/html" |
PREFIX c:<http://s.opencalais.com/1/pred/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?docId ?contentType
WHERE
{
?x c:docId ?docId .
OPTIONAL { ?x c:contentType ?contentType}
}
執行結果
DocId | ContentType |
---|---|
<http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3f77> | "text/html" |
<http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3f77> |
DocId | ContentType |
---|---|
<http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3f77> | "text/html" |
<http://d.opencalais.com/dochash-1/788f2f97-5b54-3202-9db8-61d1834a3f77> |
PREFIX c:<http://s.opencalais.com/1/pred/>
PREFIX rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?externalID ?docDate ?資料類型
WHERE
{
?x c:externalID ?externalID .
?x c:docDate ?docDate .
?x c:資料類型 ?資料類型 .}
執行過程中發現sesame工具不支持中文本本體,因此採用編程手段了實現該查詢
5 測試代碼:
採用讀文件的辦法,採用Jena的編程接口來實現sqarql查詢,並解決了中文問題。例子如下:
package RWRdfOwl;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;
import com.hp.hpl.jena.ontology.OntClass;
import com.hp.hpl.jena.ontology.OntModel;
import com.hp.hpl.jena.query.Query;
import com.hp.hpl.jena.query.QueryExecution;
import com.hp.hpl.jena.query.QueryExecutionFactory;
import com.hp.hpl.jena.query.QueryFactory;
import com.hp.hpl.jena.query.QuerySolution;
import com.hp.hpl.jena.query.ResultSet;
import com.hp.hpl.jena.query.ResultSetFormatter;
import com.hp.hpl.jena.rdf.model.InfModel;
import com.hp.hpl.jena.rdf.model.ModelFactory;
import com.hp.hpl.jena.rdf.model.Property;
import com.hp.hpl.jena.rdf.model.Resource;
import com.hp.hpl.jena.rdf.model.Statement;
import com.hp.hpl.jena.rdf.model.StmtIterator;
import com.hp.hpl.jena.reasoner.Reasoner;
import com.hp.hpl.jena.reasoner.ReasonerRegistry;
public class TestSPARQL {
/**
* @param args
*/
public static void main(String[] args) {
try{
OntModel model = ModelFactory.createOntologyModel();
FileInputStream file = new FileInputStream("G:\\wokspace\\java\\jena\\test.rdf");
InputStreamReader in = new InputStreamReader(file, "UTF-8");
model.read(file,null);
System.out.print("that is ok");
System.out.println("成功載入本體");
String prefix = "PREFIX c:<http://s.opencalais.com/1/pred/>"
+ "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>";
String slect = "SELECT ?externalID ?資料類型 ";
String where = "WHERE { " + "?k c:externalID" + " ?externalID ."
+" ?k c:資料類型 ?資料類型 ."
+" }";
Query query = QueryFactory.create(prefix + slect + where);
Reasoner reasoner = ReasonerRegistry.getOWLReasoner();
InfModel inf = ModelFactory.createInfModel(reasoner, model);
QueryExecution qe = QueryExecutionFactory.create(query, inf);
ResultSet results = qe.execSelect();
System.out.print(ResultSetFormatter.asText(results));//圖形化顯示結果。
qe.close();
}catch(Exception e)
{
e.printStackTrace();
}
}
}
運行結果如下:
-------------------------------------------------------------------
| externalID | 資料類型 |
===================================================================
| "http://www.deri.com" | "a327a4b0-8c6a-b911-3ef0-54c84c227222" |
| "http://www.deri.ie中國" | "a327a4b0-8c6a-b911-3ef0-54c84c22723b" |
-------------------------------------------------------------------