在文章http://blog.csdn.net/jj380382856/article/details/51603818我們分析了更新索引的solrj源碼的處理方式,最後會向solr發送一個/update的請求,下面我們繼續分析solr在接收到這個請求會怎麼處理。
1.請求首先被SolrDispatchFilter截獲,然後執行dofilter方法
2.在方法中調用 Action result = call.call();方法,進入HttpSolrCall.call()方法,這個方法會調用這個類的init()方法,該方法的主要作用是根據servlet和solrconfig的配置獲取當前處理請求的SolrRequestHandler的對象。這個方法中調用了 extractHandlerFromURLPath(parser);方法,該方法代碼如下。
private void extractHandlerFromURLPath(SolrRequestParsers parser) throws Exception {
if (handler == null && path.length() > 1) { // don't match "" or "/" as valid path
handler = core.getRequestHandler(path);
。。。。。。
}
執行完這個代碼後handler變成了
init()方法執行完成後action變成了process,HttpSolrCall.call()方法繼續執行,代碼如下,主要就是封裝請求,這裏面的主要的代碼是 execute(solrRsp);
switch (action) {
case ADMIN:
handleAdminRequest();
return RETURN;
case REMOTEQUERY:
remoteQuery(coreUrl + path, resp);
return RETURN;
case PROCESS:
final Method reqMethod = Method.getMethod(req.getMethod());
HttpCacheHeaderUtil.setCacheControlHeader(config, resp, reqMethod);
// unless we have been explicitly told not to, do cache validation
// if we fail cache validation, execute the query
if (config.getHttpCachingConfig().isNever304() ||
!HttpCacheHeaderUtil.doCacheHeaderValidation(solrReq, req, reqMethod, resp)) {
SolrQueryResponse solrRsp = new SolrQueryResponse();
/* even for HEAD requests, we need to execute the handler to
* ensure we don't get an error (and to make sure the correct
* QueryResponseWriter is selected and we get the correct
* Content-Type)
*/
SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, solrRsp));
execute(solrRsp);<span style="color:#ff0000;">//主要代碼</span>
HttpCacheHeaderUtil.checkHttpCachingVeto(solrRsp, resp, reqMethod);
Iterator<Map.Entry<String, String>> headers = solrRsp.httpHeaders();
while (headers.hasNext()) {
Map.Entry<String, String> entry = headers.next();
resp.addHeader(entry.getKey(), entry.getValue());
}
QueryResponseWriter responseWriter = core.getQueryResponseWriter(solrReq);
if (invalidStates != null) solrReq.getContext().put(CloudSolrClient.STATE_VERSION, invalidStates);
writeResponse(solrRsp, responseWriter, reqMethod);
}
return RETURN;
default: return action;
protected void execute(SolrQueryResponse rsp) {
// a custom filter could add more stuff to the request before passing it on.
// for example: sreq.getContext().put( "HttpServletRequest", req );
// used for logging query stats in SolrCore.execute()
solrReq.getContext().put("webapp", req.getContextPath());
solrReq.getCore().execute(handler, solrReq, rsp);
}
這裏面的excute()方法代碼如下:
public void execute(SolrRequestHandler handler, SolrQueryRequest req, SolrQueryResponse rsp) {
if (handler==null) {
String msg = "Null Request Handler '" +
req.getParams().get(CommonParams.QT) + "'";
if (log.isWarnEnabled()) log.warn(logid + msg + ":" + req);
throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, msg);
}
preDecorateResponse(req, rsp);
if (requestLog.isDebugEnabled() && rsp.getToLog().size() > 0) {
// log request at debug in case something goes wrong and we aren't able to log later
requestLog.debug(rsp.getToLogAsString(logid));
}
// TODO: this doesn't seem to be working correctly and causes problems with the example server and distrib (for example /spell)
// if (req.getParams().getBool(ShardParams.IS_SHARD,false) && !(handler instanceof SearchHandler))
// throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,"isShard is only acceptable with search handlers");
handler.handleRequest(req,rsp);<span style="white-space:pre"> </span><span style="color:#ff6666;">//主要代碼</span>
postDecorateResponse(handler, req, rsp);
if (rsp.getToLog().size() > 0) {
if (requestLog.isInfoEnabled()) {
requestLog.info(rsp.getToLogAsString(logid));
}
if (log.isWarnEnabled() && slowQueryThresholdMillis >= 0) {
final long qtime = (long) (req.getRequestTimer().getTime());
if (qtime >= slowQueryThresholdMillis) {
log.warn("slow: " + rsp.getToLogAsString(logid));
}
}
}
}
上面主要的代碼是 handler.handleRequest(req,rsp);這個方法調用的是RequestHandlerBase的handleRequest方法,該方法又調用handleRequestBody抽象方法,定義如下:
public abstract void handleRequestBody( SolrQueryRequest req, SolrQueryResponse rsp ) throws Exception;
ContentStreamHandlerBase類中實現了該方法,代碼如下:
@Override
public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception {
SolrParams params = req.getParams();
UpdateRequestProcessorChain processorChain =
req.getCore().getUpdateProcessorChain(params);<span style="white-space:pre"> </span><span style="color:#ff0000;">//獲得更新處理鏈</span>
UpdateRequestProcessor processor = processorChain.createProcessor(req, rsp);
try {
ContentStreamLoader documentLoader = newLoader(req, processor);
Iterable<ContentStream> streams = req.getContentStreams();
if (streams == null) {
if (!RequestHandlerUtils.handleCommit(req, processor, params, false) && !RequestHandlerUtils.handleRollback(req, processor, params, false)) {
throw new SolrException(SolrException.ErrorCode.BAD_REQUEST, "missing content stream");
}
} else {
for (ContentStream stream : streams) {
documentLoader.load(req, rsp, stream, processor);
}
// Perhaps commit from the parameters
RequestHandlerUtils.handleCommit(req, processor, params, false);
RequestHandlerUtils.handleRollback(req, processor, params, false);
}
} finally {
// finish the request
processor.finish();
}
}
上面這段代碼首先獲得了更新處理鏈如下
可見更新需要經過3個流程,一個是tlog的更新,一個是分佈式轉發,一個是更新鏈。
該方法中有如下代碼,主要是對請求的流確定用什麼documentLoad加載
for (ContentStream stream : streams) {
documentLoader.load(req, rsp, stream, processor);
}
以xml格式爲例,這裏會調用xmlloader的load方法,load方法又會調用xmlload裏面的processUpdate方法
這個方法會調用當前processor的processAdd方法,從LogUpdateProcessor開始下面貼出processAdd的代碼
@Override
public void processAdd(AddUpdateCommand cmd) throws IOException {
if (logDebug) { log.debug("PRE_UPDATE " + cmd.toString() + " " + req); }
// call delegate first so we can log things like the version that get set later
if (next != null) next.processAdd(cmd);<span style="white-space:pre"> //調用下一個處理鏈進行處理
// Add a list of added id's to the response
if (adds == null) {
adds = new ArrayList<>();
toLog.add("add",adds);<span style="white-space:pre"> </span>
}
if (adds.size() < maxNumToLog) {
long version = cmd.getVersion();
String msg = cmd.getPrintableId();
if (version != 0) msg = msg + " (" + version + ')';
adds.add(msg);<span style="white-space:pre">
}
numAdds++;
}
由於不是solrcloud,所以DistributedUpdateProcessor基本上沒有做什麼處理,所以繼續下一個process,就到了RunUpdateProcessor裏面
@Override
public void processAdd(AddUpdateCommand cmd) throws IOException {
if (AtomicUpdateDocumentMerger.isAtomicUpdate(cmd)) {
throw new SolrException
(SolrException.ErrorCode.BAD_REQUEST,
"RunUpdateProcessor has received an AddUpdateCommand containing a document that appears to still contain Atomic document update operations, most likely because DistributedUpdateProcessorFactory was explicitly disabled from this updateRequestProcessorChain");
}
updateHandler.addDoc(cmd);<span style="white-space:pre"> //關鍵代碼
super.processAdd(cmd);
changesSinceCommit = true;
}
這個addDoc調用的是DirectUpdateHandler2的addDoc0方法代碼如下:
private int addDoc0(AddUpdateCommand cmd) throws IOException {
int rc = -1;
RefCounted<IndexWriter> iw = solrCoreState.getIndexWriter(core);
try {
IndexWriter writer = iw.get();
addCommands.incrementAndGet();
addCommandsCumulative.incrementAndGet();
// if there is no ID field, don't overwrite
if (idField == null) {
cmd.overwrite = false;
}
try {
IndexSchema schema = cmd.getReq().getSchema();
if (cmd.overwrite) {
// Check for delete by query commands newer (i.e. reordered). This
// should always be null on a leader
List<UpdateLog.DBQ> deletesAfter = null;
if (ulog != null && cmd.version > 0) {
deletesAfter = ulog.getDBQNewer(cmd.version);
}
if (deletesAfter != null) {
log.info("Reordered DBQs detected. Update=" + cmd + " DBQs="
+ deletesAfter);
List<Query> dbqList = new ArrayList<>(deletesAfter.size());
for (UpdateLog.DBQ dbq : deletesAfter) {
try {
DeleteUpdateCommand tmpDel = new DeleteUpdateCommand(cmd.req);
tmpDel.query = dbq.q;
tmpDel.version = -dbq.version;
dbqList.add(getQuery(tmpDel));
} catch (Exception e) {
log.error("Exception parsing reordered query : " + dbq, e);
}
}
addAndDelete(cmd, dbqList);
} else {
// normal update
Term updateTerm;
Term idTerm = new Term(cmd.isBlock() ? "_root_" : idField.getName(), cmd.getIndexedId());
boolean del = false;
if (cmd.updateTerm == null) {
updateTerm = idTerm;
} else {
// this is only used by the dedup update processor
del = true;
updateTerm = cmd.updateTerm;
}
if (cmd.isBlock()) {
writer.updateDocuments(updateTerm, cmd);
} else {
Document luceneDocument = cmd.getLuceneDocument();
// SolrCore.verbose("updateDocument",updateTerm,luceneDocument,writer);
writer.updateDocument(updateTerm, luceneDocument);<span style="white-space:pre"> </span>//調用lucene的indexwriter的updateDocument
}
可以看到,這裏終於和lucene打交道了,用到了indexWriter。並且核心方法是writer.updateDocument(updateTerm, luceneDocument);;
solr爲了豐富的功能和可擴展性,設計模式用了太多了,眼花繚亂。。。。
RunUpdateProcessor處理完後又回到了LogUpdateProcessor的那段代碼,
並寫入日誌,完成一些收尾工作,一條數據的插入就完成了,這個過程涉及到的東西很多,以後我會把indexWriter.updateDocument()方法展開來介紹一下。
如有不對請不吝指正。謝謝