文章目錄
1. 維表join
流計算系統中經常需要與外部系統進行交互,比如需要查詢外部數據庫以關聯上用戶的額外信息。通常,我們的實現方式是向數據庫發送用戶a
的查詢請求,然後等待結果返回,在這之前,我們無法發送用戶b
的查詢請求。這是一種同步訪問的模式,如下圖左邊所示。
圖中棕色的長條表示等待時間,可以發現網絡等待時間極大地阻礙了吞吐和延遲。爲了解決同步訪問的問題,異步模式可以併發地處理多個請求和回覆。也就是說,你可以連續地向數據庫發送用戶a
、b
、c
等的請求,與此同時,哪個請求先返回了就處理哪個請求,從而連續的請求之間不需要阻塞等待,如上圖右邊所示。這也正是 Async I/O 的實現原理。
2. richmapfunction
利用richmapfunction進行維表關聯,就是典型的sync I/O的關聯方式。兩次請求之間阻塞進行。不適合併發量高的情形。
2.1 示例
public static final class MapWithSiteInfoFunc
extends RichMapFunction<String, String> {
private static final Logger LOGGER = LoggerFactory.getLogger(MapWithSiteInfoFunc.class);
private static final long serialVersionUID = 1L;
private transient ScheduledExecutorService dbScheduler;
// 引入緩存,減小請求次數
private Map<Integer, SiteAndCityInfo> siteInfoCache;
@Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
siteInfoCache = new HashMap<>(1024);
// 利用定時線程,實現維度數據的週期性更新
dbScheduler = new ScheduledThreadPoolExecutor(1, r -> {
Thread thread = new Thread(r, "site-info-update-thread");
thread.setUncaughtExceptionHandler((t, e) -> {
LOGGER.error("Thread " + t + " got uncaught exception: " + e);
});
return thread;
});
dbScheduler.scheduleWithFixedDelay(() -> {
try {
QueryRunner queryRunner = new QueryRunner(JdbcUtil.getDataSource());
List<Map<String, Object>> info = queryRunner.query(SITE_INFO_QUERY_SQL, new MapListHandler());
for (Map<String, Object> item : info) {
siteInfoCache.put((int) item.get("site_id"), new SiteAndCityInfo(
(int) item.get("site_id"),
(String) item.getOrDefault("site_name", ""),
(long) item.get("city_id"),
(String) item.getOrDefault("city_name", "")
));
}
LOGGER.info("Fetched {} site info records, {} records in cache", info.size(), siteInfoCache.size());
} catch (Exception e) {
LOGGER.error("Exception occurred when querying: " + e);
}
}, 0, 10 * 60, TimeUnit.SECONDS);
}
@Override
public String map(String value) throws Exception {
JSONObject json = JSON.parseObject(value);
int siteId = json.getInteger("site_id");
String siteName = "", cityName = "";
SiteAndCityInfo info = siteInfoCache.getOrDefault(siteId, null);
if (info != null) {
siteName = info.getSiteName();
cityName = info.getCityName();
}
json.put("site_name", siteName);
json.put("city_name", cityName);
return json.toJSONString();
}
@Override
public void close() throws Exception {
// 清空緩存,關閉連接
siteInfoCache.clear();
ExecutorUtils.gracefulShutdown(10, TimeUnit.SECONDS, dbScheduler);
JdbcUtil.close();
super.close();
}
private static final String SITE_INFO_QUERY_SQL = "...";
}
3. asyncio
Flink 1.2中引入了Async IO(異步IO)來加快flink與外部系統的交互性能,提升吞吐量。其設計的核心是對原有的每條處理後的消息發送至下游operator的執行流程進行改進。其核心實現包括生產和消費兩部分,生產端引入了一個AsyncWaitOperator,在其processElement/processWatermark方法中完成對消息的維表關聯,隨即將未處理完的Futrue對象存入隊列中;消費端引入一個Emitter線程,不斷從隊列中消費數據併發往下游算子。
3.1 示例
簡單的來說,使用 Async I/O 對應到 Flink 的 API 就是 RichAsyncFunction 這個抽象類,繼承這個抽象類實現裏面的open(初始化),asyncInvoke(數據異步調用),close(停止的一些操作)方法,最主要的是實現asyncInvoke 裏面的方法。有如下示例,Kafka作爲流表,存儲用戶瀏覽記錄,Elasticsearch作爲維表,存儲用戶年齡信息,利用async I/O對瀏覽記錄進行加寬。
流表: 用戶行爲日誌。某個用戶在某個時刻點擊或瀏覽了某個商品。自己造的測試數據,數據樣例如下:
{"userID": "user_1", "eventTime": "2016-06-06 07:03:42", "eventType": "browse", "productID": 2}
維表: 用戶基礎信息。自己造的測試數據,數據存儲在ES上,數據樣例如下:
GET dim_user/dim_user/user
{
"_index": "dim_user",
"_type": "dim_user",
"_id": "user_1",
"_version": 1,
"found": true,
"_source": {
"age": 22
}
}
實現邏輯:
public class FlinkAsyncIO {
public static void main(String[] args) throws Exception{
String kafkaBootstrapServers = "localhost:9092";
String kafkaGroupID = "async-test";
String kafkaAutoOffsetReset= "latest";
String kafkaTopic = "asyncio";
int kafkaParallelism =2;
String esHost= "localhost";
Integer esPort= 9200;
String esUser = "";
String esPassword = "";
String esIndex = "dim_user";
String esType = "dim_user";
/**Flink DataStream 運行環境*/
Configuration config = new Configuration();
config.setInteger(RestOptions.PORT,8081);
config.setBoolean(ConfigConstants.LOCAL_START_WEBSERVER, true);
StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironmentWithWebUI(config);
/**添加數據源*/
Properties kafkaProperties = new Properties();
kafkaProperties.put("bootstrap.servers",kafkaBootstrapServers);
kafkaProperties.put("group.id",kafkaGroupID);
kafkaProperties.put("auto.offset.reset",kafkaAutoOffsetReset);
FlinkKafkaConsumer010<String> kafkaConsumer = new FlinkKafkaConsumer010<>(kafkaTopic, new SimpleStringSchema(), kafkaProperties);
kafkaConsumer.setCommitOffsetsOnCheckpoints(true);
SingleOutputStreamOperator<String> source = env.addSource(kafkaConsumer).name("KafkaSource").setParallelism(kafkaParallelism);
//數據轉換
SingleOutputStreamOperator<Tuple4<String, String, String, Integer>> sourceMap = source.map((MapFunction<String, Tuple4<String, String, String, Integer>>) value -> {
Tuple4<String, String, String, Integer> output = new Tuple4<>();
try {
JSONObject obj = JSON.parseObject(value);
output.f0 = obj.getString("userID");
output.f1 = obj.getString("eventTime");
output.f2 = obj.getString("eventType");
output.f3 = obj.getInteger("productID");
} catch (Exception e) {
e.printStackTrace();
}
return output;
}).returns(new TypeHint<Tuple4<String, String, String, Integer>>(){}).name("Map: ExtractTransform");
//過濾掉異常數據
SingleOutputStreamOperator<Tuple4<String, String, String, Integer>> sourceFilter = sourceMap.filter((FilterFunction<Tuple4<String, String, String, Integer>>) value -> value.f3 != null).name("Filter: FilterExceptionData");
//Timeout: 超時時間 默認異步I/O請求超時時,會引發異常並重啓或停止作業。 如果要處理超時,可以重寫AsyncFunction#timeout方法。
//Capacity: 併發請求數量
/**Async IO實現流表與維表Join*/
SingleOutputStreamOperator<Tuple5<String, String, String, Integer, Integer>> result = AsyncDataStream.unorderedWait(sourceFilter, new ElasticsearchAsyncFunction(esHost,esPort,esUser,esPassword,esIndex,esType), 500, TimeUnit.MILLISECONDS, 10).name("Join: JoinWithDim");
/**結果輸出*/
result.print().name("PrintToConsole");
env.execute();
}
}
ElasticsearchAsyncFunction:
public class ElasticsearchAsyncFunction extends RichAsyncFunction<Tuple4<String, String, String, Integer>, Tuple5<String, String, String, Integer, Integer>> {
private String host;
private Integer port;
private String user;
private String password;
private String index;
private String type;
public ElasticsearchAsyncFunction(String host, Integer port, String user, String password, String index, String type) {
this.host = host;
this.port = port;
this.user = user;
this.password = password;
this.index = index;
this.type = type;
}
private RestHighLevelClient restHighLevelClient;
private Cache<String, Integer> cache;
/**
* 和ES建立連接
*
* @param parameters
*/
@Override
public void open(Configuration parameters) {
//ES Client
CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(user, password));
restHighLevelClient = new RestHighLevelClient(
RestClient
.builder(new HttpHost(host, port))
.setHttpClientConfigCallback(httpAsyncClientBuilder -> HttpAsyncClientBuilder.create()));
//初始化緩存
cache = CacheBuilder.newBuilder().maximumSize(2).expireAfterAccess(5, TimeUnit.MINUTES).build();
}
/**
* 關閉連接
*
* @throws Exception
*/
@Override
public void close() throws Exception {
restHighLevelClient.close();
}
/**
* 異步調用
*
* @param input
* @param resultFuture
*/
@Override
public void asyncInvoke(Tuple4<String, String, String, Integer> input, ResultFuture<Tuple5<String, String, String, Integer, Integer>> resultFuture) {
// 1、先從緩存中取
Integer cachedValue = cache.getIfPresent(input.f0);
if (cachedValue != null) {
System.out.println("從緩存中獲取到維度數據: key=" + input.f0 + ",value=" + cachedValue);
resultFuture.complete(Collections.singleton(new Tuple5<>(input.f0, input.f1, input.f2, input.f3, cachedValue)));
// 2、緩存中沒有,則從外部存儲獲取
} else {
searchFromES(input, resultFuture);
}
}
/**
* 當緩存中沒有數據時,從外部存儲ES中獲取
*
* @param input
* @param resultFuture
*/
private void searchFromES(Tuple4<String, String, String, Integer> input, ResultFuture<Tuple5<String, String, String, Integer, Integer>> resultFuture) {
// 1、構造輸出對象
Tuple5<String, String, String, Integer, Integer> output = new Tuple5<>();
output.f0 = input.f0;
output.f1 = input.f1;
output.f2 = input.f2;
output.f3 = input.f3;
// 2、待查詢的Key
String dimKey = input.f0;
// 3、構造Ids Query
SearchRequest searchRequest = new SearchRequest();
searchRequest.indices(index);
searchRequest.types(type);
searchRequest.source(SearchSourceBuilder.searchSource().query(QueryBuilders.idsQuery().addIds(dimKey)));
RequestOptions requestOptions = RequestOptions.DEFAULT;
// 4、用異步客戶端查詢數據
restHighLevelClient.searchAsync(searchRequest, RequestOptions.DEFAULT, new ActionListener<SearchResponse>() {
//成功響應時處理
@Override
public void onResponse(SearchResponse searchResponse) {
SearchHit[] searchHits = searchResponse.getHits().getHits();
if (searchHits.length > 0) {
JSONObject obj = JSON.parseObject(searchHits[0].getSourceAsString());
Integer dimValue = obj.getInteger("age");
output.f4 = dimValue;
cache.put(dimKey, dimValue);
System.out.println("將維度數據放入緩存: key=" + dimKey + ",value=" + dimValue);
}
resultFuture.complete(Collections.singleton(output));
}
//響應失敗時處理
@Override
public void onFailure(Exception e) {
output.f4 = null;
resultFuture.complete(Collections.singleton(output));
}
});
}
//超時時處理
@Override
public void timeout(Tuple4<String, String, String, Integer> input, ResultFuture<Tuple5<String, String, String, Integer, Integer>> resultFuture) {
searchFromES(input, resultFuture);
}
}
3.2 Ordered模式
Flink Async I/O又可以細分爲三種,一種是有序的Ordered模式,一種是ProcessingTime 無序模式,一種是EventTime 無序。
主要區別是往下游output的順序,有序模式會按接收的順序繼續往下游output發送,無序模式就是誰先處理完誰就先往下游發送。下圖是ordered模式的原理圖。
無論有序無需,都採用了Futrue/Promise設計模式,大體都遵循以下設計邏輯:
-
生產端:將每條消息封裝成一個
StreamRecordQueueEntry
(內部維護一個Future對象),放入StreamElementQueue
中 -
生產端:消息與外部系統交互的邏輯放入AsynInvoke方法中,將交互執行結果放入
StreamRecordQueueEntry
中 -
消費端:啓動一個emitter線程,從
StreamElementQueue
中讀取已經完成的StreamRecordQueueEntry
,將其結果發送至下游operator算子
下面我們分別就生產端和消費端對ordered模式進行源碼分析
3.2.1 生產
AsyncWaitOperator
@Internal
public class AsyncWaitOperator<IN, OUT>
extends AbstractUdfStreamOperator<OUT, AsyncFunction<IN, OUT>>
implements OneInputStreamOperator<IN, OUT>, OperatorActions, BoundedOneInput {
@Override
public void setup(StreamTask<?, ?> containingTask, StreamConfig config, Output<StreamRecord<OUT>> output) {
super.setup(containingTask, config, output);
this.checkpointingLock = getContainingTask().getCheckpointLock();
this.inStreamElementSerializer = new StreamElementSerializer<>(
getOperatorConfig().<IN>getTypeSerializerIn1(getUserCodeClassloader()));
// create the operators executor for the complete operations of the queue entries
this.executor = Executors.newSingleThreadExecutor();
// 根據項目中使用AsyncDataStream.unorderedWait還是AsyncDataStream.orderedWait方法,進行有序和無需兩種模式的區分,初始化不同的隊列
switch (outputMode) {
case ORDERED:
queue = new OrderedStreamElementQueue(
capacity,
executor,
this);
break;
case UNORDERED:
queue = new UnorderedStreamElementQueue(
capacity,
executor,
this);
break;
default:
throw new IllegalStateException("Unknown async mode: " + outputMode + '.');
}
}
@Override
public void open() throws Exception {
super.open();
// 啓動emitter線程
this.emitter = new Emitter<>(checkpointingLock, output, queue, this);
this.emitterThread = new Thread(emitter, "AsyncIO-Emitter-Thread (" + getOperatorName() + ')');
emitterThread.setDaemon(true);
emitterThread.start();
// process stream elements from state, since the Emit thread will start as soon as all
// elements from previous state are in the StreamElementQueue, we have to make sure that the
// order to open all operators in the operator chain proceeds from the tail operator to the
// head operator.
if (recoveredStreamElements != null) {
for (StreamElement element : recoveredStreamElements.get()) {
if (element.isRecord()) {
processElement(element.<IN>asRecord());
}
else if (element.isWatermark()) {
processWatermark(element.asWatermark());
}
else if (element.isLatencyMarker()) {
processLatencyMarker(element.asLatencyMarker());
}
else {
throw new IllegalStateException("Unknown record type " + element.getClass() +
" encountered while opening the operator.");
}
}
recoveredStreamElements = null;
}
}
// 算子中的processElement方法,都會逐個處理每一條到來的數據
@Override
public void processElement(StreamRecord<IN> element) throws Exception {
// 將數據包裝成StreamRecordBufferEntry對象
final StreamRecordQueueEntry<OUT> streamRecordBufferEntry = new StreamRecordQueueEntry<>(element);
addAsyncBufferEntry(streamRecordBufferEntry);
// 調用AsyncFunction接口的用戶自定義實現類ElasticsearchAsyncFunction中的asyncInvoke方法,該用戶實現方法中,將返回結果通過異步回調的方式,返回給StreamRecordBufferEntry對象中的Future對象
userFunction.asyncInvoke(element.getValue(), streamRecordBufferEntry);
}
private <T> void addAsyncBufferEntry(StreamElementQueueEntry<T> streamElementQueueEntry) throws InterruptedException {
assert(Thread.holdsLock(checkpointingLock));
pendingStreamElementQueueEntry = streamElementQueueEntry;
// 嘗試將StreamRecordQueueEntry對象加入到隊列
while (!queue.tryPut(streamElementQueueEntry)) {
// we wait for the emitter to notify us if the queue has space left again
checkpointingLock.wait();
}
pendingStreamElementQueueEntry = null;
}
}
OrderedStreamElementQueue
@Internal
public class OrderedStreamElementQueue implements StreamElementQueue {
// 往OrderedStreamElementQueue隊列中插入StreamRecordBufferEntry對象
@Override
public <T> boolean tryPut(StreamElementQueueEntry<T> streamElementQueueEntry) throws InterruptedException {
lock.lockInterruptibly();
try {
// capacity用於控制併發請求數量,即OrderedStreamElementQueue隊列中的StreamRecordBufferEntry對象的個數
if (queue.size() < capacity) {
addEntry(streamElementQueueEntry);
LOG.debug("Put element into ordered stream element queue. New filling degree " +
"({}/{}).", queue.size(), capacity);
return true;
} else {
// 如果一直插入失敗,則AsyncWaitOperator#addAsyncBufferEntry方法會無限嘗試插入,極致情況下,會觸發Flink自身的反壓機制,用戶不用做任何特殊處理
LOG.debug("Failed to put element into ordered stream element queue because it " +
"was full ({}/{}).", queue.size(), capacity);
return false;
}
} finally {
lock.unlock();
}
}
private <T> void addEntry(StreamElementQueueEntry<T> streamElementQueueEntry) {
assert(lock.isHeldByCurrentThread());
// 將StreamRecordBufferEntry對象插入隊尾
queue.addLast(streamElementQueueEntry);
// StreamRecordBufferEntry對象中的Futrue對象一旦返回結果,則進行以下調用
streamElementQueueEntry.onComplete(
(StreamElementQueueEntry<T> value) -> {
try {
onCompleteHandler(value);
} catch (InterruptedException e) {
// we got interrupted. This indicates a shutdown of the executor
LOG.debug("AsyncBufferEntry could not be properly completed because the " +
"executor thread has been interrupted.", e);
} catch (Throwable t) {
operatorActions.failOperator(new Exception("Could not complete the " +
"stream element queue entry: " + value + '.', t));
}
},
executor);
}
private void onCompleteHandler(StreamElementQueueEntry<?> streamElementQueueEntry) throws InterruptedException {
lock.lockInterruptibly();
try {
// 隊列不爲空,且隊首StreamRecordBufferEntry對象中的Future對象已收到返回值,則通過Condition對象喚醒emmiter線程,使其能夠取出隊首元素
if (!queue.isEmpty() && queue.peek().isDone()) {
LOG.debug("Signal ordered stream element queue has completed head element.");
headIsCompleted.signalAll();
}
} finally {
lock.unlock();
}
}
@Override
public AsyncResult peekBlockingly() throws InterruptedException {
lock.lockInterruptibly();
try {
// emmiter線程在從隊列中取StreamRecordBufferEntry對象時,如果隊列爲空 or 隊首future未完成,則emmiter線程會一直阻塞
while (queue.isEmpty() || !queue.peek().isDone()) {
// Condition阻塞
headIsCompleted.await();
}
LOG.debug("Peeked head element from ordered stream element queue with filling degree " +
"({}/{}).", queue.size(), capacity);
return queue.peek();
} finally {
lock.unlock();
}
}
}
3.2.2 消費
Emmiter
@Internal
public class Emitter<OUT> implements Runnable {
@Override
public void run() {
try {
// 不斷嘗試讀取隊首元素,在OrderedStreamElementQueue#peekBlockingly中可以看到,如果隊首元素中的Future對象還沒有返回數據,Emitter線程會一直阻塞
while (running) {
LOG.debug("Wait for next completed async stream element result.");
AsyncResult streamElementEntry = streamElementQueue.peekBlockingly();
// 將數據發往下游算子
output(streamElementEntry);
}
} catch (InterruptedException e) {
if (running) {
operatorActions.failOperator(e);
} else {
// Thread got interrupted which means that it should shut down
LOG.debug("Emitter thread got interrupted, shutting down.");
}
} catch (Throwable t) {
operatorActions.failOperator(new Exception("AsyncWaitOperator's emitter caught an " +
"unexpected throwable.", t));
}
}
private void output(AsyncResult asyncResult) throws InterruptedException {
if (asyncResult.isWatermark()) {
synchronized (checkpointLock) {
AsyncWatermarkResult asyncWatermarkResult = asyncResult.asWatermark();
LOG.debug("Output async watermark.");
// 如果是watermark,直接發往下游算子
output.emitWatermark(asyncWatermarkResult.getWatermark());
// 移除隊首StreamRecordBufferEntry對象,注意peek和poll的區別
streamElementQueue.poll();
// notify the main thread that there is again space left in the async collector
// buffer
checkpointLock.notifyAll();
}
} else {
AsyncCollectionResult<OUT> streamRecordResult = asyncResult.asResultCollection();
if (streamRecordResult.hasTimestamp()) {
timestampedCollector.setAbsoluteTimestamp(streamRecordResult.getTimestamp());
} else {
timestampedCollector.eraseTimestamp();
}
synchronized (checkpointLock) {
LOG.debug("Output async stream element collection result.");
try {
// 取出StreamRecordBufferEntry對象中的Future對象中的join後的數據
Collection<OUT> resultCollection = streamRecordResult.get();
// 將數據發往下游算子
if (resultCollection != null) {
for (OUT result : resultCollection) {
timestampedCollector.collect(result);
}
}
} catch (Exception e) {
operatorActions.failOperator(
new Exception("An async function call terminated with an exception. " +
"Failing the AsyncWaitOperator.", e));
}
// 移除隊首StreamRecordBufferEntry對象,注意peek和poll的區別
streamElementQueue.poll();
// notify the main thread that there is again space left in the async collector
// buffer
checkpointLock.notifyAll();
}
}
}
}
3.3 基於processtime的unordered模式
區別於ordered模式,無序模式下的StreamRecordBufferEntry對象外層又被封裝了一層Set層,主要是爲了應對watermark的存在,詳情見下節。基於processtime的unordered模式下,雖然沒有watermark,但是也跟基於eventTime的unordered模式共用了同一套邏輯,因此也多了一層Set層。
該模式下,不存在watermark類型的消息,因此所有消息的StreamRecordBufferEntry對象都是放入lastSet(此模式下,lastSet和firstSet引用相同的對象),在消息的onCompleteHandler方法中,直接將該消息的StreamRecordBufferEntry對象從lastSet中取出再放入completeQueue中,通過emitter線程發送至下游operator,因此該場景下實現的是完全無序的處理模式。
雲邪在其博客 《Flink 原理與實現:Aysnc I/O》中提到的基於processtime的unordered模式的架構圖,是針對Flink 1.3進行分析的,已經不再適用於Flink1.9,Flink1.9中該模式已經不需要用到uncompletedQueue,架構圖如下:
另,雲邪博客中的asyncCollector等數據結構在Flink1.9中也不復存在,本文針對Flink1.9進行分析。
3.3.1 生產
UnorderedStreamElementQueue
@Internal
public class UnorderedStreamElementQueue implements StreamElementQueue {
private <T> void addEntry(StreamElementQueueEntry<T> streamElementQueueEntry) {
assert(lock.isHeldByCurrentThread());
if (streamElementQueueEntry.isWatermark()) {
lastSet = new HashSet<>(capacity);
if (firstSet.isEmpty()) {
firstSet.add(streamElementQueueEntry);
} else {
Set<StreamElementQueueEntry<?>> watermarkSet = new HashSet<>(1);
watermarkSet.add(streamElementQueueEntry);
uncompletedQueue.offer(watermarkSet);
}
uncompletedQueue.offer(lastSet);
} else {
// 基於processtime的unordered模式只會走這裏,且lastSet和firstSet引用同一個對象
lastSet.add(streamElementQueueEntry);
}
streamElementQueueEntry.onComplete(
(StreamElementQueueEntry<T> value) -> {
try {
onCompleteHandler(value);
} catch (InterruptedException e) {
// The accept executor thread got interrupted. This is probably cause by
// the shutdown of the executor.
LOG.debug("AsyncBufferEntry could not be properly completed because the " +
"executor thread has been interrupted.", e);
} catch (Throwable t) {
operatorActions.failOperator(new Exception("Could not complete the " +
"stream element queue entry: " + value + '.', t));
}
},
executor);
numberEntries++;
}
// StreamRecordBufferEntry對象中的Future對象返回結果時進行回調
public void onCompleteHandler(StreamElementQueueEntry<?> streamElementQueueEntry) throws InterruptedException {
lock.lockInterruptibly();
try {
// 將StreamRecordBufferEntry對象插入completedQueue隊列
// 此處將StreamRecordBufferEntry對象插入lastSet(等同於firstSet),又從其中取出,確實是比較多餘的。這樣做只是因爲跟”基於eventTime的unordered模式”共用了一套代碼
if (firstSet.remove(streamElementQueueEntry)) {
// 將StreamRecordBufferEntry對象加入completedQueue
completedQueue.offer(streamElementQueueEntry);
// 該模式下不會走下面的代碼
while (firstSet.isEmpty() && firstSet != lastSet) {
firstSet = uncompletedQueue.poll();
Iterator<StreamElementQueueEntry<?>> it = firstSet.iterator();
while (it.hasNext()) {
StreamElementQueueEntry<?> bufferEntry = it.next();
if (bufferEntry.isDone()) {
completedQueue.offer(bufferEntry);
it.remove();
}
}
}
LOG.debug("Signal unordered stream element queue has completed entries.");
hasCompletedEntries.signalAll();
}
} finally {
lock.unlock();
}
}
@Override
public AsyncResult peekBlockingly() throws InterruptedException {
lock.lockInterruptibly();
try {
// emitter線程從completedQueue取出StreamRecordBufferEntry對象,相比ordered模式,這裏不需要判斷隊首StreamRecordBufferEntry對象中的Future對象是否已經返回,因爲只有Futrue已返回的StreamRecordBufferEntry對象才能被插入到completedQueue隊列
while (completedQueue.isEmpty()) {
hasCompletedEntries.await();
}
LOG.debug("Peeked head element from unordered stream element queue with filling degree " +
"({}/{}).", numberEntries, capacity);
return completedQueue.peek();
} finally {
lock.unlock();
}
}
}
3.3.2 消費
Emitter線程消費邏輯同ordered模式
3.4 基於eventTime的unordered模式
該模式下雖然一段時間內的消息之間是無序的,但是由於引入了watermark,watermark1和watermark2之間的數據必須還是原來那批數據,雖然數據之間是可以是亂序的。即Set集合內部的數據,發往下游時可以亂序,但是watermark1—set—watermark2這個順序不可以被打破。
如果watermark和數據集set直接的順序被打亂,那麼當watermark2觸發窗口計算時,窗口裏面的數據可能會變多或變少,影響計算的正確性。
3.4.1 生產
AsyncWaitOperator
@Internal
public class AsyncWaitOperator<IN, OUT>
extends AbstractUdfStreamOperator<OUT, AsyncFunction<IN, OUT>>
implements OneInputStreamOperator<IN, OUT>, OperatorActions, BoundedOneInput {
@Override
public void processWatermark(Watermark mark) throws Exception {
WatermarkQueueEntry watermarkBufferEntry = new WatermarkQueueEntry(mark);
// 處理watermark
addAsyncBufferEntry(watermarkBufferEntry);
}
@Override
public void processElement(StreamRecord<IN> element) throws Exception {
final StreamRecordQueueEntry<OUT> streamRecordBufferEntry = new StreamRecordQueueEntry<>(element);
// 處理StreamRecordBufferEntry對象
addAsyncBufferEntry(streamRecordBufferEntry);
userFunction.asyncInvoke(element.getValue(), streamRecordBufferEntry);
}
private <T> void addAsyncBufferEntry(StreamElementQueueEntry<T> streamElementQueueEntry) throws InterruptedException {
assert(Thread.holdsLock(checkpointingLock));
pendingStreamElementQueueEntry = streamElementQueueEntry;
// 嘗試將StreamRecordBufferEntry對象 or watermarkBufferEntry對象插入隊列
while (!queue.tryPut(streamElementQueueEntry)) {
// we wait for the emitter to notify us if the queue has space left again
checkpointingLock.wait();
}
pendingStreamElementQueueEntry = null;
}
}
UnorderedStreamElementQueue
@Internal
public class UnorderedStreamElementQueue implements StreamElementQueue {
@Override
public <T> boolean tryPut(StreamElementQueueEntry<T> streamElementQueueEntry) throws InterruptedException {
lock.lockInterruptibly();
try {
if (numberEntries < capacity) {
addEntry(streamElementQueueEntry);
LOG.debug("Put element into unordered stream element queue. New filling degree " +
"({}/{}).", numberEntries, capacity);
return true;
} else {
LOG.debug("Failed to put element into unordered stream element queue because it " +
"was full ({}/{}).", numberEntries, capacity);
return false;
}
} finally {
lock.unlock();
}
}
private <T> void addEntry(StreamElementQueueEntry<T> streamElementQueueEntry) {
assert(lock.isHeldByCurrentThread());
if (streamElementQueueEntry.isWatermark()) {
// 遇到watermark,將lastSet置空,方便塞入下一批StreamRecordBufferEntry對象
// 注:firstSet可以存watermarkBufferEntry對象,也可以存StreamRecordBufferEntry對象;但是
// lastSet只會存StreamRecordBufferEntry對象
lastSet = new HashSet<>(capacity);
if (firstSet.isEmpty()) {
firstSet.add(streamElementQueueEntry);
} else {
Set<StreamElementQueueEntry<?>> watermarkSet = new HashSet<>(1);
watermarkSet.add(streamElementQueueEntry);
uncompletedQueue.offer(watermarkSet);
}
uncompletedQueue.offer(lastSet);
} else {
//在沒有遇到watermark之前,一直往lastSet中塞入StreamRecordBufferEntry對象
lastSet.add(streamElementQueueEntry);
}
streamElementQueueEntry.onComplete(
(StreamElementQueueEntry<T> value) -> {
try {
onCompleteHandler(value);
} catch (InterruptedException e) {
// The accept executor thread got interrupted. This is probably cause by
// the shutdown of the executor.
LOG.debug("AsyncBufferEntry could not be properly completed because the " +
"executor thread has been interrupted.", e);
} catch (Throwable t) {
operatorActions.failOperator(new Exception("Could not complete the " +
"stream element queue entry: " + value + '.', t));
}
},
executor);
numberEntries++;
}
// watermarkBufferEntry對象 or StreamRecordBufferEntry對象中的Futrue對象返回後的回調邏輯
public void onCompleteHandler(StreamElementQueueEntry<?> streamElementQueueEntry) throws InterruptedException {
lock.lockInterruptibly();
try {
// 從firstSet中取出StreamRecordBufferEntry對象,每次都是嘗試從firstSet中獲取StreamRecordBufferEntry對象,通過這個if邏輯來控制watermark和set之間的順序
if (firstSet.remove(streamElementQueueEntry)) {
// 將取出的StreamRecordBufferEntry對象加入completedQueue
completedQueue.offer(streamElementQueueEntry);
while (firstSet.isEmpty() && firstSet != lastSet) {
// firstSet指針下移
firstSet = uncompletedQueue.poll();
Iterator<StreamElementQueueEntry<?>> it = firstSet.iterator();
// 遍歷firstSet中的StreamRecordBufferEntry對象,如果完成,加入completedQueue隊列,且移出firstSet
while (it.hasNext()) {
StreamElementQueueEntry<?> bufferEntry = it.next();
if (bufferEntry.isDone()) {
completedQueue.offer(bufferEntry);
it.remove();
}
}
}
LOG.debug("Signal unordered stream element queue has completed entries.");
hasCompletedEntries.signalAll();
}
} finally {
lock.unlock();
}
}
3.4.2 消費
Emitter線程消費邏輯同ordered模式
4. 總結
-
Flink Async I/O利用隊列來存儲加寬前(ordered模式)或加寬後(基於processtime的unordered模式)的數據,並通過隊列和Emitter輪詢線程將生產數據與消費數據進行解耦。
-
ordered模式,通過將未返回結果的StreamRecordBufferEntry對象按順序插入隊列,並通過判斷頭結點是否返回,來控制消費順序與生產順序一致
-
基於processtime的unordered模式,在數據回調邏輯中,將StreamRecordBufferEntry對象插入隊列,即隊列中的所有StreamRecordBufferEntry對象都是已經返回異步結果並加寬後的數據。
-
基於eventTime的unordered模式,uncompleteQueue存儲加寬前的數據(異步調用返回前),completeQueue存儲加寬後的數據,通過firstSet這個設計,來控制watermark和數據集set之間的順序。
參考:
http://wuchong.me/blog/2017/05/17/flink-internals-async-io/
https://www.cnblogs.com/ljygz/p/11864176.html
https://www.jianshu.com/p/f9bde854627b
https://blog.csdn.net/weixin_44904816/article/details/104305824?utm_medium=distribute.pc_relevant_right.none-task-blog-BlogCommendFromMachineLearnPai2-5.nonecase&depth_1-utm_source=distribute.pc_relevant_right.none-task-blog-BlogCommendFromMachineLearnPai2-5.nonecase