Spark成長之路(5)-消息隊列

參考文章:Spark分佈式消息發送流程
監聽器模式
volatile

因爲之前被這個消息隊列坑過(stage夯住原因分析),所以現在研究源碼,先從它下手,解答一下我這麼久的疑惑。

繼承關係

這裏寫圖片描述

ListenerBus->SparkListenerBus->LiveListenerBus。原始基類爲ListenerBus。運用的設計模式爲監聽器模式。

ListenerBus

這裏寫圖片描述

spark包中私有特質,繼承自Logging(方便輸出log),來熟悉一下它的成員變量

listenersPlusTimers 成員變量

類私有變量,CopyOnWriteArrayList爲線程安全的列表,可以將其當做一個ArrayList(+線程安全)。存放的對象是一個二元組,二元組的第一個元素是一個引用型對象,第二個元素是一個Option對象,獲取監聽者的定時器,也可爲None

private[this] val listenersPlusTimers = new CopyOnWriteArrayList[(L, Option[Timer])]

listeners 方法

CopyOnWriteArrayList轉換爲Java的集合,這裏應該是util.List

private[spark] def listeners = listenersPlusTimers.asScala.map(_._1).asJava

addListener 方法

該方法用final定義,表示不能被繼承重寫。該方法爲添加觀察者,監聽器模式標配方法之一。

 final def addListener(listener: L): Unit = {
    listenersPlusTimers.add((listener, getTimer(listener)))
  }

removeListener 方法

監聽器模式標配方法之二,移除監聽者

final def removeListener(listener: L): Unit = {
    listenersPlusTimers.asScala.find(_._1 eq listener).foreach { listenerAndTimer =>
      listenersPlusTimers.remove(listenerAndTimer)
    }
  }

getTimer 方法

獲取監聽器中的定時器

protected def getTimer(listener: L): Option[Timer] = None

postToAll 方法

向每一個監聽器發送事件。

def postToAll(event: E): Unit = {
    // JavaConverters can create a JIterableWrapper if we use asScala.
    // However, this method will be called frequently. To avoid the wrapper cost, here we use
    // Java Iterator directly.
    val iter = listenersPlusTimers.iterator
    while (iter.hasNext) {
      val listenerAndMaybeTimer = iter.next()
      val listener = listenerAndMaybeTimer._1
      val maybeTimer = listenerAndMaybeTimer._2
      val maybeTimerContext = if (maybeTimer.isDefined) {
        maybeTimer.get.time()
      } else {
        null
      }
      try {
        doPostEvent(listener, event)
      } catch {
        case NonFatal(e) =>
          logError(s"Listener ${Utils.getFormattedClassName(listener)} threw an exception", e)
      } finally {
        if (maybeTimerContext != null) {
          maybeTimerContext.stop()
        }
      }
    }
  }

findListenersByClass 方法

根據類名反射出對應的監聽器列表。

private[spark] def findListenersByClass[T <: L : ClassTag](): Seq[T] = {
    val c = implicitly[ClassTag[T]].runtimeClass
    listeners.asScala.filter(_.getClass == c).map(_.asInstanceOf[T]).toSeq
  }

doPostEvent 方法

向指定監聽器發送事件,子類具體實現。

protected def doPostEvent(listener: L, event: E): Unit

SparkListenerBus

之前說過doPostEvent這個方法,交由子類去實現。ok,來看看SparkListenerBus怎麼實現的吧。

SparkListenerBus很沒臉沒皮的就只有這一個方法,其他啥都沒有,該類已經把listenerevent具體指向了SparkListenerInterfaceSparkListenerEvent。該方法用模式匹配,來具體調用不同監聽器的發送事件的方法。

protected override def doPostEvent(
      listener: SparkListenerInterface,
      event: SparkListenerEvent): Unit = {
    event match {
      case stageSubmitted: SparkListenerStageSubmitted =>
        listener.onStageSubmitted(stageSubmitted)
      case stageCompleted: SparkListenerStageCompleted =>
        listener.onStageCompleted(stageCompleted)
      case jobStart: SparkListenerJobStart =>
        listener.onJobStart(jobStart)
      case jobEnd: SparkListenerJobEnd =>
        listener.onJobEnd(jobEnd)
      case taskStart: SparkListenerTaskStart =>
        listener.onTaskStart(taskStart)
      case taskGettingResult: SparkListenerTaskGettingResult =>
        listener.onTaskGettingResult(taskGettingResult)
      case taskEnd: SparkListenerTaskEnd =>
        listener.onTaskEnd(taskEnd)
      case environmentUpdate: SparkListenerEnvironmentUpdate =>
        listener.onEnvironmentUpdate(environmentUpdate)
      case blockManagerAdded: SparkListenerBlockManagerAdded =>
        listener.onBlockManagerAdded(blockManagerAdded)
      case blockManagerRemoved: SparkListenerBlockManagerRemoved =>
        listener.onBlockManagerRemoved(blockManagerRemoved)
      case unpersistRDD: SparkListenerUnpersistRDD =>
        listener.onUnpersistRDD(unpersistRDD)
      case applicationStart: SparkListenerApplicationStart =>
        listener.onApplicationStart(applicationStart)
      case applicationEnd: SparkListenerApplicationEnd =>
        listener.onApplicationEnd(applicationEnd)
      case metricsUpdate: SparkListenerExecutorMetricsUpdate =>
        listener.onExecutorMetricsUpdate(metricsUpdate)
      case executorAdded: SparkListenerExecutorAdded =>
        listener.onExecutorAdded(executorAdded)
      case executorRemoved: SparkListenerExecutorRemoved =>
        listener.onExecutorRemoved(executorRemoved)
      case executorBlacklisted: SparkListenerExecutorBlacklisted =>
        listener.onExecutorBlacklisted(executorBlacklisted)
      case executorUnblacklisted: SparkListenerExecutorUnblacklisted =>
        listener.onExecutorUnblacklisted(executorUnblacklisted)
      case nodeBlacklisted: SparkListenerNodeBlacklisted =>
        listener.onNodeBlacklisted(nodeBlacklisted)
      case nodeUnblacklisted: SparkListenerNodeUnblacklisted =>
        listener.onNodeUnblacklisted(nodeUnblacklisted)
      case blockUpdated: SparkListenerBlockUpdated =>
        listener.onBlockUpdated(blockUpdated)
      case _ => listener.onOtherEvent(event)
    }
  }

有時間我把SparkListenerInterface, SparkListenerEvent這兩個子系統的邏輯理一下。

LiveListenerBus

該類有一個伴生對象,還有一個附屬對象LiveListenerBusMetrics。接下來一一來解釋

eventQueue 成員變量

消息隊列,我編寫這個文章的核心,註釋中解釋了爲什麼要限制這個隊列的大小,因爲如果添加的速度大於消費的速度,那就有可能造成OOM。如果不設置spark.scheduler.listenerbus.eventqueue.size參數,默認爲10000

// Cap the capacity of the event queue so we get an explicit error (rather than
  // an OOM exception) if it's perpetually being added to more quickly than it's being drained.
  private val eventQueue =
    new LinkedBlockingQueue[SparkListenerEvent](conf.get(LISTENER_BUS_EVENT_QUEUE_CAPACITY))

metrics 成員變量

private[spark] val metrics = new LiveListenerBusMetrics(conf, eventQueue)

存放各項配置的對象。

listenerThread成員變量

private val listenerThread = new Thread(name) {
    setDaemon(true)
    override def run(): Unit = Utils.tryOrStopSparkContext(sparkContext) {
      LiveListenerBus.withinListenerThread.withValue(true) {
        val timer = metrics.eventProcessingTime
        while (true) {
          eventLock.acquire()
          self.synchronized {
            processingEvent = true
          }
          try {
            val event = eventQueue.poll
            if (event == null) {
              // Get out of the while loop and shutdown the daemon thread
              if (!stopped.get) {
                throw new IllegalStateException("Polling `null` from eventQueue means" +
                  " the listener bus has been stopped. So `stopped` must be true")
              }
              return
            }
            val timerContext = timer.time()
            try {
              postToAll(event)
            } finally {
              timerContext.stop()
            }
          } finally {
            self.synchronized {
              processingEvent = false
            }
          }
        }
      }
    }
  }

線程對象,首先設置爲守護進程。然後將代碼塊傳遞給Utils.tryOrStopSparkContext方法中去執行,這個方法具有catch異常的作用,發現異常後停止SparkContext。然後使用LiveListenerBus.withinListenerThread.withValue(true)來講變量設置爲true後執行後續代碼塊,執行完畢後,設置回原始值。然後進入死循環,執行語句前後分別設置processingEventtrue,表示真正處理事件過程中,處理完後設置爲false標識空閒狀態。然後就是從消息隊列中poll消息,發送消息。Timer的作用使計算耗時。

總體來說,這個方法是一個異步執行的線程方法,具有線程安全,鎖機制。具體爲了什麼加這些東西,需要研究到調用方有哪些才清楚。

start 方法

入口方法,調用方爲SparkContext:

這裏寫圖片描述

其實就是調用執行“listenerThread線程

def start(sc: SparkContext, metricsSystem: MetricsSystem): Unit = {
    if (started.compareAndSet(false, true)) {
      sparkContext = sc
      metricsSystem.registerSource(metrics)
      listenerThread.start()
    } else {
      throw new IllegalStateException(s"$name already started!")
    }
  }

post 方法

向隊列中添加事件。val eventAdded = eventQueue.offer(event)這行代碼爲向隊列中添加元素,添加成功則返回true,隊列已滿則返回false。如果隊列滿了,所以這裏如果超過了隊列的長度,新消息就被拋棄了,好尷尬。

def post(event: SparkListenerEvent): Unit = {
    if (stopped.get) {
      // Drop further events to make `listenerThread` exit ASAP
      logError(s"$name has already stopped! Dropping event $event")
      return
    }
    metrics.numEventsPosted.inc()
    val eventAdded = eventQueue.offer(event)
    if (eventAdded) {
      eventLock.release()
    } else {
      onDropEvent(event)
    }

    val droppedEvents = droppedEventsCounter.get
    if (droppedEvents > 0) {
      // Don't log too frequently
      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
        // There may be multiple threads trying to decrease droppedEventsCounter.
        // Use "compareAndSet" to make sure only one thread can win.
        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
        // then that thread will update it.
        if (droppedEventsCounter.compareAndSet(droppedEvents, 0)) {
          val prevLastReportTimestamp = lastReportTimestamp
          lastReportTimestamp = System.currentTimeMillis()
          logWarning(s"Dropped $droppedEvents SparkListenerEvents since " +
            new java.util.Date(prevLastReportTimestamp))
        }
      }
    }
  }

如果添加失敗後調用onDropEvent,就是把相關計數器加1,並輸出log,只輸出一次。

def onDropEvent(event: SparkListenerEvent): Unit = {
    metrics.numDroppedEvents.inc()
    droppedEventsCounter.incrementAndGet()
    if (logDroppedEvent.compareAndSet(false, true)) {
      // Only log the following message once to avoid duplicated annoying logs.
      logError("Dropping SparkListenerEvent because no remaining room in event queue. " +
        "This likely means one of the SparkListeners is too slow and cannot keep up with " +
        "the rate at which tasks are being started by the scheduler.")
    }
  }

因爲消息隊列滿的日誌消息在某個時刻只輸出一遍,超過一分鐘以後會重置:

val droppedEvents = droppedEventsCounter.get
    if (droppedEvents > 0) {
      // Don't log too frequently
      if (System.currentTimeMillis() - lastReportTimestamp >= 60 * 1000) {
        // There may be multiple threads trying to decrease droppedEventsCounter.
        // Use "compareAndSet" to make sure only one thread can win.
        // And if another thread is increasing droppedEventsCounter, "compareAndSet" will fail and
        // then that thread will update it.
        if (droppedEventsCounter.compareAndSet(droppedEvents, 0)) {
          val prevLastReportTimestamp = lastReportTimestamp
          lastReportTimestamp = System.currentTimeMillis()
          logWarning(s"Dropped $droppedEvents SparkListenerEvents since " +
            new java.util.Date(prevLastReportTimestamp))
        }
      }
    }

我們來看看誰調用了這個post方法,向隊列中添加事件的。

這裏寫圖片描述

DAGScheduler,SparkContext,我會在下一篇文章任務調用中詳細說這個DAGScheduler。總體來說就是在這兩個類中負責添加事件,但是我們是分佈式的系統,其中有MasterWorker的概念在裏面,上執行架構圖:

這裏寫圖片描述

從圖中可以看出來,DAGScheduler是存在於SparkContext,隸屬於Master,那說明WorkerMaster直接通過nio進行消息的傳遞,Master接受到信息並解析以後,委託DAGScheduler去提交響應的事件。

總結

消息隊列的研究就告一段落了,發現還有很多東西要去學,現在只是在學spark-core的東西,我真正的做的spark-streaming還在等着我呢。時間不夠用啊~

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章