Akka Cluster原理與應用

Akka集羣原理

Akka集羣支持去中心化的基於P2P的集羣服務,沒有單點故障(SPOF)問題,它主要是通過Gossip協議來實現。對於集羣成員的狀態,Akka提供了一種故障檢測機制,能夠自動發現出現故障而離開集羣的成員節點,通過事件驅動的方式,將狀態傳播到整個集羣的其它成員節點。

  • 狀態轉移與故障檢測

Akka內部爲集羣成員定義了一組有限狀態(6種狀態),並給出了一個狀態轉移矩陣,代碼如下所示:

  private[cluster] val allowedTransitions: Map[MemberStatus, Set[MemberStatus]] =
    Map(
      Joining -> Set(Up, Down, Removed),
      Up -> Set(Leaving, Down, Removed),
      Leaving -> Set(Exiting, Down, Removed),
      Down -> Set(Removed),
      Exiting -> Set(Removed, Down),
      Removed -> Set.empty[MemberStatus])
}

Akka集羣中的每個成員節點,都有可能處於上面的一種狀態,在發生某些事件以後,會發生狀態轉移。需要注意的是,除了Down和Removed狀態以外,節點處於其它任何一個狀態時都有可能變成Down狀態,即節點故障而無法提供服務,而在變成Down狀態之前有一個虛擬的Unreachable狀態,因爲在Gossip收斂過程中,是無法到達或者經由Unreachable狀態的節點,這個狀態是由Akka實現的故障探測器(Failure Detector)來檢測到的。處於Down狀態的節點如果想要再次加入Akka集羣,需要重新啓動,並進入Joining狀態,然後才能進行後續狀態的轉移變化。Akka集羣成員節點狀態及其轉移情況,如下圖所示:
akka-node-state-transition
我們說明一下Akka中的故障檢測機制。在Akka中,集羣中每一個成員節點M會被集羣中的其他另一組節點(默認是5個)G監控,這一組節點G並不是整個集羣中的其他所有節點,只是整個集羣全部節點的一個子集,組G中的節點會檢測節點M是否處於Unreachable狀態,這是通過發送心跳來確認節點M是否可達,如果不可達則組G中的節點會將節點M的Unreachable狀態向集羣中組G之外的其它節點傳播,最終使得集羣中的每個成員節點都知道節點M故障。

  • Akka事件集合

節點狀態發生轉移會觸發某個事件,我們可以根據不同類型的事件來進行相應的處理,爲了能夠詳細捕獲到各種事件,我們先看一下Akka定義的事件集合,如圖所示:
akka-events
通常,在基於Akka Cluster的應用中實現Actor時,可以重寫Actor的preStart方法,通過Cluster來訂閱集羣事件,代碼示例如下所示:

val cluster = Cluster(context.system)

override def preStart(): Unit = {
  cluster.subscribe(self, initialStateMode = InitialStateAsEvents,
    classOf[MemberUp], classOf[MemberRemoved], classOf[UnreachableMember])
}

例如,對於MemberUp事件,我們可以獲取到對應Actor的引用ActorRef,然後通過與其進行消息交換,一起協同完成特定任務。

  • Akka成員角色(Node Role)

Akka支持在每個成員節點加入集羣的時候,設置成員自己的角色。通過角色劃分,可以將使用Akka集羣處理業務的系統劃分爲多個處理邏輯獨立的子系統,每個子系統處理自己的業務邏輯,而且,劃分得到的多個子系統都處於一個統一的Akka集羣中。因此,每個子系統也具備了Akka集羣所具有的特性,如故障檢測、狀態轉移、狀態傳播等等。

Akka集羣應用實踐

我們基於Akka實現了一個簡單的模擬日誌實時處理的集羣系統,可以從任何數據源輸入數據,如文件、消息中間件Kafka、數據庫,還可以是一個遠程調用請求,我們收集數據,然數據經過一個攔截器層,最後解析處理數據爲特定格式,最後數據寫入Kafka。具體實現邏輯如下圖所示:
akka-event-processing-cluster
上圖中,我們將日誌實時處理系統分爲3個子系統,通過Akka的Role來進行劃分,3個角色分別爲collector、interceptor、processor,3個子系統中的節點都是整個Akka集羣的成員。整個集羣系統中的數據流向是:collector接收數據(或者直接對接特定數據源而產生數據),我們這裏模式發送Nginx日誌記錄行,將數據發送到interceptor;interceptor收到collector發送的日誌記錄行,解析出請求的真是IP地址,攔截在黑名單IP列表中的請求,如果IP地址不在黑名單,則發送給processor去處理;processor對整個日誌記錄行進行處理,最後保存到Kakfa中。
我們抽象出用來訂閱集羣事件相關的邏輯,實現抽象類爲ClusterRoledWorker,代碼如下所示:

package org.shirdrn.scala.akka.cluster

import akka.actor._
import akka.cluster.ClusterEvent.{InitialStateAsEvents, MemberEvent, MemberUp, UnreachableMember}
import akka.cluster.{Cluster, Member}

abstract class ClusterRoledWorker extends Actor with ActorLogging {

  // 創建一個Cluster實例
  val cluster = Cluster(context.system) 
  // 用來緩存下游註冊過來的子系統ActorRef
  var workers = IndexedSeq.empty[ActorRef] 

  override def preStart(): Unit = {
    // 訂閱集羣事件
    cluster.subscribe(self, initialStateMode = InitialStateAsEvents,
      classOf[MemberUp], classOf[UnreachableMember], classOf[MemberEvent])
  }

  override def postStop(): Unit = cluster.unsubscribe(self)

  /**
   * 下游子系統節點發送註冊消息
   */
  def register(member: Member, createPath: (Member) => ActorPath): Unit = { 
    val actorPath = createPath(member)
    log.info("Actor path: " + actorPath)
    val actorSelection = context.actorSelection(actorPath)
    actorSelection ! Registration
  }
}
另外,定義了一些case class作爲消息,方便在各個Actor之間進行發送/接收,代碼如下所示:

package org.shirdrn.scala.akka.cluster

object Registration extends Serializable

trait EventMessage extends Serializable
case class RawNginxRecord(sourceHost: String, line: String) extends EventMessage
case class NginxRecord(sourceHost: String, eventCode: String, line: String) extends EventMessage
case class FilteredRecord(sourceHost: String, eventCode: String, line: String, logDate: String, realIp: String) extends EventMessage
Akka Cluster使用一個配置文件,用來指定一些有關Actor的配置,我們使用的配置文件爲application.conf,配置內容如下所示:
akka {
  loglevel = INFO
  stdout-loglevel = INFO
  event-handlers = ["akka.event.Logging$DefaultLogger"]

  actor {
    provider = "akka.cluster.ClusterActorRefProvider"
  }

  remote {
    enabled-transports = ["akka.remote.netty.tcp"]
    log-remote-lifecycle-events = off
    netty.tcp {
      hostname = "127.0.0.1"
      port = 0
    }
  }
  cluster {
    seed-nodes = [
      "akka.tcp://[email protected]:2751",
      "akka.tcp://[email protected]:2752",
      "akka.tcp://[email protected]:2753"
    ]
    seed-node-timeout = 60s
    auto-down-unreachable-after = 10s
  }
}

上述配置中,我們創建的Akka Cluster的名稱爲event-cluster-system,初始指定了3個seed節點,實際上這3個節點是我們實現的collector角色的節點,用來收集數據。
下面,我們依次說明collector、interceptor、processor這3中角色的集羣節點的處理邏輯:

  • collector實現

我們實現的collector實現類爲EventCollector,它是一個Actor,該實現類繼承自ClusterRoledWorker抽象類,具體實現代碼如下所示:

package org.shirdrn.scala.akka.cluster

import akka.actor._
import akka.cluster.ClusterEvent._
import com.typesafe.config.ConfigFactory

import scala.concurrent.ExecutionContext
import scala.concurrent.duration._
import scala.concurrent.forkjoin.ForkJoinPool

class EventCollector extends ClusterRoledWorker {

  @volatile var recordCounter : Int = 0

  def receive = {
    case MemberUp(member) =>
      log.info("Member is Up: {}", member.address)
    case UnreachableMember(member) =>
      log.info("Member detected as Unreachable: {}", member)
    case MemberRemoved(member, previousStatus) =>
      log.info("Member is Removed: {} after {}", member.address, previousStatus)
    case _: MemberEvent => // ignore

    case Registration => {
      // watch發送註冊消息的interceptor,如果對應的Actor終止了,會發送一個Terminated消息
      context watch sender
      workers = workers :+ sender
      log.info("Interceptor registered: " + sender)
      log.info("Registered interceptors: " + workers.size)
    }
    case Terminated(interceptingActorRef) =>
      // interceptor終止,更新緩存的ActorRef
      workers = workers.filterNot(_ == interceptingActorRef)
    case RawNginxRecord(sourceHost, line) => {
      // 構造NginxRecord消息,發送到下游interceptor
      val eventCode = "eventcode=(\\d+)".r.findFirstIn(line).get
      log.info("Raw message: eventCode=" + eventCode + ", sourceHost=" + sourceHost + ", line=" + line)
      recordCounter += 1
      if(workers.size > 0) {
        // 模擬Roudrobin方式,將日誌記錄消息發送給下游一組interceptor中的一個
        val interceptorIndex = (if(recordCounter < 0) 0 else recordCounter) % workers.size
        workers(interceptorIndex) ! NginxRecord(sourceHost, eventCode, line)
        log.info("Details: interceptorIndex=" + interceptorIndex + ", interceptors=" + workers.size)
      }
    }
  }

}

/**
 * 用來模擬發送日誌記錄消息的Actor
 */
class EventClientActor extends Actor with ActorLogging {

  implicit val ec: ExecutionContext = ExecutionContext.fromExecutor(new ForkJoinPool())

  def receive = {
    case _=>
  }

  val events = Map(
    "2751" -> List(
      """10.10.2.72 [21/Aug/2015:18:29:19 +0800] "GET /t.gif?installid=0000lAOX&udid=25371384b2eb1a5dc5643e14626ecbd4&sessionid=25371384b2eb1a5dc5643e14626ecbd41440152875362&imsi=460002830862833&operator=1&network=1×tamp=1440152954&action=14&eventcode=300039&page=200002& HTTP/1.0" "-" 204 0 "-" "Dalvik/1.6.0 (Linux; U; Android 4.4.4; R8207 Build/KTU84P)" "121.25.190.146"""",
      """10.10.2.8 [21/Aug/2015:18:29:19 +0800] "GET /t.gif?installid=0000VACO&udid=f6b0520cbc36fda6f63a72d91bf305c0&imsi=460012927613645&operator=2&network=1×tamp=1440152956&action=1840&eventcode=100003&type=1&result=0& HTTP/1.0" "-" 204 0 "-" "Dalvik/1.6.0 (Linux; U; Android 4.4.2; GT-I9500 Build/KOT49H)" "61.175.219.69""""
    ),
    "2752" -> List(
      """10.10.2.72 [21/Aug/2015:18:29:19 +0800] "GET /t.gif?installid=0000gCo4&udid=636d127f4936109a22347b239a0ce73f&sessionid=636d127f4936109a22347b239a0ce73f1440150695096&imsi=460036010038180&operator=3&network=4×tamp=1440152902&action=1566&eventcode=101010&playid=99d5a59f100cb778b64b5234a189e1f4&radioid=1100000048450&audioid=1000001535718&playtime=3& HTTP/1.0" "-" 204 0 "-" "Dalvik/1.6.0 (Linux; U; Android 4.4.4; R8205 Build/KTU84P)" "106.38.128.67"""",
      """10.10.2.72 [21/Aug/2015:18:29:19 +0800] "GET /t.gif?installid=0000kPSC&udid=2ee585cde388ac57c0e81f9a76f5b797&operator=0&network=1×tamp=1440152968&action=6423&eventcode=100003&type=1&result=0& HTTP/1.0" "-" 204 0 "-" "Dalvik/v3.3.85 (Linux; U; Android L; P8 Build/KOT49H)" "202.103.133.112"""",
      """10.10.2.72 [21/Aug/2015:18:29:19 +0800] "GET /t.gif?installid=0000lABW&udid=face1161d739abacca913dcb82576e9d&sessionid=face1161d739abacca913dcb82576e9d1440151582673&operator=0&network=1×tamp=1440152520&action=1911&eventcode=101010&playid=b07c241010f8691284c68186c42ab006&radioid=1100000000762&audioid=1000001751983&playtime=158& HTTP/1.0" "-" 204 0 "-" "Dalvik/1.6.0 (Linux; U; Android 4.1; H5 Build/JZO54K)" "221.232.36.250""""
    ),
    "2753" -> List(
      """10.10.2.8 [21/Aug/2015:18:29:19 +0800] "GET /t.gif?installid=0000krJw&udid=939488333889f18e2b406d2ece8f938a&sessionid=939488333889f18e2b406d2ece8f938a1440137301421&imsi=460028180045362&operator=1&network=1×tamp=1440152947&action=1431&eventcode=300030&playid=e1fd5467085475dc4483d2795f112717&radioid=1100000001123&audioid=1000000094911&playtime=951992& HTTP/1.0" "-" 204 0 "-" "Dalvik/1.6.0 (Linux; U; Android 4.0.4; R813T Build/IMM76D)" "5.45.64.205"""",
      """10.10.2.72 [21/Aug/2015:18:29:19 +0800] "GET /t.gif?installid=0000kcpz&udid=cbc7bbb560914c374cb7a29eef8c2144&sessionid=cbc7bbb560914c374cb7a29eef8c21441440152816008&imsi=460008782944219&operator=1&network=1×tamp=1440152873&action=360&eventcode=200003&page=200003&radioid=1100000046018& HTTP/1.0" "-" 204 0 "-" "Dalvik/v3.3.85 (Linux; U; Android 4.4.2; MX4S Build/KOT49H)" "119.128.106.232"""",
      """10.10.2.8 [21/Aug/2015:18:29:19 +0800] "GET /t.gif?installid=0000juRL&udid=3f9a5ffa69a5cd5f0754d2ba98c0aeb2&imsi=460023744091238&operator=1&network=1×tamp=1440152957&action=78&eventcode=100003&type=1&result=0& HTTP/1.0" "-" 204 0 "-" "Dalvik/v3.3.85 (Linux; U; Android 4.4.3; S?MSUNG. Build/KOT49H)" "223.153.72.78""""
    )
  )

  val ports = Seq("2751","2752", "2753")
  val actors = scala.collection.mutable.HashMap[String, ActorRef]()

  ports.foreach { port =>
    // 創建一個Config對象
    val config = ConfigFactory.parseString("akka.remote.netty.tcp.port=" + port)
        .withFallback(ConfigFactory.parseString("akka.cluster.roles = [collector]"))
        .withFallback(ConfigFactory.load())
    // 創建一個ActorSystem實例
    val system = ActorSystem("event-cluster-system", config)
    actors(port) = system.actorOf(Props[EventCollector], name = "collectingActor")
  }

  Thread.sleep(30000)

  context.system.scheduler.schedule(0 millis, 5000 millis) {
    // 使用Akka的Scheduler,模擬定時發送日誌記錄消息
    ports.foreach { port =>
      events(port).foreach { line =>
        println("RAW: port=" + port + ", line=" + line)
        actors(port) ! RawNginxRecord("host.me:" + port, line)
      }
    }
  }
}

object EventClient extends App {

  val system = ActorSystem("client")
  // 創建EventClientActor實例
  val clientActorRef = system.actorOf(Props[EventClientActor], name = "clientActor")
  system.log.info("Client actor started: " + clientActorRef)
}

上面代碼中,EventClientActor並不是屬於我們創建的Akka集羣event-cluster-system,它是一個位於集羣外部的節點,它模擬向各個collector角色的節點發送消息。

  • interceptor實現

與編寫collector類似,實現的interceptor的Actor實現類爲EventInterceptor,代碼如下所示:

package org.shirdrn.scala.akka.cluster

import akka.actor._
import akka.cluster.ClusterEvent._
import akka.cluster.Member
import akka.cluster.protobuf.msg.ClusterMessages.MemberStatus
import com.typesafe.config.ConfigFactory
import net.sf.json.JSONObject
import org.shirdrn.scala.akka.cluster.utils.DatetimeUtils

class EventInterceptor extends ClusterRoledWorker {

  @volatile var interceptedRecords : Int = 0
  val IP_PATTERN = "[^\\s]+\\s+\\[([^\\]]+)\\].+\"(\\d+\\.\\d+\\.\\d+\\.\\d+)\"".r
  val blackIpList = Array(
    "5.9.116.101", "103.42.176.138", "123.182.148.65", "5.45.64.205",
    "27.159.226.192", "76.164.228.218", "77.79.178.186", "104.200.31.117",
    "104.200.31.32", "104.200.31.238", "123.182.129.108", "220.161.98.39",
    "59.58.152.90", "117.26.221.236", "59.58.150.110", "123.180.229.156",
    "59.60.123.239", "117.26.222.6", "117.26.220.88", "59.60.124.227",
    "142.54.161.50", "59.58.148.52", "59.58.150.85", "202.105.90.142"
  ).toSet

  log.info("Black IP count: " + blackIpList.size)
  blackIpList.foreach(log.info(_))

  def receive = {
    case MemberUp(member) =>
      log.info("Member is Up: {}", member.address)
      register(member, getCollectorPath)
    case state: CurrentClusterState =>
      // 如果加入Akka集羣的成員節點是Up狀態,並且是collector角色,則調用register向collector進行註冊
      state.members.filter(_.status == MemberStatus.Up) foreach(register(_, getCollectorPath))
    case UnreachableMember(member) =>
      log.info("Member detected as Unreachable: {}", member)
    case MemberRemoved(member, previousStatus) =>
      log.info("Member is Removed: {} after {}", member.address, previousStatus)
    case _: MemberEvent => // ignore

    case Registration => {
      context watch sender
      workers = workers :+ sender
      log.info("Processor registered: " + sender)
      log.info("Registered processors: " + workers.size)
    }
    case Terminated(processingActorRef) =>
      workers = workers.filterNot(_ == processingActorRef)
    case NginxRecord(sourceHost, eventCode, line) => {
      val (isIpInBlackList, data) = checkRecord(eventCode, line)
      if(!isIpInBlackList) {
        interceptedRecords += 1
        if(workers.size > 0) {
          val processorIndex = (if (interceptedRecords < 0) 0 else interceptedRecords) % workers.size
          workers(processorIndex) ! FilteredRecord(sourceHost, eventCode, line, data.getString("eventdate"), data.getString("realip"))
          log.info("Details: processorIndex=" + processorIndex + ", processors=" + workers.size)
        }
        log.info("Intercepted data: data=" + data)
      } else {
        log.info("Discarded: " + line)
      }
    }
  }

  def getCollectorPath(member: Member): ActorPath = {
    RootActorPath(member.address) / "user" / "collectingActor"
  }

  /**
   * 檢查collector發送的消息所對應的IP是否在黑名單列表中
   */
  private def checkRecord(eventCode: String, line: String): (Boolean, JSONObject) = {
    val data: JSONObject = new JSONObject()
    var isIpInBlackList = false
    IP_PATTERN.findFirstMatchIn(line).foreach { m =>
      val rawDt = m.group(1)
      val dt = DatetimeUtils.format(rawDt)
      val realIp = m.group(2)

      data.put("eventdate", dt)
      data.put("realip", realIp)
      data.put("eventcode", eventCode)
      isIpInBlackList = blackIpList.contains(realIp)
    }
    (isIpInBlackList, data)
  }
}

object EventInterceptor extends App {

  Seq("2851","2852").foreach { port =>
    val config = ConfigFactory.parseString("akka.remote.netty.tcp.port=" + port)
      .withFallback(ConfigFactory.parseString("akka.cluster.roles = [interceptor]"))
      .withFallback(ConfigFactory.load())
    val system = ActorSystem("event-cluster-system", config)
    val processingActor = system.actorOf(Props[EventInterceptor], name = "interceptingActor")
    system.log.info("Processing Actor: " + processingActor)
  }
}

上述代碼中,解析出Nginx日誌記錄中的IP地址,查看其是否在IP黑名單列表中,如果在內名單中則直接丟掉該記錄數據。

  • processor實現

EventProcessor的實現代碼,如下所示:

package org.shirdrn.scala.akka.cluster

import java.util.Properties

import akka.actor._
import akka.cluster.ClusterEvent._
import akka.cluster.Member
import akka.cluster.protobuf.msg.ClusterMessages.MemberStatus
import com.typesafe.config.ConfigFactory
import kafka.producer.{KeyedMessage, Producer, ProducerConfig}
import net.sf.json.JSONObject

class EventProcessor extends ClusterRoledWorker {

  val topic = "app_events"
  val producer = KakfaUtils.createProcuder

  def receive = {
    case MemberUp(member) =>
      log.info("Member is Up: {}", member.address)
      // 將processor註冊到上游的collector中
      register(member, getProcessorPath)
    case state: CurrentClusterState =>
      state.members.filter(_.status == MemberStatus.Up).foreach(register(_, getProcessorPath))
    case UnreachableMember(member) =>
      log.info("Member detected as Unreachable: {}", member)
    case MemberRemoved(member, previousStatus) =>
      log.info("Member is Removed: {} after {}", member.address, previousStatus)
    case _: MemberEvent => // ignore

    case FilteredRecord(sourceHost, eventCode, line, nginxDate, realIp) => {
      val data = process(eventCode, line, nginxDate, realIp)
      log.info("Processed: data=" + data)
      // 將解析後的消息一JSON字符串的格式,保存到Kafka中
      producer.send(new KeyedMessage[String, String](topic, sourceHost, data.toString))
    }
  }

  def getProcessorPath(member: Member): ActorPath = {
    RootActorPath(member.address) / "user" / "interceptingActor"
  }

  private def process(eventCode: String, line: String, eventDate: String, realIp: String): JSONObject = {
    val data: JSONObject = new JSONObject()
    "[\\?|&]{1}([^=]+)=([^&]+)&".r.findAllMatchIn(line) foreach { m =>
      val key = m.group(1)
      val value = m.group(2)
      data.put(key, value)
    }
    data.put("eventdate", eventDate)
    data.put("realip", realIp)
    data
  }
}

object KakfaUtils {
  // bin/kafka-topics.sh --create -zookeeper zk1:2181,zk2:2181,zk3:2181/data-dept/kafka --replication-factor 2 --partitions 2 --topic app_events
  val props = new Properties()
  val config = Map(
    "metadata.broker.list" -> "hadoop2:9092,hadoop3:9092",
    "serializer.class" -> "kafka.serializer.StringEncoder",
    "producer.type" -> "async"
  )
  config.foreach(entry => props.put(entry._1, entry._2))
  val producerConfig = new ProducerConfig(props)

  def createProcuder() : Producer[String, String] = {
    new Producer[String, String](producerConfig)
  }
}

object EventProcessor extends App {

  // 啓動了5個EventProcessor
  Seq("2951","2952", "2953", "2954", "2955") foreach { port =>
    val config = ConfigFactory.parseString("akka.remote.netty.tcp.port=" + port)
      .withFallback(ConfigFactory.parseString("akka.cluster.roles = [processor]"))
      .withFallback(ConfigFactory.load())
    val system = ActorSystem("event-cluster-system", config)
    val processingActor = system.actorOf(Props[EventProcessor], name = "processingActor")
    system.log.info("Processing Actor: " + processingActor)
  }
}

角色爲processor的Actor的實現類爲EventProcessor,我們在其伴生對象中創建了5個實例,分別對應不同的端口。解析的Nginx日誌記錄最後保存到Kafka,示例如下所示:

{"installid":"0000VACO","imsi":"460012927613645","network":"1","action":"1840","type":"1","eventdate":"2015-08-21 18:29:19","realip":"61.175.219.69"}
{"installid":"0000kcpz","sessionid":"cbc7bbb560914c374cb7a29eef8c21441440152816008","operator":"1","timestamp":"1440152873","eventcode":"200003","radioid":"1100000046018","eventdate":"2015-08-21 18:29:19","realip":"119.128.106.232"}
{"installid":"0000lAOX","sessionid":"25371384b2eb1a5dc5643e14626ecbd41440152875362","operator":"1","timestamp":"1440152954","eventcode":"300039","eventdate":"2015-08-21 18:29:19","realip":"121.25.190.146"}




發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章