Twisted源碼分析1

Twisted是用python編寫的事件驅動的網絡框架,雖然Twisted從發佈到現在已經有不少年頭了,而且現在也出現了不少新的高性能異步I/O框架,比如說tornado,但是Twisted任然具有很好的學習價值。如果想要看Twisted的教程的話,Twisted有着非常好的教程Twisted introduction,這個是翻譯


現在進入正題

我們通過一個簡單的例子來開始我們的分析

from twisted.internet.protocol import ServerFactory, Protocol


class PoetryProtocol(Protocol):

    def connectionMade(self):
        self.transport.write(self.factory.poem)
        self.transport.loseConnection()


class PoetryFactory(ServerFactory):

    protocol = PoetryProtocol

    def __init__(self, poem):
        self.poem = poem


def main():
    options, poetry_file = parse_args()

    poem = open(poetry_file).read()

    factory = PoetryFactory(poem)

    from twisted.internet import reactor

    port = reactor.listenTCP(options.port or 0, factory,
                             interface=options.iface)

    print 'Serving %s on %s.' % (poetry_file, port.getHost())

    reactor.run()

排版需要,這裏僅僅列出一部分代碼,全部代碼詳見這裏
這是一個非常簡單的服務器,每當有客戶端連接時,就向客戶端發送一首詩歌的全部內容,然後斷開連接,在這裏我們僅僅關注reactor。reactor是事件循環管理器,用於註冊,運行,銷燬事件,以及當事件發生時調用回調函數。我們需要注意,reactor循環是在主進程中運行,也就是調用reactor.run()的進程中運行,一但循環開始運行,就會一直運行下去,直到調用reactor.stop()方法停止。在Twisted中,reactor是單例模式,當你首次導入reactor模塊的時候就會創建它,接下來你在應用中的其他地方導入reactor時將返回第一次創建的對象

from twisted.internet import reactor

上面引入的方式是Twisted的默認方法,然我們來看看這段代碼是如何實現單例模式的

# /twisted/internet/reactor.py
from __future__ import division, absolute_import

import sys
del sys.modules['twisted.internet.reactor']
from twisted.internet import default
default.install()

當第一次導入時,首先刪除模塊字典中的“twisted.internet.reactor”的值(如果它存在的話),然後安裝默認的reactor。sys.modules是一個模塊名和模塊對象匹配的全局字典,當import一個模塊時會檢查這個字典,如果加載了只是將模塊的名字加入到導入該模塊的模塊的命名空間中,如果沒有加載就從sys.path目錄中按照模塊名稱查找模塊文件,然後將模塊導入內存,將模塊名和模塊對象映射加入到字典中,在將名稱導入到導入該模塊的模塊的命名空間中,那麼default.py中代碼爲:

# /twisted/internet/default.py

from __future__ import division, absolute_import

__all__ = ["install"]

from twisted.python.runtime import platform


def _getInstallFunction(platform):
    try:
        if platform.isLinux():
            try:
                from twisted.internet.epollreactor import install
            except ImportError:
                from twisted.internet.pollreactor import install
        elif platform.getType() == 'posix' and not platform.isMacOSX():
            from twisted.internet.pollreactor import install
        else:
            from twisted.internet.selectreactor import install
    except ImportError:
        from twisted.internet.selectreactor import install
    return install


install = _getInstallFunction(platform)

這裏會根據平臺來選擇相應的reactor,如果在linux下優先使用epollreactor,如果拋出異常那麼使用pollreactor或者是selectreactor,如果是windows則使用selectreactor。我們在這裏研究pollreactor

# /twisted/internet/pollreactor.py

def install():
    """Install the poll() reactor."""
    p = PollReactor()
    from twisted.internet.main import installReactor
    installReactor(p)


# /twisted/internet/main.py

def installReactor(reactor):
    """
    Install reactor C{reactor}.
    @param reactor: An object that provides one or more IReactor* interfaces.
    """
    # this stuff should be common to all reactors.
    import twisted.internet
    import sys
    if 'twisted.internet.reactor' in sys.modules:
        raise error.ReactorAlreadyInstalledError("reactor already installed")
    twisted.internet.reactor = reactor
    sys.modules['twisted.internet.reactor'] = reactor

在這裏,將reactor賦值給twisted.internet.reactor對象,並且將reactor對象賦給模塊字典的“twisted.internet.reactor”鍵,以後再導入reactor,就會導入這個單例了

# /twisted/internet/pollreactor.py

@implementer(IReactorFDSet)
class PollReactor(posixbase.PosixReactorBase, posixbase._PollLikeMixin):

    _POLL_DISCONNECTED = (POLLHUP | POLLERR | POLLNVAL)
    # POLLHUP 連接掛起
    # POLLNVAL 非法請求:文件描述符無法打開
    # POLLERR 連接出現錯誤
    _POLL_IN = POLLIN # 代表有數據可讀
    _POLL_OUT = POLLOUT # 代表有數據可寫,並且沒有阻塞

    def __init__(self):
        """
        初始化polling對象,文件描述符追蹤字典,以及基類
        """
        self._poller = poll() # poll調用
        self._selectables = {}
        self._reads = {}
        self._writes = {}
        posixbase.PosixReactorBase.__init__(self)

    def _updateRegistration(self, fd):
        """
        更新polling對象對文件描述符狀態的追蹤
        """
        try:
            self._poller.unregister(fd)
            # 移除被polling對象追蹤的文件描述符
        except KeyError:
            pass

        mask = 0
        if fd in self._reads:
            mask = mask | POLLIN
        if fd in self._writes:
            mask = mask | POLLOUT
        if mask != 0:
            self._poller.register(fd, mask) 
        else:
            if fd in self._selectables:
                del self._selectables[fd]

    def _dictRemove(self, selectable, mdict):
        try:
            # the easy way
            fd = selectable.fileno()
            # 確保文件描述符是真實的
            mdict[fd]
        except:
            for fd, fdes in self._selectables.items():
                if selectable is fdes:
                    break
            else:
                return
        if fd in mdict:
            del mdict[fd]
            self._updateRegistration(fd)

    def addReader(self, reader):
        fd = reader.fileno()
        if fd not in self._reads:
            self._selectables[fd] = reader
            self._reads[fd] =  1
            self._updateRegistration(fd)

    def addWriter(self, writer):
        """Add a FileDescriptor for notification of data available to write.
        """
        fd = writer.fileno()
        if fd not in self._writes:
            self._selectables[fd] = writer
            self._writes[fd] =  1
            self._updateRegistration(fd)

    def removeReader(self, reader):
        return self._dictRemove(reader, self._reads)

    def removeWriter(self, writer):
        return self._dictRemove(writer, self._writes)

    def removeAll(self):
        return self._removeAll(
            [self._selectables[fd] for fd in self._reads],
            [self._selectables[fd] for fd in self._writes])

    # 這裏是重點
    def doPoll(self, timeout):
        """Poll the poller for new events."""
        if timeout is not None:
            timeout = int(timeout * 1000) # convert seconds to milliseconds

        try:
            l = self._poller.poll(timeout)
            # 返回一組可能爲空的文件描述符-事件二元組,文件描
            # 述符代表當前有事件發生的socket對象,event代表
            # 事件的種類,可能爲上面定義的POLLIN,POLLOUT等
            # 中的一種
        except SelectError as e:
            if e.args[0] == errno.EINTR:
                # 系統調用被打斷
                return
            else:
                # 直接拋出異常
                raise
        _drdw = self._doReadOrWrite
        for fd, event in l:
            try:
                selectable = self._selectables[fd]
            except KeyError:
                continue
            log.callWithLogger(selectable, _drdw, selectable, fd, event)

    doIteration = doPoll # 會被mainloop函數調用,實現事務監聽循環

    def getReaders(self):
        return [self._selectables[fd] for fd in self._reads]


    def getWriters(self):
        return [self._selectables[fd] for fd in self._writes]

implementer表示PollReactor實現了IReactorFDSet的接口的方法:
/twisted/internet/interfaces.py

_doReadOrWrite方法的實現在pollreactor的基類_pollLikeMixin中:

# twisted/internet/posixbase.py

class _PollLikeMixin(object):
    """
    Mixin for poll-like reactors.
    Subclasses must define the following attributes::
      - _POLL_DISCONNECTED - Bitmask for events indicating a connection was
        lost.
      - _POLL_IN - Bitmask for events indicating there is input to read.
      - _POLL_OUT - Bitmask for events indicating output can be written.
    Must be mixed in to a subclass of PosixReactorBase (for
    _disconnectSelectable).
    """

    def _doReadOrWrite(self, selectable, fd, event):
        """
       文件描述符要可讀,可寫,能夠完成工作並且能在必要時拋出異常
        """
        why = None
        inRead = False
        if event & self._POLL_DISCONNECTED and not (event & self._POLL_IN):
            # 處理斷開的連接,只有當我們已經完成處理所有未決的輸入時
            if fd in self._reads:
                # 表明不會再有讀事件,即讀取數據已經完畢,
                # 並且有可能傳輸的另一邊已經斷開連接
                inRead = True
                why = CONNECTION_DONE
            else:
                # 如果我們沒有從這個描述符中讀取數據,
                # 那麼只有可能是一個錯誤的關閉
                why = CONNECTION_LOST
        else:
            try:
                if selectable.fileno() == -1:
                    # 表明這個socket已經被關閉
                    why = _NO_FILEDESC
                else:
                    if event & self._POLL_IN:
                        # 處理讀事件
                        why = selectable.doRead()
                        inRead = True
                    if not why and event & self._POLL_OUT:
                        # 處理寫事件,讀事件的eventmask是1,寫事
                        # 件的eventmask是4,二者疊加是5,5的
                        # 話一般指連接關閉(這個只是我的個人理解,
                        # 我在做實驗的時候發現連接關閉時響應事件
                        # 的eventmask爲5),所以這裏要檢測只有
                        # 寫事件出現,沒有讀事件出現
                        why = selectable.doWrite()
                        inRead = False
            except:
                # Any exception from application code gets logged and will
                # cause us to disconnect the selectable.
                why = sys.exc_info()[1]
                log.err()
        if why:
            # 處理關閉的連接
            self._disconnectSelectable(selectable, why, inRead)

_doReadOrWrite方法將根據各個socket發生的事件來調用響應的doRead,doWrite方法,或者關閉連接和報錯

port = reactor.listenTCP(options.port or 0, factory,
                             interface=options.iface)

這裏reactor監聽了一個端口,這個方法在reactor的基類PosixReactorBase中實現:

# /twisted/internet/posixbase.py
@implementer(IReactorTCP, IReactorUDP, IReactorMulticast)
class PosixReactorBase(_SignalReactorMixin, _DisconnectSelectableMixin,ReactorBase):

    def listenTCP(self, port, factory, backlog=50, interface=''):
        p = tcp.Port(port, factory, backlog, interface, self)
        p.startListening()
        return p

# /twisted/internet/tcp.py
@implementer(interfaces.IListeningPort)
class Port(base.BasePort, _SocketCloser):
    def __init__(self, port, factory, backlog=50, interface='', reactor=None):
        """Initialize with a numeric port to listen on.
        """
        base.BasePort.__init__(self, reactor=reactor)
        self.port = port
        self.factory = factory
        self.backlog = backlog
        if abstract.isIPv6Address(interface):
            self.addressFamily = socket.AF_INET6
            self._addressType = address.IPv6Address
        self.interface = interface

    def startListening(self):
        """創建和綁定socket,然後啓動偵聽"""
        # 看看可否複用之前創建的socket
        if self._preexistingSocket is None:
            try:
                skt = self.createInternetSocket()
                if self.addressFamily == socket.AF_INET6:
                    addr = _resolveIPv6(self.interface, self.port)
                else:
                    addr = (self.interface, self.port)
                skt.bind(addr)
            except socket.error as le:
                raise CannotListenError(self.interface, self.port, le)
            skt.listen(self.backlog)
        else:
            skt = self._preexistingSocket
            self._preexistingSocket = None
            self._shouldShutdown = False

        # Make sure that if we listened on port 0, we update that to
        # reflect what the OS actually assigned us.
        self._realPortNumber = skt.getsockname()[1]

        log.msg("%s starting on %s" % (
                self._getLogPrefix(self.factory), self._realPortNumber))

        # The order of the next 5 lines is kind of bizarre.  If no one
        # can explain it, perhaps we should re-arrange them.
        self.factory.doStart() # 啓動工廠
        self.connected = True
        self.socket = skt
        self.fileno = self.socket.fileno
        self.numberAccepts = 100

        self.startReading() 
        # 將該對象添加到reactor的polling對象的跟蹤列表中

在這裏,listenTCP創建一個監聽某個端口的socket,並且將其添加到reactor的polling對象的跟蹤列表中,一旦有客戶端訪問該服務器,這個reactor就會監控到,並且處理它。listenTCP返回一個Port對象,當客戶端有連接請求時,便會調用doRead方法:

# twisted/internet/tcp.py

def doRead(self):
        try:
            if platformType == "posix":
                numAccepts = self.numberAccepts
            else:
                # win32下只能調用一次socket.accept方法
                numAccepts = 1
            for i in range(numAccepts):
                if self.disconnecting:
                    return
                try:
                    skt, addr = self.socket.accept()
                    # 獲得客戶端連接的socket
                except socket.error as e:
                    if e.args[0] in (EWOULDBLOCK, EAGAIN):
                        # EWOULDBLOCK 操作阻塞
                        # EAGAIN 再次嘗試
                        self.numberAccepts = i
                        break
                    elif e.args[0] == EPERM:
                        # 操作不允許
                        # Netfilter on Linux may have rejected the
                        # connection, but we get told to try to accept()
                        # anyway.
                        continue
                    elif e.args[0] in (EMFILE, ENOBUFS, ENFILE, ENOMEM, ECONNABORTED):
                        # EMFILE 過多的文件描述符
                        # ENOBUFS 緩存區不足
                        # ENFILE 文件表溢出

                        # Linux gives EMFILE when a process is not allowed
                        # to allocate any more file descriptors.  *BSD and
                        # Win32 give (WSA)ENOBUFS.  Linux can also give
                        # ENFILE if the system is out of inodes, or ENOMEM
                        # if there is insufficient memory to allocate a new
                        # dentry.  ECONNABORTED is documented as possible on
                        # both Linux and Windows, but it is not clear
                        # whether there are actually any circumstances under
                        # which it can happen (one might expect it to be
                        # possible if a client sends a FIN or RST after the
                        # server sends a SYN|ACK but before application code
                        # calls accept(2), however at least on Linux this
                        # _seems_ to be short-circuited by syncookies.

                        log.msg("Could not accept new connection (%s)" % (
                            errorcode[e.args[0]],))
                        break
                    raise

                fdesc._setCloseOnExec(skt.fileno())
                protocol = self.factory.buildProtocol(self._buildAddr(addr))
                if protocol is None:
                    skt.close()
                    continue
                s = self.sessionno
                self.sessionno = s+1
                transport = self.transport(skt, protocol, addr, self, s, self.reactor)
                protocol.makeConnection(transport)
            else:
                self.numberAccepts = self.numberAccepts+20
        except:
            log.deferr()

在doRead方法中,調用accept產生了用於接收客戶端數據的套接字,將套接字與transport綁定,然後創建Protocol對象,然後把transport加入到reactor的讀集合。

protocol = self.factory.buildProtocol(self._buildAddr(addr))
transport = self.transport(skt, protocol, addr, self, s, self.reactor)
protocol.makeConnection(transport)

factory對象(Factory以後再講)調用buildProtocol方法創建了我們自定義的Protocol類對象,然後創建transport之後調用了protocol的makeConnection方法,改方法的實現在其父類BaseProtocol中:

# /twisted/internet/protocol.py
class BaseProtocol:
    connected = 0
    transport = None

    def makeConnection(self, transport):
        self.connected = 1
        self.transport = transport
        self.connectionMade()

    def connectionMade(self):
        """Called when a connection is made.
        This may be considered the initializer of the protocol, because
        it is called when the connection is completed.  For clients,
        this is called once the connection to the server has been
        established; for servers, this is called after an accept() call
        stops blocking and a socket has been received.  If you need to
        send any greeting or initial message, do it here.
        """

在這裏它調用了我們自定義的makeConnection方法,這樣服務器端和客戶端就可以進行數據傳輸了

當客戶端有數據到來時,就會調用transport的doRead方法進行數據讀取了。而Connection是transport實例的類的父類,它實現了doRead方法:

# /twisted/internet/tcp.py
@implementer(interfaces.ITCPTransport, interfaces.ISystemHandle)
class Connection(_TLSConnectionMixin, abstract.FileDescriptor, _SocketCloser,
                 _AbortingMixin):
     def doRead(self):
        try:
            data = self.socket.recv(self.bufferSize)
        except socket.error as se:
            if se.args[0] == EWOULDBLOCK:
                # 如果被阻塞直接返回
                return
            else:
                # 斷開連接
                return main.CONNECTION_LOST

        return self._dataReceived(data)

    def _dataReceived(self, data):
        if not data:
            return main.CONNECTION_DONE
        rval = self.protocol.dataReceived(data)
        if rval is not None:
            offender = self.protocol.dataReceived
            warningFormat = (
                'Returning a value other than None from %(fqpn)s is '
                'deprecated since %(version)s.')
            warningString = deprecate.getDeprecationWarningString(
                offender, versions.Version('Twisted', 11, 0, 0),
                format=warningFormat)
            deprecate.warnAboutFunction(offender, warningString)
        return rval

_dataReceived方法將調用我們重寫的protocol的dataReceived方法處理數據
reactor.run()方法的是reactor的基類_SignalReactorMixin實現的:

class _SignalReactorMixin(object):
    def startRunning(self, installSignalHandlers=True):
        self._installSignalHandlers = installSignalHandlers
        ReactorBase.startRunning(self)

    def run(self, installSignalHandlers=True):
    self.startRunning(installSignalHandlers=installSignalHandlers)
        self.mainLoop()

    def mainLoop(self):
        while self._started:
            try:
                while self._started:
                    # Advance simulation time in delayed event
                    # processors.
                    self.runUntilCurrent()
                    t2 = self.timeout()
                    t = self.running and t2
                    self.doIteration(t)
            except:
                log.msg("Unexpected error in main loop.")
                log.err()
            else:
                log.msg('Main loop terminated.')

pollreactor的基類PosixReactorBase有兩個基類,分別是_SignalReactorMixin和ReactorBase,由於_SignalReactorMixin和ReactorBase都實現了startRunning方法,所以根據繼承的mro順序的話,會先調用_SignalReactorMixin的,這樣的話需要在最後再調用ReactorBase的startRunning方法

在這裏mainloop將啓動主循環了,mainloop調用上面講的doIterarion方法來監控一組描述符,一旦有事件準備好讀寫,就調用事件處理程序來處理。

以上就是對於這個簡單例子的簡要分析,從創建事件監聽循環,到與客戶端建立連接。有一些細節我並沒有做出說明(因爲我也是邊閱讀源碼邊寫博客),如果有興趣可以仔細閱讀源碼,本文難免存在疏漏和錯誤,歡迎讀者給與指正。因爲要考研,我的時間並不是很多,但我至少每個月會寫一篇這樣的文章


參考:
1.http://www.jianshu.com/p/26ae331b09b0
2.https://docs.python.org/2/library/errno.html?highlight=errno#module-errno
3.https://github.com/twisted/twisted/tree/trunk/twisted
4.https://docs.python.org/2/library/select.html?highlight=select#module-select

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章