rabbitmq可靠性保證(原文加上部分標記)

ReliabilityGuide

Thispage explains how to use the various features of AMQP and RabbitMQ toachieve reliable delivery - to ensure that messages are alwaysdelivered, even encountering failure in any part of your system.

WhatCan Fail?

Networkproblems are probably the most common class of failure. Not only cannetworks fail, firewalls can interrupt idle connections, and networkfailures are not always detected immediately.

Inaddition to connectivity failures, the broker and client applicationscan experience hardware failure (or software can crash) at any time.Additionally, even if client applications keep running, logic errorscan cause channel or connection errors which force the client toestablish a new channel or connection and recover from the problem.

ConnectionFailures

Inthe event of a connection failure, the client will need to establisha new connection to the broker. Any channels opened on the previousconnection will have been automatically closed and these will needre-opening too.

Ingeneral when connections fail, the client will be informed by theconnection throwing an exception (or similar language construct).The official Java and .NET clients additionally provide callbackmethods to let you hear about connection failures in other contexts -Java provides the ShutdownListener callback on both Connection andChannel classes, and .NET client providesIConnection.ConnectionShutdown and IModel.ModelShutdown events forthe same purpose.

Acknowledgementsand Confirms

Whena connection fails, messages may be in transit between client andserver - they may be in the middle of being parsed or generated, inOS buffers, or on the wire. Messages in transit will be lost - theywill need to be retransmitted. Acknowledgements let the serverand clients know when to do this.

Acknowledgementscan be used in both directions - to allow a consumer to indicate tothe server that it has received / processed a message and to allowthe server to indicate the same thing to the producer. RabbitMQrefers to the latter case as a "confirm".

Ofcourse, TCP ensures that packets have been received, and willretransmit until they are - but that's just the network layer.Acknowledgements and confirms indicate that messages have beenreceived and acted upon. An acknowledgement signals both thereceipt of a message, and a transfer of ownership where the receiverassumes full responsibility for it.

Acknowledgementstherefore have semantics - a consuming application should notacknowledge messages until it has done whatever it needs to do withthem - recorded them in a database, forwarded them on, printed themonto paper or anything else. Once it does so, the broker is free toforget about the message.

Similarly,the broker will confirm messages once it has taken responsibility forthem (see herefor what that means).

Useof acknowledgements guarantees at-least-oncedelivery. Without acknowledgements, message loss ispossible during publish and consume operations and only at-most-oncedelivery is guaranteed.

Heartbeats

Insome types of network failure, packet loss can mean that disruptedTCP connections take some time to be detected by the operatingsystem. AMQP offers a heartbeat feature to ensure that theapplication layer promptly finds out about disrupted connections (andalso completely unresponsive peers). Heartbeatsalso defend against certain network equipment which may terminate"idle" TCP connections. In RabbitMQ versions3.0 and higher, the broker will attempt to negotiate heartbeats bydefault (although the client can still veto them). Using earlierversions the client must be configured to request heartbeats.

Atthe Broker

Inorder to avoid losing messages in the broker we need to cope withbroker restarts, broker hardware failure and in extremis evenbroker crashes.

Toensure that messages and broker definitions survive restarts, we needto ensure that they are on disk. The AMQP standard has a concept ofdurability for exchanges, queues and of persistent messages,requiring that a durable object or persistent message will survive arestart. More details about specific flags pertaining to durabilityand persistence can be found in the AMQPConcepts Guide.

Clusteringand High Availability

Ifwe need to ensure that our broker survives hardware failure, we canuse RabbitMQ's clustering. In a RabbitMQ cluster, all definitions (ofexchanges, bindings, users, etc) are mirrored across the entirecluster. Queues behave differently, by default residing only on asingle node, but optionally being mirrored across several or allnodes. Queues remain visible and reachable from all nodes regardlessof where they are located.

Mirroredqueues replicate their contents across all configured cluster nodes,tolerating node failures seamlessly and without message loss(although see thisnote on unsynchronised slaves). However, consuming applicationsneed to be aware that when queues fail their consumers will becancelled and they will need to reconsume - see thedocumentation for more details.

Atthe Producer

Whenusing confirms, producers recovering from a channel or connectionfailure should retransmit any messages for which an acknowledgementhas not been received from the broker. There is a possibility ofmessage duplication here, because the broker might have sent aconfirmation that never reached the producer (due to networkfailures, etc). Thereforeconsumer applications will need to perform deduplication or handleincoming messages in an idempotent manner.

EnsuringMessages are Routed

Insome circumstances it can be important for producers to ensure thattheir messages are being routed to queues (although not always - inthe case of a pub-sub system producers will just publish and if noconsumers are interested it is correct for messages to be dropped).

Toensure messages are routed to a single known queue, the producer canjust declare a destination queue and publish directly to it. Ifmessages may be routed in more complex ways but the producer stillneeds to know if they reached at least one queue, it can set themandatory flag on a basic.publish, ensuring that a basic.return(containing a reply code and some textual explanation) will be sentback to the client if no queues were appropriately bound.

Producersshould also be aware that when publishing to a clustered node, if oneor more destination queues that are bound to the exchange havemirrors in the cluster, it's possible to incur delays in the face ofnetwork failures between nodes, due to flow control between replicasand the master queue process. See herefor more details.

Atthe Consumer

Inthe event of network failure (or a node crashing), messages can beduplicated, and consumers must be prepared to handle them. Ifpossible, the simplest way to handle this is to ensure that yourconsumers handle messages in an idempotent way rather than explicitlydeal with deduplication.

Ifa message is delivered to a consumer and then requeued (because itwas not acknowledged before the consumer connection dropped, forexample) then RabbitMQ will set the redelivered flag on it when it isdelivered again (whether to the same consumer or a different one).This is a hint that a consumer may have seen this messagebefore (although that's not guaranteed, the message may have made itout of the broker but not into a consumer before the connectiondropped). Conversely if the redelivered flag is not set then it isguaranteed that the message has not been seen before. Therefore ifa consumer finds it more expensive to deduplicate messages or processthem in an idempotent manner, it can do this only for messages withthe redelivered flag set.

ConsumerCancel Notification

Undersome circumstances the server needs to be able to cancel a consumer -since the queue it was consuming from has been deleted, or has failedover. In this case the consumer should consume again but be awarethat it may see messages again which it has already seen.

Notethat consumer cancel notification is a RabbitMQ extension to AMQP,and as such may not be supported by all clients.

MessagesThat Cannot Be Processed

Ifa consumer determines that it cannot handle a message then it canreject it using basic.reject (or basic.nack), either askingthe server to requeue it, or not (in which case the server might beconfigured to dead-letterit instead.

DistributedRabbitMQ

Rabbitprovides two plugins to assist with distributing nodes overunreliable networks: federationand the shovel.Both are implemented as AMQP clients, so if you configure them to useconfirms and acknowledgements, they will retransmit when necessary.Both will use confirms and acknowledgements by default.

Whenconnecting clusters with federation or the shovel, it is desirable toensure that the federation links and shovels tolerate node failures.Federation will automatically distribute links across the downstreamcluster and fail them over on failure of a downstream node. In orderto connect to a new upstream when an upstream node fails you canspecify multiple redundant URIs for an upstream, or connect via a TCPload balancer.

Whenusing the shovel, it is possible to specify redundant brokers in asource or destination clause; however it is not currently possible tomake the shovel itself redundant. We hope to improve this situationin the future; in the mean time a new node can be brought up manuallyto run a shovel if the node it was originally running on fails.


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章