Nifi - 處理器內幾個註解解析

When creating a Processor, the developer is able to provide hints to the framework about how to utilize the Processor most effectively. This is done by applying annotations to the Processor’s class. The annotations that can be applied to a Processor exist in three sub-packages of org.apache.nifi.annotations. Those in the documentation sub-package are used to provide documentation to the user. Those in the lifecycle sub-package instruct the framework which methods should be called on the Processor in order to respond to the appropriate life-cycle events. Those in the behavior package help the framework understand how to interact with the Processor in terms of scheduling and general behavior.
在創建Processor時,框架需要開發者告知如何才能高效地使用該處理器。這部分的工作是通過在Processor內部使用註解實現的,能夠運用於Processor的註解位於org.apache.nifi.annotations中的一個子包內。其他的註解是用於實現Processor類的生命週期類事件。behavior包內的這些東西是爲了幫助框架去理解如何在調度和一般行爲方面與處理器交互。
The following annotations from the org.apache.nifi.annotations.behavior package can be used to modify how the framework will handle your Processor:
以下的標註目的是幫助框架了解如何去使用你的Processor
EventDriven: Instructs the framework that the Processor can be scheduled using the Event-Driven scheduling strategy. This strategy is still experimental at this point, but can result in reduced resource utilization on dataflows that do not handle extremely high data rates.
@EventDriven:提示框架這個處理器能夠使用事件驅動的調度策略,這個調度策略暫時還是在試驗階段,但可以減少低的數據傳輸速率的數據流的資源需要。

SideEffectFree: Indicates that the Processor does not have any side effects external to NiFi. As a result, the framework is free to invoke the Processor many times with the same input without causing any unexpected results to occur. This implies idempotent behavior. This can be used by the framework to improve efficiency by performing actions such as transferring a ProcessSession from one Processor to another, such that if a problem occurs many Processors’ actions can be rolled back and performed again.
@SideEffectFree:提示框架這個處理器對Nifi沒有任何的副作用,框架可以多次無壓力地調用與啓動Processor,只要輸入的數據相同,得到的結果就會一定是相同的,而不會導致任何不可預料的影響。這意味着所謂的冪等效應。可以由框架用於執行操作如從一個處理器到另一個傳輸process session以提高效率,這樣如果出現問題,很多處理器的行動可以被回滾並再次執行。

SupportsBatching: This annotation indicates that it is okay for the framework to batch together multiple ProcessSession commits into a single commit. If this annotation is present, the user will be able to choose whether they prefer high throughput or lower latency in the Processor’s Scheduling tab. This annotation should be applied to most Processors, but it comes with a caveat: if the Processor calls ProcessSession.commit, there is no guarantee that the data has been safely stored in NiFi’s Content, FlowFile, and Provenance Repositories. As a result, it is not appropriate for those Processors that receive data from an external source, commit the session, and then delete the remote data or confirm a transaction with a remote resource.
@SupportsBatching:該標註代表框架可以批量處理多個ProcessorSession。如果此註釋存在,用戶將能夠選擇他們高吞吐量或更低的延遲。該標註可以被應用於大多數處理器。
TriggerSerially: When this annotation is present, the framework will not allow the user to schedule more than one concurrent thread to execute the onTrigger method at a time. Instead, the number of thread (“Concurrent Tasks”) will always be set to 1. This does not, however, mean that the Processor does not have to be thread-safe, as the thread that is executing onTrigger may change between invocations.

TriggerWhenAnyDestinationAvailable: By default, NiFi will not schedule a Processor to run if any of its outbound queues is full. This allows back-pressure to be applied all the way a chain of Processors. However, some Processors may need to run even if one of the outbound queues is full. This annotations indicates that the Processor should run if any Relationship is “available.” A Relationship is said to be “available” if none of the connections that use that Relationship is full. For example, the DistributeLoad Processor makes use of this annotation. If the “round robin” scheduling strategy is used, the Processor will not run if any outbound queue is full. However, if the “next available” scheduling strategy is used, the Processor will run if any Relationship at all is available and will route FlowFiles only to those relationships that are available.

TriggerWhenEmpty: The default behavior is to trigger a Processor to run only if its input queue has at least one FlowFile or if the Processor has no input queues (which is typical of a “source” Processor). Applying this annotation will cause the framework to ignore the size of the input queues and trigger the Processor regardless of whether or not there is any data on an input queue. This is useful, for example, if the Processor needs to be triggered to run periodically to time out a network connection.

InputRequirement: By default, all Processors will allow users to create incoming connections for the Processor, but if the user does not create an incoming connection, the Processor is still valid and can be scheduled to run. For Processors that are expected to be used as a “Source Processor,” though, this can be confusing to the user, and the user may attempt to send FlowFiles to that Processor, only for the FlowFiles to queue up without being processed. Conversely, if the Processor expects incoming FlowFiles but does not have an input queue, the Processor will be scheduled to run but will perform no work, as it will receive no FlowFile, and this leads to confusion as well. As a result, we can use the @InputRequirement annotation and provide it a value of INPUT_REQUIRED, INPUT_ALLOWED, or INPUT_FORBIDDEN. This provides information to the framework about when the Processor should be made invalid, or whether or not the user should even be able to draw a Connection to the Processor. For instance, if a Processor is annotated with InputRequirement(Requirement.INPUT_FORBIDDEN), then the user will not even be able to create a Connection with that Processor as the destination.

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章