NiFi 學習 —自己實現處理器

好記憶不如爛筆頭，能記下點東西，就記下點，有時間拿出來看看，也會發覺不一樣的感受.

簡單描述一下自己通過學習nifi和查看源碼，然後自己通過nifi的體現很基礎的實現自己的業務的處理器開發。

適合剛入門的看，有nifi基礎的，可以跳過！

一、獲取一個json文件的文本信息.

1.Json文檔的輸入端，需要一個GenerateFlowFile處理器,做如下配置:

一、獲取一個json文件的文本信息.

假定我們需要在一段json文件中，獲取json串中的某個key所對應的value的值，那麼在nifi的架子上，該如何操作吶？共分爲三步：

1.Json文檔的輸入端，需要一個GenerateFlowFile處理器,做如下配置:

Custom text 存放的是目標json串.json如下所示:{"error_code":0,"reason":"ok","result":{"items":[{"regStatus":"存續","estiblishTime":1495555200000,"regCapital":"","pencertileScore":4902,"type":1,"legalPersonName":"溫旭穎","toco":2,"legalPersonId":2051255554,"name":"陝西蜂窩科技股份有限公司","logo":"","alias":"蜂窩科技","id":3053414776,"category":"723","personType":1,"base":"han"},{"regStatus":"註銷","estiblishTime":1473264000000,"regCapital":"","pencertileScore":3860,"type":1,"legalPersonName":"常青","toco":8,"legalPersonId":1911055314,"name":"陝西蜂窩科技股份有限公司","logo":"","alias":"蜂窩科技","id":2958332903,"category":"721","personType":1,"base":"xj"}],"total":18}}

2.拖拽自己實現的處理器

根據要求配置好自己的json_path的路徑，這樣方便獲取數據.如: $.result.items[*]

自己處理器的主要代碼是：

@Tags({"first-example:fetch value from json string"})
@SideEffectFree
@CapabilityDescription("fetch value from json string.")
public class FirstProcessor extends AbstractProcessor {
   
   private List<PropertyDescriptor> properties;
   
   private Set<Relationship> relationships;
   
   private final String arrayFlag="true";
   
   /**
    * json路徑.
    */
   public static final PropertyDescriptor JSON_PATH = new PropertyDescriptor.Builder()
         .name("Json Path")
         .required(true)
         .description("json path value,such as:$.test")
         .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
         .build();
   
   /**
    * json路徑.
    */
   public static final PropertyDescriptor ARRAY_FLAG = new PropertyDescriptor.Builder()
         .name("Array Flag")
         .required(true)
         .description("mark if the input json is array or not")
         .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
         .allowableValues("true", "false")
         .defaultValue("false")
         .build();
   
   /**
    * 成功標識.
    */
   public static final Relationship SUCCESS = new Relationship.Builder()
         .name("SUCCESS")
         .description("Succes relationship")
         .build();
   
   
   @Override
   public Set<Relationship> getRelationships(){
      return relationships;
   }
   
   @Override
   public List<PropertyDescriptor> getSupportedPropertyDescriptors(){
      return properties;
   }
   
   /**
    * 初始化配置
    * @param context
    */
   @Override
   public void init(final ProcessorInitializationContext context){
      List<PropertyDescriptor> properties = new ArrayList<>();
      properties.add(JSON_PATH);
      properties.add(ARRAY_FLAG);
      this.properties = Collections.unmodifiableList(properties);
      
      Set<Relationship> relationships = new HashSet<>();
      relationships.add(SUCCESS);
      this.relationships = Collections.unmodifiableSet(relationships);
   }
   
   @Override
   public void onTrigger(final ProcessContext context, final ProcessSession session) throws ProcessException {
      final AtomicReference<String> value = new AtomicReference<>();
      FlowFile flowfile = session.get();
      session.read(flowfile, new InputStreamCallback() {
         @Override
         public void process(InputStream in) throws IOException {
            try{
               String json =IOUtils.toString(in, StandardCharsets.UTF_8.name());
               String flag = context.getProperty(ARRAY_FLAG).getValue();
               if (flag.equalsIgnoreCase(arrayFlag)){
                  List<Object> dataList = JsonPath.read(json, context.getProperty(JSON_PATH).getValue());
                  if (ObjectUtils.allNotNull(dataList)){
                     StringBuilder all = new StringBuilder("[");
                     int total = 0;
                     for (Object object : dataList) {
                        LinkedHashMap<String,Object> dataMap = (LinkedHashMap<String, Object>) object;
                        Set<String> keys = dataMap.keySet();
                        int count = 0;
                        StringBuilder builder = new StringBuilder("{");
                        for (String key :keys ) {
                           if (count==keys.size()-1){
                              builder.append("\""+key+"\":\""+dataMap.get(key)+"\"");
                           }else{
                              builder.append("\""+key+"\":\""+dataMap.get(key)+"\",");
                           }
                           count++;
                        }
                        if (total==dataList.size()-1){
                           builder.append("}");
                        }else {
                           builder.append("},");
                        }
                        total++;
                        all.append(builder.toString());
                        builder.reverse();
                     }
                     all.append("]");
                     value.set(all.toString());
                  }
               }else {
                  String result = JsonPath.read(json, context.getProperty(JSON_PATH).getValue());
                  value.set(result);
               }
            }catch(Exception ex){
               ex.printStackTrace();
               getLogger().error("failed to read json string.");
            }
         }
      });
      
      //Write the results to an attribute
      String results = value.get();
      if(results != null && !results.isEmpty()){
         String flag = context.getProperty(ARRAY_FLAG).getValue();
         if (flag.equalsIgnoreCase(arrayFlag)){
            Map<String,String> data=new HashMap<>(16);
            data.put(NiFiConstant.MATCH_ATTR,value.toString());
            flowfile = session.putAllAttributes(flowfile,data);
         }else {
            flowfile = session.putAttribute(flowfile, NiFiConstant.MATCH_ATTR, results);
         }
      }
      
      //To write the results back out ot flow file
      flowfile = session.write(flowfile, new OutputStreamCallback() {
         @Override
         public void process(OutputStream out) throws IOException {
            out.write(value.get().getBytes());
         }
      });
      
      session.transfer(flowfile, SUCCESS);
   }

}

要指明是不是一個json array ，因爲jsonobject 和jsonarray 的解析和接收對象是不一樣的。

3.使用PutFile

指定處理完成之後,文件的輸出地址:

4.整個的流程圖如下：

直接上代碼，按照圖所示來操作，就可以看見對應的文件輸出到目錄裏了。

二、合併文本的內容

假定我們需要在把一個文本內容拼接上另外一個文本內容，那麼在nifi的架子上，該如何操作吶？共分爲三步：

1.Json文檔的輸入端

需要一個GenerateFlowFile處理器,做如下配置:

Custom text 存放的是操作的文本內容，如下所示: 你是哪個？

2.拖拽自己實現的處理器

根據要求配置好自己的input value的值，這樣就可以將a中的文本內容進行拼接:

代碼實現如下：

@Tags({"second-example:Combine two sentences!"})
@SeeAlso({})
@SideEffectFree
@CapabilityDescription("merge two content to one together")
@ReadsAttributes({@ReadsAttribute(attribute="", description="")})
@WritesAttributes({@WritesAttribute(attribute="", description="")})
public class SecondProcessor extends AbstractProcessor {
   
   /**
    * 屬性描述對象集合
    */
   private List<PropertyDescriptor> descriptors;
   /**
    * 關聯關係集合
    */
   private Set<Relationship> relationships;
   /**
    * 文件設置.
    */
   private static final String FILE_NAME = "out-";
   private static final String FILE_SUFFIX = ".txt";
   
   public static final PropertyDescriptor INPUT_VALUE = new PropertyDescriptor.Builder()
         .name("INPUT_VALUE")
         .displayName("INPUT VALUE")
         .description("input value for operating")
         .required(true)
         //非空驗證
         .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
         .build();
   
   public static final Relationship RELATIONSHIP_SUCCESS = new Relationship.Builder()
         .name("sucess")
         .description("example relationship uccess")
         .build();
   
   public static final Relationship RELATIONSHIP_FAILURE = new Relationship.Builder()
         .name("failure")
         .description("example relationship failure")
         .build();
   
   public static final PropertyDescriptor CHARSET = new PropertyDescriptor.Builder()
         .name("character-set")
         .displayName("Character Set")
         .required(true)
         .defaultValue("UTF-8")
         .addValidator(StandardValidators.CHARACTER_SET_VALIDATOR)
         .build();
   
   @Override
   protected void init(final ProcessorInitializationContext context) {
      final List<PropertyDescriptor> descriptors = new ArrayList<PropertyDescriptor>();
      descriptors.add(INPUT_VALUE);
      descriptors.add(CHARSET);
      this.descriptors = Collections.unmodifiableList(descriptors);
      
      final Set<Relationship> relationships = new HashSet<Relationship>();
      relationships.add(RELATIONSHIP_SUCCESS);
      relationships.add(RELATIONSHIP_FAILURE);
      this.relationships = Collections.unmodifiableSet(relationships);
   }
   
   @Override
   public Set<Relationship> getRelationships() {
      return this.relationships;
   }
   
   @Override
   public final List<PropertyDescriptor> getSupportedPropertyDescriptors() {
      return descriptors;
   }
   
   @OnScheduled
   public void onScheduled(final ProcessContext context) {
      getLogger().info("Processor-Name"+context.getName());
      Map<PropertyDescriptor, String> dataMap = context.getProperties();
      for (Map.Entry<PropertyDescriptor, String> entry : dataMap.entrySet()) {
         getLogger().info("key="+entry.getKey().toString()+",value="+entry.getValue());
      }
   }
   
   @Override
   public void onTrigger(final ProcessContext context, final ProcessSession session) throws ProcessException {
      FlowFile flowFile = session.get();
      if ( flowFile == null ) {
         return;
      }
      final AtomicReference<String> value = new AtomicReference<>();
      session.read(flowFile, new InputStreamCallback() {
         @Override
         public void process(InputStream inputStream) throws IOException {
            try{
               String inputVal = IOUtils.toString(inputStream, StandardCharsets.UTF_8.name());
               //utf8 的設置
               final Charset charset = Charset.forName(context.getProperty(CHARSET).getValue());
               getLogger().info("得到字符集結果是:"+charset.name());
               String current = new String(context.getProperty(INPUT_VALUE).getValue().getBytes(charset),StandardCharsets.UTF_8.name());
               String result = "處理結果:" + inputVal + current;
               //以 utf8 的方式把流信息寫出去.
               getLogger().info("處理得到的結果是:"+result);
               value.set(result);
            }catch(Exception ex){
               ex.printStackTrace();
               getLogger().error("failed to read input string!");
            }
         }
      });
      
      String results = value.get();
      if(results != null && !results.isEmpty()){
         flowFile = session.putAttribute(flowFile, NiFiConstant.MATCH_ATTR, results);
      }
      
      //寫入文件信息.
      flowFile = session.write(flowFile, new OutputStreamCallback() {
         @Override
         public void process(OutputStream outputStream) throws IOException {
            getLogger().info("寫出的消息是:"+value.get());
            byte[] content = value.get().getBytes();
            //遠程的輸出流
            outputStream.write(content);
            
            //重新定義本地輸出流.
            outputStream = new FileOutputStream(new File(FILE_NAME+uuid()+FILE_SUFFIX));
            outputStream.write(content);
         }
      });
      session.transfer(flowFile, RELATIONSHIP_SUCCESS);
   }
   
   /**
    * 產生一個32位的GUID
    * @return
    */
   public String uuid() {
      return getIdentifier().replace("-", "").toUpperCase();
   }
}

3.使用PutFile

指定處理完成之後,文件的輸出地址:.

4.整體的流程圖如下所示:

如此就可以完成兩個文本內容的拼接並輸出的操作。

三、給一個文件內容添加頭信息

假定我們需要在把一個文件內部的內容拼接上另外一個文本內容，那麼在nifi的架子上，該如何操作吶？共分爲三步：

1.Json文檔的輸入端

需要一個GenerateFlowFile處理器,做如下配置:

注意file size 和字符集的設置。Custom text 存放的是操作的文本內容，如下所示:你是誰？

2.拖拽自己實現的處理器

根據要求配置好自己的文件的絕對路徑的值，這樣就可以將a中的文本內容進行拼接

需要一個絕對的文件路徑，具體代碼如下：

@Tags({"third-example:deal with content!"})
@SeeAlso({})
@SideEffectFree
@CapabilityDescription("add prefix to given content.")
@ReadsAttributes({@ReadsAttribute(attribute="", description="")})
@WritesAttributes({@WritesAttribute(attribute="", description="")})
public class ThirdProcessor extends AbstractProcessor {
   
   /**
    * 屬性描述對象集合
    */
   private List<PropertyDescriptor> descriptors;
   /**
    * 關聯關係集合
    */
   private Set<Relationship> relationships;
   /**
    * 文件設置.
    */
   private static final String FILE_NAME = "combine-";
   private static final String FILE_SUFFIX = ".txt";
   
   public static final PropertyDescriptor ABSOLUTE_PATH = new PropertyDescriptor.Builder()
         .name("ABSOLUTE_PATH")
         .displayName("ABSOLUT PATH")
         .description("input file path for operating")
         .required(true)
         //非空驗證
         .addValidator(StandardValidators.NON_EMPTY_VALIDATOR)
         .build();
   
   public static final Relationship SHIP_SUCCESS = new Relationship.Builder()
         .name("sucess")
         .description("example relationship uccess")
         .build();
   
   public static final Relationship SHIP_FAILURE = new Relationship.Builder()
         .name("failure")
         .description("example relationship failure")
         .build();
   
   public static final PropertyDescriptor CHARSET = new PropertyDescriptor.Builder()
         .name("character-set")
         .displayName("Character Set")
         .required(true)
         .defaultValue("UTF-8")
         .addValidator(StandardValidators.CHARACTER_SET_VALIDATOR)
         .build();
   
   @Override
   protected void init(final ProcessorInitializationContext context) {
      final List<PropertyDescriptor> descriptors = new ArrayList<PropertyDescriptor>();
      descriptors.add(ABSOLUTE_PATH);
      descriptors.add(CHARSET);
      this.descriptors = Collections.unmodifiableList(descriptors);
      
      final Set<Relationship> ships = new HashSet<Relationship>();
      ships.add(SHIP_SUCCESS);
      ships.add(SHIP_FAILURE);
      this.relationships = Collections.unmodifiableSet(ships);
   }
   
   @Override
   public Set<Relationship> getRelationships() {
      return this.relationships;
   }
   
   @Override
   public final List<PropertyDescriptor> getSupportedPropertyDescriptors() {
      return descriptors;
   }
   
   @OnScheduled
   public void onScheduled(final ProcessContext context) {
      getLogger().info("Processor-Name"+context.getName());
      Map<PropertyDescriptor, String> dataMap = context.getProperties();
      for (Map.Entry<PropertyDescriptor, String> entry : dataMap.entrySet()) {
         getLogger().info("key="+entry.getKey().toString()+",value="+entry.getValue());
      }
   }
   
   @Override
   public void onTrigger(final ProcessContext context, final ProcessSession session) throws ProcessException {
      FlowFile flowFile = session.get();
      if ( flowFile == null ) {
         return;
      }
      final AtomicReference<String> value = new AtomicReference<>();
      //utf8 的設置
      final Charset charset = Charset.forName(context.getProperty(CHARSET).getValue());
      session.read(flowFile, new InputStreamCallback() {
         @Override
         public void process(InputStream inputStream) throws IOException {
            try{
               String headerDesc = IOUtils.toString(inputStream, StandardCharsets.UTF_8.name());
               String filePath = context.getProperty(ABSOLUTE_PATH).getValue();
               InputStreamReader inRd = new InputStreamReader(new FileInputStream(filePath),charset);
               BufferedReader reader = new BufferedReader(inRd);
               String line=null;
               StringBuilder  builder = new StringBuilder();
               while (null!=(line=reader.readLine())){
                  getLogger().info("文件信息是:"+line);
                  builder.append(headerDesc+new String(line.getBytes(charset),StandardCharsets.UTF_8.name())+"\n\t");
               }
               //以 utf8 的方式把流信息寫出去.
               getLogger().info("處理得到的結果是:"+builder.toString());
               value.set(builder.toString());
            }catch(Exception ex){
               ex.printStackTrace();
               getLogger().error("failed to read input string!");
            }
         }
      });
      String results = value.get();
      if(results != null && !results.isEmpty()){
         flowFile = session.putAttribute(flowFile, NiFiConstant.MATCH_ATTR, results);
      }
      
      //寫入文件信息.
      flowFile = session.write(flowFile, new OutputStreamCallback() {
         @Override
         public void process(OutputStream outputStream) throws IOException {
            getLogger().info("寫出的消息是:"+value.get());
            byte[] content = value.get().getBytes();
            //遠程的輸出流
            outputStream.write(content);
            
            //重新定義本地輸出流.
            outputStream = new FileOutputStream(new File(FILE_NAME+uuid()+FILE_SUFFIX));
            outputStream.write(content);
         }
      });
      session.transfer(flowFile, SHIP_SUCCESS);
   }
   
   /**
    * 產生一個32位的GUID
    * @return
    */
   public String uuid() {
      return getIdentifier().replace("-", "").toUpperCase();
   }
}

3.使用PutFile

指定處理完成之後,文件的輸出地址

4.整體的流程圖如下所示

如此挨個執行，不報錯的情況下，就可以看見執行的結果了。

本文只是一個簡單描述下如何基於nifi框架，來實現自己的業務邏輯，下一篇我就複雜使用下，看看如何操作。

如有不明白的，請微信搜索公衆號：codingba ,我會一一解答。

NiFi 學習 —自己實現處理器

一、獲取一個json文件的文本信息.

1.Json文檔的輸入端，需要一個GenerateFlowFile處理器,做如下配置:

2.拖拽自己實現的處理器

3.使用PutFile

二、合併文本的內容

1.Json文檔的輸入端

2.拖拽自己實現的處理器

3.使用PutFile

4.整體的流程圖如下所示:

三、給一個文件內容添加頭信息

1.Json文檔的輸入端

2.拖拽自己實現的處理器

3.使用PutFile

4.整體的流程圖如下所示

HTML頁面關於高分屏的設置

北歐瑞典挪威芬蘭瑞士TikTok海外網紅與YouTube博主的合作模式

歐洲英國德國法國TikTok與YouTube海外網紅達人的完美合作策略

druid數據源 xml配置

springboot 服務器腳本啓動和關閉

當update修改數據與原數據相同時會再次執行嗎

NiFi 學習 — kafka入庫mysql

Mysql Statement 批處理異常

NiFi 學習 —自己實現處理器

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結

NiFi 學習 —自己實現處理器

一、 獲取一個json文件的文本信息.

1.Json文檔的輸入端，需要一個GenerateFlowFile處理器,做如下配置:

2.拖拽自己實現的處理器

3.使用PutFile

二、合併文本的內容

1.Json文檔的輸入端

2.拖拽自己實現的處理器

3.使用PutFile

4.整體的流程圖如下所示:

三、給一個文件內容添加頭信息

1.Json文檔的輸入端

2.拖拽自己實現的處理器

3.使用PutFile

4.整體的流程圖如下所示

一、獲取一個json文件的文本信息.