好記憶不如爛筆頭,能記下點東西,就記下點,有時間拿出來看看,也會發覺不一樣的感受.
簡單描述一下自己通過學習nifi和查看源碼,然後自己通過nifi的體現很基礎的實現自己的業務的處理器開發。
適合剛入門的看,有nifi基礎的,可以跳過!
目錄
1.Json文檔的輸入端,需要一個GenerateFlowFile處理器,做如下配置:
一、 獲取一個json文件的文本信息.
假定我們需要在一段json文件中,獲取json串中的某個key所對應的value的值,那麼在nifi的架子上,該如何操作吶?共分爲三步:
1.Json文檔的輸入端,需要一個GenerateFlowFile處理器,做如下配置:
Custom text 存放的是目標json串.json如下所示:{"error_code":0,"reason":"ok","result":{"items":[{"regStatus":"存續","estiblishTime":1495555200000,"regCapital":"","pencertileScore":4902,"type":1,"legalPersonName":"溫旭穎","toco":2,"legalPersonId":2051255554,"name":"陝西蜂窩科技股份有限公司","logo":"","alias":"蜂窩科技","id":3053414776,"category":"723","personType":1,"base":"han"},{"regStatus":"註銷","estiblishTime":1473264000000,"regCapital":"","pencertileScore":3860,"type":1,"legalPersonName":"常青","toco":8,"legalPersonId":1911055314,"name":"陝西蜂窩科技股份有限公司","logo":"","alias":"蜂窩科技","id":2958332903,"category":"721","personType":1,"base":"xj"}],"total":18}}
2.拖拽自己實現的處理器
根據要求配置好自己的json_path的路徑,這樣方便獲取數據.如: $.result.items[*]
自己處理器的主要代碼是:
@Tags({"first-example:fetch value from json string"}) @SideEffectFree @CapabilityDescription("fetch value from json string.") public class FirstProcessor extends AbstractProcessor { private List<PropertyDescriptor> properties; private Set<Relationship> relationships; private final String arrayFlag="true"; /** * json路徑. */ public static final PropertyDescriptor JSON_PATH = new PropertyDescriptor.Builder() .name("Json Path") .required(true) .description("json path value,such as:$.test") .addValidator(StandardValidators.NON_EMPTY_VALIDATOR) .build(); /** * json路徑. */ public static final PropertyDescriptor ARRAY_FLAG = new PropertyDescriptor.Builder() .name("Array Flag") .required(true) .description("mark if the input json is array or not") .addValidator(StandardValidators.NON_EMPTY_VALIDATOR) .allowableValues("true", "false") .defaultValue("false") .build(); /** * 成功標識. */ public static final Relationship SUCCESS = new Relationship.Builder() .name("SUCCESS") .description("Succes relationship") .build(); @Override public Set<Relationship> getRelationships(){ return relationships; } @Override public List<PropertyDescriptor> getSupportedPropertyDescriptors(){ return properties; } /** * 初始化配置 * @param context */ @Override public void init(final ProcessorInitializationContext context){ List<PropertyDescriptor> properties = new ArrayList<>(); properties.add(JSON_PATH); properties.add(ARRAY_FLAG); this.properties = Collections.unmodifiableList(properties); Set<Relationship> relationships = new HashSet<>(); relationships.add(SUCCESS); this.relationships = Collections.unmodifiableSet(relationships); } @Override public void onTrigger(final ProcessContext context, final ProcessSession session) throws ProcessException { final AtomicReference<String> value = new AtomicReference<>(); FlowFile flowfile = session.get(); session.read(flowfile, new InputStreamCallback() { @Override public void process(InputStream in) throws IOException { try{ String json =IOUtils.toString(in, StandardCharsets.UTF_8.name()); String flag = context.getProperty(ARRAY_FLAG).getValue(); if (flag.equalsIgnoreCase(arrayFlag)){ List<Object> dataList = JsonPath.read(json, context.getProperty(JSON_PATH).getValue()); if (ObjectUtils.allNotNull(dataList)){ StringBuilder all = new StringBuilder("["); int total = 0; for (Object object : dataList) { LinkedHashMap<String,Object> dataMap = (LinkedHashMap<String, Object>) object; Set<String> keys = dataMap.keySet(); int count = 0; StringBuilder builder = new StringBuilder("{"); for (String key :keys ) { if (count==keys.size()-1){ builder.append("\""+key+"\":\""+dataMap.get(key)+"\""); }else{ builder.append("\""+key+"\":\""+dataMap.get(key)+"\","); } count++; } if (total==dataList.size()-1){ builder.append("}"); }else { builder.append("},"); } total++; all.append(builder.toString()); builder.reverse(); } all.append("]"); value.set(all.toString()); } }else { String result = JsonPath.read(json, context.getProperty(JSON_PATH).getValue()); value.set(result); } }catch(Exception ex){ ex.printStackTrace(); getLogger().error("failed to read json string."); } } }); //Write the results to an attribute String results = value.get(); if(results != null && !results.isEmpty()){ String flag = context.getProperty(ARRAY_FLAG).getValue(); if (flag.equalsIgnoreCase(arrayFlag)){ Map<String,String> data=new HashMap<>(16); data.put(NiFiConstant.MATCH_ATTR,value.toString()); flowfile = session.putAllAttributes(flowfile,data); }else { flowfile = session.putAttribute(flowfile, NiFiConstant.MATCH_ATTR, results); } } //To write the results back out ot flow file flowfile = session.write(flowfile, new OutputStreamCallback() { @Override public void process(OutputStream out) throws IOException { out.write(value.get().getBytes()); } }); session.transfer(flowfile, SUCCESS); } }
要指明是不是一個json array ,因爲jsonobject 和jsonarray 的解析和接收對象是不一樣的。
3.使用PutFile
指定處理完成之後,文件的輸出地址:
4.整個的流程圖如下:
直接上代碼,按照圖所示來操作,就可以看見對應的文件輸出到目錄裏了。
二、合併文本的內容
假定我們需要在把一個文本內容拼接上另外一個文本內容,那麼在nifi的架子上,該如何操作吶?共分爲三步:
1.Json文檔的輸入端
需要一個GenerateFlowFile處理器,做如下配置:
Custom text 存放的是操作的文本內容,如下所示: 你是哪個 ?
2.拖拽自己實現的處理器
根據要求配置好自己的input value的值,這樣就可以將a中的文本內容進行拼接:
代碼實現如下:
@Tags({"second-example:Combine two sentences!"}) @SeeAlso({}) @SideEffectFree @CapabilityDescription("merge two content to one together") @ReadsAttributes({@ReadsAttribute(attribute="", description="")}) @WritesAttributes({@WritesAttribute(attribute="", description="")}) public class SecondProcessor extends AbstractProcessor { /** * 屬性描述對象集合 */ private List<PropertyDescriptor> descriptors; /** * 關聯關係集合 */ private Set<Relationship> relationships; /** * 文件設置. */ private static final String FILE_NAME = "out-"; private static final String FILE_SUFFIX = ".txt"; public static final PropertyDescriptor INPUT_VALUE = new PropertyDescriptor.Builder() .name("INPUT_VALUE") .displayName("INPUT VALUE") .description("input value for operating") .required(true) //非空驗證 .addValidator(StandardValidators.NON_EMPTY_VALIDATOR) .build(); public static final Relationship RELATIONSHIP_SUCCESS = new Relationship.Builder() .name("sucess") .description("example relationship uccess") .build(); public static final Relationship RELATIONSHIP_FAILURE = new Relationship.Builder() .name("failure") .description("example relationship failure") .build(); public static final PropertyDescriptor CHARSET = new PropertyDescriptor.Builder() .name("character-set") .displayName("Character Set") .required(true) .defaultValue("UTF-8") .addValidator(StandardValidators.CHARACTER_SET_VALIDATOR) .build(); @Override protected void init(final ProcessorInitializationContext context) { final List<PropertyDescriptor> descriptors = new ArrayList<PropertyDescriptor>(); descriptors.add(INPUT_VALUE); descriptors.add(CHARSET); this.descriptors = Collections.unmodifiableList(descriptors); final Set<Relationship> relationships = new HashSet<Relationship>(); relationships.add(RELATIONSHIP_SUCCESS); relationships.add(RELATIONSHIP_FAILURE); this.relationships = Collections.unmodifiableSet(relationships); } @Override public Set<Relationship> getRelationships() { return this.relationships; } @Override public final List<PropertyDescriptor> getSupportedPropertyDescriptors() { return descriptors; } @OnScheduled public void onScheduled(final ProcessContext context) { getLogger().info("Processor-Name"+context.getName()); Map<PropertyDescriptor, String> dataMap = context.getProperties(); for (Map.Entry<PropertyDescriptor, String> entry : dataMap.entrySet()) { getLogger().info("key="+entry.getKey().toString()+",value="+entry.getValue()); } } @Override public void onTrigger(final ProcessContext context, final ProcessSession session) throws ProcessException { FlowFile flowFile = session.get(); if ( flowFile == null ) { return; } final AtomicReference<String> value = new AtomicReference<>(); session.read(flowFile, new InputStreamCallback() { @Override public void process(InputStream inputStream) throws IOException { try{ String inputVal = IOUtils.toString(inputStream, StandardCharsets.UTF_8.name()); //utf8 的設置 final Charset charset = Charset.forName(context.getProperty(CHARSET).getValue()); getLogger().info("得到字符集結果是:"+charset.name()); String current = new String(context.getProperty(INPUT_VALUE).getValue().getBytes(charset),StandardCharsets.UTF_8.name()); String result = "處理結果:" + inputVal + current; //以 utf8 的方式把流信息寫出去. getLogger().info("處理得到的結果是:"+result); value.set(result); }catch(Exception ex){ ex.printStackTrace(); getLogger().error("failed to read input string!"); } } }); String results = value.get(); if(results != null && !results.isEmpty()){ flowFile = session.putAttribute(flowFile, NiFiConstant.MATCH_ATTR, results); } //寫入文件信息. flowFile = session.write(flowFile, new OutputStreamCallback() { @Override public void process(OutputStream outputStream) throws IOException { getLogger().info("寫出的消息是:"+value.get()); byte[] content = value.get().getBytes(); //遠程的輸出流 outputStream.write(content); //重新定義本地輸出流. outputStream = new FileOutputStream(new File(FILE_NAME+uuid()+FILE_SUFFIX)); outputStream.write(content); } }); session.transfer(flowFile, RELATIONSHIP_SUCCESS); } /** * 產生一個32位的GUID * @return */ public String uuid() { return getIdentifier().replace("-", "").toUpperCase(); } }
3.使用PutFile
指定處理完成之後,文件的輸出地址:.
4.整體的流程圖如下所示:
如此就可以完成兩個文本內容的拼接並輸出的操作。
三、給一個文件內容添加頭信息
假定我們需要在把一個文件內部的內容拼接上另外一個文本內容,那麼在nifi的架子上,該如何操作吶?共分爲三步:
1.Json文檔的輸入端
需要一個GenerateFlowFile處理器,做如下配置:
注意file size 和字符集的設置。Custom text 存放的是操作的文本內容,如下所示:你是誰?
2.拖拽自己實現的處理器
根據要求配置好自己的文件的絕對路徑的值,這樣就可以將a中的文本內容進行拼接
需要一個絕對的文件路徑,具體代碼如下:
@Tags({"third-example:deal with content!"}) @SeeAlso({}) @SideEffectFree @CapabilityDescription("add prefix to given content.") @ReadsAttributes({@ReadsAttribute(attribute="", description="")}) @WritesAttributes({@WritesAttribute(attribute="", description="")}) public class ThirdProcessor extends AbstractProcessor { /** * 屬性描述對象集合 */ private List<PropertyDescriptor> descriptors; /** * 關聯關係集合 */ private Set<Relationship> relationships; /** * 文件設置. */ private static final String FILE_NAME = "combine-"; private static final String FILE_SUFFIX = ".txt"; public static final PropertyDescriptor ABSOLUTE_PATH = new PropertyDescriptor.Builder() .name("ABSOLUTE_PATH") .displayName("ABSOLUT PATH") .description("input file path for operating") .required(true) //非空驗證 .addValidator(StandardValidators.NON_EMPTY_VALIDATOR) .build(); public static final Relationship SHIP_SUCCESS = new Relationship.Builder() .name("sucess") .description("example relationship uccess") .build(); public static final Relationship SHIP_FAILURE = new Relationship.Builder() .name("failure") .description("example relationship failure") .build(); public static final PropertyDescriptor CHARSET = new PropertyDescriptor.Builder() .name("character-set") .displayName("Character Set") .required(true) .defaultValue("UTF-8") .addValidator(StandardValidators.CHARACTER_SET_VALIDATOR) .build(); @Override protected void init(final ProcessorInitializationContext context) { final List<PropertyDescriptor> descriptors = new ArrayList<PropertyDescriptor>(); descriptors.add(ABSOLUTE_PATH); descriptors.add(CHARSET); this.descriptors = Collections.unmodifiableList(descriptors); final Set<Relationship> ships = new HashSet<Relationship>(); ships.add(SHIP_SUCCESS); ships.add(SHIP_FAILURE); this.relationships = Collections.unmodifiableSet(ships); } @Override public Set<Relationship> getRelationships() { return this.relationships; } @Override public final List<PropertyDescriptor> getSupportedPropertyDescriptors() { return descriptors; } @OnScheduled public void onScheduled(final ProcessContext context) { getLogger().info("Processor-Name"+context.getName()); Map<PropertyDescriptor, String> dataMap = context.getProperties(); for (Map.Entry<PropertyDescriptor, String> entry : dataMap.entrySet()) { getLogger().info("key="+entry.getKey().toString()+",value="+entry.getValue()); } } @Override public void onTrigger(final ProcessContext context, final ProcessSession session) throws ProcessException { FlowFile flowFile = session.get(); if ( flowFile == null ) { return; } final AtomicReference<String> value = new AtomicReference<>(); //utf8 的設置 final Charset charset = Charset.forName(context.getProperty(CHARSET).getValue()); session.read(flowFile, new InputStreamCallback() { @Override public void process(InputStream inputStream) throws IOException { try{ String headerDesc = IOUtils.toString(inputStream, StandardCharsets.UTF_8.name()); String filePath = context.getProperty(ABSOLUTE_PATH).getValue(); InputStreamReader inRd = new InputStreamReader(new FileInputStream(filePath),charset); BufferedReader reader = new BufferedReader(inRd); String line=null; StringBuilder builder = new StringBuilder(); while (null!=(line=reader.readLine())){ getLogger().info("文件信息是:"+line); builder.append(headerDesc+new String(line.getBytes(charset),StandardCharsets.UTF_8.name())+"\n\t"); } //以 utf8 的方式把流信息寫出去. getLogger().info("處理得到的結果是:"+builder.toString()); value.set(builder.toString()); }catch(Exception ex){ ex.printStackTrace(); getLogger().error("failed to read input string!"); } } }); String results = value.get(); if(results != null && !results.isEmpty()){ flowFile = session.putAttribute(flowFile, NiFiConstant.MATCH_ATTR, results); } //寫入文件信息. flowFile = session.write(flowFile, new OutputStreamCallback() { @Override public void process(OutputStream outputStream) throws IOException { getLogger().info("寫出的消息是:"+value.get()); byte[] content = value.get().getBytes(); //遠程的輸出流 outputStream.write(content); //重新定義本地輸出流. outputStream = new FileOutputStream(new File(FILE_NAME+uuid()+FILE_SUFFIX)); outputStream.write(content); } }); session.transfer(flowFile, SHIP_SUCCESS); } /** * 產生一個32位的GUID * @return */ public String uuid() { return getIdentifier().replace("-", "").toUpperCase(); } }
3.使用PutFile
指定處理完成之後,文件的輸出地址
4.整體的流程圖如下所示
如此挨個執行,不報錯的情況下,就可以看見執行的結果了。
本文只是一個簡單描述下如何基於nifi框架,來實現自己的業務邏輯,下一篇我就複雜使用下,看看如何操作。
如有不明白的,請微信搜索公衆號 :codingba ,我會一一解答。