当我们用yaml来存储Map<String,String>时候,用dump方法转yaml,如果map中有不可打印的字符比如 \u0002 \b 等unicode,时候,发现转出来的内容成了 !!binary "5oKj6ICF77yM55S377yMMjTlsoHjgILlm6DovabnpbjkvKTlhaXpmaLvvIzmn6XkvZPlj5HnjrDlt6bkvqfpoqfpqqjpoqflvJPpqqjmipjvvIznnLblkajmt6TmlpHvvIzlvKDlj6Plj5fpmZDvvIzkuIrllIfpurvmnKjjgILooYxDVOS4iee7tOmcnue7hCgw77yONzUgbW3oloTlsYLmiavmj4/vvIxNUFLlhqDnirbpnaLpnJ7nu4Qp77yM5Y+R546w5bem5L6n5LiK6aKM56qm5YmN44CB5aSW5L6n5aOB44CBSGFsAmxlcuawlOaIv+S4iuWjgeWSjOWkluS+p+WjgeOAgeectuS4i+elnue7j+euoeWkluS+p+WjgeOAgeW3puS+p+ectuWkluWjgeOAgemip+W8k+Wkmgrlj5HpqqjotKjkuI3ov57nu63lj4rpqqjnoo7niYfvvIzlt6bkvqfkuIrpooznqqbjgIHlt6bnnLblpJbkvqflo4Hnoo7pqqjniYflkJHnqqblhoXjgIHnnLblhoXlh7npmbfvvIzlt6bot5/lpJbnm7TogoznlaXlj5fljovjgILpvLvpqqjlh7npmbfjgILlj4zkvqfnnLznkIPjgIHop4bnpZ7nu4/mnKrop4HmmI7mmL7lvILluLjjgILlvbHlg4/ljbDosaHvvJrlt6bkvqfpnaLpg6jlpJrlj5HpqqjmipjjgILlkI7ooYzmiYvmnK/mlbTlpI3jgII="
检查发现这是一段base64加密的String,这个就比较蛋疼了,只好看代码,
DumperOptions options = new DumperOptions();
options.setDefaultFlowStyle(FlowStyle.FLOW);
Constructor constructor = new Constructor(Map.class);
Yaml yaml = new Yaml(constructor, new Representer(), options);
yaml.setBeanAccess(BeanAccess.FIELD);
这个是初始化的代码,没什么问题,问题出在 Representer extends SafeRepresenter
protected class RepresentString implements Represent {
public Node representData(Object data) {
Tag tag = Tag.STR;
Character style = null;
String value = data.toString();
if (StreamReader.NON_PRINTABLE.matcher(value).find()) {
tag = Tag.BINARY;
char[] binary;
try {
binary = Base64Coder.encode(value.getBytes("UTF-8"));
} catch (UnsupportedEncodingException e) {
throw new YAMLException(e);
}
value = String.valueOf(binary);
style = '|';
}
// if no other scalar style is explicitly set, use literal style for
// multiline scalars
if (defaultScalarStyle == null && MULTILINE_PATTERN.matcher(value).find()) {
style = '|';
}
return representScalar(tag, value, style);
}
}
其中
public final static Pattern NON_PRINTABLE = Pattern
.compile("[^\t\n\r\u0020-\u007E\u0085\u00A0-\uD7FF\uE000-\uFFFD]");
这就明白了,在dump过程中,转换string时候回检查是否有不可打印的字符,如果有,就用base64来表示了,知道了原因,我们就有多种方法了处理了,一个是可以转之前去掉不可显示字符,也可以先将其转成base64,在反序列化成Java对象时候,特殊处理,再将base64转回来