CDH6.2集成sparkThrift服务 参考 https://blog.csdn.net/qq_34864753/article/details/102729859
公司网络组为了安全起见购买了 华为的防火墙,安装上后 启动的sparkThrift服务隔2小时10分钟后就会断开,而且是正常断开
sparkThrift日志
2020-01-17 13:25:25 INFO HiveSessionImpl:318 - Operation log session directory is created: /var/log/hive/operation_logs/5f686f2b-c5fe-4e4b-813c-5e6596ef0f68
2020-01-17 13:33:56 ERROR YarnClientSchedulerBackend:70 - YARN application has exited unexpectedly with state SUCCEEDED! Check the YARN application logs for more details.
2020-01-17 13:33:56 INFO HiveServer2:112 - Shutting down HiveServer2
2020-01-17 13:33:56 INFO ThriftCLIService:188 - Thrift server has stopped
2020-01-17 13:33:56 INFO AbstractService:125 - Service:ThriftBinaryCLIService is stopped.
2020-01-17 13:33:56 INFO AbstractService:125 - Service:OperationManager is stopped.
2020-01-17 13:33:56 INFO AbstractService:125 - Service:SessionManager is stopped.
2020-01-17 13:33:56 INFO AbstractConnector:318 - Stopped Spark@201aa8c1{HTTP/1.1,[http/1.1]}{0.0.0.0:4041}
2020-01-17 13:33:56 INFO SparkUI:54 - Stopped Spark web UI at http://zmbd-vpc-wk01:4041
2020-01-17 13:33:56 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 2.
2020-01-17 13:33:56 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 3.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 14.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 11.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 15.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 10.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 5.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 13.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 4.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 6.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 12.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 1.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 7.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 8.
2020-01-17 13:33:57 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Disabling executor 9.
2020-01-17 13:34:06 INFO AbstractService:125 - Service:CLIService is stopped.
2020-01-17 13:34:06 INFO AbstractService:125 - Service:HiveServer2 is stopped.
2020-01-17 13:34:30 INFO ThriftCLIService:107 - Session disconnected without closing properly, close it now
2020-01-17 13:34:30 ERROR HiveSessionImpl:691 - Failed to cleanup session log dir: SessionHandle [5675bfe5-18e4-45d5-8ce3-e83de8c09530]
java.io.FileNotFoundException: File does not exist: /var/log/hive/operation_logs/5675bfe5-18e4-45d5-8ce3-e83de8c09530
从日志上看是正常退出的,yarn上的日志错误显示为 与sparkThrift的driver端连接断开
百度,谷歌搜不到原因,本来好好的服务突然就出现了异常,那就看网络组或者安全组做了什么变更,公司的网络组刚安装了防火墙,只能让他们帮忙查找问题,最终定位到的问题是防火墙的策略导致的,隔段时间连接就会老化,后面更新了策略服务就正常了
说下总结的经验吧:大家在服务出现问题排查过程中,可能就不是自己的问题,不要在查找问题上纠结很久,如果查了2、3个小时还找不出问题,那就问一下与服务相关方有没有动过什么,网络,安全都有可能会影响到服务的稳定性,更多有经验的人参与到查找问题,问题就越容易解决