Connection reset by peer

部署项目时A服务启动失败,报错:

14-Aug-2019 12:52:49.860 SEVERE [main] org.springframework.web.context.ContextLoader.initWebApplicationContext Context initialization failed
        org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'adminService': Unsatisfied dependency expressed through field 'adminDao'; nested exception is org.springframework.beans.factory.UnsatisfiedDependencyException: Error creating bean with name 'adminDao': Unsatisfied dependency expressed through field 'jdbcTemplate'; nested exception is org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'jdbcTemplate' defined in URL .
                at org.springframework.beans.factory.support.ConstructorResolver.autowireConstructor(ConstructorResolver.java:275)
                ... 80 more
        Caused by: com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool: IO Error: Connection reset by peer, Authentication lapse 98879 ms.
                at com.zaxxer.hikari.pool.HikariPool.throwPoolInitializationException(HikariPool.java:576)
                ... 82 more
        Caused by: java.sql.SQLRecoverableException: IO Error: Connection reset by peer, Authentication lapse 98879 ms.
                at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:794)
                at oracle.jdbc.driver.PhysicalConnection.connect(PhysicalConnection.java:688)
                ... 89 more
        Caused by: java.io.IOException: Connection reset by peer, Authentication lapse 98879 ms.
                at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:790)
                ... 97 more
        Caused by: java.io.IOException: Connection reset by peer
                at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
                at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
                ... 97 more

这个错误是间歇性发作的,本例中大多数情况失败,极少数情况(1次)项目启动成功。

因为A服务需要连接到到B服务器的数据库,但是能够确认使用的数据库账户名和密码无误。

进一步的,connection reset by peer的含义是往对端写数据的时候,对端提示已经关闭了连接。一般对一个已经被关闭的socket的写操作会提示这个错误。

所以怀疑是是数据库服务器接收了大量连接,超过最大连接数后主动关闭了部分连接,导致客户端报出connection reset by peer,但是查看了数据库最大连接数1000,已使用数只有140多,所以还不是数据库的问题。

问题解决链接

Based on the symptoms of "happening intermittently". It appears to be a known issue around "/dev/random" and "/dev/urandom".
基于这种间歇性发生的征状,这似乎是一个关于“dev/random”和“dev/urandom”的已知问题。

Tried as suggested below and worked around it:
尝试使用下面的建议去解决这个问题:

1. Open the $JAVA_HOME/jre/lib/security/java.security file in a text editor.
打开JAVA_HOME下的java.security文件。

2. Change the line:
securerandom.source=file:/dev/random
to read:
securerandom.source=file:/dev/urandom
将配置项securerandom.source的值改为file:/dev/urandom。

3. Save your change and exit the text editor.
保存并退出。

Oracle官方链接

The library used for random number generation in Sun's JVM relies on /dev/random by default for UNIX platforms. 
This can potentially block the WebLogic SIP Server process because on some operating systems /dev/random waits for a certain amount of "noise" to be generated on the host machine before returning a result. 
Although /dev/random is more secure, BEA recommends using /dev/urandom if the default JVM configuration delays WebLogic SIP Server startup.

在Sun的JVM中,用于随机数生成的lib库默认依赖于UNIX平台的/dev/random。
这可能会阻止Weblogic SIP服务器进程,因为在某些操作系统上/dev/random会在返回结果之前等待主机上生成一定数量的"noise",这个等待过程会造成阻塞。
虽然/dev/random更安全,但是如果默认的JVM配置延迟了Weblogic SIP服务器的启动,那么BEA建议使用/dev/urandom。

回到当前的的例子,当A服务的jvm平台使用/dev/random时,由于等待生成noise而造成了阻塞,导致B机器由于超时或其他原因关闭了socket,此时当A服务再次向该socket写数据时,报出了connection reset by peer的错误。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章