FastDFS大量请求时异常解决记录
起因
最近做毕业设计,图片文件的保存用到了FastDFS,普通地用发现并没有什么问题,但是当大量的图片请求涌向FastDFS时,会报ClientAbortException,具体如下:
org.apache.catalina.connector.ClientAbortException: java.io.IOException: 你的主机中的软件中止了一个已建立的连接。
at org.apache.catalina.connector.OutputBuffer.realWriteBytes(OutputBuffer.java:356)
at org.apache.catalina.connector.OutputBuffer.flushByteBuffer(OutputBuffer.java:808)
at org.apache.catalina.connector.OutputBuffer.append(OutputBuffer.java:713)
at org.apache.catalina.connector.OutputBuffer.writeBytes(OutputBuffer.java:391)
at org.apache.catalina.connector.OutputBuffer.write(OutputBuffer.java:369)
at org.apache.catalina.connector.CoyoteOutputStream.write(CoyoteOutputStream.java:96)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1793)
at org.apache.commons.io.IOUtils.copyLarge(IOUtils.java:1769)
at org.apache.commons.io.IOUtils.copy(IOUtils.java:1744)
at cn.dmall.common.util.FastDFSClient.downloadFile(FastDFSClient.java:146)
at cn.dmall.manager.controller.ImageController.imageShow(ImageController.java:68)
at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:205)
at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:133)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:97)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:827)
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:738)
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:85)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:967)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:901)
at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:970)
at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:861)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:635)
at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:846)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:230)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:165)
at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:192)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:165)
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(CharacterEncodingFilter.java:197)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:107)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:192)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:165)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:478)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:80)
at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:624)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:341)
at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:783)
at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:798)
at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1441)
at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:748)
原因
先提前说一下原因,是因为FastDFS的server端设置最大连接数为256,当超过这个连接数后,就会抱着个异常。
解决方案
方法1:最简单粗暴的就是增加server端的max_connections参数。
提示:如果trackerServer是你自己创建的(不是FastDFS客户端在内部创建的,具体可以“排查过程”),它被用完之后不会主动释放。所以,如果每次来一个连接都去新建一个trackerServer,而这些trackerServer不被释放,那么当第256次,就会达到最大连接数,就会报异常,建议一定优化一下。我的解决方案是对trackerServer连接进行了“池化”,建立一个连接池,对创建完毕的trackerServer连接进行复用。
方法2:比较方便的是nginx+FastDFS,nginx配置缓存,然后注意一下上面的提示即可。
排查过程
问题复现与分析
使用大量数据进行访问到报错
在win10环境下,使用netstat -an | find "你的ip" \c
命令查看到对FastDFS的连接数
发现连接到tracker的数量为255,并且都为established,猜测连接到tracker的连接达到了服务端设定的连接最大值了
追踪源码
为了对刚刚的猜测进行检验,我去翻了FastDFS官方client的源码,发现自己使用trackerClient.getConnection();
获取TrackerServer的时候,会给连接用的socket设置ReuseAddress和SoTimeout(注意soTimeout指的是read的超时时间)
该部分源码如下:
public TrackerServer getConnection(int serverIndex) throws IOException {
Socket sock = new Socket();
sock.setReuseAddress(true);
sock.setSoTimeout(ClientGlobal.g_network_timeout);
sock.connect(this.tracker_servers[serverIndex], ClientGlobal.g_connect_timeout);
return new TrackerServer(sock, this.tracker_servers[serverIndex]);
}
然后去查看访问一个文件的一整个流程的源码,发现像这种我们自己trackerClient.getConnection()获得的trackerServer不会被关闭。
使用连接池解决
由于一开始是trackerServer连接太多导致的异常,所以我准备对trackerServer连接进行池化,FastDfsConnectionPool实现如下所示:
package cn.dmall.common.util;
import org.csource.fastdfs.ClientGlobal;
import org.csource.fastdfs.TrackerClient;
import org.csource.fastdfs.TrackerServer;
import java.io.IOException;
import java.net.Socket;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.LinkedBlockingQueue;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicInteger;
public class FastDfsConnectionPool {
private static final String CONFIG_FILENAME = "fdfs_client.conf";
private static BlockingQueue<TrackerServer> trackerServerPool =
new LinkedBlockingQueue<>();
private static final int maxTrackerConn = Constant.getInt("fastdfs.tracker.maxConn", 255);
private static AtomicInteger currentTrackerConn = new AtomicInteger(0);
private static TrackerClient trackerClient = null;
static {
try {
ClientGlobal.init(CONFIG_FILENAME);
trackerClient = new TrackerClient();
} catch (Exception e) {
e.printStackTrace();
}
}
public static TrackerServer borrowTrackerServer() throws InterruptedException, IOException {
TrackerServer trackerServer = trackerServerPool.poll(2, TimeUnit.SECONDS);
if (trackerServer == null) {
trackerServer = getNewTrackerConn();
} else {
if (!isUsable(trackerServer)) {
trackerServer.close();
trackerServer = getNewTrackerConn();
}
}
return trackerServer;
}
/**
* trackerConn获取新连接
* 如果当前连接数小于连接最大数
* 则新建一个连接
* 否则从pool中获取或者等待超时
* @return
* @throws IOException
* @throws InterruptedException
*/
private static TrackerServer getNewTrackerConn() throws IOException, InterruptedException {
TrackerServer trackerServer = null;
int current = currentTrackerConn.get();
if (current < maxTrackerConn) {
// 新建一个新连接
trackerServer = trackerClient.getConnection();
} else {
// 从pool中获取,等待其他人把链接使用完毕后还回来
trackerServer = trackerServerPool.poll(2,TimeUnit.SECONDS);
}
if (trackerServer == null) {
throw new IllegalStateException("borrowTrackerServer failed!");
}
return trackerServer;
}
private static boolean isUsable(TrackerServer conn) throws IOException {
if (conn == null) {
return false;
}
Socket socket = conn.getSocket();
return socket != null
&& socket.isBound()
&& !socket.isClosed()
&& socket.isConnected()
&& !socket.isInputShutdown()
&& !socket.isOutputShutdown();
}
public static void releaseTrackerServer(TrackerServer conn) throws IOException {
if (!isUsable(conn)) {
return;
}
trackerServerPool.add(conn);
}
public static void close() throws IOException {
for (int i = 0; i < trackerServerPool.size(); i++) {
TrackerServer trackerServer = trackerServerPool.poll();
if (trackerServer != null) {
trackerServer.close();
}
}
currentTrackerConn.set(0);
}
}
出现的问题
经过对trackerServer的池化后,对trackerServer连接可以进行复用了,实际使用过程中trackerServer连接从255个降到了6个。但是又出现了问题,也就是storageServer的连接出现了同样的问题——请求数量大于了max_connections值导致报了同样的异常。
这里引发了思考:
这里不同于trackerServer,因为trackerServer的时候是有大量空闲连接但是没用,我们用连接池解决了复用空闲连接的问题;而storageServer这边是没用空闲连接(storageServer是用完后就关闭的),所以问题就成了“storageServer机器最大能承受的连接数是多少”,而这个问题不是池化或者复用能解决的。
最后:我觉得可以对图片进行缓存,这样就不必去查询FastDFS了,nginx+FastDFS就有这种功能,所以我们可以专门搭建一个图片服务器。