徹底解決問題的是第三步, 所以,可以直接跳到第三步去看解決方法和步驟,當日第一、第二步是我不斷探索,嘗試解決問題的過程,雖然沒有找到點上,但是還是有些意義的,因爲linux切實有打開資源數量的限制,肯定需要修改的。
異常信息:
- ............
- Oct 17, 2011 5:22:41 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
- SEVERE: Socket accept failed
- java.net.SocketException: Too many open files
- at java.net.PlainSocketImpl.socketAccept(Native Method)
- at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
- at java.net.ServerSocket.implAccept(ServerSocket.java:470)
- at java.net.ServerSocket.accept(ServerSocket.java:438)
- at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:59)
- at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:210)
- at java.lang.Thread.run(Thread.java:636)
- Oct 17, 2011 5:22:43 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
- SEVERE: Socket accept failed
- java.net.SocketException: Too many open files
- at java.net.PlainSocketImpl.socketAccept(Native Method)
- at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
- at java.net.ServerSocket.implAccept(ServerSocket.java:470)
- at java.net.ServerSocket.accept(ServerSocket.java:438)
- at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:59)
- at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:210)
- at java.lang.Thread.run(Thread.java:636)
- Oct 17, 2011 5:22:44 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
- SEVERE: Socket accept failed
- java.net.SocketException: Too many open files
- at java.net.PlainSocketImpl.socketAccept(Native Method)
- at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
- at java.net.ServerSocket.implAccept(ServerSocket.java:470)
- at java.net.ServerSocket.accept(ServerSocket.java:438)
- at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:59)
- at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:210)
- at java.lang.Thread.run(Thread.java:636)
- ............
- ............
- Oct 17, 2011 5:22:41 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
- SEVERE: Socket accept failed
- java.net.SocketException: Too many open files
- at java.net.PlainSocketImpl.socketAccept(Native Method)
- at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
- at java.net.ServerSocket.implAccept(ServerSocket.java:470)
- at java.net.ServerSocket.accept(ServerSocket.java:438)
- at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:59)
- at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:210)
- at java.lang.Thread.run(Thread.java:636)
- Oct 17, 2011 5:22:43 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
- SEVERE: Socket accept failed
- java.net.SocketException: Too many open files
- at java.net.PlainSocketImpl.socketAccept(Native Method)
- at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
- at java.net.ServerSocket.implAccept(ServerSocket.java:470)
- at java.net.ServerSocket.accept(ServerSocket.java:438)
- at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:59)
- at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:210)
- at java.lang.Thread.run(Thread.java:636)
- Oct 17, 2011 5:22:44 PM org.apache.tomcat.util.net.JIoEndpoint$Acceptor run
- SEVERE: Socket accept failed
- java.net.SocketException: Too many open files
- at java.net.PlainSocketImpl.socketAccept(Native Method)
- at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:375)
- at java.net.ServerSocket.implAccept(ServerSocket.java:470)
- at java.net.ServerSocket.accept(ServerSocket.java:438)
- at org.apache.tomcat.util.net.DefaultServerSocketFactory.acceptSocket(DefaultServerSocketFactory.java:59)
- at org.apache.tomcat.util.net.JIoEndpoint$Acceptor.run(JIoEndpoint.java:210)
- at java.lang.Thread.run(Thread.java:636)
- ............
原因:
linux下有有文件限制,結果文件數操作最大限制,導致程序異常:問題是程序中有個靜態方法打開文件時忘記關閉。兩種解決方法,一是設置linux的最大文件打開數量(無法根本解決問題),二是解決程序中的bugs,即消除有問題的代碼。
第一次解決
解決:
方法一、增大系統打開文件的數量(無法根本解決問題)、
1、默認linux同時最大打開文件數量爲1024個,用命令查看如下:ulimit -a:查看系統上受限資源的設置(open files (-n) 1024):
- [root@**** bin]# ulimit -a
- core file size (blocks, -c) 0
- data seg size (kbytes, -d) unlimited
- scheduling priority (-e) 0
- file size (blocks, -f) unlimited
- pending signals (-i) 16384
- max locked memory (kbytes, -l) 32
- max memory size (kbytes, -m) unlimited
- open files (-n) 1024
- pipe size (512 bytes, -p) 8
- POSIX message queues (bytes, -q) 819200
- real-time priority (-r) 0
- stack size (kbytes, -s) 10240
- cpu time (seconds, -t) unlimited
- max user processes (-u) 16384
- virtual memory (kbytes, -v) unlimited
- file locks (-x) unlimited
- [root@**** bin]#
- [root@**** bin]# ulimit -a
- core file size (blocks, -c) 0
- data seg size (kbytes, -d) unlimited
- scheduling priority (-e) 0
- file size (blocks, -f) unlimited
- pending signals (-i) 16384
- max locked memory (kbytes, -l) 32
- max memory size (kbytes, -m) unlimited
- open files (-n) 1024
- pipe size (512 bytes, -p) 8
- POSIX message queues (bytes, -q) 819200
- real-time priority (-r) 0
- stack size (kbytes, -s) 10240
- cpu time (seconds, -t) unlimited
- max user processes (-u) 16384
- virtual memory (kbytes, -v) unlimited
- file locks (-x) unlimited
- [root@**** bin]#
2、可以修改同時打開文件的最大數基本可以解決:ulimit -n 4096
- [root@**** bin]# ulimit -n 4096
- [root@**** bin]# ulimit -a
- core file size (blocks, -c) 0
- data seg size (kbytes, -d) unlimited
- scheduling priority (-e) 0
- file size (blocks, -f) unlimited
- pending signals (-i) 16384
- max locked memory (kbytes, -l) 32
- max memory size (kbytes, -m) unlimited
- open files (-n) 4096
- pipe size (512 bytes, -p) 8
- POSIX message queues (bytes, -q) 819200
- real-time priority (-r) 0
- stack size (kbytes, -s) 10240
- cpu time (seconds, -t) unlimited
- max user processes (-u) 16384
- virtual memory (kbytes, -v) unlimited
- file locks (-x) unlimited
- [root@**** bin]#
- [root@**** bin]# ulimit -n 4096
- [root@**** bin]# ulimit -a
- core file size (blocks, -c) 0
- data seg size (kbytes, -d) unlimited
- scheduling priority (-e) 0
- file size (blocks, -f) unlimited
- pending signals (-i) 16384
- max locked memory (kbytes, -l) 32
- max memory size (kbytes, -m) unlimited
- open files (-n) 4096
- pipe size (512 bytes, -p) 8
- POSIX message queues (bytes, -q) 819200
- real-time priority (-r) 0
- stack size (kbytes, -s) 10240
- cpu time (seconds, -t) unlimited
- max user processes (-u) 16384
- virtual memory (kbytes, -v) unlimited
- file locks (-x) unlimited
- [root@**** bin]#
已經修改了最大打開文件數。
方法二、修改程序中的bugs:
程序中有個靜態的方法打開文件後,沒有關閉文件,導致每次請求都會去打開文件,在程序中填入關閉輸入流的操作即可以:
- public static List<GpsPoint> getArrayList() throws IOException {
- List<GpsPoint> pointList = null;
- // 讀取配置文件
- InputStream in = ParseGpsFile.class.getClassLoader().getResourceAsStream("GPS1.TXT");
- // 讀路徑出錯,換另一種方式讀取配置文件
- if (null == in) {
- System.out.println("讀取文件失敗");
- return pointList;
- }
- pointList = new ArrayList<GpsPoint>();
- try {
- BufferedReader br = new BufferedReader(new InputStreamReader(in));
- String longtude = "";
- String latude = "";
- String elevation = "";
- while ((longtude = br.readLine()) != null) {
- // 讀下一行數據,讀緯度
- latude = br.readLine();
- if (null == latude) {
- // 退出循環
- break;
- }
- // 讀下一行數據,讀海拔
- elevation = br.readLine();
- if (null == latude) {
- // 退出循環
- break;
- }
- // 加入一個點
- pointList.add(gps2point(longtude, latude, elevation));
- }
- in.close();
- System.out.println("\n\n");
- } catch (Exception e) {
- in.close();
- e.printStackTrace();
- }
- return pointList;
- }
- public static List<GpsPoint> getArrayList() throws IOException {
- List<GpsPoint> pointList = null;
- // 讀取配置文件
- InputStream in = ParseGpsFile.class.getClassLoader().getResourceAsStream("GPS1.TXT");
- // 讀路徑出錯,換另一種方式讀取配置文件
- if (null == in) {
- System.out.println("讀取文件失敗");
- return pointList;
- }
- pointList = new ArrayList<GpsPoint>();
- try {
- BufferedReader br = new BufferedReader(new InputStreamReader(in));
- String longtude = "";
- String latude = "";
- String elevation = "";
- while ((longtude = br.readLine()) != null) {
- // 讀下一行數據,讀緯度
- latude = br.readLine();
- if (null == latude) {
- // 退出循環
- break;
- }
- // 讀下一行數據,讀海拔
- elevation = br.readLine();
- if (null == latude) {
- // 退出循環
- break;
- }
- // 加入一個點
- pointList.add(gps2point(longtude, latude, elevation));
- }
- in.close();
- System.out.println("\n\n");
- } catch (Exception e) {
- in.close();
- e.printStackTrace();
- }
- return pointList;
- }
問題徹底解決
-----
第二次解決:
實際測試後發現這個問題還沒有解決,最終又找了些方法,經過一段時間的測試,似乎解決了問題:
具體步驟如下:
linux爲redhat服務器版本(非個人版),必須設置的內容:
1、/etc/pam.d/login 添加
- session required /lib/security/pam_limits.so
- session required /lib/security/pam_limits.so
# 注意看這個文件的註釋
具體文件的內容爲:
- [root@**** ~]# vi /etc/pam.d/login
- #%PAM-1.0
- auth [user_unknown=ignore success=ok ignore=ignore default=bad] pam_securetty.so
- auth include system-auth
- account required pam_nologin.so
- account include system-auth
- password include system-auth
- # pam_selinux.so close should be the first session rule
- session required pam_selinux.so close
- session optional pam_keyinit.so force revoke
- session required pam_loginuid.so
- session include system-auth
- session optional pam_console.so
- # pam_selinux.so open should only be followed by sessions to be executed in the user context
- session required pam_selinux.so open
- ~
- "/etc/pam.d/login" 15L, 644C
- [root@**** ~]# vi /etc/pam.d/login
- #%PAM-1.0
- auth [user_unknown=ignore success=ok ignore=ignore default=bad] pam_securetty.so
- auth include system-auth
- account required pam_nologin.so
- account include system-auth
- password include system-auth
- # pam_selinux.so close should be the first session rule
- session required pam_selinux.so close
- session optional pam_keyinit.so force revoke
- session required pam_loginuid.so
- session include system-auth
- session optional pam_console.so
- # pam_selinux.so open should only be followed by sessions to be executed in the user context
- session required pam_selinux.so open
- ~
- "/etc/pam.d/login" 15L, 644C
修改後的內容:
- -bash: [root@****: command not found
- [root@**** ~]# clear
- [root@**** ~]# cat /etc/pam.d/login
- #%PAM-1.0
- auth [user_unknown=ignore success=ok ignore=ignore default=bad] pam_securetty.so
- auth include system-auth
- account required pam_nologin.so
- account include system-auth
- password include system-auth
- # pam_selinux.so close should be the first session rule
- session required pam_selinux.so close
- session optional pam_keyinit.so force revoke
- session required pam_loginuid.so
- session include system-auth
- session optional pam_console.so
- # pam_selinux.so open should only be followed by sessions to be executed in the user context
- session required pam_selinux.so open
- # kevin.xie added, fixed 'too many open file' bug, limit open max files 1024, 2011-10-24
- session required /lib/security/pam_limits.so
- [root@**** ~]#
- -bash: [root@****: command not found
- [root@**** ~]# clear
- [root@**** ~]# cat /etc/pam.d/login
- #%PAM-1.0
- auth [user_unknown=ignore success=ok ignore=ignore default=bad] pam_securetty.so
- auth include system-auth
- account required pam_nologin.so
- account include system-auth
- password include system-auth
- # pam_selinux.so close should be the first session rule
- session required pam_selinux.so close
- session optional pam_keyinit.so force revoke
- session required pam_loginuid.so
- session include system-auth
- session optional pam_console.so
- # pam_selinux.so open should only be followed by sessions to be executed in the user context
- session required pam_selinux.so open
- # kevin.xie added, fixed 'too many open file' bug, limit open max files 1024, 2011-10-24
- session required /lib/security/pam_limits.so
- [root@**** ~]#
2. /etc/security/limits.conf 添加
- root – nofile 1006154
- root – nofile 1006154
root 是一個用戶,如果是想所有用戶生效的話換成 * ,設置的數值與硬件配置有關,別設置太大了。
修改前內容
- [root@**** ~]# cat /etc/security/limits.conf
- # /etc/security/limits.conf
- #
- #Each line describes a limit for a user in the form:
- #
- #<domain> <type> <item> <value>
- #
- #Where:
- #<domain> can be:
- # - an user name
- # - a group name, with @group syntax
- # - the wildcard *, for default entry
- # - the wildcard %, can be also used with %group syntax,
- # for maxlogin limit
- #
- #<type> can have the two values:
- # - "soft" for enforcing the soft limits
- # - "hard" for enforcing hard limits
- #
- #<item> can be one of the following:
- # - core - limits the core file size (KB)
- # - data - max data size (KB)
- # - fsize - maximum filesize (KB)
- # - memlock - max locked-in-memory address space (KB)
- # - nofile - max number of open files
- # - rss - max resident set size (KB)
- # - stack - max stack size (KB)
- # - cpu - max CPU time (MIN)
- # - nproc - max number of processes
- # - as - address space limit
- # - maxlogins - max number of logins for this user
- # - maxsyslogins - max number of logins on the system
- # - priority - the priority to run user process with
- # - locks - max number of file locks the user can hold
- # - sigpending - max number of pending signals
- # - msgqueue - max memory used by POSIX message queues (bytes)
- # - nice - max nice priority allowed to raise to
- # - rtprio - max realtime priority
- #
- #<domain> <type> <item> <value>
- #
- #* soft core 0
- #* hard rss 10000
- #@student hard nproc 20
- #@faculty soft nproc 20
- #@faculty hard nproc 50
- #ftp hard nproc 0
- #@student - maxlogins 4
- # End of file
- [root@**** ~]#
- [root@**** ~]# cat /etc/security/limits.conf
- # /etc/security/limits.conf
- #
- #Each line describes a limit for a user in the form:
- #
- #<domain> <type> <item> <value>
- #
- #Where:
- #<domain> can be:
- # - an user name
- # - a group name, with @group syntax
- # - the wildcard *, for default entry
- # - the wildcard %, can be also used with %group syntax,
- # for maxlogin limit
- #
- #<type> can have the two values:
- # - "soft" for enforcing the soft limits
- # - "hard" for enforcing hard limits
- #
- #<item> can be one of the following:
- # - core - limits the core file size (KB)
- # - data - max data size (KB)
- # - fsize - maximum filesize (KB)
- # - memlock - max locked-in-memory address space (KB)
- # - nofile - max number of open files
- # - rss - max resident set size (KB)
- # - stack - max stack size (KB)
- # - cpu - max CPU time (MIN)
- # - nproc - max number of processes
- # - as - address space limit
- # - maxlogins - max number of logins for this user
- # - maxsyslogins - max number of logins on the system
- # - priority - the priority to run user process with
- # - locks - max number of file locks the user can hold
- # - sigpending - max number of pending signals
- # - msgqueue - max memory used by POSIX message queues (bytes)
- # - nice - max nice priority allowed to raise to
- # - rtprio - max realtime priority
- #
- #<domain> <type> <item> <value>
- #
- #* soft core 0
- #* hard rss 10000
- #@student hard nproc 20
- #@faculty soft nproc 20
- #@faculty hard nproc 50
- #ftp hard nproc 0
- #@student - maxlogins 4
- # kevin.xie added, fixed 'too many open file' bug, limit open max files 1024, 2011-10-24
- * - nofile 102400
- # End of file
- [root@**** ~]#
- [root@**** ~]# cat /etc/security/limits.conf
- # /etc/security/limits.conf
- #
- #Each line describes a limit for a user in the form:
- #
- #<domain> <type> <item> <value>
- #
- #Where:
- #<domain> can be:
- # - an user name
- # - a group name, with @group syntax
- # - the wildcard *, for default entry
- # - the wildcard %, can be also used with %group syntax,
- # for maxlogin limit
- #
- #<type> can have the two values:
- # - "soft" for enforcing the soft limits
- # - "hard" for enforcing hard limits
- #
- #<item> can be one of the following:
- # - core - limits the core file size (KB)
- # - data - max data size (KB)
- # - fsize - maximum filesize (KB)
- # - memlock - max locked-in-memory address space (KB)
- # - nofile - max number of open files
- # - rss - max resident set size (KB)
- # - stack - max stack size (KB)
- # - cpu - max CPU time (MIN)
- # - nproc - max number of processes
- # - as - address space limit
- # - maxlogins - max number of logins for this user
- # - maxsyslogins - max number of logins on the system
- # - priority - the priority to run user process with
- # - locks - max number of file locks the user can hold
- # - sigpending - max number of pending signals
- # - msgqueue - max memory used by POSIX message queues (bytes)
- # - nice - max nice priority allowed to raise to
- # - rtprio - max realtime priority
- #
- #<domain> <type> <item> <value>
- #
- #* soft core 0
- #* hard rss 10000
- #@student hard nproc 20
- #@faculty soft nproc 20
- #@faculty hard nproc 50
- #ftp hard nproc 0
- #@student - maxlogins 4
- # End of file
- [root@**** ~]#
- [root@**** ~]# cat /etc/security/limits.conf
- # /etc/security/limits.conf
- #
- #Each line describes a limit for a user in the form:
- #
- #<domain> <type> <item> <value>
- #
- #Where:
- #<domain> can be:
- # - an user name
- # - a group name, with @group syntax
- # - the wildcard *, for default entry
- # - the wildcard %, can be also used with %group syntax,
- # for maxlogin limit
- #
- #<type> can have the two values:
- # - "soft" for enforcing the soft limits
- # - "hard" for enforcing hard limits
- #
- #<item> can be one of the following:
- # - core - limits the core file size (KB)
- # - data - max data size (KB)
- # - fsize - maximum filesize (KB)
- # - memlock - max locked-in-memory address space (KB)
- # - nofile - max number of open files
- # - rss - max resident set size (KB)
- # - stack - max stack size (KB)
- # - cpu - max CPU time (MIN)
- # - nproc - max number of processes
- # - as - address space limit
- # - maxlogins - max number of logins for this user
- # - maxsyslogins - max number of logins on the system
- # - priority - the priority to run user process with
- # - locks - max number of file locks the user can hold
- # - sigpending - max number of pending signals
- # - msgqueue - max memory used by POSIX message queues (bytes)
- # - nice - max nice priority allowed to raise to
- # - rtprio - max realtime priority
- #
- #<domain> <type> <item> <value>
- #
- #* soft core 0
- #* hard rss 10000
- #@student hard nproc 20
- #@faculty soft nproc 20
- #@faculty hard nproc 50
- #ftp hard nproc 0
- #@student - maxlogins 4
- # kevin.xie added, fixed 'too many open file' bug, limit open max files 1024, 2011-10-24
- * - nofile 102400
- # End of file
- [root@**** ~]#
修改後的內容
- </pre>3. 修改 /etc/rc.local 添加 <div class="dp-highlighter"><div class="bar"><div class="tools">Java代碼 <a title="複製代碼" href="http://xieyanhua.iteye.com/blog/1198708#"><img alt="複製代碼" src="http://xieyanhua.iteye.com/images/icon_copy.gif" /></a> <a title="收藏這段代碼" href="http://xieyanhua.iteye.com/blog/1198708"><img class="star" alt="收藏代碼" src="http://xieyanhua.iteye.com/images/icon_star.png" /><img style="DISPLAY: none" class="spinner" src="http://xieyanhua.iteye.com/images/spinner.gif" alt="" /></a></div></div><ol class="dp-j"><li><span><span>echo </span><span class="number">8061540</span><span> > /proc/sys/fs/file-max </span></span></li></ol></div><pre style="DISPLAY: none" class="java" title="MINA2 錯誤解決方法-- Linux下tomcat報錯“java.net.SocketException: Too many open files”" name="code" codeable_id="1198708" codeable_type="Blog" source_url="http://xieyanhua.iteye.com/blog/1198708" pre_index="10">echo 8061540 > /proc/sys/fs/file-max
修改前內容
- [root@**** ~]# cat /proc/sys/fs/file-max
- 4096
- [root@**** ~]#
- [root@**** ~]# cat /proc/sys/fs/file-max
- 4096
- [root@**** ~]#
修改後內容
- [root@**** ~]# cat /proc/sys/fs/file-max
- 4096000
- [root@**** ~]#
- [root@**** ~]# cat /proc/sys/fs/file-max
- 4096000
- [root@**** ~]#
做完3個步驟,就可以了。
**************************************
補充說明:
/proc/sys/fs/file-max
該文件指定了可以分配的文件句柄的最大數目。如果用戶得到的錯誤消息聲明由於打開文件數已經達到了最大值,從而他們不能打開更多文件,則可能需要增加該值。可將這個值設置成有任意多個文件,並且能通過將一個新數字值寫入該文件來更改該值。
缺省設置:4096
/proc/sys/fs/file-nr
該文件與 file-max 相關,它有三個值:
已分配文件句柄的數目
已使用文件句柄的數目
文件句柄的最大數目
該文件是隻讀的,僅用於顯示信息。
關於“打開文件數”限制
Linux系統上對每一個用戶可使用的系統資源都是有限制的,這是多用戶系統必然要採用的一種資源管理手段,試想假如沒有這種機制,那麼任何一個普通用戶寫一個死循環程序,用不了多久系統就要“拒絕服務”了。
今天我遇到了tomcat日誌報的錯誤信息”too many open files”,第一意識就想到了是ulimit控制的”open files“限制。然而問題來了。我在/etc/profile里加入了 ulimit -n 4096保存之後,普通用戶登錄的時候均會收到一條錯誤信息ulimit: open files: cannot modify limit: Operation not permitted。然後普通用戶的open files限制還是默認值1024。
然後開始在互聯網上搜索關於ulimit的信息。互聯網果然方便,信息鋪天蓋地。大家也可以搜一下試一下。其中我瞭解到兩個以前不知道的相關內容。
第一個是內核參數 fs.file-max ,影射爲 /proc/sys/fs/file-max
第二個是配置文件 /etc/security/limits.conf
其中大部分的信息中提到 將 /proc/sys/fs/file-max的值設置爲4096和ulimit -n 4096是相同的效果。對此我很懷疑,爲什麼呢?首先ulimit 是一個普通用戶也可以使用的命令,而fs.file-max只有root有權設置。其次,很明顯fs.file-max是一個全局的設置,而ulimit 是一個局部的設置,很明顯的是不相同的。
帶着疑慮,又在網上搜索了許久,未果(實際上是我搜索的關鍵字不夠準確)。
最後終於在內核文檔/usr/src/linux/Documentation/sysctl/fs.txt裏找到下面一段話:
file-max & file-nr:
The kernel allocates file handles dynamically, but as yet it doesn’t free them again. The value in file-max denotes the maximum number of file-handles that the Linux kernel will allocate. When you get lots of error messages about running out of file handles, you might want to increase this limit.
The three values in file-nr denote the number of allocated file handles, the number of unused file handles and the maximum number of file handles. When the allocated file handles come close to the maximum, but the number of unused file handles is significantly greater than 0, you’ve encountered a peak in your usage of file handles and you don’t need to increase the maximum.
這兩段話的大致意思是:
內核動態地分配和釋放“file handles”(句柄)。file-max的值是內核所能分配到的最大句柄數。當你收到大量關於句柄用完的錯誤信息時,你可以需要增加這個值以打破老的限制。
file-nr中的三個值的含意分別是:系統已經分配出去(正在使用)的句柄數,沒有用到的句柄數和所有分配到的最大句柄數。當分配出去的句柄數接近 最大句柄數,而“無用的句柄數”大於零時,表明你遇到了一個“句柄”使用高峯,這意爲着你不需要增加file-max的值。
看完這段話,相信大家都明白了。file-max是系統全局的可用句柄數。根據我後來又翻查的信息,以及對多個系統的查看求證,這個參數的默認值是跟內存大小有關係的,增加物理內存以後重啓機器,這個值會增大。大約1G內存10萬個句柄的線性關係。
再回過頭來看這兩段話,不知道你意識到了沒有,文中只提到了file-max的增加,而沒有提到了該值的減少。那些在操作ulimit時同時操 作了file-max的哥們兒,估計無一例外地將file-max設置成了4096或者2048。但以似乎也沒有因此而導致系統無法打開文件或者建議連 接。(實際上,我將file-max的值設備成256,然後使用shell編寫用vi打開500個文件角本運行,並沒有得到任何錯誤信息,查看file- nr的值,系統當前分配的句柄值已經遠超過了後面的最大值。所以我猜想對於file-max的任何減少的操作都是毫無意義的,姑且不去管他。實踐中需要減 少file-max的時候總是不多見的。 )實事證明我犯了一個致命的錯誤。我測試的時候使用的是root用戶,而當我再次使用普通用戶測試的時候,預料中的錯誤信息出現了:”Too many open files in system”。可見,file-max的減少對系統也是影響力的。前面的結論“所以我猜想對於file-max的任何減少的操作都是毫無意義的”是錯誤 的。
然後便是/etc/security/limits.conf文件,這個文件很簡單,一看就能明白。
於是我按照註釋中描述的格式兩個兩行:
* soft nofile 4096
* hard nofile 4096
恐怖的是,網上居然有人說改了這個設置是需要重啓系統的!實在是讓人想不通啊,鼎鼎大名的UNIX系統,怎麼可能因爲這麼小小的一個改動就需要 重啓系統呢?結果當我再次以普通用戶登錄的時候,那個”ulimit: open files: cannot modify limit: Operation not permitted”提示沒有了,查看ulimit -n,果然已經變成了4096。
linux lsof 修改句柄限制(轉)
在Linux下,我們使用ulimit -n 命令可以看到單個進程能夠打開的最大文件句柄數量(socket連接也算在裏面)。系統默認值1024。
對於一般的應用來說(象Apache、系統進程)1024完全足夠使用。但是如何象squid、mysql、java等單進程處理大量請求的應用來說就有點捉襟見肘了。如果單個進程打開的文件句柄數量超過了系統定義的值,就會提到“too many files open”的錯誤提示。如何知道當前進程打開了多少個文件句柄呢?下面一段小腳本可以幫你查看:
lsof -n |awk ‘{print $2}’|sort|uniq -c |sort -nr|more
在系統訪問高峯時間以root用戶執行上面的腳本,可能出現的結果如下:
# lsof -n|awk ‘{print $2}’|sort|uniq -c |sort -nr|more
131 24204
57 24244
57 24231
56 24264
其中第一行是打開的文件句柄數量,第二行是進程號。得到進程號後,我們可以通過ps命令得到進程的詳細內容。
ps -aef|grep 24204
mysql 24204 24162 99 16:15 ? 00:24:25 /usr/sbin/mysqld
哦,原來是mysql進程打開最多文件句柄數量。但是他目前只打開了131個文件句柄數量,遠遠底於系統默認值1024。
但是如果系統併發特別大,尤其是squid服務器,很有可能會超過1024。這時候就必須要調整系統參數,以適應應用變化。Linux有硬性限制和軟性限制。可以通過ulimit來設定這兩個參數。方法如下,以root用戶運行以下命令:
ulimit -HSn 4096
以上命令中,H指定了硬性大小,S指定了軟性大小,n表示設定單個進程最大的打開文件句柄數量。個人覺得最好不要超過4096,畢竟打開的文件句柄數越多響應時間肯定會越慢。設定句柄數量後,系統重啓後,又會恢復默認值。如果想永久保存下來,可以修改.bash_profile文件,可以修改 /etc/profile 把上面命令加到最後。
仍未處理的問題:
爲什麼redhat9的個人版按照以上的方式修改1024tcp連接限制依然不行呢?
是因爲個人版最多支持1024個tcp連接,還是修改方式、相關文件會有所不同?
以上內容,來源網絡並加自己親自測試,經過測試,似乎沒有再出現過問題,但不知道是否真的解決,有待更長時間的測試看看
第三次解決--還解決不了,就徹底無語了(經過壓力測試,運行7天再也沒有出現該問題)
問題的原因是:原來的MINA2程序之關了IoSession,並沒有關閉IoConnector實例,但恰恰就是因爲沒有關閉每次打開的IoConnector實例,造成了"Too many open files ".
原來的程序:
- /**
- * <pre><b>功能描述:</b>獲取異步的session實例。
- *
- * @author :Kevin.xie
- * <b>創建日期 :</b>2011-9-15 上午10:06:27
- *
- * @return
- *
- * <b>修改歷史:</b>(修改人,修改時間,修改原因/內容)
- *
- * </pre>
- */
- public static IoSession getSession1() {
- // 創建客戶端連接器
- IoConnector connector = new NioSocketConnector();
- // 設置事件處理器
- connector.setHandler(new WebClientHandler());
- // 設置編碼過濾器和按行讀取數據模式
- connector.getFilterChain()
- .addLast("codec", new ProtocolCodecFilter(new ObdDemuxingProtocolCodecFactory(false)));
- // 創建連接
- ConnectFuture future = connector.connect(new InetSocketAddress(ServerConfigBoundle.getServerIp(),
- ServerConfigBoundle.getServerPort()));
- // 等待連接創建完成
- future.awaitUninterruptibly();
- // 獲取連接會話
- IoSession session = future.getSession();
- return session;
- }
- /**
- * <pre><b>功能描述:</b>必須要關閉Connector和IoSession
- * @author :Kevin.xie
- * <b>創建日期 :</b>2011-10-20 上午10:20:54
- *
- * @param session 要關閉的session
- *
- * <b>修改歷史:</b>(修改人,修改時間,修改原因/內容)
- *
- * </pre>
- */
- public static void closeSession(IoSession session) {
- if (session != null && !session.isClosing()) {
- // 沒有關閉,就關閉
- session.close(true);
- session = null;
- }
- }
- /**
- * <pre><b>功能描述:</b>獲取異步的session實例。
- *
- * @author :Kevin.xie
- * <b>創建日期 :</b>2011-9-15 上午10:06:27
- *
- * @return
- *
- * <b>修改歷史:</b>(修改人,修改時間,修改原因/內容)
- *
- * </pre>
- */
- public static IoSession getSession1() {
- // 創建客戶端連接器
- IoConnector connector = new NioSocketConnector();
- // 設置事件處理器
- connector.setHandler(new WebClientHandler());
- // 設置編碼過濾器和按行讀取數據模式
- connector.getFilterChain()
- .addLast("codec", new ProtocolCodecFilter(new ObdDemuxingProtocolCodecFactory(false)));
- // 創建連接
- ConnectFuture future = connector.connect(new InetSocketAddress(ServerConfigBoundle.getServerIp(),
- ServerConfigBoundle.getServerPort()));
- // 等待連接創建完成
- future.awaitUninterruptibly();
- // 獲取連接會話
- IoSession session = future.getSession();
- return session;
- }
- /**
- * <pre><b>功能描述:</b>必須要關閉Connector和IoSession
- * @author :Kevin.xie
- * <b>創建日期 :</b>2011-10-20 上午10:20:54
- *
- * @param session 要關閉的session
- *
- * <b>修改歷史:</b>(修改人,修改時間,修改原因/內容)
- *
- * </pre>
- */
- public static void closeSession(IoSession session) {
- if (session != null && !session.isClosing()) {
- // 沒有關閉,就關閉
- session.close(true);
- session = null;
- }
- }
修改後的程序
- /**
- *
- * <pre><b>功能描述:</b>獲取IoConnector和異步的session實例
- * 無法關閉。特別的提醒,NioSocketConnector 也要關閉。
- * 函數名是 dispose()。這點特別重要。這次出現 too many open files 的問題根源在這裏
- *
- * @author :Kevin.xie
- * <b>創建日期 :</b>2011-9-15 上午10:06:27
- *
- * @return
- *
- * <b>修改歷史:</b>(修改人,修改時間,修改原因/內容)
- *
- * </pre>
- */
- public static Map<String, Object> getConnectorAndSession() {
- // 創建客戶端連接器
- IoConnector connector = new NioSocketConnector();
- // 設置事件處理器
- connector.setHandler(new WebClientHandler());
- // 設置編碼過濾器和按行讀取數據模式
- connector.getFilterChain()
- .addLast("codec", new ProtocolCodecFilter(new ObdDemuxingProtocolCodecFactory(false)));
- // 創建連接
- ConnectFuture future = connector.connect(new InetSocketAddress(ServerConfigBoundle.getServerIp(),
- ServerConfigBoundle.getServerPort()));
- // 等待連接創建完成
- future.awaitUninterruptibly();
- // 獲取連接會話
- IoSession session = future.getSession();
- Map<String, Object> map = new HashMap<String, Object>();
- map.put(CONNECTOR_KEY, connector);
- map.put(SESSION_KEY, session);
- return map;
- }
- /**
- *
- * <pre><b>功能描述:</b>必須要關閉Connector和IoSession
- * 特別的提醒,NioSocketConnector 也要關閉。
- * 函數名是 dispose()。這點特別重要。這次出現 too many open files 的問題根源在這裏
- * @author :Kevin.xie
- * <b>創建日期 :</b>2011-10-20 上午10:20:54
- *
- * @param connector 要關閉的IoConnector,不關閉會報 too many open files 錯誤
- * @param session 要關閉的session
- *
- * <b>修改歷史:</b>(修改人,修改時間,修改原因/內容)
- *
- * </pre>
- */
- public static void closeConnectorAndSession(IoConnector connector, IoSession session) {
- if (session != null && !session.isClosing()) {
- // 沒有關閉,就關閉
- session.close(true);
- session = null;
- }
- if (connector != null && !(connector.isDisposing() || connector.isDisposed())) {
- // 沒有關閉,就關閉
- connector.dispose();
- connector = null;
- }
- }
- ]
- /**
- *
- * <pre><b>功能描述:</b>獲取IoConnector和異步的session實例
- * 無法關閉。特別的提醒,NioSocketConnector 也要關閉。
- * 函數名是 dispose()。這點特別重要。這次出現 too many open files 的問題根源在這裏
- *
- * @author :Kevin.xie
- * <b>創建日期 :</b>2011-9-15 上午10:06:27
- *
- * @return
- *
- * <b>修改歷史:</b>(修改人,修改時間,修改原因/內容)
- *
- * </pre>
- */
- public static Map<String, Object> getConnectorAndSession() {
- // 創建客戶端連接器
- IoConnector connector = new NioSocketConnector();
- // 設置事件處理器
- connector.setHandler(new WebClientHandler());
- // 設置編碼過濾器和按行讀取數據模式
- connector.getFilterChain()
- .addLast("codec", new ProtocolCodecFilter(new ObdDemuxingProtocolCodecFactory(false)));
- // 創建連接
- ConnectFuture future = connector.connect(new InetSocketAddress(ServerConfigBoundle.getServerIp(),
- ServerConfigBoundle.getServerPort()));
- // 等待連接創建完成
- future.awaitUninterruptibly();
- // 獲取連接會話
- IoSession session = future.getSession();
- Map<String, Object> map = new HashMap<String, Object>();
- map.put(CONNECTOR_KEY, connector);
- map.put(SESSION_KEY, session);
- return map;
- }
- /**
- *
- * <pre><b>功能描述:</b>必須要關閉Connector和IoSession
- * 特別的提醒,NioSocketConnector 也要關閉。
- * 函數名是 dispose()。這點特別重要。這次出現 too many open files 的問題根源在這裏
- * @author :Kevin.xie
- * <b>創建日期 :</b>2011-10-20 上午10:20:54
- *
- * @param connector 要關閉的IoConnector,不關閉會報 too many open files 錯誤
- * @param session 要關閉的session
- *
- * <b>修改歷史:</b>(修改人,修改時間,修改原因/內容)
- *
- * </pre>
- */
- public static void closeConnectorAndSession(IoConnector connector, IoSession session) {
- if (session != null && !session.isClosing()) {
- // 沒有關閉,就關閉
- session.close(true);
- session = null;
- }
- if (connector != null && !(connector.isDisposing() || connector.isDisposed())) {
- // 沒有關閉,就關閉
- connector.dispose();
- connector = null;
- }
- }
- ]
用完後一定要釋放資源:
- Map<String, Object> resultMap = SocketUtils.getConnectorAndSession();
- IoSession session = (IoSession) resultMap.get(SocketUtils.SESSION_KEY);
- IoConnector connector = (IoConnector) resultMap.get(SocketUtils.CONNECTOR_KEY);
- ............
- ............
- // 主動關閉連接
- SocketUtils.closeConnectorAndSession(connector, session);
- Map<String, Object> resultMap = SocketUtils.getConnectorAndSession();
- IoSession session = (IoSession) resultMap.get(SocketUtils.SESSION_KEY);
- IoConnector connector = (IoConnector) resultMap.get(SocketUtils.CONNECTOR_KEY);
- ............
- ............
- // 主動關閉連接
- SocketUtils.closeConnectorAndSession(connector, session);
同時在配置文件 /etc/security/limits.conf 加了一個配置(該不該問題不大):
# kevin.xie added, fixed 'too many open file' bug', 2012-01-04
* soft nofile 65536
* hard nofile 65536
- # 第二次解決添加的內容
- # kevin.xie added, fixed 'too many open file' bug, limit open max files 1024, 2011-10-24
- * - nofile 102400
- # 第三次(本次)解決添加的問題(不過這個應該可以不修改,沒有印證,也懶得修改了)
- # kevin.xie added, fixed 'too many open file' bug', 2012-01-04
- * soft nofile 65536
- * hard nofile 65536