記一下摘要,有時間再整理
1. Three attributes,三個標準
Throughput 吞吐量
Latencies 延遲
Footprint 佔用內存(Good space efficiency 利用率?)
Usually, you sacrifice one in favor of the other two
Not all three are always important
But you do need to know what's important for your application
內存分配越多越好,每次儘可能回收更多垃圾、優化上面三個標準中的兩個
2. 測試方法
測試三個屬性
測試實際負載下的實際條件
一下條件需要一致:應用 第三方包 框架 jdk hardware OS
3. GC Choice
General rule of thumb
Latency more critical than throughput → use CMS
Exception: < 1GB heap → ParallelOldGC might be able to meet desired latencies
Latency not as important as throughput → use ParallelOldGC
Our approach presented here
Start with and measure ParallelOldGC
Move to CMS as necessary
4. 一般的方法
a.測試
b.進一步優化
c. 重複ab
5. footprint
可用內存
單一jvm能分配多少
多jvm或有其他進程(db)每個jvm能分配多少?
記得給OS留些
通常young old都是越大越好,更少的gc間隔,更少的壓力、更多對象可以回收
Minor gc的時間主要取決於live object,而不是young gen的大小
6. 初始化的heap size
首先使用ParallelOldGC
選擇一個應用可以順利跑起來的heap size(說起來容易 做起來難
使用默認參數
-XX:+PrintCommandLineFlags查看初始和max的heap size
查看gclog 確定heap 使用情況
OOM則增加內存
目的是爲了獲取初始的數據,然後做進一步優化
FootPrint
7. Calculate Live Data Size(LDS)
應用穩定期獲取數據
使用如下方法觸發fullgc:
jconsole/visualvm: 點擊perform gc
jmap -histo:live
gc log中能看到
live data size
max perm gen size
worse case latency
確定合理的heap size,一般的規則
Set -Xms and -Xmx to 3x to 4x LDS
Set both -XX:PermSize and -XX:MaxPermSize to around 1.2x to 1.5x the max perm gen size
Young gen should be around 1x to 1.5x LDS
Old gen should be around 2x to 3x LDS
e.g., young gen should be around 1/3-1/4 of the heap size(LDS of 512m : -Xmn768m -Xms2g -Xmx2g)
但是heap size並不是全部
查看整個進程的footprint:top prstat
其他佔用內存:本地lib,io buffer,java 線程棧
問題:
有可能在指定的內存下,應用無法正常運行
應用層優化
heap size小於1.5LDS, 可能gc頻繁 影響應用
Latencies
8. Latencies
How large are the pauses?
Average GC pause time target
Max GC pause time target
How frequently violations can be tolerated
How frequent are the pauses?
GC pause frequency target (will usually be the same as application pause frequency target)
Likely less important than pause time target
9. Refine Young Gen Size
Monitor young GC times
The most frequent source of GC-induced latencies
Look at both duration and frequency
If young GCs too long → decrease young gen size
It might decrease young GC times (but not always)
It will make young GCs more frequent
If young GCs too frequent → increase young gen size
It might increase young GC times (but not always)
When changing the young gen size
Try to keep old gen size constant
e.g., increase young gen size by 1g
-Xms2g -Xmx2g -Xmn1g → -Xms3g -Xmx3g -Xmn2g
Old gen size should not be much smaller than 1.5x LDS
A very small young gen can be counter-productive
Very frequent young GCs
Generally, the young gen should not be much smaller than around 10% of the heap
Gc時間過長沒有什麼好方法,修改應用、把應用部署到多個jvm
10. 下一步優化:決定gc類型
收集了足夠的信息:
Young gc 、 old gc time和frequency都滿足,就是用Parellelold,優化結束。
Yong gc時間ok,full gc 時間長或頻率高,使用CMS
Younggc 時間長,應用級優化。
11. ParallelOld優化
Throughput
UseAdaptiveSizePolicy
更高的吞吐量可以增加young old gen
UseNUMA
>Non-Uniform Memory Access
Applicable to most SPARC, Opteron, more recently Intel platforms
> -XX:+UseNUMA
> Splits the young generation into partitions
Each partition “belongs” to a CPU
>Allocates new objects into the partition that belongs to the allocating CPU
> Big win for some applications
12. CMS優化
GC暫停時間
GC頻率
吞吐量
CMS不會壓縮,默認在Full GC時壓縮
遷移到CMS需要增加20-30%的內存(因爲碎片和更長的回收週期)
更長的young gc時間(old gen的內存分配更緩慢)
更好的最差延遲
更低的吞吐量
Tenuring Threshold
Higher tenuring threshold → promotes fewer objects
Possibly (but not necessarily) longer young GC times
Increases the number of objects reclaimed in the young gen
Better overall efficiency
Lower tenuring threshold → promotes more objects
Possibly (but not necessarily) shorter young GC times
More load / pressure on the old gen
More frequent old GCs
Could make fragmentation more severe
Essential in CMS to minimize this as much as possible
Survivor Size Tuning
不應該溢出
調試TargetSurvivorRatio
通過GC log or the tenuring distribution
If survivors overflow
Increase survivor size using -XX:SurvivorRatio=<ratio>
survivor size = (100 / (<ratio> + 2))% of young gen size
larger <ratio> → smaller survivors
Or decrease MTT
Desired survivor size:suvivor size * TargetSurvivorRatio
需要拷貝到suvivor的存活對象>DSS,TT減小(survivor溢出)
何時開始:
CMSInitiatingOccupancyFraction:1.5x LDS at least
Cycle starts too early → unnecessary overhead
Cycle starts too late → will not finish on time
與old gen增長速度也有關係:增長快 更早收集
收集過於頻繁:增加old gen大小
暫停時間優化:
Initial mark:很難優化
Remark
與收集過程中對象變化有關
CMSScavengeBeforeRemark:先激發一個young gc
ParallelRefProcEnabled:如果有很多待處理的Reference / finalizable對象 則有用
如果有full gc
是否出現concurrent mode failure
Perm gen滿了?
-XX:+ExplicitGCInvokesConcurrent
ParallelOld的young gc很慢,CMS的young gc極難做的更好
13. 一些特殊的配置
Use this material as a guide, not as hard rules
Don't be afraid to experiment
We have seen the following in the field
Young gen size = 80% of heap size
Maximize throughput by minimizing young GC frequency
Old gen size = 1.2x LDS
Old gen with extremely low growth rate
Initiating occupancy threshold = 95%
Ditto
14.
Before you start, you'll need to have some basic
knowledge of
The application's behavior
The application's important requirements
The context in which it will be run