優化策略之PowerOptimization

Power_Opt_Design

1 Available Logic Optimizations

opt_design[-retarget][-propconst][-sweep][-bram_power_opt][-remap]

[-resynth_area][-resynth_seq_area][-directive<arg>][-muxf_remap]

[-hier_fanout_limit<arg>][-bufg_opt][-control_set_merge][-quiet][-verbose]

1.1Retargeting (Default)

Retargetingreplaces one cell type with another to ease optimization. For example, a MUXF7

replaced by aLUT3 can be combined with other LUTs. In addition, simple cells such as

inverters areabsorbed into downstream logic.

1.2Constant Propagation (Default)

ConstantPropagation propagates constant values through logic, which results in:

• Eliminatedlogic:

For example, anAND with a constant 0 input

• Reduced logic:

For example, A3-input AND with a constant 1 input is reduced to a 2-input AND.

• Redundantlogic:

For example, A2-input OR with a logic 0 input is reduced to a wire.

1.3Sweep (Default)

Sweep removescells that have no loads.

1.4Block RAM Power Optimization(Default)

Block RAM PowerOptimization enables power optimization on block RAM cells including:

• Changing the WRITE_MODE on unread ports of true dual-port RAMs to NO_CHANGE.

• Applyingintelligent clock gating to block RAM outputs.

1.5Remap

Remap combinesmultiple LUTs into a single LUT to reduce the depth of the logic.

1.6Resynth Area

Resynth Areaperforms re-synthesis in area mode to reduce the number of LUTs.

1.7Mux Optimization

Remaps MUXF7,MUXF8, and MUXF9 primitives to LUT3 to improve route-ability.

1.8Control Set Merging

Reduce thedrivers of logically-equivalent control signals to a single driver. This islike a reverse fanout replication, and results in nets that are better suitedfor module-based replication.

1.9Global Clock Buffer Insertion

Logicoptimization conservatively inserts global clock buffers on clock nets andhigh-fanout non-clock nets such as device-wide resets.

For 7 seriesdesigns, clock buffers are inserted as long as 12 total global clock buffersarenot exceeded.

For UltraScale designs,there is no limit for clock buffers inserted on clock nets.

For non-clocknets:

• Global clockbuffers are only inserted as long as 24 total clock buffers are not

exceeded, notincluding BUFG_GT buffers.

• The fanoutmust be above 25,000.

For fabric-drivenclock nets, the fanout must be 30 or greater.

1.10 Module-Based FanoutOptimization

Net drivers withfanout greater than the specified limit, provided as an argument with thisoption, will be replicated according to the logical hierarchy.

For each hierarchicalinstance driven by the high-fanout net, if the fanout within the hierarchy isgreater than the specified limit, then the net within the hierarchy is drivenby a replica of the driver of the high-fanout net.

2 Using Directives

2.1Explore

Runs multiplepasses of optimization.

進行多次優化

2.2ExploreArea

Runs multiplepasses of optimization with emphasis on reducing combinational logic.

進行多次優化,重點是減少組合邏輯

2.3AddRemap

Runs the defaultlogic optimization flow and includes LUT remapping to reduce logic levels.

將LUT重新優化到邏輯單元內部。

2.4ExploreSequentialArea

Runs multiplepasses of optimization with emphasis on reducing registers and related combinationallogic.

減少寄存器和組合邏輯

2.5RuntimeOptimized

Runs minimalpasses of optimization, trading design performance for faster run time.

更快通過優化來替代設計性能

2.6NoBramPowerOpt

Runs all thedefault opt_design optimizations except block RAMPower Optimization.

對BRAM不做功耗優化

2.7ExploreWithRemap

Same as the Explore directive but includes the Remap optimization.

與Remap類似,但是多了一個重新映射。

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章