之前學習了floodlight鏈路發現模塊:
今天來着手學習floodlight拓撲管理模塊,實際上拓撲管理模塊會在網絡發生變動(例如新添加了交換機、交換機端口改變)的時候自動計算網絡拓撲,並生成相應的拓撲結構,同時floodlight拓撲管理模塊計算拓撲數據的來源是鏈路發現模塊通過LLDP和BDDP協議來獲取的。
先給出本次實驗的Mininet拓撲圖吧:
拓撲結構說明:採用一個控制器,四臺OF交換機和一臺非OF交換機,配置如下:
S1/S2/S4/S5 dpId分別爲:
00:00:00:00:00:00:00:01,00:00:00:00:00:00:00:02,00:00:00:00:00:00:00:04,00:00:00:00:00:00:00:05
那麼我們直接看代碼,其中遇到一些概念再解釋,首先來看TopologyManager的startUp函數:
@Override
public void startUp(FloodlightModuleContext context) {
clearCurrentTopology();
// Initialize role to floodlight provider role.
this.role = floodlightProviderService.getRole();
ScheduledExecutorService ses = threadPoolService.getScheduledExecutor();
newInstanceTask = new SingletonTask(ses, new UpdateTopologyWorker());
if (role != HARole.STANDBY) {
newInstanceTask.reschedule(TOPOLOGY_COMPUTE_INTERVAL_MS, TimeUnit.MILLISECONDS);
}
linkDiscoveryService.addListener(this);
floodlightProviderService.addOFMessageListener(OFType.PACKET_IN, this);
floodlightProviderService.addHAListener(this.haListener);
addRestletRoutable();
}
可以看到啓動模塊後,模塊會啓動一個任務,並觸發UpdateTopologyWorker(),繼續跟進這個任務看看:
protected class UpdateTopologyWorker implements Runnable {
@Override
public void run() {
try {
if (ldUpdates.peek() != null) { /* must check here, otherwise will run every interval */
updateTopology("link-discovery-updates", false);
}
handleMiscellaneousPeriodicEvents();
}
catch (Exception e) {
log.error("Error in topology instance task thread", e);
} finally {
if (floodlightProviderService.getRole() != HARole.STANDBY) {
newInstanceTask.reschedule(TOPOLOGY_COMPUTE_INTERVAL_MS, TimeUnit.MILLISECONDS);
}
}
}
}
可以看到任務中會中ldUpdates消息隊列中取出網絡拓撲改變的事件,若存在事件,則進行拓撲的重新計算,調用updateTopology,繼續跟進代碼可以發現,floodlight會在拓撲改變的時候,生成一個TopologyInstance類的實例,並調用其中的compute方法來進行拓撲計算:
protected boolean createNewInstance(String reason, boolean forced) {
Set<NodePortTuple> blockedPorts = new HashSet<NodePortTuple>();
if (!linksUpdated && !forced) {
return false;
}
Map<NodePortTuple, Set<Link>> openflowLinks;
openflowLinks =
new HashMap<NodePortTuple, Set<Link>>();
Set<NodePortTuple> nptList = switchPortLinks.keySet();
if (nptList != null) {
for(NodePortTuple npt: nptList) {
Set<Link> linkSet = switchPortLinks.get(npt);
if (linkSet == null) continue;
openflowLinks.put(npt, new HashSet<Link>(linkSet));
}
}
// Identify all broadcast domain ports.
// Mark any port that has inconsistent set of links
// as broadcast domain ports as well.
Set<NodePortTuple> broadcastDomainPorts =
identifyBroadcastDomainPorts();
// Remove all links incident on broadcast domain ports.
for (NodePortTuple npt : broadcastDomainPorts) {
if (switchPortLinks.get(npt) == null) continue;
for (Link link : switchPortLinks.get(npt)) {
removeLinkFromStructure(openflowLinks, link);
}
}
// Remove all tunnel links.
for (NodePortTuple npt: tunnelPorts) {
if (switchPortLinks.get(npt) == null) continue;
for (Link link : switchPortLinks.get(npt)) {
removeLinkFromStructure(openflowLinks, link);
}
}
//switchPorts contains only ports that are part of links. Calculation of broadcast ports needs set of all ports.
Map<DatapathId, Set<OFPort>> allPorts = new HashMap<DatapathId, Set<OFPort>>();;
for (DatapathId sw : switchPorts.keySet()){
allPorts.put(sw, this.getPorts(sw));
}
TopologyInstance nt = new TopologyInstance(switchPorts,
blockedPorts,
openflowLinks,
broadcastDomainPorts,
tunnelPorts,
switchPortLinks,
allPorts,
interClusterLinks);
nt.compute();
currentInstance = nt;
return true;
}
我們可以看到代碼中首先會identifyBroadcastDomainPorts(),這個方法的意思是排除那些非OF的端口鏈接,回到鏈路發現模塊我們可以看到floodlight將非OF的端口鏈接都標識爲廣播域端口。後面代碼接着從openflowLinks中移除了廣播域端口和Tunnel端口,最後生成了一個TopologyInstance 的實例,並調用compute進行拓撲計算.接下來關鍵時刻到來,我們看看拓撲計算是怎麼進行的:
protected void compute() {
/*
* Step 1: Compute clusters ignoring ports with > 2 links and
* blocked links.
*/
identifyClusters();
/*
* Step 2: Associate non-blocked links within clusters to the cluster
* in which they reside. The remaining links are inter-cluster links.
*/
identifyIntraClusterLinks();
/*
* Step 3: Compute the archipelagos. (Def: group of conneccted clusters)
* Each archipelago will have its own broadcast tree, chosen by running
* dijkstra's algorithm from the archipelago ID switch (lowest switch
* DPID). We need a broadcast tree per archipelago since each
* archipelago is by definition isolated from all other archipelagos.
*/
identifyArchipelagos();
/*
* Step 4: Use Yens algorithm to permute through each node combination
* within each archipelago and compute multiple paths. The shortest
* path located (i.e. first run of dijkstra's algorithm) will be used
* as the broadcast tree for the archipelago.
*/
computeOrderedPaths();
/*
* Step 5: Determine the broadcast ports for each archipelago. These are
* the ports that reside on the broadcast tree computed and saved when
* performing path-finding. These are saved into multiple data structures
* to aid in quick lookup per archipelago, per-switch, and topology-global.
*/
computeBroadcastPortsPerArchipelago();
/*
* Step 6: Optionally, print topology to log for added verbosity or when debugging.
*/
printTopology();
}
可以從代碼中看出,拓撲計算分爲6步:
第一步:計算cluster.
第二步:標識cluster之間的link
第三步:計算孤島(Archipelagos)
第四步:計算每個節點之間的k個最短路徑
第五步:計算每個孤島的廣播樹
第六步:打印拓撲信息
以下將詳細講解其中重要的步驟:
1.第一步:計算cluster
cluster是floodlight中引入的概念,有點類似於我們平時接觸到的集羣,我理解是有一組互聯的OF交換機。如上面拓撲圖中,s1,s2爲一個cluster; s4,s5爲一個cluster.因爲s2和s4之間隔着一個非OF的s3,所以s2,s4不是在一個cluster.
那麼floodlight是如何計算出cluster的呢?floodlight 採用了一種名叫Tarjan的算法.看了下Tarjan算法是用於在有向圖中找出強連通分量的算法,算法大致的思路是將OF交換機抽象爲圖中的一個節點,鏈路抽象爲圖中的連線,注意floodlight中鏈路都是有向的,也就是兩個OF交換機的連接在floodlight中是有兩個Link,兩個Link是對稱的.
下面說下Tarjan算法的大致思路:從一個節點開始進行圖的深度遍歷,如果遍歷的過程中發現了之已經遍歷過的節點,則圖中勢必存在一個強聯通分量,這時候對這個分量進行標識,繼續進行深度遍歷.具體算法的簡介請參考:
2.第二步,第三步用於生成cluster之間的連接和標識孤島,這裏孤島也可以理解爲一組OF交換機強連接的集合.
3.第四步:計算每個節點之間的k個最短路徑
這步驟是通過Yens算法計算節點之間的K個最短路徑,從代碼中可以得到,K默認爲3.
我們都知道求最短路徑的算法是dijkstra,Yens算法可以理解爲在dijkstra算法基礎上發展出來的,大致思路是先用dijkstra算法求出某個節點到另外一個節點的最短路徑,求出最短路徑以後,在最短路徑上的基礎上每次去掉一條路徑並用另一條路徑替換進而獲得另一條次短的路徑,算法的具體實現可以參考:
完成上面的步驟,拓撲管理模塊就能生成網絡拓撲的信息,可以用於界面展示、路由選路、數據包廣播等等信息。