hadoop2.0在線升級，不停止hadoop集羣

原創

技多不压身

2020-06-19 11:16

介紹

HDFS 滾動升級允許單個的hdfs節點（守護進程）進行升級。例如，datanodes 節點可以單獨升級不影響namenodes。反之亦然。

升級

在hadoop2.0版本，hdfs 支持 name services的ha功能，並且是強一致性的。這兩個特性可以讓我們有機會實現升級hdfs集羣而不需要關閉hdfs服務。只有做了HA的集羣纔可以滾動升級。
如果在新的版本中有新的功能，並且這個功能不能在舊版本中使用，這種情況請遵循以下步驟：

1. 關閉新功能
2. 升級集羣
3. 開啓新功能

ps：Rolling update 僅在2.4.0版本以上支持

一個ha集羣有，兩到多個nn和多個dn，jn，zkns。因爲jns 相對穩定並且在絕大多數都不需要被升級當升級hdfs時。滾動升級僅僅是升級nn，dn。
升級非聯邦集羣
假如有兩個主節點nn1 nn2，並且分別是active 和 standby 狀態。升級步驟如下：

準備升級

運行
hdfs dfsadmin -rollingUpgrade prepare
創建fsimage 用於rollback。
代碼如下：

        RollingUpgradeInfo startRollingUpgrade() throws IOException {
            checkSuperuserPrivilege(); //檢查權限
            checkOperation(OperationCategory.WRITE);
            writeLock();
            try {
              checkOperation(OperationCategory.WRITE);
              long startTime = now();
              if (!haEnabled) { // for non-HA, we require NN to be in safemode
                startRollingUpgradeInternalForNonHA(startTime);
              } else { // for HA, NN cannot be in safemode
                checkNameNodeSafeMode("Failed to start rolling upgrade");
                startRollingUpgradeInternal(startTime);
              }getEditLog().logStartRollingUpgrade(rollingUpgradeInfo.getStartTime());
              if (haEnabled) {
                // roll the edit log to make sure the standby NameNode can tail
                getFSImage().rollEditLog();
              }
            } finally {
              writeUnlock();
            }

            getEditLog().logSync();// 同步jn節點 並且flush jn內存數據
            if (auditLog.isInfoEnabled() && isExternalInvocation()) {
              logAuditEvent(true, "startRollingUpgrade", null, null, null);
            }
            return rollingUpgradeInfo;
          }

運行

hdfs dfsadmin -rollingUpgrade query

檢查rollback images的狀態，直到”Proceed with rolling upgrade” 出現。表示準備好了。

  RollingUpgradeInfo queryRollingUpgrade() throws IOException {
    checkSuperuserPrivilege();
    checkOperation(OperationCategory.READ);
    readLock();
    try {
      if (rollingUpgradeInfo != null) {
        boolean hasRollbackImage = this.getFSImage().hasRollbackFSImage();// 有可以回滾的images就返回true
        rollingUpgradeInfo.setCreatedRollbackImages(hasRollbackImage);
      }
      return rollingUpgradeInfo;
    } finally {
      readUnlock();
    }
  }

升級Active and Stanby NNs

關閉NN2的服務，升級NN2(如果是tar包安裝升級就是換目錄。把hadoop目錄軟連成高版本目錄)
開啓 NN2 as standby with the
hdfs namenode -rollingUpgrade started.

ps：看了代碼 這個hdfs name -rollingUpgrade started 和 hdfs name 一樣 (CDH 5.3.3)除了日誌處理部分，而且啓動的時候 建議添加 nohup & 後臺執行

Failover 切換，NN2:active NN1：standby (hdfs haadmin -failover nn1 nn2)
關閉NN1上 namenode 服務：hadoop-daemon.sh stop namenode
升級namenode節點 hadoop tar
開啓 NN1 as standby with the hdfs namenode -rollingUpgrade startedoption.

升級Datanode 節點

選擇一些指定的datanode節點（同一機架的）

運行 hdfs dfsadmin -shutdownDatanode <DATANODE_HOST:IPC_PORT> upgrade關閉datanode
運行hdfs dfsadmin -getDatanodeInfo <DATANODE_HOST:IPC_PORT>查看狀態
升級並且重啓datanode
在所有的datanode上執行123步驟全部更新完畢

結束rolling update

在主節點運行
hdfs dfsadmin -rollingUpgrade finalize

注意事項

jn和nn和dn最好是分離沒有交集，不在同一臺機器上運行
備份好namenode.dir 下面的所有editslog和fsimages 以備後患
執行回滾或者降級參考一下流程：

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html#namenode_-rollingUpgrade

Downgrade without Downtime
In a HA cluster, when a rolling upgrade from an old software release to a new software release is in progress, it is possible to downgrade, in a rolling fashion, the upgraded machines back to the old software release. Same as before, suppose NN1 and NN2 are respectively in active and standby states. Below are the steps for rolling downgrade:

Downgrade DNs
Choose a small subset of datanodes (e.g. all datanodes under a particular rack).
Run “hdfs dfsadmin -shutdownDatanode upgrade” to shutdown one of the chosen datanodes.
Run “hdfs dfsadmin -getDatanodeInfo ” to check and wait for the datanode to shutdown.
Downgrade and restart the datanode.
Perform the above steps for all the chosen datanodes in the subset in parallel.
Repeat the above steps until all upgraded datanodes in the cluster are downgraded.
Downgrade Active and Standby NNs
Shutdown and downgrade NN2.
Start NN2 as standby normally. (Note that it is incorrect to use the “-rollingUpgrade downgrade” option here.)
Failover from NN1 to NN2 so that NN2 becomes active and NN1 becomes standby.
Shutdown and upgrade NN1.
Start NN1 as standby normally. (Note that it is incorrect to use the “-rollingUpgrade downgrade” option here.)
Finalize Rolling Downgrade
Run “hdfs dfsadmin -rollingUpgrade finalize” to finalize the rolling downgrade.
Note that the datanodes must be downgraded before downgrading the namenodes since protocols may be changed in a backward compatible manner but not forward compatible, i.e. old datanodes can talk to the new namenodes but not vice versa.

Downgrade with Downtime
Administrator may choose to first shutdown the cluster and then downgrade it. The following are the steps:

Shutdown all NNs and DNs.
Restore the pre-upgrade release in all machines.
Start NNs with the “-rollingUpgrade downgrade” option.
Start DNs normally.

原文鏈接：https://blog.csdn.net/leone911/article/details/51395874

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

hadoop2.0在線升級，不停止hadoop集羣

介紹

升級

準備升級

升級Active and Stanby NNs

升級Datanode 節點

結束rolling update

注意事項

hadoop2.0在線升級，不停止hadoop集羣

ubuntu16.04 的國內更新源

服務上搭建本地倉庫【centos、ubuntu、kylin系統（中標麒麟、銀河麒麟）】

mysql中information_schema.columns字段說明

redis分佈式集羣搭建（詳細）

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結