一個Hadoop管理員的職責(翻譯)

最近看過一篇與Hadoop有關的英文文檔,其實就是一本書裏的一部分內容。覺得很好,基本闡述了一個hadoop管理員的職責。平時,工作當中接觸到hadoop的朋友,可以看下,這篇文檔中所描述的知識和技能,大家是否都已經具備了?
譯文:
一個Hadoop管理員的職責

隨着對大數據日益增長的興趣和洞察力,各個組織正在積極計劃或者組建他們的大數據團隊。要開始進行數據工作,他們需要一個良好而紮實的基礎架構。
一旦他們具備基礎架構,他們就須要針對集羣的維護,管理和排錯進行控制和指定策略。

市場對Hadoop管理員的需求日益增長,他們的工作(創建和維護集羣)使得數據分析成爲真正的可能。

Hadoop管理員在網絡,操作系統,和存儲方面,須要很好的系統操作技能。在複雜的網絡環境中,對於計算機硬件和硬件操作,他們需要具備大量的知識。

Apache Hadoop軟件主要運行在Linux操作系統,所有必須對Linux操作系統具備諸如:監控,排錯,配置,安全管理等這些技能。

爲集羣設置節點涉及很多重複性的工作,Hadoop管理員應該使用快速而有效率的方法把這些服務器使用起來,比如使用Puppet,Chef和CFEngine這樣的管理工具.
除了這些工具,管理也應該具備良好的規劃技能去設計和規劃集羣.

在一個集羣中許多節點須要複製數據,比如,namenode守護進程的fsimage文件,可以被配置爲寫入相同節點的不同硬盤,或者寫入不同節點。
所以hadoop管理員須要理解NFS掛載點以及如何配合集羣來建立NFS掛載.管理員也可能被要求在特定的節點上配置磁盤RAID.

因爲Hadoop所有的服務和守護進程都是建立在Java之上,所以JVM(Java Virtual Machine Java虛擬機)的基本知識,和對Java異常的理解將會非常有用.
這些知識能夠幫助管理員快速的確認問題.

Hadoop管理員應具備進行基準測試的技能,能夠在高流量的場景下測試集羣的性能.

集羣總是在持續不斷的運行,並處理大量的數據,所以集羣比較容易出現故障.爲了監控集羣的健康狀況,管理員須要部署監控工具,諸如:Nagios 和 Ganglia等等.
並且管理員須要爲關鍵節點配置告警和監控,在出現問題之前,提前預見到問題.

具備良好的腳步語言編程知識,諸如: Python,Ruby, 或者 Shell,將會極大的幫助到Hadoop管理員.
通常,Hadoop管理員會被要求把一些預定的文件從外部文件源,分期的導入至HDFS. 腳步技能可以幫助管理員通過執行腳本來自動化地管理這些工作.

最重要的是,Hadoop管理員應該很好的瞭解Apache Hadoop的體系結構和它的內部運作.

下面這些項目是Hadoop管理員必須掌握的一些關鍵hadoop操作:
規劃集羣,評估集羣須要處理的數據量,以此來決定集羣中的節點數量.
在集羣上安裝和升級Apache Hadoop.
通過使用Hadoop的各種配置文件來配置和調試Hadoop.
理解所有Hadoop守護進程,以及它們在集羣中的角色和承擔的職責.
Hadoop 管理員應該知如何閱讀和解釋Hadoop的日誌.
在集羣中添加和刪除節點.
在集羣中重新平衡節點.
使用認證和認證系統來啓用安全機制,比如Kerberos

幾乎所有的組織都會遵循一定的策略來備份他們的數據,執行數據備份工作是Hadoop管理員的責任.
所以Hadoop管理員應該熟悉服務器的備份和恢復操作.


原文:
Responsibilities of a Hadoop administrator

With the increase in the interest to derive insight on their big data,
organizations are now planning and building their big data teams aggressively.
To start working on their data, they need to have a good solid infrastructure.
Once they have this setup, they need several controls and system policies in place to maintain, manage,and troubleshoot their cluster.

There is an ever-increasing demand for Hadoop Administrators in the market
as their function (setting up and maintaining Hadoop clusters) is what makes analysis really possible.

The Hadoop administrator needs to be very good at system operations, networking, operating systems, and storage.
They need to have a strong knowledge of computer hardware and their operations, in a complex network.

Apache Hadoop, mainly, runs on Linux. So having good Linux skills such as monitoring, troubleshooting, confguration, and security is a must.

Setting up nodes for clusters involves a lot of repetitive tasks
and the Hadoop administrator should use quicker and effcient ways to bring up these servers using confguration management tools
such as Puppet, Chef, and CFEngine.
Apart from these tools, the administrator should also have good capacity planning skills to design and plan clusters.

There are several nodes in a cluster that would need duplication of data,
for example, the fsimage file of the namenode daemon can be confgured to write to two different disks on the same node
or on a disk on a different node.
An understanding of NFS mount points and how to set it up within a cluster is required.
The administrator may also be asked to set up RAID for disks on specifc nodes.

As all Hadoop services/daemons are built on Java,
a basic knowledge of the JVM along with the ability to understand Java exceptions would be very useful.
This helps administrators identify issues quickly.

The Hadoop administrator should possess the skills to benchmark the cluster to test performance under high traffc scenarios.

Clusters are prone to failures as they are up all the time and are processing large amounts of data regularly.
To monitor the health of the cluster, the administrator should deploy monitoring tools such as Nagios and Ganglia
and should confgure alerts and monitors for critical nodes of the cluster to foresee issues before they occur.

Knowledge of a good scripting language such as Python, Ruby, or Shell would greatly help the function of an administrator.
Often, administrators are asked to set up some kind of a scheduled file staging from an external source to HDFS.
The scripting skills help them execute these requests by building scripts and automating them.

Above all, the Hadoop administrator should have a very good understanding of the Apache Hadoop architecture and its inner workings.

The following are some of the key Hadoop-related operations that the Hadoop administrator should know:

Planning the cluster, deciding on the number of nodes based on the estimated amount of data the cluster is going to serve.

Installing and upgrading Apache Hadoop on a cluster.

Confguring and tuning Hadoop using the various confguration files available within Hadoop.

An understanding of all the Hadoop daemons along with their roles and responsibilities in the cluster.

The administrator should know how to read and interpret Hadoop logs.

Adding and removing nodes in the cluster.

Rebalancing nodes in the cluster.

Employ security using an authentication and authorization system such as Kerberos.

Almost all organizations follow the policy of backing up their data
and it is the responsibility of the administrator to perform this activity.
So, an administrator should be well versed with backups and recovery operations of servers

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章