1、前言
前段時間,在升級Hive版本(從Hive1.1.0升級至Hive2.3.6)的過程中,遇到了權限兼容問題。
(升級相關請移步Hive1.1.0升級至2.3.6 踩坑記錄)
Hive1.1.0使用的是AuthorizerV1而Hive2.3.6默認是AuthorizerV2,兩者相差極大。
其中AuthorizerV2的權限驗證極爲嚴格,如果曾經使用V1鑑權想要使用V2的,需要修改部分代碼,這是我提交的patch,供參考HIVE-22830
2、權限認證
v1和v2的權限授權和查詢,在SQL語法上是通用的,因此我們不需要對Hive授權相關的系統做代碼上的修改。
v1權限認證
(org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider
):官方文檔
總共有ALL | ALTER | UPDATE | CREATE | DROP | INDEX | LOCK | SELECT | SHOW_DATABASE
這幾種權限類型。
相關Hive操作所需要的權限如下表:
v2權限認證
(org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory
):官方文檔
總共有INSERT | SELECT | UPDATE | DELETE | ALL
這幾種類型的權限。
相關Hive操作所需要的權限如下表:
Y代表需要有該權限
Y+G代表不僅需要有改權限,還必須是能給別人賦權的權限(“WITH GRANT OPTION”)
來說說兩者在一些細節上的區別:
1、v2的ALL
權限在mysql中會分成INSERT | SELECT | UPDATE | DELETE
來保存,而v1則是直接存ALL
。
2、v1對庫授權,則用戶/角色對該庫下的所有表都有相應的權限,而v2必須對錶一一授權。
3、v1如果需要刪庫/表,只需要擁有對應的drop權限即可,而v2如果需要刪庫/表,必須是該庫/表的owner,這個owner可以是用戶或者角色。
3、源碼解析(Hive3.1版本)
在Hive SQL執行全過程源碼解析中,介紹了Driver.compile中權限認證的入口:
接下來着重看下v2權限認證的實現:
如何判斷是用v1或v2進行認證?
if (ss.isAuthorizationModeV2())
-> public boolean isAuthorizationModeV2(){
return getAuthorizationMode() == AuthorizationMode.V2;
}
--> public AuthorizationMode getAuthorizationMode(){
setupAuth();
// 判斷在初始化操作setupAuth()後,authorizer是否爲null
// 優先判斷是否爲v1
if(authorizer != null){
return AuthorizationMode.V1;
}else if(authorizerV2 != null){
return AuthorizationMode.V2;
}
}
setupAuth():
/**
* Setup authentication and authorization plugins for this session.
*/
private void setupAuth() {
// 已經被初始化了,直接返回
if (authenticator != null) {
// auth has been initialized
return;
}
try {
// 通過獲取hive.security.authenticator.manager參數來決定authenticator
authenticator = HiveUtils.getAuthenticator(sessionConf,
HiveConf.ConfVars.HIVE_AUTHENTICATOR_MANAGER);
authenticator.setSessionState(this);
// 獲取hive.security.authorization.manager參數,爲權限驗證實現類名
// 決定了使用v1還是v2
// Hive1默認v1:(org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider)
// Hive2及以後默認v2:(org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdHiveAuthorizerFactory)
String clsStr = HiveConf.getVar(sessionConf, HiveConf.ConfVars.HIVE_AUTHORIZATION_MANAGER);
// 使用反射嘗試實例化權限認證類對象
// 生成HiveAuthorizationProvider對象,v1
authorizer = HiveUtils.getAuthorizeProviderManager(sessionConf,
clsStr, authenticator, true);
if (authorizer == null) {
// if it was null, the new (V2) authorization plugin must be specified in
// config
// 實現和v1大致相同,使用反射嘗試實例化權限認證類對象
HiveAuthorizerFactory authorizerFactory = HiveUtils.getAuthorizerFactory(sessionConf,
HiveConf.ConfVars.HIVE_AUTHORIZATION_MANAGER);
HiveAuthzSessionContext.Builder authzContextBuilder = new HiveAuthzSessionContext.Builder();
authzContextBuilder.setClientType(isHiveServerQuery() ? CLIENT_TYPE.HIVESERVER2
: CLIENT_TYPE.HIVECLI);
authzContextBuilder.setSessionString(getSessionId());
// 生成HiveAuthorizer對象,v2
authorizerV2 = authorizerFactory.createHiveAuthorizer(new HiveMetastoreClientFactoryImpl(),
sessionConf, authenticator, authzContextBuilder.build());
setAuthorizerV2Config();
}
// create the create table grants with new config
createTableGrants = CreateTableAutomaticGrant.create(sessionConf);
} catch (HiveException e) {
LOG.error("Error setting up authorization: " + e.getMessage(), e);
throw new RuntimeException(e);
}
if(LOG.isDebugEnabled()){
Object authorizationClass = getActiveAuthorizer();
LOG.debug("Session is using authorization class " + authorizationClass.getClass());
}
return;
}
繼續看v2的實現:
ss.getAuthorizerV2().checkPrivileges(hiveOpType, inputsHObjs, outputHObjs, authzContextBuilder.build());
-> public void checkPrivileges(HiveOperationType hiveOpType, List<HivePrivilegeObject> inputHObjs,
List<HivePrivilegeObject> outputHObjs, HiveAuthzContext context)
throws HiveAuthzPluginException, HiveAccessControlException {
authValidator.checkPrivileges(hiveOpType, inputHObjs, outputHObjs, context);
}
--> public void checkPrivileges(HiveOperationType hiveOpType, List<HivePrivilegeObject> inputHObjs,
List<HivePrivilegeObject> outputHObjs, HiveAuthzContext context)
throws HiveAuthzPluginException, HiveAccessControlException {
if (LOG.isDebugEnabled()) {
String msg = "Checking privileges for operation " + hiveOpType + " by user "
+ authenticator.getUserName() + " on " + " input objects " + inputHObjs
+ " and output objects " + outputHObjs + ". Context Info: " + context;
LOG.debug(msg);
}
String userName = authenticator.getUserName();
// 獲取metastore api入口
IMetaStoreClient metastoreClient = metastoreClientFactory.getHiveMetastoreClient();
// check privileges on input and output objects
List<String> deniedMessages = new ArrayList<String>();
// 分別對輸入輸出鑑權,結果計入deniedMessages
checkPrivileges(hiveOpType, inputHObjs, metastoreClient, userName, IOType.INPUT, deniedMessages);
checkPrivileges(hiveOpType, outputHObjs, metastoreClient, userName, IOType.OUTPUT, deniedMessages);
// 若deniedMessages大小不爲0,則權限不足,拋出異常
SQLAuthorizationUtils.assertNoDeniedPermissions(new HivePrincipal(userName,
HivePrincipalType.USER), hiveOpType, deniedMessages);
}
checkPrivileges:
getPrivilegesFromMetaStore:
...
thrifPrivs = metastoreClient.get_privilege_set(objectRef, userName, null);
-> public PrincipalPrivilegeSet get_privilege_set(HiveObjectRef hiveObject,
String userName, List<String> groupNames) throws MetaException,
TException {
if (!hiveObject.isSetCatName()) {
hiveObject.setCatName(getDefaultCatalog(conf));
}
// 訪問metastore
return client.get_privilege_set(hiveObject, userName, groupNames);
}
...
get_privilege_set:
這邊以獲取表權限爲例,get_table_privilege_set:
getTablePrivilege:
listAllMTableGrants:
package.jdo:
至此,權限認證核心部分的代碼已經梳理完了,如果對JDO不是很熟悉,可以看一下JDOQL官方文檔。
再回到Driver.doAuthorization,如果不走v2則會默認進行v1鑑權,其底層實現也是同v2一樣,通過連metastore查庫獲取權限信息,然後比對。
// 進行v1權限認證
HiveAuthorizationProvider authorizer = ss.getAuthorizer();
if (op.equals(HiveOperation.CREATEDATABASE)) {
// 建庫鑑權
authorizer.authorize(
op.getInputRequiredPrivileges(), op.getOutputRequiredPrivileges());
} else if (op.equals(HiveOperation.CREATETABLE_AS_SELECT)
|| op.equals(HiveOperation.CREATETABLE)) {
// 建表鑑權
authorizer.authorize(
db.getDatabase(SessionState.get().getCurrentDatabase()), null,
HiveOperation.CREATETABLE_AS_SELECT.getOutputRequiredPrivileges());
} else {
if (op.equals(HiveOperation.IMPORT)) {
ImportSemanticAnalyzer isa = (ImportSemanticAnalyzer) sem;
if (!isa.existsTable()) {
// 導入(import)鑑權
authorizer.authorize(
db.getDatabase(SessionState.get().getCurrentDatabase()), null,
HiveOperation.CREATETABLE_AS_SELECT.getOutputRequiredPrivileges());
}
}
}
if (outputs != null && outputs.size() > 0) {
for (WriteEntity write : outputs) {
if (write.isDummy() || write.isPathType()) {
continue;
}
if (write.getType() == Entity.Type.DATABASE) {
if (!op.equals(HiveOperation.IMPORT)){
// We skip DB check for import here because we already handle it above
// as a CTAS check.
// 寫庫鑑權
authorizer.authorize(write.getDatabase(),
null, op.getOutputRequiredPrivileges());
}
continue;
}
if (write.getType() == WriteEntity.Type.PARTITION) {
Partition part = db.getPartition(write.getTable(), write
.getPartition().getSpec(), false);
if (part != null) {
// 寫分區鑑權
authorizer.authorize(write.getPartition(), null,
op.getOutputRequiredPrivileges());
continue;
}
}
// 寫表鑑權
if (write.getTable() != null) {
authorizer.authorize(write.getTable(), null,
op.getOutputRequiredPrivileges());
}
}
}
// 之後就是對讀取的鑑權,不一一註釋了。
// 其底層實現也是同v2一樣,連metastore查庫獲取權限信息,然後比對。
if (inputs != null && inputs.size() > 0) {
Map<Table, List<String>> tab2Cols = new HashMap<Table, List<String>>();
Map<Partition, List<String>> part2Cols = new HashMap<Partition, List<String>>();
//determine if partition level privileges should be checked for input tables
Map<String, Boolean> tableUsePartLevelAuth = new HashMap<String, Boolean>();
for (ReadEntity read : inputs) {
if (read.isDummy() || read.isPathType() || read.getType() == Entity.Type.DATABASE) {
continue;
}
Table tbl = read.getTable();
if ((read.getPartition() != null) || (tbl != null && tbl.isPartitioned())) {
String tblName = tbl.getTableName();
if (tableUsePartLevelAuth.get(tblName) == null) {
boolean usePartLevelPriv = (tbl.getParameters().get(
"PARTITION_LEVEL_PRIVILEGE") != null && ("TRUE"
.equalsIgnoreCase(tbl.getParameters().get(
"PARTITION_LEVEL_PRIVILEGE"))));
if (usePartLevelPriv) {
tableUsePartLevelAuth.put(tblName, Boolean.TRUE);
} else {
tableUsePartLevelAuth.put(tblName, Boolean.FALSE);
}
}
}
}
// column authorization is checked through table scan operators.
getTablePartitionUsedColumns(op, sem, tab2Cols, part2Cols, tableUsePartLevelAuth);
// cache the results for table authorization
Set<String> tableAuthChecked = new HashSet<String>();
for (ReadEntity read : inputs) {
// if read is not direct, we do not need to check its autho.
if (read.isDummy() || read.isPathType() || !read.isDirect()) {
continue;
}
if (read.getType() == Entity.Type.DATABASE) {
authorizer.authorize(read.getDatabase(), op.getInputRequiredPrivileges(), null);
continue;
}
Table tbl = read.getTable();
if (tbl.isView() && sem instanceof SemanticAnalyzer) {
tab2Cols.put(tbl,
sem.getColumnAccessInfo().getTableToColumnAccessMap().get(tbl.getCompleteName()));
}
if (read.getPartition() != null) {
Partition partition = read.getPartition();
tbl = partition.getTable();
// use partition level authorization
if (Boolean.TRUE.equals(tableUsePartLevelAuth.get(tbl.getTableName()))) {
List<String> cols = part2Cols.get(partition);
if (cols != null && cols.size() > 0) {
authorizer.authorize(partition.getTable(),
partition, cols, op.getInputRequiredPrivileges(),
null);
} else {
authorizer.authorize(partition,
op.getInputRequiredPrivileges(), null);
}
continue;
}
}
// if we reach here, it means it needs to do a table authorization
// check, and the table authorization may already happened because of other
// partitions
if (tbl != null && !tableAuthChecked.contains(tbl.getTableName()) &&
!(Boolean.TRUE.equals(tableUsePartLevelAuth.get(tbl.getTableName())))) {
List<String> cols = tab2Cols.get(tbl);
if (cols != null && cols.size() > 0) {
authorizer.authorize(tbl, null, cols,
op.getInputRequiredPrivileges(), null);
} else {
authorizer.authorize(tbl, op.getInputRequiredPrivileges(),
null);
}
tableAuthChecked.add(tbl.getTableName());
}
}
}