0039-如何使用Python Impyla客戶端連接Hive和Impala

溫馨提示:要看高清無碼套圖,請使用手機打開並單擊圖片放大查看。

1.文檔編寫目的


繼上一章講述如何在CDH集羣安裝Anaconda&搭建Python私有源後,本章節主要講述如何使用Pyton Impyla客戶端連接CDH集羣的HiveServer2和Impala Daemon,並進行SQL操作。

  • 內容概述

1.依賴包安裝

2.代碼編寫

3.代碼測試

  • 測試環境

1.CM和CDH版本爲5.11.2

2.RedHat7.2

  • 前置條件

1.CDH集羣環境正常運行

2.Anaconda已安裝並配置環境變量

3.pip工具能夠正常安裝Python包

4.Python版本2.6+ or 3.3+

5.非安全集羣環境

2.Impyla依賴包安裝


Impyla所依賴的Python包

  • six
  • bit_array
  • thrift (on Python 2.x) orthriftpy (on Python 3.x)
  • thrift_sasl
  • sasl

1.首先安裝Impyla依賴的Python包

[root@ip-172-31-22-86 ~]# pip install bit_array
[root@ip-172-31-22-86 ~]# pip install thrift==0.9.3
[root@ip-172-31-22-86 ~]# pip install six
[root@ip-172-31-22-86 ~]# pip install thrift_sasl
[root@ip-172-31-22-86 ~]# pip install sasl

0039-如何使用Python Impyla客戶端連接Hive和Impala

0039-如何使用Python Impyla客戶端連接Hive和Impala

0039-如何使用Python Impyla客戶端連接Hive和Impala

0039-如何使用Python Impyla客戶端連接Hive和Impala

注意:thrift的版本必須使用0.9.3,默認安裝的爲0.10.0版本,需要卸載後重新安裝0.9.3版本,卸載命令pip uninstall thrift

2.安裝Impyla包

impyla版本,默認安裝的是0.14.0,需要將卸載後安裝0.13.8版本

 [root@ip-172-31-22-86 ec2-user]# pip install impyla==0.13.8
Collecting impyla
  Downloading impyla-0.14.0.tar.gz (151kB)
    100% |████████████████████████████████| 153kB 1.0MB/s 
Requirement already satisfied: six in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Requirement already satisfied: bitarray in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Requirement already satisfied: thrift in /opt/cloudera/parcels/Anaconda-4.2.0/lib/python2.7/site-packages (from impyla)
Building wheels for collected packages: impyla
  Running setup.py bdist_wheel for impyla ... done
  Stored in directory: /root/.cache/pip/wheels/96/fa/d8/40e676f3cead7ec45f20ac43eb373edc471348ac5cb485d6f5
Successfully built impyla
Installing collected packages: impyla
Successfully installed impyla-0.14.0

0039-如何使用Python Impyla客戶端連接Hive和Impala

3.編寫Python代碼


Python連接Hive(HiveTest.py)

from impala.dbapi importconnect

conn = connect(host='ip-172-31-21-45.ap-southeast-1.compute.internal',port=10000,database='default',auth_mechan

ism='PLAIN')

print(conn)

cursor = conn.cursor()

cursor.execute('show databases')

print cursor.description # prints the result set's schema

results = cursor.fetchall()

print(results)

cursor.execute('SELECT * FROM test limit 10')

print cursor.description # prints the result set's schema

results = cursor.fetchall()

print(results)

Python連接Impala(ImpalaTest.py)

from impala.dbapi importconnect

conn = connect(host='ip-172-31-26-80.ap-southeast-1.compute.internal',port=21050)

print(conn)

cursor = conn.cursor()

cursor.execute('show databases')

print cursor.description # prints the result set's schema

results = cursor.fetchall()

print(results)

cursor.execute('SELECT * FROM test limit 10')

print cursor.description # prints the result set's schema

results = cursor.fetchall()

print(results)

4.測試代碼


在shell命令行執行Python代碼測試

1.測試連接Hive

_root@ip-172-31-22-86_ec2-user# python HiveTest.py

<impala.hiveserver2.HiveServer2Connection_object at 0x7f66eee00250>_

('database_name', 'STRING', None, None, None, None, None)

('default',)

('test.s1', 'STRING',None, None, None, None, None), ('test.s2', 'STRING', None, None, None, None, None)

('name1', 'age1'), ('name2', 'age2'), ('name3', 'age3'), ('name4', 'age4'), ('name5', 'age5'), ('name6', 'age6'), ('name7', 'age7'), ('name8', 'age8'), ('name9', 'age9'), ('name10', 'age10')

[root@ip-172-31-22-86 ec2-user]#

0039-如何使用Python Impyla客戶端連接Hive和Impala

2.測試連接Impala

_root@ip-172-31-22-86_ec2-user# python ImpalaTest.py

<impala.hiveserver2.HiveServer2Connection_object at 0x7f7e1f2cfad0>_

('name', 'STRING', None, None, None, None, None), ('comment', 'STRING', None, None, None, None, None)

('_impala_builtins', 'Systemdatabase for Impala builtin functions'), ('default', 'Default Hive database')

('s1', 'STRING', None, None, None,None, None), ('s2', 'STRING', None, None, None,None, None)

('name1', 'age1'), ('name2', 'age2'), ('name3', 'age3'), ('name4', 'age4'), ('name5', 'age5'), ('name6', 'age6'), ('name7', 'age7'), ('name8', 'age8'), ('name9', 'age9'), ('name10', 'age10')

[root@ip-172-31-22-86 ec2-user]#

0039-如何使用Python Impyla客戶端連接Hive和Impala

5.常見問題


1.錯誤一

building 'sasl.saslwrapper' extension
    creating build/temp.linux-x86_64-2.7
    creating build/temp.linux-x86_64-2.7/sasl
    gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/opt/cloudera/parcels/Anaconda/include/python2.7 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-2.7/sasl/saslwrapper.o
    unable to execute 'gcc': No such file or directory
    error: command 'gcc' failed with exit status 1

    ----------------------------------------
Command "/opt/cloudera/parcels/Anaconda/bin/python -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-kD6tvP/sasl/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-WJFNeG-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-kD6tvP/sasl/

解決方法:

[root@ip-172-31-22-86 ec2-user]# yum -y install gcc 
[root@ip-172-31-22-86 ec2-user]# yum install gcc-c++ 

2.錯誤二

gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -Isasl -I/opt/cloudera/parcels/Anaconda/include/python2.7 -c sasl/saslwrapper.cpp -o build/temp.linux-x86_64-2.7/sasl/saslwrapper.o
cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ [enabled by default]
In file included from sasl/saslwrapper.cpp:254:0:
sasl/saslwrapper.h:22:23: fatal error: sasl/sasl.h: No such file or directory
#include <sasl/sasl.h>
                   ^
compilation terminated.
error: command 'gcc' failed with exit status 1

解決方法:

[root@ip-172-31-22-86 ec2-user]# yum -y install python-devel.x86_64 cyrus-sasl-devel.x86_64

醉酒鞭名馬,少年多浮誇! 嶺南浣溪沙,嘔吐酒肆下!摯友不肯放,數據玩的花!
溫馨提示:要看高清無碼套圖,請使用手機打開並單擊圖片放大查看。


推薦關注Hadoop實操,第一時間,分享更多Hadoop乾貨,歡迎轉發和分享。

0039-如何使用Python Impyla客戶端連接Hive和Impala
原創文章,歡迎轉載,轉載請註明:轉載自微信公衆號Hadoop實操

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章