python phoenix API

原創

2019-06-20 03:43

一、phoenix

Phonix是搜索引擎，n個版本之前是作爲獨立的產品存在的，現在集成到hbase裏面。Phoenix是一個開源的HBASE SQL層。Phoeinx可以用標準的JDBC API替代HBASE client API來創建表，插入和查詢查詢HBASE中的數據。

Phoenix作爲應用層和HBASE之間的中間件,以下特性使它在大數據量的簡單查詢場景有着獨有的優勢

二級索引支持(global index + local index)
編譯SQL成爲原生HBASE的可並行執行的scan
在數據層完成計算，server端的coprocessor執行聚合
下推where過濾條件到server端的scan filter上
利用統計信息優化、選擇查詢計劃（5.x版本將支持CBO）
skip scan功能提高掃描速度

一句話：用sql語句操作no sql數據

二、接口調用

phoenix有phoenixdb 和JayDeBeApi兩個包，這裏採用的是phoenixdb。好像沒有連接池方式、只能實時打開、關閉或者長鏈接方式。

下載依賴包 pip3 install phoenixdb

configuration.properties

[Hbase]
hbase_host:172.8.10.xx
hbase_db:xx

PhoenixClient

#!/usr/bin/python3
# -*- coding: UTF-8 -*-

import json
import pandas as pd
import phoenixdb as pdb

## READ CONFIGURATION FILE
config_file = pd.read_table(filepath_or_buffer="configuration.properties",
                            header=None, delim_whitespace=True,index_col=0).transpose()

config=ConfigParser()
config.read(str(config_file['configPath'].iloc[0]))

# =============================================================================
# initialization
# =============================================================================

Hbase_host = config.get('Hbase', 'Hbase_host')
Hbase_db = config.get('Hbase', 'Hbase_db')


class PhoenixClient(object):

    def __init__(self):
        self.url = 'http://{}:8765/'.format(Hbase_host)
        self.Hbase_db = Hbase_db

    def pullData(self,query_str,ID=False):
        try:
            self.conn=pdb.connect('http://{}:8765/'.format(Hbase_host), autocommit=True)
            self.cursor = self.conn.cursor()
            self.cursor.execute(query_str)
            cols = [x.name for x in self.cursor.description]
            data = self.cursor.fetchall()
            if ID:
                return pd.DataFrame(data,columns=cols)
            else:
                return pd.DataFrame(data,columns=cols).drop(columns="ID")
        finally:
            self.cursor.close()
            self.conn.close()


    def pushData(self,table,df,rowKeyName,prefix):
        try:
            self.conn=pdb.connect('http://{}:8765/'.format(Hbase_host), autocommit=True)
            self.cursor = self.conn.cursor()
            ## add ID column as ROWKEY
            df = df.assign(ID = df[rowKeyName].apply(lambda x:'_'.join(x.map(str)),axis=1))
            df = df.assign(ID = prefix+'_'+df['ID']+'_'+df.index.map(str))
            cols = tuple(df.columns.tolist())
            hbTable = '{}:{}'.format(self.Hbase_db,table)
            query_str = '''UPSERT INTO "{}"{} VALUES ({})'''.format(hbTable,cols,','.join(['?']*len(cols)))
            query_str=query_str.replace("'","\"")
            self.cursor.executemany(query_str, df.to_numpy().tolist())
        finally:
            self.cursor.close()
            self.conn.close()


    def executer(self,query_str):
        try:
            self.conn=pdb.connect('http://{}:8765/'.format(Hbase_host), autocommit=True)
            self.cursor = self.conn.cursor()
            self.cursor.execute(query_str)
        finally:
            self.cursor.close()
            self.conn.close()

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

python phoenix API

一、phoenix

二、接口調用

DBeaver連接phoenix

python hbase API (二) thrift2

python phoenix API

python mysql API

python happybase API

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結