Centos7使用sqoop導mysql數據到hive，並kylin創建cube（流程及報錯解決）

環境是全部配置好的，配置鏈接：https://blog.csdn.net/qq_38929220/article/details/95481002

啓動kylin時遇到報錯：
[root@hadoop ~]# /hadoop/kylin/bin/kylin.sh start
Retrieving hadoop conf dir...
KYLIN_HOME is set to /hadoop/kylin
Retrieving hive dependency...
Something wrong with Hive CLI or Beeline, please execute Hive CLI or Beeline CLI in terminal to find the root cause.

修改 find-hive-dependency.sh，刪掉hive_env=hive ${hive_conf_properties} -e set 2>&1 | grep 'env:CLASSPATH' ，加入下面幾行後，kylin就能成功啓動了。

hive -e set >/tmp/hive_env.txt 2>&1
hive_env=`grep 'env:CLASSPATH' /tmp/hive_env.txt`
hive_env=`echo ${hive_env#*env:CLASSPATH}`
hive_env="env:CLASSPATH"${hive_env}

cube創建時意外關閉kylin，重啓jobhistoryserver
Couldn't find hive executable jar. Please check if hive executable jar exists in HIVE_LIB folder.
嘗試啓動hive發現：
hive查詢異常：Cannot create directory /tmp/hive-root/。。。Name node is in safe mode.
關閉安全模式：
hdfs dfsadmin -safemode leave
之後kylin可以正常啓動

sqoop執行語句

列出mysql中所有數據庫

sqoop list-databases --connect jdbc:mysql://hostname:3306?useSSL=false --username fmmanager -P

列出數據庫中的表

sqoop list-tables --connect jdbc: mysql://hostname:3306/database?useSSL=false database username fmmanager -P

導出數據庫中的所有表：

sqoop import-all-tables “-Dorg.apache.sqoop.splitter.allow_text_splitter=true” --connect jdbc:mysql://hostname:3306/database?useSSL=false --username fmmanager -P

導出數據庫所有表到hive，除了某個：

sqoop import-all-tables “-Dorg.apache.sqoop.splitter.allow_text_splitter=true” --connect jdbc:mysql://hostname:3306/database?useSSL=false --username fmmanager -P --exclude-tables table --fields-terminated-by ‘\t’ --hive-import -m 1

因爲sqoop缺少java-json.jar包報錯，下載jar包(http://www.java2s.com/Code/Jar/j/Downloadjavajsonjar.htm)，把java-json.jar添加到…/sqoop/lib目錄就可以。
Import failed: java.io.IOException: Generating splits for a textual index column allowed only in case of "-Dorg.apache.sqoop.splitter.allow_text_splitter=true" property passed asa parameter
解決javaERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Hive exited with status 1異常
解決方法：找到HIVE_HOME下的lib文件夾，將文件夾中的libthrift-0.9.3.jar 拷貝到SQOOP_HOME路徑下的lib文件夾下面
錯誤：找不到或無法加載主類 org.apache.sqoop.sqoop
取出sqoop-1.4.6.jar放在hadoop的lib下
如果還解決不了，就在SQOOP_HOME/bin裏找到sqoop腳本，vi sqoop，修改最下面的配置
將#exec {HADOOP_COMMON_HOME}/bin/hadoop org.apache.sqoop.Sqoop "@"exec
改爲
{HADOOP_COMMON_HOME}/bin/hadoop jar SQOOP_HOME/lib/sqoop-1.4.6.jar org.apache.sqoop.Sqoop "$@"
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/avro/Logical
用sqoop下的 avro-1.8.1.jar
替換原本的
${HADOOP_HOME}/lib/avro-1.7.7.jar
$HBASE_HOME/lib/avro-1.7.7.jar
報錯ERROR bonecp.BoneCP: Unable to start/stop JMX java.security.AccessControlException: access denied ("javax.management.MBeanTrustPermission" "register") %JAVA_HOME%\jre\lib\security\java.policy
添加如下內容：
permission javax.management.MBeanTrustPermission "register";

將hdfs中的數據導入到hive中

load data local inpath ‘/usr/root/’ into table test. ;
load data inpath ‘hdfs://localhost/user/root/’ into table test.;

用sqoop導入表時可以看到表的格式,直接複製創建表

CREATE TABLE test.energymanage_electricity_circuit(
 id INT,
 number INT,
 name STRING,
 is_spare BOOLEAN, 
 quantity_unit STRING, 
 quantity DOUBLE, 
 branch_number STRING, 
 magnification STRING, 
 meter_address INT, 
 zone_id INT, 
 room_id INT, 
 control_cabinet STRING, 
 cabinet_number STRING, 
 endlevel_id INT, 
 firstlevel_id INT, 
 address INT, 
 ip STRING, 
 port INT, 
 Instrument_number STRING, 
 busy_end STRING, 
 busy_start STRING, 
 ref_power_density DOUBLE, 
 status BOOLEAN, 
 facility_id INT)
row format delimited fields terminated by ',';

導入hive報錯
http://blog.itpub.net/25854343/viewspace-2565234/
hive+sqoop jackson因版本不一致導致java.lang.NoSuchMethodError: com.fasterxml.jackson.databind.ObjectMapper.
將SQOOP_HOME/lib/jackson*.jar 文件bak，再把HIVE_HOME/lib/jackson*.jar 拷貝至 $SQOOP_HOME/lib 目錄中，重新運行sqoop 作業。

hive命令

hive刪除庫

drop database if exists db_name;

強制刪除庫

drop database if exists db_name cascade;

刪除表

drop table if exists employee;

清空表

truncate table employee;
or
insert overwrite table employee select * from employee where 1=0;

刪除分區

alter table employee_table drop partition (stat_year_month>=‘2018-01’);

按條件刪除數據

insert overwrite table employee_table select * from employee_table where id>‘180203a15f’;

hdfs命令

創建目錄

hdfs dfs -mkdir /home

上傳文件或目錄到hdfs

hdfs dfs -put hello /
hdfs dfs -put hellodir/ /

查看目錄

hdfs dfs -ls /

創建一個空文件

hdfs dfs -touchz /file

刪除一個文件

hdfs dfs -rm /file

刪除一個目錄

hdfs dfs -rmr /home

重命名

hdfs dfs -mv /file1 /file2

查看文件

hdfs dfs -cat /file

將制定目錄下的所有內容merge成一個文件，下載到本地

hdfs dfs -getmerge /filedir /root

使用du文件和目錄大小

hdfs dfs -du /

將目錄拷貝到本地

hdfs dfs -copyToLocal /home localdir

Kylin的使用

https://cenrise.com/2017/04/16/hadoop/Kylin%E5%85%A5%E9%97%A8%E6%A6%82%E5%BF%B5/
維度Dimensions就是觀察數據的角度。比如電商的銷售數據，可以從時間的維度來觀察，也可以進一步細化，從時間和地區的維度來觀察。維度一般是一組離散的值，比如時間維度上的每一個獨立的日期，或者商品維度上的每一件獨立的商品。因此統計時可以把維度值相同的記錄聚合在一起，然後應用聚合函數做累加、平均、去重計數等聚合計算。
度量Measures就是被聚合的統計值，也是聚合運算的結果，它一般是連續的值，如圖1-2中的銷售額，抑或是銷售商品的總件數據。通過比較和測量試題，分析師可以對數據進行評估，比如今年的銷售額相比去年有多大的增長，增長的速度是否達到預期，不同商品類別的增長比例是否合理等。

給定一個數據模型，我們可以對其上的所有維度進行組合。對於N個維度來說，組合的所有可能共2的n次方種。對於每一種維度的組合，將度量做聚合運算，然後將運算的結果保存爲一個物化視圖，稱爲Cuboid。所有維度組合的Coboid作爲一個整體，被稱爲Cube。所以簡單來說一個Cube就是許多按維度聚合的物化視圖的集合。

Apache Kylin的工作原理就是對數據模型做Cube預計算，並利用計算的結果加速查詢，具體工作過程如下：
1）指定數據模型，定義維度和度量
2）預計算Cube，計算所有Cuboid並保存爲物化視圖。
3）執行查詢時，讀取Cuboid，運算，產生查詢結果。

登錄kylin

點擊‘+’創建一個新的工程project

在對應工程下導入表table（即之前寫入hive的表）

點擊Model,新建一個modle

選擇對應的表

點擊 “Next” 進入 “Configuration Overwrites” 頁面，點擊 “+Property” 添加屬性 “kylin.engine.spark.rdd-partition-cut-mb” 其值爲 “500” (理由如下):

樣例 cube 有兩個耗盡內存的度量: “COUNT DISTINCT” 和 “TOPN(100)”；當源數據較小時，他們的大小估計的不太準確: 預估的大小會比真實的大很多，導致了更多的 RDD partitions 被切分，使得 build 的速度降低。500 對於其是一個較爲合理的數字。點擊 “Next” 和 “Save” 保存 cube。

對於沒有”COUNT DISTINCT” 和 “TOPN” 的 cube，請保留默認配置。

Centos7使用sqoop導mysql數據到hive，並kylin創建cube（流程及報錯解決）

sqoop執行語句

hive命令

hdfs命令

Kylin的使用

python gdal 安裝使用（Windows， python 3.6.8）

【PAT_B】1019 數字黑洞（c/c++)

【PAT_B】1010 一元多項式求導（c/c++)

博途TIA Portal STEP 7 Professional WinCC Advanced V15.0安裝報錯解決

【數據結構知識點總結】一、基本概念

Centos7使用sqoop導mysql數據到hive，並kylin創建cube（流程及報錯解決）

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結