impala大数据量查询/tmp/impala-scratch创建异常

使用impala对大数据量查询distinct的时候出现如下错误

5ab149d_24414dab2c19caca:e54b206c5ab149f_91001337-9d70-4c93-84ce-e7916c1ae804 failed with errno=2 description=Error(2): No such file or directory
Backend 4:Create file /tmp/impala-scratch/24414dab2c19caca:e54b206c5ab149d_24414dab2c19caca:e54b206c5ab149f_91001337-9d70-4c93-84ce-e7916c1ae804 failed with errno=2 description=Error(2): No such file or directory


查阅文档发现impala在大数据量处理时会用到磁盘保存中间数据

By default, intermediate files used during large sort, join, aggregation, or analytic function operations are stored in the directory /tmp/impala-scratch. These files are removed when the operation finishes. (Multiple concurrent queries can perform operations that use the "spill to disk" technique, without any name conflicts for these temporary files.) You can specify a different location by starting the impalad daemon with the --scratch_dirs="path_to_directory" configuration option or the equivalent configuration option in the Cloudera Manager user interface. You can specify a single directory, or a comma-separated list of directories. The scratch directories must be on the local filesystem, not in HDFS. You might specify different directory paths for different hosts, depending on the capacity and speed of the available storage devices. Impala will not start if it cannot create or read and write files in the "scratch" directory. If there is less than 1 GB free on the filesystem where that directory resides, Impala still runs, but writes a warning message to its log.


直接到各个impalad节点上创建/tmp/impala-scratch中间目录,并赋予读写权限:

mkdir /tmp/impala-scratch
chmod 777 /tmp/impala-scratch


发布了56 篇原创文章 · 获赞 7 · 访问量 9万+
發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章