前面已經講過hadoop僞分佈式和分佈式環境搭建,參考大數據時代之Hadoop集羣搭建
本來想直接搞java代碼操作hdfs的,想了想還是先簡單的複習下命令行操作hdfs吧。其實hdfs操作起來是操作linux系統的文件個人認爲很相似,只不過命令前面要加個hdfs dfs -或者hadoop fs -,如:
#新建目錄
hdfs dfs -mkdir <path>
或
hadoop fs -mkdir /test
這裏就不得不問hadoop fs和hdfs dfs的區別是什麼了。
參考:Hadoop:hadoop fs、hadoop dfs與hdfs dfs命令的區別
言歸正傳,下面講解java如何操作hdsf文件系統:
pom.xml,hadoop依賴版本儘量保證和服務器上hadoop版本一致
<properties>
<hadoop-version>2.6.5</hadoop-version>
</properties>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop-version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop-version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop-version}</version>
</dependency>
</dependencies>
HdfsUtil.java
public class HdfsUtil {
private Configuration configuration;
private FileSystem fileSystem;
/**
* init
* @param url
* @param user
* @return
* @throws IOException
* @throws InterruptedException
*/
public static HdfsUtil getUtil(String url,String user) throws IOException, InterruptedException {
HdfsUtil util = new HdfsUtil();
util.configuration = new Configuration();
util.fileSystem =FileSystem.get(URI.create(url),util.configuration,user);
return util;
}
/**
* 創建目錄
* @param filePath
* @return
*/
public boolean createPath(String filePath) {
boolean b = false;
Path path = new Path(filePath);
try {
b = this.fileSystem.mkdirs(path);
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
this.fileSystem.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return b;
}
/**
<br>功能描述: 判斷該路徑是否存在,所指路徑是文件還是文件夾
<br>處理邏輯:
<br>作者: lwl [email protected] 2018/12/25 15:08
<br>修改記錄: {修改人 修改原因 修改時間}
* @param
* @throws
* @return int 0:不存在 1:文件 2:文件夾
* @see #
*/
public int checkFile(String filePath) {
Path path = new Path(filePath);
int result = 0;
try {
if(this.fileSystem.exists(path)){
if(this.fileSystem.isDirectory(path)){
result = 2;
}else{
result = 1;
}
}
} catch (IOException e) {
e.printStackTrace();
} finally {
try {
this.fileSystem.close();
} catch (IOException e) {
e.printStackTrace();
}
}
return result;
}
/**
* 上傳文件
* @param sourcePath
* @param savePath
*/
public void uploadFile(String sourcePath, String savePath){
Path source = new Path(sourcePath);
Path disc = new Path(savePath);
try {
this.fileSystem.copyFromLocalFile(source,disc);
} catch (IOException e) {
e.printStackTrace();
}finally {
try {
this.fileSystem.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
public void uploadFile(InputStream input, String savePath) throws IOException {
this.fileSystem.createNewFile(new Path(savePath));
Path inFile = new Path(savePath);
FSDataOutputStream output = this.fileSystem.create(inFile);
IOUtils.copyBytes(input,output,1024*1024*64,false);
output.close();
}
/**
* 下載文件
* @param sourcePath
* @param out
* @throws IOException
*/
public void dowonloadFile(String sourcePath, OutputStream out) throws IOException {
this.fileSystem.createNewFile(new Path(sourcePath));
Path inFile = new Path(sourcePath);
FSDataInputStream input = this.fileSystem.open(inFile);
IOUtils.copyBytes(input,out,1024*1024*64,false);
input.close();
}
public static void main(String[] args) throws IOException, InterruptedException {
String url = "hdfs://my-cdh-master:9000";//注意端口9000跟core-site.xml的fs.defaultFS配置匹配
HdfsUtil util = HdfsUtil.getUtil(url,"root");
util.uploadFile("H:\\VMachines\\my_cdh_slave1\\vmware.log","/test/vmware.log");
}
hdfs的更多操作根據具體需要參考api
至此java操作hdfs完成~~~~