學習大數據——Shell工具（cut、sed、awk、sort）

cut

cut的工作就是“剪”，具體的說就是在文件中負責剪切數據用的。cut 命令從文件的每一行剪切字節、字符和字段並將這些字節、字符和字段輸出。

基本用法
cut [選項參數] filename
說明：默認分隔符是製表符
選項參數說明

選項參數	功能
-f 列號	提取第幾列
-d 分隔符	按照指定分隔符分割列
-c	指定具體的字符

3.案例實操
（0）數據準備

[root@hadoop100 learnshellutil]# vim testcut.txt
[root@hadoop100 learnshellutil]# cat testcut.txt 
1 2 3 4
1 2 3 4
1 2 3 4
11 22 33 44

（1）切割testcut.txt第一列

[root@hadoop100 learnshellutil]# cut -d " " -f 1 testcut.txt 
1
1
1
11

（2）切割testcut.txt第二、三列

[root@hadoop100 learnshellutil]# cut -d " " -f 2,3 testcut.txt 
2 3
2 3
2 3
22 33

（3）在testcut.txt文件中切割出22

[root@hadoop100 learnshellutil]# cat testcut.txt | grep 22 | cut -d " " -f 2
22

（4）選取系統PATH變量值，第2個“：”開始後的所有路徑：

[root@hadoop100 learnshellutil]# echo $PATH
/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin

[root@hadoop100 learnshellutil]# echo $PATH | cut -d : -f 2-
/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin

（5）切割ifconfig 後打印的IP地址

[root@hadoop100 learnshell]# ifconfig | grep Bcast | cut -d : -f 2 | cut -d " " -f 1
192.168.1.100

sed

sed是一種流編輯器，它一次處理一行內容。處理時，把當前處理的行存儲在臨時緩衝區中，稱爲“模式空間”，接着用sed命令處理緩衝區中的內容，處理完成後，把緩衝區的內容送往屏幕。接着處理下一行，這樣不斷重複，直到文件末尾。文件內容並沒有改變，除非你使用重定向存儲輸出。

基本用法
sed [選項參數] ‘command’ filename
選項參數說明

選項參數	功能
-e	直接在指令列模式上進行sed的動作編輯。
-i	直接編輯文件

命令功能描述

命令	功能描述
a	新增，a的後面可以接字串，在下一行出現
d	刪除
s	查找並替換

先指定範圍後指定命令，可以使用‘/正則表達式/’指定範圍

案例實操
（0）數據準備

[root@hadoop100 learnshellutil]# cat testsed.txt 
1 2 3 4
1 2 3 4
1 2 3 4
11 22 33 44

（1）將“one two”這個單詞插入到testsed.txt第二行下，打印。

[root@hadoop100 learnshellutil]# sed '2a one two' testsed.txt 
1 2 3 4
1 2 3 4
one two
1 2 3 4
11 22 33 44
[root@hadoop100 learnshellutil]# cat testsed.txt 
1 2 3 4
1 2 3 4
1 2 3 4
11 22 33 44

注意：文件並沒有改變，當選項是-i的時候纔會改變文件

（2）刪除testsed.txt文件所有包含11的行

[root@hadoop100 learnshellutil]# sed '/11/d' testsed.txt 
1 2 3 4
1 2 3 4
1 2 3 4

（3）將sed.txt文件中11替換爲55

[root@hadoop100 learnshellutil]# sed 's/11/55/g' testsed.txt 
1 2 3 4
1 2 3 4
1 2 3 4
55 22 33 44

注意：‘g’表示global，全部替換
（4）將sed.txt文件中的第二行刪除並將1替換爲5

[root@hadoop100 learnshellutil]# sed -e '2d' -e 's/1/5/g' testsed.txt 
5 2 3 4
5 2 3 4
55 22 33 44

awk

一個強大的文本分析工具，把文件逐行的讀入，以空格爲默認分隔符將每行切片，切開的部分再進行分析處理。

基本用法
awk [選項參數] ‘pattern1{action1} pattern2{action2}…’ filename
pattern：表示AWK在數據中查找的內容，就是匹配模式
action：在找到匹配內容時所執行的一系列命令
選項參數說明

選項參數	功能
-F	指定輸入文件折分隔符
-v	賦值一個用戶定義變量

案例實操
（0）數據準備

[root@hadoop100 learnshellutil]# cp /etc/passwd ./testawk.txt

（1）搜索testawk.txt文件以root關鍵字開頭的所有行，並輸出該行的第7列。

[root@hadoop100 learnshellutil]# awk -F : '/^root/{print $7}' testawk.txt 
root /bin/bash

（2）搜索testawk.txt文件以root關鍵字開頭的所有行，並輸出該行的第1列和第7列，中間以“，”號分割。

[root@hadoop100 learnshellutil]# awk -F : '/^root/{print $1" "$7}' testawk.txt 
root /bin/bash

注意：只有匹配了pattern的行纔會執行action
（3）只顯示/etc/passwd的第一列和第七列，以逗號分割，且在所有行前面添加列名user，shell在最後一行添加"XXX"。

[root@hadoop100 learnshellutil]# awk -F : 'BEGIN{print "user,shell"} {print $1" "$7} END{print "XXX"}' testawk.txt 
user,shell
root /bin/bash
bin /sbin/nologin
。。。
sshd /sbin/nologin
tcpdump /sbin/nologin
why /bin/bash
XXX

注意：BEGIN 在所有數據讀取行之前執行；END 在所有數據執行之後執行。
（4）將passwd文件中的用戶id增加數值1並輸出

[root@hadoop100 learnshellutil]# awk -v i=1 -F: '{print$3+i}' testawk.txt 
1
2
...
498
75
73
501

awk的內置變量

變量	說明
FILENAME	文件名
NR	已讀的記錄數
NF	瀏覽記錄的域的個數（切割後，列的個數）

案例實操
（1）統計passwd文件名，每行的行號，每行的列數

[root@hadoop100 learnshellutil]# awk -F: '{print "filename:" FILENAME "linenumber:" NR ",columns:" NF}' testawk.txt 
filename:testawk.txtlinenumber:1,columns:7
filename:testawk.txtlinenumber:2,columns:7
filename:testawk.txtlinenumber:3,columns:7
filename:testawk.txtlinenumber:4,columns:7

（2）切割IP

[root@hadoop100 learnshellutil]# ifconfig eth0 | grep "inet addr" | awk -F: '{print $2}' | awk -F " " '{print $1}'
192.168.1.100

（3）查詢sed.txt中空行所在的行號

[root@hadoop100 learnshellutil]# cat testsed.txt 
1 2 3 4
1 2 3 4
1 2 3 4
11 22 33 44


[root@hadoop100 learnshellutil]# awk '/^$/{print NR}' testsed.txt 
5
6

sort

sort命令是在Linux裏非常有用，它將文件進行排序，並將排序結果標準輸出。

基本語法
sort(選項)(參數)

選項	說明
-n	依照數值的大小排序
-r	以相反的順序來排序
-t	設置排序時所用的分隔字符
-k	指定需要排序的列

參數：指定待排序的文件列表
7. 案例實操
（0）數據準備

[root@hadoop100 learnshellutil]# vim testsort.txt
[root@hadoop100 learnshellutil]# cat testsort.txt 
1:2:3
6:58:99
45:24:72

（1）按照“：”分割後的第三列倒序排序。

[root@hadoop100 learnshellutil]# sort -t: -nrk 3 testsort.txt 
6:58:99
45:24:72
1:2:3

引用自尚硅谷課件，命令博主示例手工製作

學習大數據——Shell工具（cut、sed、awk、sort）

cut

sed

awk

sort

學習大數據——SpringMVC處理使用jQuery等靜態資源

學習大數據——自動創建一個SpringBoot_HelloWorld（必須聯網）以及修改SpringBoot的版本

學習大數據——Spring MVC中@RequestMapping映射請求註解

學習大數據——Spring MVC中處理請求數據方法以及解決其中的POST中文亂碼問題

學習大數據——linux文件目錄類命令(包含修改提示符和修改歷史格式）

Mac下配置sublime實現LaTeX

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結