shell腳本--awk數組實現去除重複行

原創

2020-02-22 14:24

去除重複行的方法有很多，這裏介紹三種。

測試文本：

[root@172-0-10-222 myscripts]# cat testfile
andy 123456
hanna 123456
hello world
welcome fuck
andy 123456
hello world
andy andy

這其中，有andy 123456和hello world是重複的。

（1）使用sort、uniq命令去除重複行

[root@172-0-10-222 myscripts]# cat testfile | sort | uniq
andy 123456
andy andy
hanna 123456
hello world
welcome fuck

這裏是先將每一行進行默認規則排序，然後把重複的行去掉。

（2）使用awk數組去除重複行

[root@172-0-10-222 myscripts]# cat testfile | awk '!arr[$0]++{print $0}'
andy 123456
hanna 123456
hello world
welcome fuck
andy andy

也可以簡寫成 cat testfile | awk '!arr[$0]++'

分析：將每一行數據作爲數組的下標，某下標x第一次出現的時候arr[x]爲0，第二次，第三次，。。。，第n次出現的時候arr[x]不爲0。這裏，!arr[$0]++是選擇第一次出現的行進行打印輸出。

（3）使用awk數組去除重複行詳細寫法

[root@172-0-10-222 myscripts]# cat testfile | awk '{arr[$0]=1}END{for(i in arr){print i}}'
welcome fuck
hanna 123456
andy 123456
hello world
andy andy

分析：以每一行作爲數組下標給數組賦值，重複行下標就會替換掉前面的下標。然後輸出留下來的下標即可。

案例：去除重複號碼行

號碼文件

[root@172-0-10-222 myscripts]# cat testfile
andy 15871731153
hanna 15387876543
hello 15578765389
welcome 15871731153
andy 13987273647
hello 15871731153
andy 15871731153

去除文件中號碼重複的行

[root@172-0-10-222 myscripts]# cat testfile | awk '!arr[$2]++'
andy 15871731153
hanna 15387876543
hello 15578765389
andy 13987273647

Andy_Hanna

發佈了71 篇原創文章 · 獲贊 5 · 訪問量 6454

私信關注

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

shell腳本--awk數組實現去除重複行

案例：去除重複號碼行

[轉帖]使用NMT和pmap解決JVM資源泄漏問題原創

Python實現大麥網搶票的四大關鍵技術點解析

Python 安裝庫指令大全

salesforce零基礎學習（一百三十八）零碎知識點小總結（十）

一款開源的.NET程序集反編譯、編輯和調試神器

關於接口協議，你必須要知道這些！

基於 Milvus + LlamaIndex 實現高級 RAG

【2024-05-21】以茶會友

linux設備文件名和掛載點

Linux服務管理--SAMBA服務

Linux服務管理--NFS服務

Linux服務管理--VSFTP服務

Python基礎--Python入門--不好理解的基礎知識

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結