在一堆雜亂無序的字母中找出隱藏的英文語句(Python)

原創

T3rry7f

2018-09-05 03:00

在某ctf網站看到一個比較有意思的題目，題目大概的意思是在一個隨機的生成的字母文本里被插了一句有意義的英文句子，由於文本比較大靠人眼去找基本沒可能。

解決思路如下：

1.到網上找一個常用的英文單詞表做成字典，不要太大基本常用的2000-3000個就夠用用了

2.記錄字典裏每個詞在隨機文本里的出現位置後再排序去重

3.假設句子的單詞個數不少12，最大的單詞長度不大於10，這樣的話最後可以篩選出符合條件的句子

隨便寫了個程序跑了下,大概10來秒就可以出結果了,最終大概之後剩下40個結果,一眼就知道結果了.

import os  
import time

def FindIndex(index,IndexList,FindedCount,PosList):
    if(FindedCount<WordCount):
        if (int(IndexList[index+1])-int(IndexList[index]))<WordMaxLen:
            FindedCount+=1      
            index+=1
            FindIndex(index,IndexList,FindedCount,PosList)  
    else:
        PosList.append(int(IndexList[index]))    
 
time.clock()  
path= os.path.split(os.path.realpath(__file__))[0]  
ts=open(path+"/find.txt")  #需要查找的文件
text_all=ts.read()  
dict=open(path+"/dic.txt") #字典文件
output=open(path+"/output.txt",'w')   #輸出文件
hash=[]  
dic=[]  
line = dict.readline()              
while line:  
    dic.append(line)  
    line = dict.readline()  
start = 0  
for d in dic:  
    while True:  
        index = text_all.find(d.strip('\r\n'), start)  
        if index == -1:  
            start=0  
            break   
        output.write(str(index)+'\n')  
        start = index + 1  
os.system('cat '+path+'/output.txt |sort -n |uniq >new.txt') #去重排序

new=open(path+"/new.txt")  
IndexList=[]  
PosList=[]
TotalCount=0  
WordCount=12
FindedCount=0
WordMaxLen=9
line = new.readline()              
while line:  
    IndexList.append(line)  
    line = new.readline()  
    TotalCount=TotalCount+1  
for i in range(0,TotalCount-WordCount):    
    FindIndex(i,IndexList,FindedCount,PosList)#遞歸查找

print "Use time:%s" % time.clock()  ,"find result ", len(PosList)
if(len(PosList)>0): #打印結果
    i=0
    for pos in PosList:
        print i,' :',
        for j in range(pos-21,pos+30):  
            print text_all[j],  
        print "\r\n"                    
        i+=1

結果圖:

其實還可以就結果以連續區間密度排序下,由於結果不多,就不排序了

只有48條結果,很快就能找到句子了

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

在一堆雜亂無序的字母中找出隱藏的英文語句(Python)

linux安裝cuda和cudnn

Mellanox網卡開啓SR-IOV

模擬手機設備：使用 Playwright 實現移動端自動化測試

全面系統的AI學習路徑，幫助普通人也能玩轉AI

HTML 00 Tutorial

uni-app實現上拉加載

vue3編譯優化之“靜態提升”

又是一個月-20240513

flask 如何保證返回json有序

linux服務器設置ssh免密

HTTPS明文劫持之證書僞造(Python)

網鼎杯2020-玄武組-writeup（safe_box)

DNS隧道之突破運營商認證

網盤秒傳原理利用之任意文件轉存

DMM破解原理之HLS篇（一）

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結