1. 程式人生 > 其它 >維吉尼亞密碼的破解演算法及python程式碼實現

維吉尼亞密碼的破解演算法及python程式碼實現

技術標籤:python密碼學

1. 密文描述

密文1

密文:
krkpekmcwxtvknugcmkxfwmgmjvpttuflihcumgxafsdajfupgzzmjlkyykxdvccyqiwdncebwhyjmgkazybtdfsitncwdnolqiacmchnhwcgxfzlwtxzlvgqecllhimbnudynagrttgiiycmvyyimjzqaxvkcgkgrawxupmjwqemiptzrtmqdciakjudnnuadfrimbbuvyaeqwshtpuyqhxvyaeffldmtvrjkpllsxtrlnvkiajfukycvgjgibubldppkfpmkkuplafslaqycaigushmqxcityrwukqdftkgrlstncudnnuzteqjrxyafshaqljsljfunhwiqtehncpkgxspkfvbstarlsgkxfibffldmerptrqlygxpfrwxtvbdgqkztmtfsqegumcfararhwerchvygczyzjaacgntgvfktmjvlpmkflpecjqtfdcclbncqwhycccbgeanyciclxncrwxofqieqmcshhdccughsxxvzdnhwtycmcbcrttvmurqlphxnwddkopqtehzapgpfrlkkkcpgadmgxdlrchvygczkerwxyfpawefsawukmefgkmpwqicnhwlnihvycsxckf

金鑰: crypt (長度為5)

明文:
  I am alive here, my beloved, for the reason to adore you. Oh!How anxious I have been for you and how sorry I am about all you must have suffered in having no news from us. May heaven grant that this letter reaches you. Do not write to me, this would compromise all of us and above all,do not return under any circumstances. It is known that it was you who helped us to get away from here and all would be lost if you should show yourself.We are guarded day and night. I do not care you are not here. Do not be troubled on my account. Nothing will happen to me. The national assemble will show leniency. Farewell the most loved of men. Be quiet if you can take care of yourself.For myself I cannot write any more, but nothing in the world could stop me to adore you up to the death.

密文2

密文:
cbkznkiyjsrofgnqadnzuqigscvxizgsjwucusrdkxuahgzrhywtvdjeiuwsrrtnpszbvpzncngztbvsrnzuqigscvfjwqgjwcytwdazuqigscvfjwqgjwjhkfdylmcbmhonbmbvdnvbmwbnacjaphhonbmbvdnvbmwbnaublsbdnjjneoroyfmxfhixpzpcozzuqigscvxcvhdmfgxmgovzsqmvzyvwyzmsczoajsejifoakdcrehwhgdehvmtnmvvmesvzifutzfjzoalwqztunwvdvmfhesvzifutzfjzoalwqztunpsnoyfleoxdetbwfsoyfjmfhjuxuagnarsfqydoyfjzsrzeujmfhjuubihrjdfinwsnepcawdnkbobvnmzucmghijjmbscjejnapddehlmqddmfxncqbfpxwfejifpqzhikiyaiozimubwuzufazsdjwdiudzmztivcmgp

金鑰:uiozvrb(長度為7)

明文:
  It was the best of times.It was the worst of times. It was the age of wisdom. It was the age offoolishness. It was the epoch of belief. It was the epoch of incredulity. It was the season of light. It was the season of darkness. It was the spring of hope. It was the winter of despair.We had everything before us. We had nothing before us. We were all going direct to heaven. We were all going direct the other way. In short, the period was so, far like the present period, that some of its noisiest authorities insisted on its being received, for good or for evil, in the superlative degree of comparison only.

2. 破解原理

2.1 重合指數法確定金鑰長度

  重合指數是指字串中兩個隨機元素相同的概率,記為 IC \text{IC} IC。設某種語言由 n n n個字母組成,每個字母 i i i發生的概率為 P i ( 1 ≤ i ≤ n ) P_i\left( 1\le i\le n \right) Pi(1in),則:
I C = ∑ i = 1 n P i 2 , ( 1 ) IC=\sum_{i=1}^n{P_{i}^{2}}\ \text{,} \left(1 \right) IC=i=1nPi2(1)
  現實世界中密文有限,故從密文計算的重合指數總是不同於理論值,所以一般用 IC \text{IC} IC的無偏估計值:
I C = ∑ i = 1 n x i ( x i − 1 ) L ( L − 1 ) . ( 2 ) IC=\sum_{i=1}^n{\frac{x_i\left( x_i-1 \right)}{L\left( L-1 \right)}}\ . \left(2 \right) IC=i=1nL(L1)xi(xi1).(2)
  其中, x i x_i xi是密文中i出現的次數,L表示密文長度,n表示字母數,此處n=26。
  對於英文字串,通過詞頻統計可以得到26個英文字母出現的期望概率。通過大量的英文樣本,利用公式(1)可得一般英文文字重合指數值:
I C = ∑ i = 0 25 P i 2 = 0.065 , ( 3 ) IC=\sum_{i=0}^{25}{P_{i}^{2}}=0.065\ \text{,} \left(3 \right) IC=i=025Pi2=0.065(3)
  對於一個完全隨機的字串,其重合指數值:
I C = 26 ( 1 26 ) 2 = 0.038 . ( 4 ) IC=26\left( \frac{1}{26} \right) ^2=0.038\ . \left(4 \right) IC=26(261)2=0.038.(4)
  因為0.065與0.038間隔的充分遠,所以能夠通過這種方式確定金鑰長度。具體計算步驟如下:
  Step1:初始化d = 1。
  Step2:將密文分成d個子串,分別對每一個子串利用公式(2)計算其重合指數,及它們的平均值。若d為金鑰長度,那麼該平均值的IC值將接近於0.065。
  Step3:d = d+1,若d < 10重複步驟二。
  Step4:比較每一個d下的重合指數平均值,取最接近0.065的d值即為所求。
  以密文1為例,利用上述演算法,得到部分結果見表1。

表1 密文1金鑰長度與重合指數對應表

  從計算結果可以確定密文1的金鑰字長度為5,此時每一個子串的平均重合指數為0.06254。
  對於密文2採用同樣的計算方法得到金鑰字長度為7,此時平均重合指數為0.07392。對於其餘的情況,平均重合指數均在0.035-0.046之間波動。

2.2 互重合指數確定子串間相對偏移

  從以上結論可知金鑰字長度。下面本文將採用互重合指數法來確定相對位移,其精確定義如下。
  已經確定金鑰字長度為m,密文子串 Y i Y_i Yi中的各個密文字母都是由同一個單表代換得到。設金鑰為 K = K 1 K 2 … K m K=K_1K_2…K_m K=K1K2Km,可估算 M I c ( Y i , Y j ) MI_c\left( Y_i,Y_j \right) MIc(Yi,Yj)的值。顯然, Y i Y_i Yi中的一個隨機元素與 Y j Y_j Yj中的一個隨機元素同時為第h個英文字母的概率 P h − K i P h − K j , 0 ≤ h ≤ 25 P_{h-K_i}P_{h-K_j},0\le h\le 25 PhKiPhKj,0h25,這裡下標運算為模26運算。因此有

M I c ( Y i , Y j ) ≈ ∑ h = 0 25 P h − K i P h − K j = ∑ h = 0 25 P h P h + K i − K j ( 5 ) MI_c\left( Y_i,Y_j \right) \approx \sum_{h=0}^{25}{P_{h-K_i}P_{h-K_j}}=\sum_{h=0}^{25}{P_hP_{h+K_i-K_j}} \left(5 \right) MIc(Yi,Yj)h=025PhKiPhKj=h=025PhPh+KiKj(5)
  該估計值僅僅依賴於 s i j = ( K i − K j ) mod    26 s_{ij}=\left( K_i-K_j \right) \text{mod\,\,}26 sij=(KiKj)mod26,稱其為 Y i Y_i Yi Y j Y_j Yj的相對位移。易得,當相對位移s為0時,互重合指數估計接近0.065;當s不為0時,這些估計值在0.031-0.045之間波動。因此可以確定子串之間的相對位移,以密文1為例得到結果見表2。

表2 密文1互重合指數表

  第1組和第2組之間偏移為11時,互重合指數為0.06141;
  第1組和第3組之間偏移為4時,互重合指數為0.06148;
  第1組和第4組之間偏移為13時,互重合指數為0.05907;
  第1組和第5組之間偏移為9時,互重合指數為0.06198。
  於是金鑰字為 K = ( K 1 , K 1 − 11 , K 1 − 4 , K 1 − 13 , K 1 − 9 ) K=\left( K_1,K_1-11,K_1-4,K_1-13,K_1-9 \right) K=(K1,K111,K14,K113,K19)

2.3 金鑰字的確定

  對 K 1 K_1 K1遍歷26個英文字母后,可以得到26種可能的金鑰。對每一個金鑰均對密文進行解密。通過分析“明文”的前20個字母,可以確定:
  密文1金鑰為crypt, k=(2,-9,-2,-11,-7) ;
  密文2金鑰為uiozvrb, k=(20,8,14,-1,-5,17,1) 。

2.4 密文破解

  利用上述金鑰對密文進行破解,引入wordninja庫對英文明文進行分詞操作。最後,為明文進行標點調整。

3. 破解程式碼

Python程式碼(程式碼說明見程式碼註釋)
執行環境:python 3.7 + numpy、wordninja第三方庫

'''維吉尼亞破解'''
import numpy as np
import wordninja

def alpha(cipher): #預處理,去掉空格以及回車 
    c = ''
    for i in range(len(cipher)):
        if(cipher[i].isalpha()):
            c += cipher[i]
    return c

def count_IC(cipher): #給定字串計算其重合指數 
    count = [0 for i in range(26)]
    L = len(cipher)
    IC = 0.0
    for i in range(len(cipher)):
        if(cipher[i].isupper()):
            count[ord(cipher[i])-ord('A')] += 1
        elif(cipher[i].islower()):
            count[ord(cipher[i])-ord('a')] += 1
    for i in range(26):
        IC += (count[i]*(count[i]-1))/(L*(L-1))
    return IC

def count_key_len(cipher,key_len): #對字串按輸入個數進行分組,計算每一組的IC值返回平均值
    N = ['' for i in range(key_len)]
    IC = [0 for i in range(key_len)]
    for i in range(len(cipher)):
        m = i % key_len
        N[m] += cipher[i]
    for i in range(key_len):
        IC[i] = count_IC(N[i])
    #print(IC)
    print("長度為%d時,平均重合指數為%.5f" % (key_len,np.mean(IC)))
    return np.mean(IC)    

def length(cipher): #遍歷確定最有可能的金鑰長度返回金鑰長度 
    key_len = 0
    mins = 100
    aver = 0.0
    for i in range(1,10):
        k = count_key_len(cipher,i)
        if(abs(k-0.065)<mins):
            mins = abs(k-0.065)
            key_len = i
            aver = k
    print("金鑰長度為%d,此時重合指數每組的平均值為%.5f" % (key_len,aver))
    return key_len
    
def count_MIC(c1,c2,n): #n=k1-k2為偏移量,計算c1,c2互重合指數MIC
    count_1 = [0 for i in range(26)]
    count_2 = [0 for i in range(26)]
    L_1 = len(c1)
    L_2 = len(c2)
    MIC = 0
    for i in range(L_1):
        if(c1[i].isupper()):
            count_1[ord(c1[i])-ord('A')] += 1
        elif(c1[i].islower()):
            count_1[ord(c1[i])-ord('a')] += 1         
    for i in range(L_2):
        if(c2[i].isupper()):   
            count_2[(ord(c2[i])-ord('A')+n+26)% 26] += 1
        elif(c2[i].islower()):
            count_2[(ord(c2[i])-ord('a')+n+26)% 26] += 1   
    for i in range(26):
        MIC += count_1[i]*count_2[i]/(L_1*L_2)
    return MIC

def count_n(c1,c2): #確定兩個子串最優的相對偏移量n=k1-k2
    n = 0
    mins = 100
    k = [0.0 for i in range(26)]
    for i in range(26):
        k[i] = count_MIC(c1,c2,i)
        #print(i,k[i])  
        if(abs(k[i]-0.065)<mins):
            mins = abs(k[i]-0.065)
            n = i      
    return n
        
def  group_k(cipher,key_len):#完成分組操作並計算每一組與第一組的最優相對偏移量並返回
    N = ['' for i in range(key_len)]
    MIC = [0 for i in range(key_len)]
    s = [0 for i in range(key_len)]
    for i in range(len(cipher)): #對密文進行分組 
        m = i % key_len
        N[m] += cipher[i] 
    for i in range(1,key_len):  #計算與第一組之間的相對偏移量 
        s[i] = count_n(N[0],N[i])  # s[i] = k1-k(i+1) 
        MIC[i] = count_MIC(N[0],N[i],s[i])  # MIC[i] = MIC(1,i+1)
        print("第1組和第%d組之間偏移為%d時,互重合指數為%.5f" % (i+1,s[i],MIC[i]))
    return s    

def miyao(key_len,s,k):  #k為第一個子串的移位,輸出金鑰並返回金鑰所有字母的下標
    mi = ['' for i in range(key_len)] 
    for i in range(key_len):
        s[i] = -s[i]+k   #k2=k1-n
        mi[i] = chr((s[i]+26) % 26 + ord('a'))
    print("第一個偏移量為%d,金鑰為%s時" % (k,mi))
    return s

def the_end(cipher,key_len,s):#輸入密文金鑰返回明文結果 
    plain ='' 
    i = 0 
    while( i < len(cipher)):
        for j in range(key_len):
            if(cipher[i].isupper()):
                plain += chr((ord(cipher[i])-ord('A')-s[j]+26) % 26 + ord('A'))
            else:
                plain += chr((ord(cipher[i])-ord('a')-s[j]+26) % 26 + ord('a'))
            i+=1
            if(i == len(cipher)):
                break 
   # print(plain) 
    return plain
             
if __name__ == "__main__":
    fp = open("密文2.txt","r")
    cipher = '' 
    for i in fp.readlines():
        cipher = cipher + i    
    fp.close()
    cipher = alpha(cipher)
    key_len = length(cipher)
    s = group_k(cipher,key_len)
    m = s.copy()
    for k in range(26):
        s = m.copy() 
        s = miyao(key_len,s,k)
        plain = the_end(cipher,key_len,s)
        print(plain[0:20]) #輸出部分明文確定偏移量k1
    print("參考輸出,請輸入第一個子串的偏移量:",end='')
    k = int(input())
    m = miyao(key_len,m,k)
    plain = the_end(cipher,key_len,m)

    '''對英文文字進行分詞'''
    word = wordninja.split(plain)
    plain = ''
    for i in range(len(word)):
        plain += word[i]
        plain += ' '
    print("明文為\n"+plain)

部分輸出截圖:

參考文獻

python實現維吉尼亞祕鑰破解

一種新式Vigenere密碼的破譯和研究

Python英文文字分詞(無空格)模組wordninja的使用例項