Some results on new statistical randomness tests based on length of runs

Các dãy và các số ngẫu nhiên đóng

một vai trò rất quan trọng trong mật mã. Trong

các nguyên thuỷ mật mã đối xứng, khoá bí mật

chính là thành phần quan trọng nhất nhằm đảm

bảo tính an toàn của chúng. Trong khi đó, các

giao thức mật mã hay lược đồ chữ ký số cũng phụ

thuộc nhiều vào các giá trị ngẫu nhiên. Ngoài ra,

một trong các tiêu chí để đánh giá tính an toàn

cho các nguyên thuỷ mật mã như mã khối, hàm

băm là đánh giá tính ngẫu nhiên đầu ra. Do đó,

việc đánh giá tính ngẫu nhiên theo các kiểm tra

thống kê thực sự rất quan trọng đối với việc đánh

giá tính an toàn của các thuật toán mật mã.

Trong bài báo này, chúng tôi trình bày một số kết

quả nghiên cứu về các tiêu chuẩn kiểm tra loạt

dựa trên độ dài đã được đề xuất bởi A.

Doğanaksoy cùng đồng sự năm 2015. Đầu tiên,

chúng tôi chỉ ra rằng một số giá trị xác suất cho

các loạt độ dài 1 và 2 là chưa chính xác và đề xuất

chỉnh sửa. Sau đó, chúng tôi đã đưa ra và chứng

This manuscript is received on December 1, 2018. It is

commented on December 6, 2018 and is accepted on

December 12, 2018 by the first reviewer. It is commented on

December 16, 2018 and is accepted on December 22, 2018

by the second reviewer.

minh cho trường hợp tổng quát các loạt có độ dài

k bất kỳ. Cuối cùng, chúng tôi đã xây dựng một

công cụ kiểm tra tính ngẫu nhiên dựa trên độ dài

các loạt và áp dụng đánh giá cho các nguồn ngẫu

nhiên thực sự.

Some results on new statistical randomness tests based on length of runs trang 1

Trang 1

Some results on new statistical randomness tests based on length of runs trang 2

Trang 2

Some results on new statistical randomness tests based on length of runs trang 3

Trang 3

Some results on new statistical randomness tests based on length of runs trang 4

Trang 4

Some results on new statistical randomness tests based on length of runs trang 5

Trang 5

Some results on new statistical randomness tests based on length of runs trang 6

Trang 6

Some results on new statistical randomness tests based on length of runs trang 7

Trang 7

Some results on new statistical randomness tests based on length of runs trang 8

Trang 8

Some results on new statistical randomness tests based on length of runs trang 9

Trang 9

pdf 9 trang duykhanh 4200
Bạn đang xem tài liệu "Some results on new statistical randomness tests based on length of runs", để tải tài liệu gốc về máy hãy click vào nút Download ở trên

Tóm tắt nội dung tài liệu: Some results on new statistical randomness tests based on length of runs

Some results on new statistical randomness tests based on length of runs
y  y 1 1k 1 
 1 2 r l l ... l 
 1 2 k ... .
 l l l 
 n l2 l ... kl k 1 r l l ... l 1 2 k 
 1 2k 1 2 k 
 It implies the probabilities 
 n k 1 r kl1 k 1 l 2 ... lk .
 N l
 k k 
 Pr rk l k . 
 2n
 We use the Algorithm 1 in order to evaluate 
 the probabilities Pr rk l k . 
 Số 2.CS (08) 2018 13 
Journal of Science and Technology on Information Security 
 Then, we get Table 1 for probability 
 Algorithm 1: Evaluate Pr rk l k for 
 subintervals. 
 l 0,1, , n / k 
 k 
 Table 1. Interval and probability values for runs of length 
 l0, , l  0, l  0, r  1, N l  0, 
 k2 1 k k one test for 128-bit blocks 
while l n/ k do 
 k Interval Probability 
  Box 1 0-27 0.2191945278 
 Box 2 28-31 0.2304573984 
 while l n / 2 do 
 2 Box 3 32-34 0.1843489091 
 while l n do Box 4 35-38 0.1945435197 
 1 Box 5 39-128 0.1714556450 
 while r n do Total 1 
 1
 N l N l Remark 1: In the Table 1, we have use the 
 k k k k n 1
 2 intervals given in [5], however the calculated 
 n/ k 1 
 n n n kr k 1 l ... l 1 
 1k 1 probabilities of Box 4 and Box 5 are not match 
   
 r l l ... l 1 
 lk 1 0 l 1 0 r 1 1 2 k with the probabilities given in [5]. After 
 r r l r l ... l retesting, we find that the authors in [5] give 
 1 1k 1 
 ... correct intervals but wrong probabilities. The 
 l l l 
 1 2 k correct probabilities are as in Table 1. 
 r r 1 Moreover, the probabilities given in [5] are 
 end while belong to the intervals 35-40, and 41-128, 
 l1 l 1 1 that can not belong to the intervals 35-38 
 end while and 39-128. 
 l l 1 Similarly, we can calculate probabilitiy 
 2 2 intervals for sequences with different lengths. 
 end while 
 The subinterval probabilities for runs of length 
 1 can be seen in Table 2. 
 lk l k 1 
 Table 2. Interval and probability values for runs of length 
end while one test for 64, 128, 256, and 512-bit blocks 
 N
return k n = 64 n = 128 
 Interva
 Computational Complexity: Let the Probability Interval Probability 
 l 
complexity of computing N l be  k then 
 k k Box 0.19008234 0.21919452
 0-12 0-27 
the complexity of probability searching 1 44 78 
 Box 0.23887746 0.23045739
algorithm for runs of length k is about 13-15 28-31 
 2 37 84 
  nk 1 k
 . Box 0.17456037 0.18434890
 16-17 32-34 
After evaluating all probability values, we 3 41 91 
 Box 0.21147040 0.19454351
divide these into 5 subintervals as in [5]. 18-20 35-38 
 4 14 97 
 Case k 1 . Box 0.18500941 0.17145564
 21-64 39-128 
 Choose n 128 , we have calculated all 5 64 50 
probability values and divide into subintervals Total 1 1 
as follow n = 256 n = 512 
 Interva
 27 31 Probability Interval Probability 
 Box1 Pr r l , Box 2 Pr r l , l 
 1 1 1  1 1 1 
 l 0 l 28 Box 0.18725584 0.19356638
 1 1 0-56 0-117 
 34 38 1 09 36 
 Box3 Pr r l , Box 4 Pr r l , 
 1 1 1  1 1 1 Box 0.18928091 0.21863011
 l1 31 l 1 35 57-61 118-125 
 128 2 85 42 
 Box5 Pr r l . Box 0.21985945 0.21707667
  1 1 1 62-66 126-132 
 l1 39 3 18 90 
 Box 0.21877592 0.19951554
 67-72 133-140 
 4 27 92 
 Box 0.18482786 0.17121127
 73-256 141-512 
 5 61 40 
 Total 1 1 
14 Số 2.CS (08) 2018 
 Nghiên cứu Khoa học và Công nghệ trong lĩnh vực An toàn thông tin 
 Box 0.24528775 0.26659050
 Remark 2: In [5], the authors give the 9-10 17-19 
intervals and the corresponding probabilities for 4 67 46 
 Box 0.18330972 0.18246401
runs of length 1. However, we found that some 11-32 20-64 
 5 55 11 
value in [5] are incorrect! For n 64 , if we use 0.99999999 0.99999999
 Total 
the intervals in [5], then the correct probabilities 99 99 
should be: 
 Table 2.1. Interval and probability values for runs of n = 256 n = 512 
 length 1 test for 64-bit blocks Interval Probability Interval Probability 
 Box 0.19257931 0.18893841
 n = 64 0-27 0-57 
 1 49 82 
 Interval Probability 
 Box 0.19405196 0.17879497
 28-30 58-61 
 Box 1 0-13 0.2613425337 2 89 30 
 Box 2 14-16 0.2561417553 Box 0.22292350 0.21049627
 31-33 62-65 
 Box 3 17-18 0.1659176815 3 33 69 
 Box 4 19-21 0.1812433426 Box 0.18785330 0.22561551
 34-36 66-70 
 Box 5 22-64 0.1353546869 4 79 70 
 Total 1 Box 0.20259190 0.19615481
 37-128 71-256 
 These probabilities are not match with the 5 51 50 
 1.00000000 1.00000000
 Total 
probabilities given in [5]. Moreover, these 01 01 
intervals are not equivalent. Therefore, we have 
re-divide into new intervals and recalculate In case k 3 , we have calculated all 
probabilities in new intervals. Interestingly, probability values and divide into subintervals 
these probability values approximate the values as follow: 
given in [5] but belong to other intervals. Table 4. Interval and probability values for runs of length 
 Similar to the case n 128 , intervals and the three test for 64, 128, 256, and 512-bit blocks 
probability values given in [5] is not match. n = 64 n = 128 
 Interval Probability Interval Probability 
Specifically, if we take the given intervals in Box 0.20782508 0.16320900
 0-2 0-5 
[5], we have recalculated the probabilities 1 99 84 
exactly as shown in Table 1 and Table 2. If we Box 0.20431981 0.27450079
 3 6-7 
use the intervals as follows, the probability 2 09 90 
 Box 0.21673204 0.15485472
values coincide with the values given in [5]. 4 8 
 3 08 29 
 Table 2.2. Interval and probability values for runs of Box 0.28324547 0.24505901
 5-6 9-10 
 length 1 test for 128-bit blocks 4 60 18 
 Box 0.08787758 0.16237645
 n = 128 7-21 11-42 
 Interval Probability 5 25 79 
 1.00000000 1.00000000
 Box 1 0-26 0.1731718548 Total 
 Box 2 27-30 0.2142651725 01 00 
 Box 3 31-33 0.1869770204 
 Box 4 34-37 0.2133929800 n = 256 n = 512 
 Interv Interva
 Box 5 38-128 0.2121929722 Probability Probability 
 Total 0.9999999999 al l 
 Box 0.248734758 0.189852605
 0-13 0-27 
 In case k 2 , we have calculated all 1 4 4 
probability values and divide into subintervals Box 0.207164715 0.201497107
 14-15 28-30 
as follow: 2 8 3 
 Box 0.213743768 0.231181647
 16-17 31-33 
 Table 3. Interval and probability values for runs of length 3 1 3 
 two test for 64, 128, 256, and 512-bit blocks Box 0.222144506 0.190006612
 18-20 34-36 
 n = 64 n = 128 4 9 6 
 Interval Probability Interval Probability Box 0.108212250 0.187462027
Box 0.16134454 0.16707580 20-85 37-170 
 0-5 0-12 5 8 4 
 1 44 01 1.000000000 1.000000000
Box 0.26096400 0.17407560 Total 
 6-7 13-14 0 0 
 2 39 80 
Box 0.14909396 0.20979407 
 8 15-16 
 3 94 61 
 Số 2.CS (08) 2018 15 
Journal of Science and Technology on Information Security 
 Remark 3: In the case of n 512 we used Also we use a variation of S , denoted by 
Magma software to divide the intervals and S of length n 1 by adding 1’s at the 
calculate the probability values because it takes beginning the sequence S . The variation of 
quite a long time to run in C ++ language. The derivative is an important part of new defned 
calculation time on Magma for this case is run tests, since the number of runs of different 
about 5000 seconds. length is determined by this sequence. 
 S s,,, s s
 C. Tests Descriptions Let 0 1n 1 be a binary sequence 
 and derivative of S is S s,,, s s . 
 After calculating the probabilities, we begin 0 1n 1
to build a new test based on the number of runs Then, S s0,,, s 1  sn is defined as 
of length k. Specifically, to test a sequence of follows 
N n m bits, where n is the block size we s if i 1,2, , n ,
 i 1
 s 
choose. Or we consider m outputs of a i 1 if i 0.
cryptographic primitive (a block cipher or a 
hash function) that have output block size is n- It is easy to prove that the total runs number 
bit. First, we'll count the number of runs of of a sequence is the weight of the derivative of 
length k of each sequence in m blocks, and that sequence, and the runs number with the 
increases the count value of the corresponding length k in a given sequence S is the number of 
sub-interval to 1. After calculating, we record samples 10  01 overlaps in S . 
the counting values of each sub-interval, k 1 
denoted by FFF,,, respectively. We use the Algorithm 2 presents the pseudocode of the 
 1 2 5 test based on the number of runs of length k: 
approach as in [8], using χ2 test to evaluate the 
 Algorithm 2: Test based on the number of 
randomness of the sequence. 
 S,,,,,,, S S L l1 l 2  l m
Consider: runs of length k 1 2 m k k k k  
 2
 5 S s ,,, s  s 
 F m Pr i i,0 i ,1 i , n
 2 i i
  , i
 i 1 m Pri i0, lk  0 
 Lastly p-value is calculated according to the while j n k do 
given values: 
 5 1 2 temp s 2k s 2 k 1 s 2 k 2  s 2 0
 i, j i , j 1 i , j 2 i , j k
 p value igamc , . 
 k
 2 2 if temp 2 1 then 
 By comparing the produced p-value with the i i
 lk l k 1 
level of significance , we can conclude about 
 end if 
the randomness of the input sequence. 
 i i 1 
Note that for 2 test, we require m Pr 5 
 i end while 
therefore for Pr 0.2 we need m 25 . Thus, 2 
 i Applying χ test for Lk 
the new tests can be applied for short sequences return p-value. 
of length N n m bit for m 25 . 
In addition, counting the total runs numbers and IV. SOME EXPERIMENTAL RESULTS 
runs numbers with length k of a sequence by We have developed a randomness test 
definition is difficult. Therefore, we use the program using tests based on runs of lengths 1, 
concept of “derivative” of a sequence. 2 and 3. The program interface is shown in 
 Definition 1 (Remark 11, [5]) (derivative of Figure 1. Specifically, we have tested 4 files 
 true random: sample1.rng, sample2.rng, 
a sequence) Let S s0,,, s 1 sn 1 be a binary 
sequence of length n, the derivative of S, sample3.rng, sample4.rng (these true random 
 files are actually 32 KB in size, that is, bits of 
denoted by S s,,, s s is defined as 
 0 1n 1 N 32 1024 8 length, downloaded at 
follows  with 
 s s if i 0,1, , n 2,
 i i 1 the following cases n = 64, 128, 256 and 512. 
 s 
 i 1 if i n 1.
 The results of all 4 files have passed 3 new runs 
16 Số 2.CS (08) 2018 
 Nghiên cứu Khoa học và Công nghệ trong lĩnh vực An toàn thông tin 
 Runs of 
tests with n = 64, 128, 256 and 512. 0.8992
 length 0.231489 0.571770 0.779767 
Specifically, for the case n = 64, select the 93 
 2 test 
significance level 0.01 , the file to be 
 Runs of 
 0.5063
checked is sample1.rng, we get the result as length 0.814081 0.011770 0.591287 
 32 
shown in Figure 1: 3 test 
 Case n = 512 
 sample sample2.r sample3.r sample4.r
 1.rng ng ng ng 
 Runs of 
 0.1205
 length 0.292832 0.480338 0.861397 
 85 
 1 test 
 Runs of 
 0.2522
 length 0.821268 0.105730 0.726579 
 12 
 2 test 
 Runs of 
 0.8111
 length 0.471682 0.110607 0.834620 
 72 
 3 test 
 Fig 1. The program interface of 3 new runs tests for V. CONCLUSION 
 n = 64, 0.01 for file sample1.rng 
 In this paper, we present some results on new 
 Similarly, we perform tests for the cases randomness tests based on length of run 
n 128,256,512 and for files sample2.rng, proposed by A. Doğanaksoy et al. [5]. First, we 
sample3.rng, sample4.rng. The results are have given and demonstrated in detail the 
summarized in the following Table 5: probability calculation formula for runs of 
 Table 5. Results of 3 new runs tests for length k, with 1 k n . Second, we show that 
 true random files some probability values for runs lengths 1 and 2 
 are inaccurate and suggest corrections. Third, 
 Case n = 64 
 sample sample2.r sample3.r sample4.r we have built a randomness testing algorithm 
 1.rng ng ng ng based on the length of runs. Finally, we 
Runs of 
 0.2654 programmed to build an accurate and efficient 
 length 0.177239 0.249560 0.857602 
 71 tool to test randomness based on the length of 
 1 test runs and apply evaluations to true random 
Runs of 
 0.5320
 length 0.054239 0.509319 0.219101 sources. 
 56 
 2 test Further research directions: Note that the 
Runs of 
 0.5785 criteria presented in this paper can only be used 
 length 0.832500 0.445590 0.941098 
 69 to evaluate sequences of lengths greater than 
 3 test 512 bits, so it is not applicable to assess 
 Case n = 128 
 sample sample2.r sample3.r sample4.r randomness output for cryptographic primitives 
 1.rng ng ng ng such as block ciphers or hash functions. To be 
Runs of able to evaluate for sequences of length less 
 0.0897
 length 0.601941 0.251491 0.941470 
 78 than or equal to 512 bits, we need to recalculate 
 1 test the probability distribution for blocks of smaller 
Runs of 
 0.1695
 length 0.435659 0.645554 0.416198 lengths and divide the probability interval 
 05 
 2 test accordingly. This is an open problem that needs 
Runs of further research in the future. In addition, the 
 0.2641
 length 0.893517 0.393173 0.978088 
 85 evaluation of probability values for series with a 
 3 test length greater than or equal to 4 and the 
 Case n = 256 correlation between these tests also need further 
 sample sample2.r sample3.r sample4.r
 1.rng ng ng ng consideration and research. 
Runs of 
 0.6409
 length 0.308548 0.272620 0.422990 
 39 
 1 test 
 Số 2.CS (08) 2018 17 
Journal of Science and Technology on Information Security 
 REFERENCES ABOUT THE AUTHOR 
 B.S. Linh Hoang Dinh 
[1]. M. D. MacLaren, “The art of computer Workplace: Institute of 
 programming. Volume 2: Seminumerical Cryptography Science and 
 algorithms (Donald E. Knuth)”, SIAM Review Technology. 
 12, pp. 306-308, 1970. 
 Email: linhhd@bcy.gov.vn 
[2]. G. Marsaglia, “The marsaglia random number 
 cdrom including the diehard battery of tests of The education process: has received 
 randomness, 1995”. URL  stat. fsu. a mathematical bachelor degree in 
 edu/pub/diehard, 2008. Ha Noi University of Science, in 
 2014. 
[3]. W. Caelli, “Crypt x package documentation”. 
 Information Security Research Centre and Research today: symmetric cryptography algorithm. 
 School of Mathematics, Queensland University 
 of Technology, 1992. 
[4]. A. Rukhin, J. Soto, J. Nechvatal, E. Barker, S. 
 Leigh, M. Levenson, D. Banks, A. Heckert, J. 
 Dray, S. Vo, “Statistical test suite for random 
 and pseudorandom number generators for 
 cryptographic applications, NIST special 
 publication”, 2010. 
[5]. A. Doğanaksoy, F. Sulak, M. Uğuz, O. Şeker, Z. 
 Akcengiz, “New statistical randomness tests 
 based on length of runs”. Mathematical 
 Problems in Engineering, 2015. 
[6]. S. W. Golomb, “Shift register sequences”. 
 Aegean Park Press, 1982. 
[7]. P. L'Ecuyer, R. Simard, “TestU01: AC library 
 for empirical testing of random number 
 generators”. ACM Transactions on 
 Mathematical Software (TOMS) 33, 22, 2007. 
[8]. F. Sulak, A. Doğanaksoy, B. Ege, O. Koçak, 
 “Evaluation of randomness test results for short 
 sequences”, in International Conference on 
 Sequences and Their Applications. Springer, pp. 
 309-319, 2010. 
18 Số 2.CS (08) 2018 

File đính kèm:

  • pdfsome_results_on_new_statistical_randomness_tests_based_on_le.pdf