Monday, June 10, 2013

Week 5: Finishing up Online Sources

This week, I finished up our work with gathering entropy from online sources. I was able to circumvent the usage of twill for downloading random bytes from randomserver.dyndns.org by simply appending the form entry data to the URL. This way, I could utilize the Python URL library and the script to download 4096 random bytes became trivial. After that, I brought all of the online sources together. I standardized the output files of each script that was written and created a tmp folder for the output files to be stored in. Then, I wrote run_all.py, which uses the Python os extension to run each script from the command line. Once all the temporary files have been stored in tmp, the run_all script determines the timestamp, appends it to the file names, and moves the files to ./daily_results in the randomness github repository. Amir has set up a Cron job to execute run_all.py a couple times each day on his BICUSPID account, so the daily_results directory will continue to grow larger as we collect random bytes every day.

Throughout the week, I have been reading two different books to gain a general understanding of the Linux kernel and Python programming, which will help once I start writing scripts to isolate entropy sources and their collection for the Linux RNG.

Understanding the Linux Kernel (Bovet & Cesati): One important thing I learned from this book is that the open-source nature of the kernel has its advantages and disadvantages. On one hand, there is a strong base of Linux developers who are very talented programmers that understand how to make the operating system compact yet powerful. On the other hand, as I witnessed in trying to understand the RNG by looking at /drivers/char/random.c, the source code can get extremely messy. Another part of the overview of the kernel gave me an idea for a new source of entropy for the RNG. Linux dynamically links modules that contain code for file system management, device drivers, and other features so that main memory is not burdened with kernel code that may even go unused most of the time. If the time between module linking and unlinking events could be measured, this would provide a strong source of entropy, especially since the delta would be relatively large compared to other sources such as keystrokes or mouse clicks.   


Programming Python (Lutz): I decided to take a thorough, in-depth approach to learning Python since I found it to be a very powerful scripting language when writing my scripts for online sources. I have been getting accustomed to the syntax and writing simple Python programs, experimenting with the interpreter behavior. In the book, I have covered keeping records, using dictionaries of dictionaries as a database, Pickle files, and shelves. I left off on the advantages of object-oriented programming within Python (structure, encapsulation, customization, etc.) and the syntax for defining classes.

No comments:

Post a Comment