Monday, August 5, 2013

Week 13: More Android

This week, I continued working on collecting entropy from the mobile Android platform. The first thing I worked on was setting up a TabView to categorize the different sources of entropy within the user interface. I added the already finished BatteryActivity.java Activity to one tab and then moved on to Android system information. I found the class ActivityManager.RunningTaskInfo, which provides information for each task currently running in the environment. This information includes a unique identifier, the base Activity associated with the task, the number of running Activities associated with the task, and the Activity component that is at the top of the history stack. While I am already able to access this data through the Task Activity I wrote, I am still working on creating a Table layout that will display it appropriately in the UI. I plan on making a similar tab for the RunningServiceInfo class and another tab for MemoryInfo.

With the framework for extracting system data set up, I turned my attention to CyanogenMod to see if I could retrieve more fine-grained readings for things like battery voltage and temperature. After a couple days of tinkering, I was able to root the Nexus S test device with CyanogenMod 10.1. Unfortunately, once this new OS had been loaded, I could not get Eclipse to recognize the device so I could not debug my app on it. I will try to find a solution to this problem, although I believe this will not contribute much to the app since it would only improve the data for the battery and not the OS information. My first priority will be finishing up the UI and working on sending the data over a network.

The last thing I worked on was trying to get the location of the device, which I thought may be a useful source of entropy. What I thought would be a relatively simple task actually turned into a considerable challenge. First of all, the LocationManager class gathers location data from several different sources, such as the GPS, the network, or a wifi connection. I used the following code:

locationManager = (LocationManager) getSystemService(Context.LOCATION_SERVICE);
Criteria criteria = new Criteria();
provider = locationManager.getBestProvider(criteria, true);
if (provider != null){
Log.v(TAG, "Requesting Location updates...");
          locationManager.requestLocationUpdates(LocationManager.GPS_PROVIDER, 0, 0, this.locationListener);
       locationManager.requestLocationUpdates(LocationManager.NETWORK_PROVIDER, 0, 0, this.locationListener);
}



to obtain as many location updates as possible from both the GPS and network providers. I quickly found out, however, that GPS does not work indoors and since the phone is not part of a network, I cannot get the location. Even the native Google Maps app shuts down with an error saying the app requires a network provider to work. If Amir thinks it is a good idea to incorporate location data, I will talk to Denis about trying to obtain a monthly plan or alternative options. 

Monday, July 29, 2013

Week 12: Android Battery Data

At the beginning of this week, I continued working on the Android development training. I learned about switching between Activities by using Intents, the lifecycle of an Activity, navigating between different XML pages, and dynamically generating UI elements. I began working on collecting sources of entropy with Adam. We plan on eventually combining the data collection code into a unified app, but for right now, I am working on getting battery specifications and status as well as process statistics from the OS such as how many processes are currently running, how much memory each process is using, the network bandwidth of each process, and other system information.

I started out researching a way to get battery information from the Android environment. The android.os API provides the BatteryManager class, which contains several string constants such as EXTRA_TEMPERATURE, EXTRA_VOLTAGE, EXTRA_LEVEL, and EXTRA_STATUS. When I tried to display those constants in TextViews directly, however, they all came up as either 0s or nulls. I did a little more research and discovered that these constants can only be accessed through a BroadcastReceiver. The receiver is registered using this function call:

this.registerReceiver(this.mBatInfoReceiver,
                new IntentFilter(Intent.ACTION_BATTERY_CHANGED));


This registers mBatInfoReceiver, which is declared as follows:

private BroadcastReceiver mBatInfoReceiver = new BroadcastReceiver(){
        @Override
        public void onReceive(Context arg0, Intent intent) {
          temp = intent.getIntExtra(BatteryManager.EXTRA_TEMPERATURE, 0);
          volt = intent.getIntExtra(BatteryManager.EXTRA_VOLTAGE, 0);
          tech = intent.getStringExtra(BatteryManager.EXTRA_TECHNOLOGY);
          stat = intent.getIntExtra(BatteryManager.EXTRA_STATUS, 0);
          level = intent.getIntExtra(BatteryManager.EXTRA_LEVEL, 0);
          scale = intent.getIntExtra(BatteryManager.EXTRA_SCALE, 0);
        }
      };


So that each of the variables assigned in onReceive() is updated whenever the Android system triggers the ACTION_BATTERY_CHANGED intent. All I needed to do once these variables were updated was format some of them (temperature is given in tenths of a degree Celcius and voltage is given in millivolts). This code worked quite nicely, giving me all of the battery information that can be updated in real time by the press of a button. The next step for this unified app will be for me to start working on system process data collection.

Monday, July 22, 2013

Week 11: Getting Started with Android

The first thing I worked on this week was collecting TCP network packet arrival times. Fortunately, the tcpdump command line tool has an extensive list of features that makes it very easy to obtain information on network traffic. It dumps the originating IP, the target IP, the packet size, and other bookkeeping information. If the –tt option is used, the timestamp of the packet with microsecond granularity is recorded as well. Therefore, the command:

sudo tcpdump -nntt | awk '{print $1}' > packets.txt

logs the timestamps of packet arrivals in packets.txt continuously until it is interrupted by the SIGINT or SIGTERM signals. Alternately, the –c 100 option can be appended to print the first 100 packet timestamps and then terminate.


For the rest of the week, I worked on getting up to speed on Android development to work with Adam on identifying sources of entropy and collecting random data from the mobile platform. There is a hefty amount of work to set up the development environment. Once I was finished downloading the Android SDK, the ADT plugin for Eclipse, the latest SDK tools, and the AVD (an emulator for debugging apps), I was ready to start learning about project organization, user interface principles, event listeners, Android Activities and Intents, and a whole list of other basic Android development concepts. Hopefully with a week of preparation, I will at least be able to do some basic tasks on the platform. Fortunately, the Android developer’s site is very well documented, making research for specialized projects easy.

Monday, July 15, 2013

Week 10: Wrapping up the Mouse Logger

I started off this week by trying to investigate device event files in a new Linux virtual machine I obtained. My startup engineering class provided me with an Amazon EC2 remote instance of Ubuntu LTS 12.04 which was intended to be used as a homogenous development environment. Unfortunately, I ran into the same problem as the virtual machine on the Macbook Pro where mouse event information was not being routed properly to the files due to the virtualized interface.

I then went on to research available libraries that abstract away the lower-level details of getting mouse event data. I had already used the Python binding evdev last week but I was unable to find any way to register mouse clicks with that extension. When I did a little more digging, I found that the preferred library for interfacing with any external device is Xlib. However, I quickly discovered that the library is tedious, outdated, and very poorly documented. For example, the only way to obtain the resolution of the screen is:

width = Display().screen().root.query_pointer().root_x
height = Display().screen().root.query_pointer().root_y

Fortunately, I was able to find a Python library called PyUserInput that abstracts away the complexity of Xlib. I created a class ClickRecorder that inherits PyMouseEvent. This way, I could change the code for the click() and move() functions so they would call WriteInfoLine(), a function I wrote that appends a line containing the timestamp, delta since last move or click, and x and y coordinates to the log file. The function ended up looking like this:

def writeInfoLine(x, y, isClick):
    global MIN_DELTA, lastClick, lastX, lastY, lastMove
    currTime = time.time()
    line = str(currTime) + " (delta "
    if isClick:
        delta = currTime - lastClick
        line += str(delta) + "):\t"
        line += "[" + str(x) + "," + str(y) + "] click\n"
        f.write(line)
        lastClick = currTime
    else:
        delta = currTime - lastMove
        if delta < MIN_DELTA:
            return
        if x == lastX and y == lastY:
            return
       
        line += str(delta) + "):\t"
        line += "[" + str(x) + "," + str(y) + "]\n"
        f.write(line)
        lastX = x
        lastY = y
        lastMove = currTime


I set MIN_DELTA to 0.25 so that mouse movements that are within 250 milliseconds of each other do not get recorded. To finish up my program, I wrapped the import statement for PyUserInput in a try clause that would run a setup.sh script and install the module if an ImportError is caught. With this program completed, we finally have a fully functional robust application for logging mouse events, and the log file formatting will make it easy to extract certain parameters for entropy.

Monday, July 8, 2013

Week 9: Java and Fireworks

For this shortened Fourth of July week, I turned to Java to develop a program capable of logging mouse events. One advantage of using Java is that it is intrinsically platform-independent since each Java application is compiled into bytecode that is then executed in the Java Runtime Environment. I used the java.awt.MouseInfo class to sample the x and y coordinates of the mouse ten times per second. If the position of the mouse has not changed since the last sample, it is not logged again. The code ended up looking like this:

while(true){
            Point mousePt = MouseInfo.getPointerInfo().getLocation();
            mouseX = Math.max(0, mousePt.x);
            mouseY = Math.max(0, mousePt.y);
            if (mouseX != prevX || mouseY != prevY){
                unixTime = System.currentTimeMillis();
                delta = unixTime - prevTime;
                String content = unixTime + " " + delta +
                    " [" + mouseX + "," + mouseY + "]\n";
                writeToFile(filename, content);
            }

            try { Thread.sleep(100); }
            catch (InterruptedException exception){
                exception.printStackTrace();
                throw exception;
            }
            prevX = mouseX;
            prevY = mouseY;
            prevTime = unixTime;
}


The log file I create also includes the timestamp with millisecond granularity as well as the delta since the last different pair of x and y coordinates. This program should work for all three of the platforms we have targeted for this project.

Monday, July 1, 2013

Week 8: Mouse Event Logging

The first thing I did this week was research Linux keyloggers. While there are many online, there are few open source solutions that offer the flexibility required for our project. In fact, I only found one source that provides a timestamp for each individual key pressed. I changed the script to store the data in a local log file that can be parsed later on.

For the rest of the week, I did a lot of research on Linux device event files in order to work on capturing mouse events. Unfortunately, the protocol for this OS service varies wildly depending on the Linux distro in question and I was unable to find a truly reliable source for working with the Ubuntu VM. I also refrained from investigating rootkits because I did not want to put my personal machine at risk. So far, my understanding is that a new input_event struct is written to the event file every time the given hardware device has a new event to report. The struct has the following format:

struct input_event {
struct timeval time;
__u16 type;
__u16 code;
__s32 value;
};

However, when I use programs that try to exploit this data formatting, the only field that ever changes is the timeval struct while the type, code, and value remain constant. The best option I could find for logging mouse events was the python bindings for evdev. Using this package to read /dev/input/event3, the mouse event file for Ubuntu 12.04 LTS, I am able to record mouse clicks with extremely precise timestamps. None of the deltas get recorded and the coordinates of the mouse when it is clicked are not logged either.

Monday, June 24, 2013

Week 7: Keylogging

Now that work on gathering random bytes from online sources has finished, I have begun working on tracking keystrokes and attempting to build a record of timestamps and cycles after boot-up for each key pressed. This has been more challenging than expected because each major operating system platform handles the raw input of the keyboard differently. The first platform I worked on was Mac OS X. I thought the UNIX-based OS would use device event files to record keystrokes but Apple changed the architecture so that the same information was in a much more obscure location within Application Services. Amir gave me a Macbook Air to work on but we did not have the passwords for any of the accounts. Instead of having to load a new OS on the machine, Adam let me borrow the Macbook Pro he had been working with. Eventually, the only solution I could come up with was to use logKextClient, a software that is able to capture all keyboard input. While it records the timespan range for the collected data, it does not assign a timestamp to each individual key pressed, let alone include the number of CPU cycles that have passed.

Trying to find a keylogger solution for Windows and Linux have proven similarly difficult. While I would like to test a simple script which reads the Linux device event file, that script failed when Ubuntu was loaded as a virtual machine on the MBP. In fact, the device event file for the keyboard was empty. This is probably because keyboard input is routed through the VM interface and the data never gets written to the event file. I will need to read more about device event files before I can work on the Linux platform. I found PyKeylogger, a Python extension that is supposed to work for both Windows and Linux. I downloaded it on my Windows 8 machine but the installation process always fails. I will have to take a deeper look into my options for Windows as well.


Meanwhile, I have started my Stanford online courses in cryptography and startup engineering. Both courses will give me more knowledge and experience to help me with my project on cryptographic randomness. 

Monday, June 17, 2013

Week 6: More Studying

At the beginning of this week, I made some finishing touches to our online entropy collection project. I made my script for the NIST Beacon source more dynamic. Since the source provides new random bytes every minute, I set up the script to identify the last timestamp it left off at and start collecting data from a minute after it. Later on, Amir told me that after running the cron job to collect data from online sources, there was an issue with the file naming. I had to rethink the algorithm I had made for run_all.py. By switching the steps where I moved the files from the tmp directory to daily_results and renamed the files, I was able to fix this problem.

I have been continuing my study of Python through the Lutz manual. I have found it to be an extremely powerful language because of its cross-platform system programming capabilities. The os and glob modules make automation of any command line algorithms trivial. Having Python as part of my skill set will make it much easier for me to write scripts moving forward.

After finding out about it through a friend, I enrolled for the Stanford online cryptography course, which begins on June 17th. I feel that this course will expand my knowledge of cryptography beyond the basic concepts I learned in EECS 482. The course syllabus includes DES and AES block ciphers, collision resistant hashing, key derivation functions, Diffie-Hellman, RSA, and Merkle puzzles as well as several other topics.


On Friday, our team met with Professor Fu to discuss future work. We talked with Ari Juels about using subtle frequency variations in RFIDs to obtain randomness as well as thermal chamber testing. The first step is to order the RFIDs and with readers in order to begin testing. Those should take around a week to arrive to the lab. Until then, we will continue to focus on entropy available in desktop computers.

Monday, June 10, 2013

Week 5: Finishing up Online Sources

This week, I finished up our work with gathering entropy from online sources. I was able to circumvent the usage of twill for downloading random bytes from randomserver.dyndns.org by simply appending the form entry data to the URL. This way, I could utilize the Python URL library and the script to download 4096 random bytes became trivial. After that, I brought all of the online sources together. I standardized the output files of each script that was written and created a tmp folder for the output files to be stored in. Then, I wrote run_all.py, which uses the Python os extension to run each script from the command line. Once all the temporary files have been stored in tmp, the run_all script determines the timestamp, appends it to the file names, and moves the files to ./daily_results in the randomness github repository. Amir has set up a Cron job to execute run_all.py a couple times each day on his BICUSPID account, so the daily_results directory will continue to grow larger as we collect random bytes every day.

Throughout the week, I have been reading two different books to gain a general understanding of the Linux kernel and Python programming, which will help once I start writing scripts to isolate entropy sources and their collection for the Linux RNG.

Understanding the Linux Kernel (Bovet & Cesati): One important thing I learned from this book is that the open-source nature of the kernel has its advantages and disadvantages. On one hand, there is a strong base of Linux developers who are very talented programmers that understand how to make the operating system compact yet powerful. On the other hand, as I witnessed in trying to understand the RNG by looking at /drivers/char/random.c, the source code can get extremely messy. Another part of the overview of the kernel gave me an idea for a new source of entropy for the RNG. Linux dynamically links modules that contain code for file system management, device drivers, and other features so that main memory is not burdened with kernel code that may even go unused most of the time. If the time between module linking and unlinking events could be measured, this would provide a strong source of entropy, especially since the delta would be relatively large compared to other sources such as keystrokes or mouse clicks.   


Programming Python (Lutz): I decided to take a thorough, in-depth approach to learning Python since I found it to be a very powerful scripting language when writing my scripts for online sources. I have been getting accustomed to the syntax and writing simple Python programs, experimenting with the interpreter behavior. In the book, I have covered keeping records, using dictionaries of dictionaries as a database, Pickle files, and shelves. I left off on the advantages of object-oriented programming within Python (structure, encapsulation, customization, etc.) and the syntax for defining classes.