Python Forensics – Dshell and Scapy ”; Previous Next DShell Dshell is a Python-based network forensic analysis toolkit. This toolkit was developed by the US Army Research Laboratory. The release of this open source toolkit was in the year 2014. The major focus of this toolkit is to make forensic investigations with ease. The toolkit consists of large number of decoders which are listed in the following table. Sr.No. Decoder Name & Description 1 dns This is used to extract DNS related queries 2 reservedips Identifies the solutions for DNS problems 3 large-flows Listing of the netflows 4 rip-http It is used extract the files from the HTTP traffic 5 Protocols Used for identification of non-standard protocols The US Army Laboratory has maintained the clone repository in GitHub in the following link − https://github.com/USArmyResearchLab/Dshell The clone consists of a script install-ubuntu.py () used for installation of this toolkit. Once the installation is successful, it will automatically build the executables and dependencies that will be used later. The dependencies are as follows − dependencies = { “Crypto”: “crypto”, “dpkt”: “dpkt”, “IPy”: “ipy”, “pcap”: “pypcap” } This toolkit can be used against the pcap (packet capture) files, which is usually recorded during the incidents or during the alert. These pcap files is either created by libpcap on Linux platform or WinPcap on Windows platform. Scapy Scapy is a Python-based tool used to analyze and manipulate the network traffic. Following is the link for Scapy toolkit − http://www.secdev.org/projects/scapy/ This toolkit is used to analyze packet manipulation. It is very capable to decode packets of a wide number of protocols and capture them. Scapy differs from the Dshell toolkit by providing a detailed description to the investigator about network traffic. These descriptions have been recorded in real time. Scapy has the ability to plot using third-party tools or OS fingerprinting. Consider the following example. import scapy, GeoIP #Imports scapy and GeoIP toolkit from scapy import * geoIp = GeoIP.new(GeoIP.GEOIP_MEMORY_CACHE) #locates the Geo IP address def locatePackage(pkg): src = pkg.getlayer(IP).src #gets source IP address dst = pkg.getlayer(IP).dst #gets destination IP address srcCountry = geoIp.country_code_by_addr(src) #gets Country details of source dstCountry = geoIp.country_code_by_addr(dst) #gets country details of destination print src+”(“+srcCountry+”) >> “+dst+”(“+dstCountry+”)n” This script gives the detailed description of the country details in the network packet, who are communicating with each other. The above script will produce the following output. Print Page Previous Next Advertisements ”;
Category: python Forensics
Mobile Forensics
Python Forensics – Mobile Forensics ”; Previous Next Forensic investigation and analysis of standard computer hardware such as hard disks have developed into a stable discipline and is followed with the help of techniques to analyze non-standard hardware or transient evidence. Although smartphones are increasingly being used in digital investigations, they are still considered as non-standard. Forensic Analysis Forensic investigations search for data such as received calls or dialed numbers from the smartphone. It can include text messages, photos, or any other incriminating evidence. Most smartphones have screen-locking features using passwords or alphanumeric characters. Here, we will take an example to show how Python can help crack the screen-locking password to retrieve data from a smartphone. Manual Examination Android supports password lock with PIN number or alphanumeric password. The limit of both passphrases are required to be between 4 and 16 digits or characters. The password of a smartphone is stored in the Android system in a special file called password.key in /data/system. Android stores a salted SHA1-hashsum and MD5-hashsum of the password. These passwords can be processed in the following code. public byte[] passwordToHash(String password) { if (password == null) { return null; } String algo = null; byte[] hashed = null; try { byte[] saltedPassword = (password + getSalt()).getBytes(); byte[] sha1 = MessageDigest.getInstance(algo = “SHA-1”).digest(saltedPassword); byte[] md5 = MessageDigest.getInstance(algo = “MD5”).digest(saltedPassword); hashed = (toHex(sha1) + toHex(md5)).getBytes(); } catch (NoSuchAlgorithmException e) { Log.w(TAG, “Failed to encode string because of missing algorithm: ” + algo); } return hashed; } It is not feasible to crack the password with the help of dictionary attack as the hashed password is stored in a salt file. This salt is a string of hexadecimal representation of a random integer of 64 bit. It is easy to access the salt by using Rooted Smartphone or JTAG Adapter. Rooted Smartphone The dump of the file /data/system/password.key is stored in SQLite database under the lockscreen.password_salt key. Under settings.db, the password is stored and the value is clearly visible in the following screenshot. JTAG Adapter A special hardware known as JTAG (Joint Test Action Group) adapter can be used to access the salt. Similarly, a Riff-Box or a JIG-Adapter can also be used for the same functionality. Using the information obtained from Riff-box, we can find the position of the encrypted data, i.e., the salt. Following are the rules − Search for the associated string “lockscreen.password_salt.” The byte represents the actual width of the salt, which is its length. This is the length which is actually searched for to get the stored password/pin of the smartphones. These set of rules help in getting the appropriate salt data. Print Page Previous Next Advertisements ”;
Virtualization
Python Forensics – Virtualization ”; Previous Next Virtualization is the process of emulating IT systems such as servers, workstations, networks, and storage. It is nothing but the creation of a virtual rather than actual version of any operating system, a server, a storage device or network processes. The main component which helps in emulation of virtual hardware is defined as a hyper-visor. The following figure explains the two main types of system virtualization used. Virtualization has been used in computational forensics in a number of ways. It helps the analyst in such a way that the workstation can be used in a validated state for each investigation. Data recovery is possible by attaching the dd image of a drive as a secondary drive on a virtual machine particularly. The same machine can be used as a recovery software to gather the evidences. The following example helps in understanding the creation of a virtual machine with the help of Python programming language. Step 1 − Let the virtual machine be named ”dummy1”. Every virtual machine must have 512 MB of memory in minimum capacity, expressed in bytes. vm_memory = 512 * 1024 * 1024 Step 2 − The virtual machine must be attached to the default cluster, which has been calculated. vm_cluster = api.clusters.get(name = “Default”) Step 3 − The virtual machine must boot from the virtual hard disk drive. vm_os = params.OperatingSystem(boot = [params.Boot(dev = “hd”)]) All the options are combined into a virtual machine parameter object, before using the add method of the vms collection to the virtual machine. Example Following is the complete Python script for adding a virtual machine. from ovirtsdk.api import API #importing API library from ovirtsdk.xml import params try: #Api credentials is required for virtual machine api = API(url = “https://HOST”, username = “Radhika”, password = “a@123”, ca_file = “ca.crt”) vm_name = “dummy1” vm_memory = 512 * 1024 * 1024 #calculating the memory in bytes vm_cluster = api.clusters.get(name = “Default”) vm_template = api.templates.get(name = “Blank”) #assigning the parameters to operating system vm_os = params.OperatingSystem(boot = [params.Boot(dev = “hd”)]) vm_params = params.VM(name = vm_name, memory = vm_memory, cluster = vm_cluster, template = vm_template os = vm_os) try: api.vms.add(vm = vm_params) print “Virtual machine ”%s” added.” % vm_name #output if it is successful. except Exception as ex: print “Adding virtual machine ”%s” failed: %s” % (vm_name, ex) api.disconnect() except Exception as ex: print “Unexpected error: %s” % ex Output Our code will produce the following output − Print Page Previous Next Advertisements ”;
Hash Function
Python Forensics – Hash Function ”; Previous Next A hash function is defined as the function that maps on a large amount of data to a fixed value with a specified length. This function ensures that the same input results in the same output, which is actually defined as a hash sum. Hash sum includes a characteristic with specific information. This function is practically impossible to revert. Thus, any third party attack like brute force attack is practically impossible. Also, this kind of algorithm is called one-way cryptographic algorithm. An ideal cryptographic hash function has four main properties − It must be easy to compute the hash value for any given input. It must be infeasible to generate the original input from its hash. It must be infeasible to modify the input without changing the hash. It must be infeasible to find two different inputs with the same hash. Example Consider the following example which helps in matching passwords using characters in hexadecimal format. import uuid import hashlib def hash_password(password): # userid is used to generate a random number salt = uuid.uuid4().hex #salt is stored in hexadecimal value return hashlib.sha256(salt.encode() + password.encode()).hexdigest() + ”:” + salt def check_password(hashed_password, user_password): # hexdigest is used as an algorithm for storing passwords password, salt = hashed_password.split(”:”) return password == hashlib.sha256(salt.encode() + user_password.encode()).hexdigest() new_pass = raw_input(”Please enter required password ”) hashed_password = hash_password(new_pass) print(”The string to store in the db is: ” + hashed_password) old_pass = raw_input(”Re-enter new password ”) if check_password(hashed_password, old_pass): print(”Yuppie!! You entered the right password”) else: print(”Oops! I am sorry but the password does not match”) Flowchart We have explained the logic of this program with the help of the following flowchart − Output Our code will produce the following output − The password entered twice matches with the hash function. This ensures that the password entered twice is accurate, which helps in gathering useful data and save them in an encrypted format. Print Page Previous Next Advertisements ”;
Implementation of Cloud
Python Forensics – Implementation of Cloud ”; Previous Next Cloud computing can be defined as a collection of hosted services provided to users over the Internet. It enables organizations to consume or even compute the resource, which includes Virtual Machines (VMs), storage, or an application as a utility. One of the most important advantages of building applications in Python programming language is that it includes the ability to deploy applications virtually on any platform, which includes cloud as well. It implies that Python can be executed on cloud servers and can also be launched on handy devices such as desktop, tablet, or smartphone. One of the interesting perspectives is creating a cloud base with the generation of Rainbow tables. It helps in integrating single and multiprocessing versions of the application, which requires some considerations. Pi Cloud Pi Cloud is the cloud computing platform, which integrates Python programming language with the computing power of Amazon Web Services. Let’s take a look at an example of implementing Pi clouds with rainbow tables. Rainbow Tables A rainbow table is defined as a listing of all possible plain text permutations of encrypted passwords specific to a given hash algorithm. Rainbow tables follow a standard pattern, which creates a list of hashed passwords. A text file is used to generate passwords, which include characters or plain text of passwords to be encrypted. The file is used by Pi cloud, which calls the main function to be stored. The output of hashed passwords is stored in the text file as well. This algorithm can be used to save passwords in the database as well and have a backup storage in the cloud system. The following in-built program creates a list of encrypted passwords in a text file. Example import os import random import hashlib import string import enchant #Rainbow tables with enchant import cloud #importing pi-cloud def randomword(length): return ””.join(random.choice(string.lowercase) for i in range(length)) print(”Author- Radhika Subramanian”) def mainroutine(): engdict = enchant.Dict(“en_US”) fileb = open(“password.txt”,”a+”) # Capture the values from the text file named password while True: randomword0 = randomword(6) if engdict.check(randomword0) == True: randomkey0 = randomword0+str(random.randint(0,99)) elif engdict.check(randomword0) == False: englist = engdict.suggest(randomword0) if len(englist) > 0: randomkey0 = englist[0]+str(random.randint(0,99)) else: randomkey0 = randomword0+str(random.randint(0,99)) randomword3 = randomword(5) if engdict.check(randomword3) == True: randomkey3 = randomword3+str(random.randint(0,99)) elif engdict.check(randomword3) == False: englist = engdict.suggest(randomword3) if len(englist) > 0: randomkey3 = englist[0]+str(random.randint(0,99)) else: randomkey3 = randomword3+str(random.randint(0,99)) if ”randomkey0” and ”randomkey3” and ”randomkey1” in locals(): whasher0 = hashlib.new(“md5”) whasher0.update(randomkey0) whasher3 = hashlib.new(“md5”) whasher3.update(randomkey3) whasher1 = hashlib.new(“md5″) whasher1.update(randomkey1) print(randomkey0+” + “+str(whasher0.hexdigest())+”n”) print(randomkey3+” + “+str(whasher3.hexdigest())+”n”) print(randomkey1+” + “+str(whasher1.hexdigest())+”n”) fileb.write(randomkey0+” + “+str(whasher0.hexdigest())+”n”) fileb.write(randomkey3+” + “+str(whasher3.hexdigest())+”n”) fileb.write(randomkey1+” + “+str(whasher1.hexdigest())+”n”) jid = cloud.call(randomword) #square(3) evaluated on PiCloud cloud.result(jid) print(”Value added to cloud”) print(”Password added”) mainroutine() Output This code will produce the following output − The passwords are stored in the text files, which is visible, as shown in the following screenshot. Print Page Previous Next Advertisements ”;
Indexing
Python Forensics – Indexing ”; Previous Next Indexing actually provides the investigator have a complete look at a file and gather potential evidence from it. The evidence could be contained within a file, a disk image, a memory snapshot, or a network trace. Indexing helps in reducing time for time-consuming tasks such as keyword searching. Forensic investigation also involves interactive searching phase, where the index is used to rapidly locate keywords. Indexing also helps in listing the keywords in a sorted list. Example The following example shows how you can use indexing in Python. aList = [123, ”sample”, ”zara”, ”indexing”]; print “Index for sample : “, aList.index(”sample”) print “Index for indexing : “, aList.index(”indexing”) str1 = “This is sample message for forensic investigation indexing”; str2 = “sample”; print “Index of the character keyword found is ” print str1.index(str2) The above script will produce the following output. Print Page Previous Next Advertisements ”;
Home
Python Forensics Tutorial PDF Version Quick Guide Resources Job Search Discussion Python has built-in capabilities to support digital investigation and protect the integrity of evidence during an investigation. In this tutorial, we will explain the fundamental concepts of applying Python in computational (digital) forensics that includes extracting evidence, collecting basic data, and encryption of passwords as required. Audience This tutorial is meant for all those readers who seek to increase their understanding in digital or computational forensics through the use of Python. It will help you understand how to integrate Python in computational forensics. Prerequisites Before starting with this tutorial, it is important that you understand the basic concepts of computational forensics. And, it will definitely help if you have prior exposure to Python. Print Page Previous Next Advertisements ”;
Overview of Python
Python Forensics – Overview of Python ”; Previous Next The codes written in Python look quite similar to the codes written in other conventional programming languages such as C or Pascal. It is also said that the syntax of Python is heavily borrowed from C. This includes many of the Python keywords which are similar to C language. Python includes conditional and looping statements, which can be used to extract the data accurately for forensics. For flow control, it provides if/else, while, and a high-level for statement that loops over any “iterable” object. if a < b: max = b else: max = a The major area where Python differs from other programming languages is in its use of dynamic typing. It uses variable names that refer to objects. These variables need not be declared. Data Types Python includes a set of built-in data types such as strings, Boolean, numbers, etc. There are also immutable types, which means the values which cannot be changed during the execution. Python also has compound built-in data types that includes tuples which are immutable arrays, lists, and dictionaries which are hash tables. All of them are used in digital forensics to store values while gathering evidence. Third-party Modules and Packages Python supports groups of modules and/or packages which are also called third-party modules (related code grouped together in a single source file) used for organizing programs. Python includes an extensive standard library, which is one of the main reasons for its popularity in computational forensics. Life Cycle of Python Code At first, when you execute a Python code, the interpreter checks the code for syntax errors. If the interpreter discovers any syntax errors, then they are displayed immediately as error messages. If there are no syntax errors, then the code is compiled to produce a bytecode and sent to PVM (Python Virtual Machine). The PVM checks the bytecode for any runtime or logical errors. In case the PVM finds any runtime errors, then they are reported immediately as error messages. If the bytecode is error-free, then the code gets processed and you get its output. The following illustration shows in a graphical manner how the Python code is first interpreted to produce a bytecode and how the bytecode gets processed by the PVM to produce the output. Print Page Previous Next Advertisements ”;
Installation of Python
Python Forensics – Installation of Python ”; Previous Next As we need Python for all the activities of computational forensics, let us move step by step and understand how to install it. Step 1 − Go to https://www.python.org/downloads/ and download the installation files of Python according to the Operating System you have on your system. Step 2 − After downloading the package/installer, click on the exe file to start the installation process. You will get to see the following screen after the installation is complete. Step 3 − The next step is to set the environment variables of Python in your system. Step 4 − Once the environment variables are set, type the command “python” on the command prompt to verify whether the installation was successful or not. If the installation was successful, then you will get the following output on the console. Print Page Previous Next Advertisements ”;
Basic Forensic Application
Python Forensics – Basic Forensic Application ”; Previous Next For creating an application as per the Forensic guidelines, it is important to understand and follow its naming conventions and patterns. Naming Conventions During the development of Python forensics applications, the rules and conventions to be followed are described in the following table. Constants Uppercase with underscore separation HIGH_TEMPERATURE Local variable name Lowercase with bumpy caps (underscores are optional) currentTemperature Global variable name Prefix gl lowercase with bumpy caps (underscores are optional) gl_maximumRecordedTemperature Functions name Uppercase with bumpy caps (underscores optional) with active voice ConvertFarenheitToCentigrade(…) Object name Prefix ob_ lowercase with bumpy caps ob_myTempRecorder Module An underscore followed by lowercase with bumpy caps _tempRecorder Class names Prefix class_ then bumpy caps and keep brief class_TempSystem Let us take a scenario to understand the importance of naming conventions in Computational Forensics. Suppose we have a hashing algorithm that is normally used for encrypting data. The one-way hashing algorithm takes input as a stream of binary data; this could be a password, a file, binary data, or any digital data. The hashing algorithm then produces a message digest (md) with respect to the data received in the input. It is practically impossible to create a new binary input that will generate a given message digest. Even a single bit of the binary input data, if changed, will generate a unique message, which is different than the previous one. Example Take a look at the following sample program which follows the above-mentioned conventions. import sys, string, md5 # necessary libraries print “Please enter your full name” line = sys.stdin.readline() line = line.rstrip() md5_object = md5.new() md5_object.update(line) print md5_object.hexdigest() # Prints the output as per the hashing algorithm i.e. md5 exit The above program produces the following output. In this program, the Python script accepts the input (your full name) and converts it as per the md5 hashing algorithm. It encrypts the data and secures the information, if required. As per forensic guidelines, the name of evidences or any other proofs can be secured in this pattern. Print Page Previous Next Advertisements ”;