Python Digital Forensics – Introduction ”; Previous Next This chapter will give you an introduction to what digital forensics is all about, and its historical review. You will also understand where you can apply digital forensics in real life and its limitations. What is Digital Forensics? Digital forensics may be defined as the branch of forensic science that analyzes, examines, identifies and recovers the digital evidences residing on electronic devices. It is commonly used for criminal law and private investigations. For example, you can rely on digital forensics extract evidences in case somebody steals some data on an electronic device. Brief Historical Review of Digital Forensics The history of computer crimes and the historical review of digital forensics is explained in this section as given below − 1970s-1980s: First Computer Crime Prior to this decade, no computer crime has been recognized. However, if it is supposed to happen, the then existing laws dealt with them. Later, in 1978 the first computer crime was recognized in Florida Computer Crime Act, which included legislation against unauthorized modification or deletion of data on a computer system. But over the time, due to the advancement of technology, the range of computer crimes being committed also increased. To deal with crimes related to copyright, privacy and child pornography, various other laws were passed. 1980s-1990s: Development Decade This decade was the development decade for digital forensics, all because of the first ever investigation (1986) in which Cliff Stoll tracked the hacker named Markus Hess. During this period, two kind of digital forensics disciplines developed – first was with the help of ad-hoc tools and techniques developed by practitioners who took it as a hobby, while the second being developed by scientific community. In 1992, the term “Computer Forensics”was used in academic literature. 2000s-2010s: Decade of Standardization After the development of digital forensics to a certain level, there was a need of making some specific standards that can be followed while performing investigations. Accordingly, various scientific agencies and bodies have published guidelines for digital forensics. In 2002, Scientific Working Group on Digital Evidence (SWGDE) published a paper named “Best practices for Computer Forensics”. Another feather in the cap was a European led international treaty namely “The Convention on Cybercrime” was signed by 43 nations and ratified by 16 nations. Even after such standards, still there is a need to resolve some issues which has been identified by researchers. Process of Digital Forensics Since first ever computer crime in 1978, there is a huge increment in digital criminal activities. Due to this increment, there is a need for structured manner to deal with them. In 1984, a formalized process has been introduced and after that a great number of new and improved computer forensics investigation processes have been developed. A computer forensics investigation process involves three major phases as explained below − Phase 1: Acquisition or Imaging of Exhibits The first phase of digital forensics involves saving the state of the digital system so that it can be analyzed later. It is very much similar to taking photographs, blood samples etc. from a crime scene. For example, it involves capturing an image of allocated and unallocated areas of a hard disk or RAM. Phase 2: Analysis The input of this phase is the data acquired in the acquisition phase. Here, this data was examined to identify evidences. This phase gives three kinds of evidences as follows − Inculpatory evidences − These evidences support a given history. Exculpatory evidences − These evidences contradict a given history. Evidence of tampering − These evidences show that the system was tempered to avoid identification. It includes examining the files and directory content for recovering the deleted files. Phase 3: Presentation or Reporting As the name suggests, this phase presents the conclusion and corresponding evidences from the investigation. Applications of Digital Forensics Digital forensics deals with gathering, analyzing and preserving the evidences that are contained in any digital device. The use of digital forensics depends on the application. As mentioned earlier, it is used mainly in the following two applications − Criminal Law In criminal law, the evidence is collected to support or oppose a hypothesis in the court. Forensics procedures are very much similar to those used in criminal investigations but with different legal requirements and limitations. Private Investigation Mainly corporate world uses digital forensics for private investigation. It is used when companies are suspicious that employees may be performing an illegal activity on their computers that is against company policy. Digital forensics provides one of the best routes for company or person to take when investigating someone for digital misconduct. Branches of Digital Forensics The digital crime is not restricted to computers alone, however hackers and criminals are using small digital devices such as tablets, smart-phones etc. at a very large scale too. Some of the devices have volatile memory, while others have non-volatile memory. Hence depending upon type of devices, digital forensics has the following branches − Computer Forensics This branch of digital forensics deals with computers, embedded systems and static memories such as USB drives. Wide range of information from logs to actual files on drive can be investigated in computer forensics. Mobile Forensics This deals with investigation of data from mobile devices. This branch is different from computer forensics in the sense that mobile devices have an inbuilt communication system which is useful for providing useful information related to location. Network Forensics This deals with the monitoring and analysis of computer network traffic, both local and WAN(wide area network) for the purposes of information gathering, evidence collection, or intrusion detection. Database Forensics This branch of digital forensics deals with forensics study of databases and their metadata. Skills Required for Digital Forensics Investigation Digital forensics examiners help to track hackers, recover stolen data, follow computer attacks back to their source, and aid in other types of investigations involving computers. Some of the key skills required to become digital forensics examiner as discussed below − Outstanding
Category: python Digital Forensics
Python Digital Forensics Tutorial PDF Version Quick Guide Resources Job Search Discussion Digital forensics is the branch of forensic science that analyzes, examines, identifies as well as recovers the digital evidences from electronic devices. It is commonly used in criminal law and private investigation. This tutorial will make you comfortable with performing Digital Forensics in Python on Windows operated digital devices. In this tutorial, you will learn various concepts and coding for carrying out digital forensics in Python. Audience This tutorial will be useful for graduates, post graduates, and research students who either have an interest in this subject or have this subject as a part of their curriculum. Any reader who is enthusiastic about gaining knowledge digital forensics using Python programming language can also pick up this tutorial. Prerequisites This tutorial is designed by making an assumption that the reader has a basic knowledge about operating system and computer networks. You are expected to have a basic knowledge of Python programming. If you are novice to any of these subjects or concepts, we strongly suggest you go through tutorials based on these, before you start your journey with this tutorial. Print Page Previous Next Advertisements ”;
Investigating Embedded Metadata ”; Previous Next In this chapter, we will learn in detail about investigating embedded metadata using Python digital forensics. Introduction Embedded metadata is the information about data stored in the same file which is having the object described by that data. In other words, it is the information about a digital asset stored in the digital file itself. It is always associated with the file and can never be separated. In case of digital forensics, we cannot extract all the information about a particular file. On the other side, embedded metadata can provide us information critical to the investigation. For example, a text file’s metadata may contain information about the author, its length, written date and even a short summary about that document. A digital image may include the metadata such as the length of the image, the shutter speed etc. Artifacts Containing Metadata Attributes and their Extraction In this section, we will learn about various artifacts containing metadata attributes and their extraction process using Python. Audio and Video These are the two very common artifacts which have the embedded metadata. This metadata can be extracted for the purpose of investigation. You can use the following Python script to extract common attributes or metadata from audio or MP3 file and a video or a MP4 file. Note that for this script, we need to install a third party python library named mutagen which allows us to extract metadata from audio and video files. It can be installed with the help of the following command − pip install mutagen Some of the useful libraries we need to import for this Python script are as follows − from __future__ import print_function import argparse import json import mutagen The command line handler will take one argument which represents the path to the MP3 or MP4 files. Then, we will use mutagen.file() method to open a handle to the file as follows − if __name__ == ”__main__”: parser = argparse.ArgumentParser(”Python Metadata Extractor”) parser.add_argument(“AV_FILE”, help=”File to extract metadata from”) args = parser.parse_args() av_file = mutagen.File(args.AV_FILE) file_ext = args.AV_FILE.rsplit(”.”, 1)[-1] if file_ext.lower() == ”mp3”: handle_id3(av_file) elif file_ext.lower() == ”mp4”: handle_mp4(av_file) Now, we need to use two handles, one to extract the data from MP3 and one to extract data from MP4 file. We can define these handles as follows − def handle_id3(id3_file): id3_frames = {”TIT2”: ”Title”, ”TPE1”: ”Artist”, ”TALB”: ”Album”,”TXXX”: ”Custom”, ”TCON”: ”Content Type”, ”TDRL”: ”Date released”,”COMM”: ”Comments”, ”TDRC”: ”Recording Date”} print(“{:15} | {:15} | {:38} | {}”.format(“Frame”, “Description”,”Text”,”Value”)) print(“-” * 85) for frames in id3_file.tags.values(): frame_name = id3_frames.get(frames.FrameID, frames.FrameID) desc = getattr(frames, ”desc”, “N/A”) text = getattr(frames, ”text”, [“N/A”])[0] value = getattr(frames, ”value”, “N/A”) if “date” in frame_name.lower(): text = str(text) print(“{:15} | {:15} | {:38} | {}”.format( frame_name, desc, text, value)) def handle_mp4(mp4_file): cp_sym = u”u00A9″ qt_tag = { cp_sym + ”nam”: ”Title”, cp_sym + ”art”: ”Artist”, cp_sym + ”alb”: ”Album”, cp_sym + ”gen”: ”Genre”, ”cpil”: ”Compilation”, cp_sym + ”day”: ”Creation Date”, ”cnID”: ”Apple Store Content ID”, ”atID”: ”Album Title ID”, ”plID”: ”Playlist ID”, ”geID”: ”Genre ID”, ”pcst”: ”Podcast”, ”purl”: ”Podcast URL”, ”egid”: ”Episode Global ID”, ”cmID”: ”Camera ID”, ”sfID”: ”Apple Store Country”, ”desc”: ”Description”, ”ldes”: ”Long Description”} genre_ids = json.load(open(”apple_genres.json”)) Now, we need to iterate through this MP4 file as follows − print(“{:22} | {}”.format(”Name”, ”Value”)) print(“-” * 40) for name, value in mp4_file.tags.items(): tag_name = qt_tag.get(name, name) if isinstance(value, list): value = “; “.join([str(x) for x in value]) if name == ”geID”: value = “{}: {}”.format( value, genre_ids[str(value)].replace(“|”, ” – “)) print(“{:22} | {}”.format(tag_name, value)) The above script will give us additional information about MP3 as well as MP4 files. Images Images may contain different kind of metadata depending upon its file format. However, most of the images embed GPS information. We can extract this GPS information by using third party Python libraries. You can use the following Python script can be used to do the same − First, download third party python library named Python Imaging Library (PIL) as follows − pip install pillow This will help us to extract metadata from images. We can also write the GPS details embedded in images to KML file, but for this we need to download third party Python library named simplekml as follows − pip install simplekml In this script, first we need to import the following libraries − from __future__ import print_function import argparse from PIL import Image from PIL.ExifTags import TAGS import simplekml import sys Now, the command line handler will accept one positional argument which basically represents the file path of the photos. parser = argparse.ArgumentParser(”Metadata from images”) parser.add_argument(”PICTURE_FILE”, help = “Path to picture”) args = parser.parse_args() Now, we need to specify the URLs that will populate the coordinate information. The URLs are gmaps and open_maps. We also need a function to convert the degree minute seconds (DMS) tuple coordinate, provided by PIL library, into decimal. It can be done as follows − gmaps = “https://www.google.com/maps?q={},{}” open_maps = “http://www.openstreetmap.org/?mlat={}&mlon={}” def process_coords(coord): coord_deg = 0 for count, values in enumerate(coord): coord_deg += (float(values[0]) / values[1]) / 60**count return coord_deg Now, we will use image.open() function to open the file as PIL object. img_file = Image.open(args.PICTURE_FILE) exif_data = img_file._getexif() if exif_data is None: print(“No EXIF data found”) sys.exit() for name, value in exif_data.items(): gps_tag = TAGS.get(name, name) if gps_tag is not ”GPSInfo”: continue After finding the GPSInfo tag, we will store the GPS reference and process the coordinates with the process_coords() method. lat_ref = value[1] == u”N” lat = process_coords(value[2]) if not lat_ref: lat = lat * -1 lon_ref = value[3] == u”E” lon = process_coords(value[4]) if not lon_ref: lon = lon * -1 Now, initiate kml object from simplekml library as follows − kml = simplekml.Kml() kml.newpoint(name = args.PICTURE_FILE, coords = [(lon, lat)]) kml.save(args.PICTURE_FILE + “.kml”) We can now print the coordinates from processed information as follows − print(“GPS Coordinates: {}, {}”.format(lat, lon)) print(“Google Maps URL: {}”.format(gmaps.format(lat, lon))) print(“OpenStreetMap URL: {}”.format(open_maps.format(lat, lon))) print(“KML File {} created”.format(args.PICTURE_FILE + “.kml”)) PDF Documents PDF documents have a
Network Forensics-I
Python Digital Network Forensics-I ”; Previous Next This chapter will explain the fundamentals involved in performing network forensics using Python. Understanding Network Forensics Network forensics is a branch of digital forensics that deals with the monitoring and analysis of computer network traffic, both local and WAN(wide area network), for the purposes of information gathering, evidence collection, or intrusion detection. Network forensics play a critical role in investigating digital crimes such as theft of intellectual property or leakage of information. A picture of network communications helps an investigator to solve some crucial questions as follows − What websites has been accessed? What kind of content has been uploaded on our network? What kind of content has been downloaded from our network? What servers are being accessed? Is somebody sending sensitive information outside of company firewalls? Internet Evidence Finder (IEF) IEF is a digital forensic tool to find, analyze and present digital evidence found on different digital media like computer, smartphones, tablets etc. It is very popular and used by thousands of forensics professionals. Use of IEF Due to its popularity, IEF is used by forensics professionals to a great extent. Some of the uses of IEF are as follows − Due to its powerful search capabilities, it is used to search multiple files or data media simultaneously. It is also used to recover deleted data from the unallocated space of RAM through new carving techniques. If investigators want to rebuild web pages in their original format on the date they were opened, then they can use IEF. It is also used to search logical or physical disk volumes. Dumping Reports from IEF to CSV using Python IEF stores data in a SQLite database and following Python script will dynamically identify result tables within the IEF database and dump them to respective CSV files. This process is done in the steps shown below First, generate IEF result database which will be a SQLite database file ending with .db extension. Then, query that database to identify all the tables. Lastly, write this result tables to an individual CSV file. Python Code Let us see how to use Python code for this purpose − For Python script, import the necessary libraries as follows − from __future__ import print_function import argparse import csv import os import sqlite3 import sys Now, we need to provide the path to IEF database file − if __name__ == ”__main__”: parser = argparse.ArgumentParser(”IEF to CSV”) parser.add_argument(“IEF_DATABASE”, help=”Input IEF database”) parser.add_argument(“OUTPUT_DIR”, help=”Output DIR”) args = parser.parse_args() Now, we will confirm the existence of IEF database as follows − if not os.path.exists(args.OUTPUT_DIR): os.makedirs(args.OUTPUT_DIR) if os.path.exists(args.IEF_DATABASE) and os.path.isfile(args.IEF_DATABASE): main(args.IEF_DATABASE, args.OUTPUT_DIR) else: print(“[-] Supplied input file {} does not exist or is not a ” “file”.format(args.IEF_DATABASE)) sys.exit(1) Now, as we did in earlier scripts, make the connection with SQLite database as follows to execute the queries through cursor − def main(database, out_directory): print(“[+] Connecting to SQLite database”) conn = sqlite3.connect(database) c = conn.cursor() The following lines of code will fetch the names of the tables from the database − print(“List of all tables to extract”) c.execute(“select * from sqlite_master where type = ”table””) tables = [x[2] for x in c.fetchall() if not x[2].startswith(”_”) and not x[2].endswith(”_DATA”)] Now, we will select all the data from the table and by using fetchall() method on the cursor object we will store the list of tuples containing the table’s data in its entirety in a variable − print(“Dumping {} tables to CSV files in {}”.format(len(tables), out_directory)) for table in tables: c.execute(“pragma table_info(”{}”)”.format(table)) table_columns = [x[1] for x in c.fetchall()] c.execute(“select * from ”{}””.format(table)) table_data = c.fetchall() Now, by using CSV_Writer() method we will write the content in CSV file − csv_name = table + ”.csv” csv_path = os.path.join(out_directory, csv_name) print(”[+] Writing {} table to {} CSV file”.format(table,csv_name)) with open(csv_path, “w”, newline = “”) as csvfile: csv_writer = csv.writer(csvfile) csv_writer.writerow(table_columns) csv_writer.writerows(table_data) The above script will fetch all the data from tables of IEF database and write the contents to the CSV file of our choice. Working with Cached Data From IEF result database, we can fetch more information that is not necessarily supported by IEF itself. We can fetch the cached data, a bi product for information, from email service provider like Yahoo, Google etc. by using IEF result database. The following is the Python script for accessing the cached data information from Yahoo mail, accessed on Google Chrome, by using IEF database. Note that the steps would be more or less same as followed in the last Python script. First, import the necessary libraries for Python as follows − from __future__ import print_function import argparse import csv import os import sqlite3 import sys import json Now, provide the path to IEF database file along with two positional arguments accepts by command-line handler as done in the last script − if __name__ == ”__main__”: parser = argparse.ArgumentParser(”IEF to CSV”) parser.add_argument(“IEF_DATABASE”, help=”Input IEF database”) parser.add_argument(“OUTPUT_DIR”, help=”Output DIR”) args = parser.parse_args() Now, confirm the existence of IEF database as follows − directory = os.path.dirname(args.OUTPUT_CSV) if not os.path.exists(directory):os.makedirs(directory) if os.path.exists(args.IEF_DATABASE) and os.path.isfile(args.IEF_DATABASE): main(args.IEF_DATABASE, args.OUTPUT_CSV) else: print(“Supplied input file {} does not exist or is not a ” “file”.format(args.IEF_DATABASE)) sys.exit(1) Now, make the connection with SQLite database as follows to execute the queries through cursor − def main(database, out_csv): print(“[+] Connecting to SQLite database”) conn = sqlite3.connect(database) c = conn.cursor() You can use the following lines of code to fetch the instances of Yahoo Mail contact cache record − print(“Querying IEF database for Yahoo Contact Fragments from ” “the Chrome Cache Records Table”) try: c.execute(“select * from ”Chrome Cache Records” where URL like ” “”https://data.mail.yahoo.com” “/classicab/v2/contacts/?format=json%””) except sqlite3.OperationalError: print(“Received an error querying the database — database may be” “corrupt or not have a Chrome Cache Records table”) sys.exit(2) Now, the list of tuples returned from above query to be saved into a variable as follows − contact_cache = c.fetchall() contact_data = process_contacts(contact_cache) write_csv(contact_data, out_csv) Note that here we will use two methods namely process_contacts() for setting up the result
Investigation Of Log Based Artifacts ”; Previous Next Till now, we have seen how to obtain artifacts in Windows using Python. In this chapter, let us learn about investigation of log based artifacts using Python. Introduction Log-based artifacts are the treasure trove of information that can be very useful for a digital forensic expert. Though we have various monitoring software for collecting the information, the main issue for parsing useful information from them is that we need lot of data. Various Log-based Artifacts and Investigating in Python In this section, let us discuss various log based artifacts and their investigation in Python − Timestamps Timestamp conveys the data and time of the activity in the log. It is one of the important elements of any log file. Note that these data and time values can come in various formats. The Python script shown below will take the raw date-time as input and provides a formatted timestamp as its output. For this script, we need to follow the following steps − First, set up the arguments that will take the raw data value along with source of data and the data type. Now, provide a class for providing common interface for data across different date formats. Python Code Let us see how to use Python code for this purpose − First, import the following Python modules − from __future__ import print_function from argparse import ArgumentParser, ArgumentDefaultsHelpFormatter from datetime import datetime as dt from datetime import timedelta Now as usual we need to provide argument for command-line handler. Here it will accept three arguments, first would be the date value to be processed, second would be the source of that date value and third would be its type − if __name__ == ”__main__”: parser = ArgumentParser(”Timestamp Log-based artifact”) parser.add_argument(“date_value”, help=”Raw date value to parse”) parser.add_argument( “source”, help = “Source format of date”,choices = ParseDate.get_supported_formats()) parser.add_argument( “type”, help = “Data type of input value”,choices = (”number”, ”hex”), default = ”int”) args = parser.parse_args() date_parser = ParseDate(args.date_value, args.source, args.type) date_parser.run() print(date_parser.timestamp) Now, we need to define a class which will accept the arguments for date value, date source, and the value type − class ParseDate(object): def __init__(self, date_value, source, data_type): self.date_value = date_value self.source = source self.data_type = data_type self.timestamp = None Now we will define a method that will act like a controller just like the main() method − def run(self): if self.source == ”unix-epoch”: self.parse_unix_epoch() elif self.source == ”unix-epoch-ms”: self.parse_unix_epoch(True) elif self.source == ”windows-filetime”: self.parse_windows_filetime() @classmethod def get_supported_formats(cls): return [”unix-epoch”, ”unix-epoch-ms”, ”windows-filetime”] Now, we need to define two methods which will process Unix epoch time and FILETIME respectively − def parse_unix_epoch(self, milliseconds=False): if self.data_type == ”hex”: conv_value = int(self.date_value) if milliseconds: conv_value = conv_value / 1000.0 elif self.data_type == ”number”: conv_value = float(self.date_value) if milliseconds: conv_value = conv_value / 1000.0 else: print(“Unsupported data type ”{}” provided”.format(self.data_type)) sys.exit(”1”) ts = dt.fromtimestamp(conv_value) self.timestamp = ts.strftime(”%Y-%m-%d %H:%M:%S.%f”) def parse_windows_filetime(self): if self.data_type == ”hex”: microseconds = int(self.date_value, 16) / 10.0 elif self.data_type == ”number”: microseconds = float(self.date_value) / 10 else: print(“Unsupported data type ”{}” provided”.format(self.data_type)) sys.exit(”1”) ts = dt(1601, 1, 1) + timedelta(microseconds=microseconds) self.timestamp = ts.strftime(”%Y-%m-%d %H:%M:%S.%f”) After running the above script, by providing a timestamp we can get the converted value in easy-to-read format. Web Server Logs From the point of view of digital forensic expert, web server logs are another important artifact because they can get useful user statistics along with information about the user and geographical locations. Following is the Python script that will create a spreadsheet, after processing the web server logs, for easy analysis of the information. First of all we need to import the following Python modules − from __future__ import print_function from argparse import ArgumentParser, FileType import re import shlex import logging import sys import csv logger = logging.getLogger(__file__) Now, we need to define the patterns that will be parsed from the logs − iis_log_format = [ (“date”, re.compile(r”d{4}-d{2}-d{2}”)), (“time”, re.compile(r”dd:dd:dd”)), (“s-ip”, re.compile( r”((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(.|$)){4}”)), (“cs-method”, re.compile( r”(GET)|(POST)|(PUT)|(DELETE)|(OPTIONS)|(HEAD)|(CONNECT)”)), (“cs-uri-stem”, re.compile(r”([A-Za-z0-1/.-]*)”)), (“cs-uri-query”, re.compile(r”([A-Za-z0-1/.-]*)”)), (“s-port”, re.compile(r”d*”)), (“cs-username”, re.compile(r”([A-Za-z0-1/.-]*)”)), (“c-ip”, re.compile( r”((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(.|$)){4}”)), (“cs(User-Agent)”, re.compile(r”.*”)), (“sc-status”, re.compile(r”d*”)), (“sc-substatus”, re.compile(r”d*”)), (“sc-win32-status”, re.compile(r”d*”)), (“time-taken”, re.compile(r”d*”))] Now, provide an argument for command-line handler. Here it will accept two arguments, first would be the IIS log to be processed, second would be the desired CSV file path. if __name__ == ”__main__”: parser = ArgumentParser(”Parsing Server Based Logs”) parser.add_argument(”iis_log”, help = “Path to IIS Log”,type = FileType(”r”)) parser.add_argument(”csv_report”, help = “Path to CSV report”) parser.add_argument(”-l”, help = “Path to processing log”,default=__name__ + ”.log”) args = parser.parse_args() logger.setLevel(logging.DEBUG) msg_fmt = logging.Formatter( “%(asctime)-15s %(funcName)-10s “”%(levelname)-8s %(message)s”) strhndl = logging.StreamHandler(sys.stdout) strhndl.setFormatter(fmt = msg_fmt) fhndl = logging.FileHandler(args.log, mode = ”a”) fhndl.setFormatter(fmt = msg_fmt) logger.addHandler(strhndl) logger.addHandler(fhndl) logger.info(“Starting IIS Parsing “) logger.debug(“Supplied arguments: {}”.format(“, “.join(sys.argv[1:]))) logger.debug(“System ” + sys.platform) logger.debug(“Version ” + sys.version) main(args.iis_log, args.csv_report, logger) iologger.info(“IIS Parsing Complete”) Now we need to define main() method that will handle the script for bulk log information − def main(iis_log, report_file, logger): parsed_logs = [] for raw_line in iis_log: line = raw_line.strip() log_entry = {} if line.startswith(“#”) or len(line) == 0: continue if ””” in line: line_iter = shlex.shlex(line_iter) else: line_iter = line.split(” “) for count, split_entry in enumerate(line_iter): col_name, col_pattern = iis_log_format[count] if col_pattern.match(split_entry): log_entry[col_name] = split_entry else: logger.error(“Unknown column pattern discovered. ” “Line preserved in full below”) logger.error(“Unparsed Line: {}”.format(line)) parsed_logs.append(log_entry) logger.info(“Parsed {} lines”.format(len(parsed_logs))) cols = [x[0] for x in iis_log_format] logger.info(“Creating report file: {}”.format(report_file)) write_csv(report_file, cols, parsed_logs) logger.info(“Report created”) Lastly, we need to define a method that will write the output to spreadsheet − def write_csv(outfile, fieldnames, data): with open(outfile, ”w”, newline=””) as open_outfile: csvfile = csv.DictWriter(open_outfile, fieldnames) csvfile.writeheader() csvfile.writerows(data) After running the above script we will get the web server based logs in a spreadsheet. Scanning Important Files using YARA YARA(Yet Another Recursive Algorithm) is a pattern matching utility designed for malware identification and incident response. We will use YARA for scanning the files. In the following Python script, we will use YARA. We can install YARA with the help of following command − pip install YARA We can
Getting Started With Python
Python Digital Forensics – Getting Started ”; Previous Next In the previous chapter, we learnt the basics of digital forensics, its advantages and limitations. This chapter will make you comfortable with Python, the essential tool that we are using in this digital forensics investigation. Why Python for Digital Forensics? Python is a popular programming language and is used as tool for cyber security, penetration testing as well as digital forensic investigations. When you choose Python as your tool for digital forensics, you do not need any other third party software for completing the task. Some of the unique features of Python programming language that makes it a good fit for digital forensics projects are given below − Simplicity of Syntax − Python’s syntax is simple compared to other languages, that makes it easier for one to learn and put into use for digital forensics. Comprehensive inbuilt modules − Python’s comprehensive inbuilt modules are an excellent aid for performing a complete digital forensic investigation. Help and Support − Being an open source programming language, Python enjoys excellent support from the developer’s and users’ community. Features of Python Python, being a high-level, interpreted, interactive and object-oriented scripting language, provides the following features − Easy to Learn − Python is a developer friendly and easy to learn language, because it has fewer keywords and simplest structure. Expressive and Easy to read − Python language is expressive in nature; hence its code is more understandable and readable. Cross-platform Compatible − Python is a cross-platform compatible language which means it can run efficiently on various platforms such as UNIX, Windows, and Macintosh. Interactive Mode Programming − We can do interactive testing and debugging of code because Python supports an interactive mode for programming. Provides Various Modules and Functions − Python has large standard library which allows us to use rich set of modules and functions for our script. Supports Dynamic Type Checking − Python supports dynamic type checking and provides very high-level dynamic data types. GUI Programming − Python supports GUI programming to develop Graphical user interfaces. Integration with other programming languages − Python can be easily integrated with other programming languages like C, C++, JAVA etc. Installing Python Python distribution is available for various platforms such as Windows, UNIX, Linux, and Mac. We only need to download the binary code as per our platform. In case if the binary code for any platform is not available, we must have a C compiler so that source code can be compiled manually. This section will make you familiar with installation of Python on various platforms− Python Installation on Unix and Linux You can follow following the steps shown below to install Python on Unix/Linux machine. Step 1 − Open a Web browser. Type and enter www.python.org/downloads/ Step 2 − Download zipped source code available for Unix/Linux. Step 3 − Extract the downloaded zipped files. Step 4 − If you wish to customize some options, you can edit the Modules/Setup file. Step 5 − Use the following commands for completing the installation − run ./configure script make make install Once you have successfully completed the steps given above, Python will be installed at its standard location /usr/local/bin and its libraries at /usr/local/lib/pythonXX where XX is the version of Python. Python Installation on Windows We can follow following simple steps to install Python on Windows machine. Step 1 − Open a web browser. Type and enter www.python.org/downloads/ Step 2 − Download the Windows installer python-XYZ.msi file, where XYZ is the version we need to install. Step 3 − Now run that MSI file after saving the installer file to your local machine. Step 4 − Run the downloaded file which will bring up the Python installation wizard. Python Installation on Macintosh For installing Python 3 on Mac OS X, we must use a package installer named Homebrew. You can use the following command to install Homebrew, incase you do not have it on your system − $ ruby -e “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)” If you need to update the package manager, then it can be done with the help of following command − $ brew update Now, use the following command to install Python3 on your system − $ brew install python3 Setting the PATH We need to set the path for Python installation and this differs with platforms such as UNIX, WINDOWS, or MAC. Path setting at Unix/Linux You can use the following options to set the path on Unix/Linux − If using csh shell – Type setenv PATH “$PATH:/usr/local/bin/python” and then press Enter. If using bash shell (Linux) − Type export ATH=”$PATH:/usr/local/bin/python” and then press Enter. If using sh or ksh shell – Type PATH=”$PATH:/usr/local/bin/python” and then press Enter. Path Setting at Windows Type path %path%;C:Python at the command prompt and then press Enter. Running Python You can choose any of the following three methods to start the Python interpreter − Method 1: Using Interactive Interpreter A system that provides a command-line interpreter or shell can easily be used for starting Python. For example, Unix, DOS etc. You can follow the steps given below to start coding in interactive interpreter − Step 1 − Enter python at the command line. Step 2 − Start coding right away in the interactive interpreter using the commands shown below − $python # Unix/Linux or python% # Unix/Linux or C:> python # Windows/DOS Method 2: Using Script from the Command-line We can also execute a Python script at command line by invoking the interpreter on our application. You can use commands shown below − $python script.py # Unix/Linux or python% script.py # Unix/Linux or C: >python script.py # Windows/DOS Method 3: Integrated Development Environment If a system has GUI application that supports Python, then Python can be run from that GUI environment. Some of the IDE for various platforms are given below − Unix IDE − UNIX has IDLE IDE for Python. Windows IDE − Windows has PythonWin, the first Windows interface for Python along with GUI.