Splunk – Environment

Splunk – Environment ”; Previous Next In this tutorial, we will aim to install the enterprise version. This version is available for a free evaluation for 60 days with all features enabled. You can download the setup using the below link which is available for both windows and Linux platforms. https://www.splunk.com/en_us/download/splunk-enterprise.html. Linux Version The Linux version is downloaded from the download link given above. We choose the .deb package type as the installation will be done in a Ubuntu platform. We shall learn this with a step by step approach − Step 1 Download the .deb package as shown in the screenshot below − Step 2 Go to the download directory and install Splunk using the above downloaded package. Step 3 Next you can start Splunk by using the following command with accept license argument. It will ask for administrator user name and password which you should provide and remember. Step 4 The Splunk server starts and mentions the URL where the Splunk interface can be accessed. Step 5 Now, you can access the Splunk URL and enter the admin user ID and password created in step 3. Windows Version The windows version is available as a msi installer as shown in the below image − Double clicking on the msi installer installs the Windows version in a straight forward process. The two important steps where we must make the right choice for successful installation are as follows. Step 1 As we are installing it on a local system, choose the local system option as given below − Step 2 Enter the password for the administrator and remember it, as it will be used in the future configurations. Step 3 In the final step, we see that Splunk is successfully installed and it can be launched from the web browser. Step 4 Next, open the browser and enter the given url, http://localhost:8000, and login to the Splunk using the admin user ID and password. Print Page Previous Next Advertisements ”;

Splunk – Pivot & Datasets

Splunk – Pivot and Datasets ”; Previous Next Splunk can ingest different types of data sources and build tables which are similar to relational tables. These are called table dataset or just tables. They provide easy ways to analyse and filter the data and lookups, etc. These table data sets are also used in creating pivot analysis which we learn in this chapter. Creating a Dataset We use a Splunk Add-on named Splunk Datasets Add-on to create and manage the datasets. It can be downloaded from the Splunk website, https://splunkbase.splunk.com/app/3245/#/details. It has to be installed by following the instructions given in the details tab in this link. On successful installation, we see a button named Create New Table Dataset. Selecting a Dataset Next, we click on the Create New Table Dataset button and it gives us the option to choose from the below three options. Indexes and Source Types − Choose from an existing index or source type which are already added to Splunk through Add Data app. Existing Datasets − You might have already created some dataset previously which you want to modify by creating a new dataset from it. Search − Write a search query and the result can be used to create a new dataset. In our example, we choose an index to be our source of data set as shown in the image below − Choosing Dataset Fields On clicking OK in the above screen, we are presented with an option to choose the various fields we want to finally get into the Table Dataset. The _time field is selected by default and this field cannot be dropped. We choose the fields: bytes, categoryID, clientIP and files. On clicking done in the above screen, we get the final dataset table with all the selected fields, as seen below. Here the dataset has become similar to a relational table. We save the dataset with save as option available in the top right corner. Creating Pivot We use the above dataset to create a pivot report. The pivot report reflects aggregation of values of one column with respect to the values in another column. In other words, one columns values are made into rows and another columns values are made into rows. Choose Dataset Action To achieve this, we first select the dataset using the dataset tab and then choose the option Visualize with Pivot from the Actions column for that data set. Choose the Pivot Fields Next, we choose the appropriate fields for creating the pivot table. We choose category ID in the split columns option as this is the field whose values should appear as different columns in the report. Then we choose File in the Split Rows option as this is the field whose values should be presented in rows. The result shows count of each categoryid values for each value in the file field. Next, we can save the pivot table as a Report or a panel in an existing dashboard for future reference. Print Page Previous Next Advertisements ”;

Splunk – Source Types

Splunk – Source Types ”; Previous Next All the incoming data to Splunk are first judged by its inbuilt data processing unit and classified to certain data types and categories. For example, if it is a log from apache web server, Splunk is able to recognize that and create appropriate fields out of the data read. This feature in Splunk is called source type detection and it uses its built-in source types that are known as “pretrained” source types to achieve this. This makes things easier for analysis as the user does not have to manually classify the data and assign any data types to the fields of the incoming data. Supported Source Types The supported source types in Splunk can be seen by uploading a file through the Add Data feature and then selecting the dropdown for Source Type. In the below image, we have uploaded a CSV file and then checked for all the available options. Source Type Sub-Category Even in those categories, we can further click to see all the sub categories that are supported. So when you choose the database category, you can find the different types of databases and their supported files which Splunk can recognize. Pre-Trained Source Types The below table lists some of the important pre-trained source types Splunk recognizes − Source Type Name Nature access_combined NCSA combined format http web server logs (can be generated by apache or other web servers) access_combined_wcookie NCSA combined format http web server logs (can be generated by apache or other web servers), with cookie field added at end apache_error Standard Apache web server error log linux_messages_syslog Standard linux syslog (/var/log/messages on most platforms) log4j Log4j standard output produced by any J2EE server using log4j mysqld_error Standard mysql error log Print Page Previous Next Advertisements ”;

QlikView – Inline Data

QlikView – Inline Data ”; Previous Next Data can be entered into a QlikView document by directly typing or pasting it. This feature is a quick method to get the data from the clipboard into the QlikView. The script editor provides this feature under the Insert tab. Script Editor To open the Inline data load option, we open the script editor and go to Insert → Load Statement → Load Inline. Inserting Data On opening the above screen, we get a spreadsheet-like document where we can type the values. We can also paste the values already available in the clipboard. Please note the column headers are created automatically. Click Finish. Load Script The command, which loads the data, is created in the background which can be seen in the script editor. Table Box Data On creating a Table Box Sheet Object, we see the data that is read from the Inline data load option. Print Page Previous Next Advertisements ”;

Splunk – Search Macros

Splunk – Search Macros ”; Previous Next Search macros are reusable blocks of Search Processing Language (SPL) that you can insert into other searches. They are used when you want to use the same search logic on different parts or values in the data set dynamically. They can take arguments dynamically and the search result will be updated as per the new values. Macro Creation To create the search macro, we go to the settings → Advanced Search → Search macros → Add new. This brings up the below screen where we start creating the macro. Macro Scenario We want to show various stats about the file size from the web_applications log. The stats are about max, min and avg value of the filesize using the bytes field in the log. The result should display these stats for each file listed in the log. So here the type of the stats is dynamic in nature. The name of the stats function will be passed as an argument to the macro. Defining the Macro Next, we define the macro by setting various properties as shown in the below screen. The name of the macro contains (1), indicating that there is one argument to be passed into the macro when it is used in the search string. fun is the argument which will be passed on to the macro during execution in the search query. Using the Macro To use the macro, we make it a part of the search string. On passing different values for the argument we see different results as expected. Consider finding the average size in bytes of the files. We pass avg as the argument and get the result as shown below. The macro has been kept under ` sign as part of the search query. Similarly, if we want the maximum file size for each of the files present in the log, then we use max as the argument. The result is as shown below. Print Page Previous Next Advertisements ”;

Circular Permutation

Statistics – Circular Permutation ”; Previous Next Circular permutation is the total number of ways in which n distinct objects can be arranged around a fix circle. It is of two types. Case 1 − Clockwise and Anticlockwise orders are different. Case 2 − Clockwise and Anticlockwise orders are same. Case 1 − Formula ${P_n = (n-1)!}$ Where − ${P_n}$ = represents circular permutation ${n}$ = Number of objects Case 2 − Formula ${P_n = frac{n-1!}{2!}}$ Where − ${P_n}$ = represents circular permutation ${n}$ = Number of objects Example Problem Statement Calculate circular permulation of 4 persons sitting around a round table considering i) Clockwise and Anticlockwise orders as different and ii) Clockwise and Anticlockwise orders as same. Solution In Case 1, n = 4, Using formula ${P_n = (n-1)!}$ Apply the formula ${P_4 = (4-1)! \[7pt] = 3! \[7pt] = 6 }$ In Case 2, n = 4, Using formula ${P_n = frac{n-1!}{2!}}$ Apply the formula ${P_4 = frac{n-1!}{2!} \[7pt] = frac{4-1!}{2!} \[7pt] = frac{3!}{2!} \[7pt] = frac{6}{2} \[7pt] = 3 }$ Calculator Print Page Previous Next Advertisements ”;

Splunk – Data Ingestion

Splunk – Data Ingestion ”; Previous Next Data ingestion in Splunk happens through the Add Data feature which is part of the search and reporting app. After logging in, the Splunk interface home screen shows the Add Data icon as shown below. On clicking this button, we are presented with the screen to select the source and format of the data we plan to push to Splunk for analysis. Gathering The Data We can get the data for analysis from the Official Website of Splunk. Save this file and unzip it in your local drive. On opening the folder, you can find three files which have different formats. They are the log data generated by some web apps. We can also gather another set of data provided by Splunk which is available at from the Official Splunk webpage. We will use data from both these sets for understanding the working of various features of Splunk. Uploading data Next, we choose the file, secure.log from the folder, mailsv which we have kept in our local system as mentioned in the previous paragraph. After selecting the file, we move to next step using the green coloured next button in the top right corner. Selecting Source Type Splunk has an in-built feature to detect the type of the data being ingested. It also gives the user an option to choose a different data type than the chosen by Splunk. On clicking the source type drop down, we can see various data types that Splunk can ingest and enable for searching. In the current example given below, we choose the default source type. Input Settings In this step of data ingestion, we configure the host name from which the data is being ingested. Following are the options to choose from, for the host name − Constant value It is the complete host name where the source data resides. regex on path When you want to extract the host name with a regular expression. Then enter the regex for the host you want to extract in the Regular expression field. segment in path When you want to extract the host name from a segment in your data source”s path, enter the segment number in the Segment number field. For example, if the path to the source is /var/log/ and you want the third segment (the host server name) to be the host value, enter “3”. Next, we choose the index type to be created on the input data for searching. We choose the default index strategy. The summary index only creates summary of the data through aggregation and creates index on it while the history index is for storing the search history. It is clearly depicted in the image below − Review Settings After clicking on the next button, we see a summary of the settings we have chosen. We review it and choose Next to finish the uploading of data. On finishing the load, the below screen appears which shows the successful data ingestion and further possible actions we can take on the data. Print Page Previous Next Advertisements ”;

Black-Scholes model

Statistics – Black-Scholes model ”; Previous Next The Black Scholes model is a mathematical model to check price variation over time of financial instruments such as stocks which can be used to compute the price of a European call option. This model assumes that the price of assets which are heavily traded follows a geometric Brownian motion having a constant drift and volatility. In case of stock option, Black Scholes model incorporates the constant price variation of the underlying stock, the time value of money, strike price of the option and its time to expiry. The Black Scholes Model was developed in 1973 by Fisher Black, Robert Merton and Myron Scholes and is still widely used in euporian financial markets. It provides one of the best way to determine fair prices of options. Inputs The Black Scholes model requires five inputs. Strike price of an option Current stock price Time to expiry Risk-free rate Volatility Assumptions The Black Scholes model assumes following points. Stock prices follow a lognormal distribution. Asset prices cannot be negative. No transaction cost or tax. Risk-free interest rate is constant for all maturities. Short selling of securities with use of proceeds is permitted. No riskless arbitrage opportunity present. Formula ${ C = SN(d_1) – Ke^{-rT}Nd_2 \[7pt] , P = Ke^{-rT}N(-d_2) – SN(-d_1) \[7pt] , where \[7pt] , d_1 = frac{1}{{sigma sqrt T}} [ln(frac{S}{K}) + (r + frac{sigma^2}{2}T)] \[7pt] , d_2 = d_1 – sigma sqrt T }$ Where − ${C}$ = Value of Call Option. ${P}$ = Value of Put Option. ${S}$ = Stock Price. ${K}$ = Strike Price. ${r}$ = Risk free interest rate. ${T}$ = Time to maturity. ${sigma}$ = Annualized volatility. Limitations The Black Scholes model have following limitations. Only applicable to European options as American options could be exercised before their expiry. Constant dividend and constant risk free rates may not be relistic. Volatility may fluctuate with the level of supply and demand of option thus being constant may not be true. Print Page Previous Next Advertisements ”;

Splunk – Field Searching

Splunk – Field Searching ”; Previous Next When Splunk reads the uploaded machine data, it interprets the data and divides it into many fields which represent a single logical fact about the entire data record. For example, a single record of information may contain server name, timestamp of the event, type of the event being logged whether login attempt or a http response, etc. Even in case of unstructured data, Splunk tries to divide the fields into key value pairs or separate them based on the data types they have, numeric and string, etc. Continuing with the data uploaded in the previous chapter, we can see the fields from the secure.log file by clicking on the show fields link which will open up the following screen. We can notice the fields Splunk has generated from this log file. Choosing the Fields We can choose what fields to be displayed by selecting or unselecting the fields from the list of all fields. Clicking on all fields opens a window showing the list of all the fields. Some of these fields have check marks against them showing they are already selected. We can use the check boxes to choose our fields for display. Besides the name of the field, it displays the number of distinct values the fields have, its data type and what percentage of events this field is present in. Field Summary Very detailed stats for every selected field become available by clicking on the name of the field. It shows all the distinct values for the field, their count and their percentages. Using Fields in Search The field names can also be inserted into the search box along with the specific values for the search. In the below example, we aim to find all the records for the date, 15th Oct for the host named mailsecure_log. We get the result for this specific date. Print Page Previous Next Advertisements ”;

Continuous Series Arithmetic Mode

Statistics – Continuous Series Arithmetic Mode ”; Previous Next When data is given based on ranges along with their frequencies. Following is an example of continous series − Items 0-5 5-10 10-20 20-30 30-40 Frequency 2 5 1 3 12 Formula $M_o = {L} + frac{f_1-f0}{2f_1-f_0-f_2} times {i}$ Where − ${M_o}$ = Mode ${L}$ = Lower limit of modal class ${f_1}$ = Frquencey of modal class ${f_0}$ = Frquencey of pre-modal class ${f_2}$ = Frquencey of class succeeding modal class ${i}$ = Class interval. In case there are two values of variable which have equal highest frequency, then the series is bi-modal and mode is said to be ill-defined. In such situations mode is calculated by the following formula − Mode = 3 Median – 2 Mean Arithmetic Mode can be used to describe qualitative phenomenon e.g. consumer preferences, brand preference etc. It is preferred as a measure of central tendency when the distribution is not normal because it is not affected by extreme values. Example Problem Statement − Calculate the Arithmetic Mode from the following data − Wages (in Rs.) No.of workers 0-5 3 5-10 7 10-15 15 15-20 30 20-25 20 25-30 10 30-35 5 Solution − Using following formula $M_o = {L} + frac{f_1-f0}{2f_1-f_0-f_2} times {i}$ ${L}$ = 15 ${f_1}$ = 30 ${f_0}$ = 15 ${f_2}$ = 20 ${i}$ = 5 Substituting the values, we get $M_o = {15} + frac{30-15}{2 times 30-15-20} times {5} \[7pt] , = {15+3} \[7pt] , = {18}$ Thus Arithmetic Mode is 18. Calculator Print Page Previous Next Advertisements ”;