Individual Series Arithmetic Mean

Individual Series Arithmetic Mean ”; Previous Next When data is given on individual basis. Following is an example of individual series − Items 5 10 20 30 40 50 60 70 For individual series, the Arithmetic Mean can be calculated using the following formula. Formula $bar{x} = sum_{i=1}^{n} X_{i}$ Alternatively, we can write same formula as follows − $bar{x} = frac{_{sum {x}}}{N}$ Where − $X_{1}, X_{2}, X_{3}, …. X_{n}$ = individual observation of variable. $sum {x}$ = sum of all observations of the variable ${N}$ = Number of observations Example Problem Statement − Calculate Arithmetic Mean for the following individual data − Items 14 36 45 70 105 Solution − Based on the above mentioned formula, Arithmetic Mean $bar{x}$ will be − $bar{x} = frac{14 + 36 + 45 + 70 + 105}{5} \[7pt] , = frac{270}{5} \[7pt] , = {54}$ The Arithmetic Mean of the given numbers is 54. Calculator Print Page Previous Next Advertisements ”;

Adjusted R-Squared

Statistics – Adjusted R-Squared ”; Previous Next R-squared measures the proportion of the variation in your dependent variable (Y) explained by your independent variables (X) for a linear regression model. Adjusted R-squared adjusts the statistic based on the number of independent variables in the model.${R^2}$ shows how well terms (data points) fit a curve or line. Adjusted ${R^2}$ also indicates how well terms fit a curve or line, but adjusts for the number of terms in a model. If you add more and more useless variables to a model, adjusted r-squared will decrease. If you add more useful variables, adjusted r-squared will increase. Adjusted ${R_{adj}^2}$ will always be less than or equal to ${R^2}$. You only need ${R^2}$ when working with samples. In other words, ${R^2}$ isn”t necessary when you have data from an entire population. Formula ${R_{adj}^2 = 1 – [frac{(1-R^2)(n-1)}{n-k-1}]}$ Where − ${n}$ = the number of points in your data sample. ${k}$ = the number of independent regressors, i.e. the number of variables in your model, excluding the constant. Example Problem Statement − A fund has a sample R-squared value close to 0.5 and it is doubtlessly offering higher risk adjusted returns with the sample size of 50 for 5 predictors. Find Adjusted R square value. Solution − Sample size = 50 Number of predictor = 5 Sample R – square = 0.5.Substitute the qualities in the equation, $ {R_{adj}^2 = 1 – [frac{(1-0.5^2)(50-1)}{50-5-1}] \[7pt] , = 1 – (0.75) times frac{49}{44} , \[7pt] , = 1 – 0.8352 , \[7pt] , = 0.1648 }$ Calculator Print Page Previous Next Advertisements ”;

Chebyshev”s Theorem

Statistics – Chebyshev”s Theorem ”; Previous Next The fraction of any set of numbers lying within k standard deviations of those numbers of the mean of those numbers is at least ${1-frac{1}{k^2}}$ Where − ${k = frac{the within number}{the standard deviation}}$ and ${k}$ must be greater than 1 Example Problem Statement − Use Chebyshev”s theorem to find what percent of the values will fall between 123 and 179 for a data set with mean of 151 and standard deviation of 14. Solution − We subtract 151-123 and get 28, which tells us that 123 is 28 units below the mean. We subtract 179-151 and also get 28, which tells us that 151 is 28 units above the mean. Those two together tell us that the values between 123 and 179 are all within 28 units of the mean. Therefore the “within number” is 28. So we find the number of standard deviations, k, which the “within number”, 28, amounts to by dividing it by the standard deviation − ${k = frac{the within number}{the standard deviation} = frac{28}{14} = 2}$ So now we know that the values between 123 and 179 are all within 28 units of the mean, which is the same as within k=2 standard deviations of the mean. Now, since k > 1 we can use Chebyshev”s formula to find the fraction of the data that are within k=2 standard deviations of the mean. Substituting k=2 we have − ${1-frac{1}{k^2} = 1-frac{1}{2^2} = 1-frac{1}{4} = frac{3}{4}}$ So ${frac{3}{4}}$ of the data lie between 123 and 179. And since ${frac{3}{4} = 75}$% that implies that 75% of the data values are between 123 and 179. Calculator Print Page Previous Next Advertisements ”;

Arithmetic Mode

Statistics – Arithmetic Mode ”; Previous Next Arithmetic Mode refers to the most frequently occurring value in the data set. In other words, modal value has the highest frequency associated with it. It is denoted by the symbol ${M_o}$ or Mode. We”re going to discuss methods to compute the Arithmetic Mode for three types of series: Individual Data Series Discrete Data Series Continuous Data Series Individual Data Series When data is given on individual basis. Following is an example of individual series: Items 5 10 20 30 40 50 60 70 Discrete Data Series When data is given alongwith their frequencies. Following is an example of discrete series: Items 5 10 20 30 40 50 60 70 Frequency 2 5 1 3 12 0 5 7 Continuous Data Series When data is given based on ranges alongwith their frequencies. Following is an example of continous series: Items 0-5 5-10 10-20 20-30 30-40 Frequency 2 5 1 3 12 Print Page Previous Next Advertisements ”;

Chi-squared Distribution

Statistics – Chi-squared Distribution ”; Previous Next The chi-squared distribution (chi-square or ${X^2}$ – distribution) with degrees of freedom, k is the distribution of a sum of the squares of k independent standard normal random variables. It is one of the most widely used probability distributions in statistics. It is a special case of the gamma distribution. Chi-squared distribution is widely used by statisticians to compute the following: Estimation of Confidence interval for a population standard deviation of a normal distribution using a sample standard deviation. To check independence of two criteria of classification of multiple qualitative variables. To check the relationships between categorical variables. To study the sample variance where the underlying distribution is normal. To test deviations of differences between expected and observed frequencies. To conduct a The chi-square test (a goodness of fit test). Probability density function Probability density function of Chi-Square distribution is given as: Formula ${ f(x; k ) = } $ $ begin {cases} frac{x^{ frac{k}{2} – 1} e^{-frac{x}{2}}}{2^{frac{k}{2}}Gamma(frac{k}{2})}, & text{if $x gt 0 $} \[7pt] 0, & text{if $x le 0 $} end{cases} $ Where − ${Gamma(frac{k}{2})}$ = Gamma function having closed form values for integer parameter k. ${x}$ = random variable. ${k}$ = integer parameter. Cumulative distribution function Cumulative distribution function of Chi-Square distribution is given as: Formula ${ F(x; k) = frac{gamma(frac{x}{2}, frac{k}{2})}{Gamma(frac{k}{2})}\[7pt] = P (frac{x}{2}, frac{k}{2}) }$ Where − ${gamma(s,t)}$ = lower incomplete gamma function. ${P(s,t)}$ = regularized gamma function. ${x}$ = random variable. ${k}$ = integer parameter. Print Page Previous Next Advertisements ”;

F Test Table

Statistics – F Test Table ”; Previous Next F-test is named after the more prominent analyst R.A. Fisher. F-test is utilized to test whether the two autonomous appraisals of populace change contrast altogether or whether the two examples may be viewed as drawn from the typical populace having the same difference. For doing the test, we calculate F-statistic is defined as: Formula ${F} = frac{Larger estimate of population variance}{smaller estimate of population variance} = frac{{S_1}^2}{{S_2}^2} where {{S_1}^2} gt {{S_2}^2}$ Procedure Its testing procedure is as follows: Set up null hypothesis that the two population variance are equal. i.e. ${H_0: {sigma_1}^2 = {sigma_2}^2}$ The variances of the random samples are calculated by using formula: ${S_1^2} = frac{sum(X_1- bar X_1)^2}{n_1-1}, \[7pt] {S_2^2} = frac{sum(X_2- bar X_2)^2}{n_2-1}$ The variance ratio F is computed as: ${F} = frac{{S_1}^2}{{S_2}^2} where {{S_1}^2} gt {{S_2}^2}$ The degrees of freedom are computed. The degrees of freedom of the larger estimate of the population variance are denoted by v1 and the smaller estimate by v2. That is, ${v_1}$ = degrees of freedom for sample having larger variance = ${n_1-1}$ ${v_2}$ = degrees of freedom for sample having smaller variance = ${n_2-1}$ Then from the F-table given at the end of the book, the value of ${F}$ is found for ${v_1}$ and ${v_2}$ with 5% level of significance. Then we compare the calculated value of ${F}$ with the table value of ${F_.05}$ for ${v_1}$ and ${v_2}$ degrees of freedom. If the calculated value of ${F}$ exceeds the table value of ${F}$, we reject the null hypothesis and conclude that the difference between the two variances is significant. On the other hand, if the calculated value of ${F}$ is less than the table value, the null hypothesis is accepted and concludes that both the samples illustrate the applications of F-test. Example Problem Statement: In a sample of 8 observations, the entirety of squared deviations of things from the mean was 94.5. In another specimen of 10 perceptions, the worth was observed to be 101.7 Test whether the distinction is huge at 5% level. (You are given that at 5% level of centrality, the basic estimation of ${F}$ for ${v_1}$ = 7 and ${v_2}$ = 9, ${F_.05}$ is 3.29). Solution: Let us take the hypothesis that the difference in the variances of the two samples is not significant i.e. ${H_0: {sigma_1}^2 = {sigma_2}^2}$ We are given the following: ${n_1} = 8 , {sum {(X_1 – bar X_1)}^2} = 94.5, {n_2} = 10, {sum {(X_2 – bar X_2)}^2} = 101.7, \[7pt] {S_1^2} = frac{sum(X_1- bar X_1)^2}{n_1-1} = frac {94.5}{8-1} = frac {94.5}{7} = {13.5}, \[7pt] {S_2^2} = frac{sum(X_2- bar X_2)^2}{n_2-1} = frac {101.7}{10-1} = frac {101.7}{9} = {11.3}$ Applying F-Test ${F} = frac{{S_1}^2}{{S_2}^2} = frac {13.5}{11.3} = {1.195}$ For ${v_1}$ = 8-1 = 7, ${v_2}$ = 10-1 = 9 and ${F_.05}$ = 3.29. The Calculated value of ${F}$ is less than the table value. Hence, we accept the null hypothesis and conclude that the difference in the variances of two samples is not significant at 5% level. Print Page Previous Next Advertisements ”;

Splunk – Event Types

Splunk – Event Types ”; Previous Next In Splunk search, we can design our own events from a dataset based on certain criteria. For example, we search for only the events which have a http status code of 200. This event now can be saved as an event type with a user defined name as status200 and use this event name as part of future searches. In short, an event type represents a search that returns a specific type of event or a useful collection of events. Every event that can be returned by the search gets an association with that event type. Creating Event Type There are two ways to create an event type after we have decided the search criteria. One is to run a search and then save it as an Event Type. Another is to add a new Event Type from the settings tab. We will see both the ways of creating it in this section. Using a Search Consider the search for the events which have the criteria of successful http status value of 200 and the event type run on a Wednesday. After running the search query, we can choose Save As option to save the query as an Event Type. The next screen prompts to give a name for the Event Type, choose a Tag which is optional and then choose a colour with which the events will be highlighted. The priority option decides which event type will be displayed first in case two or more event types match the same event. Finally, we can see the Event Type has been created by going to the Settings → Event Types option. Using New Event Type The other option to create a new Event Type is to use the Settings → Event Types option as shown below where we can add a new Event Type − On clicking the button New Event Type we get the following screen to add the same query as in the previous section. Viewing the Event Type To view the event we just created above, we can write the below search query in the search box and we can see the resulting events along with the colour we have chosen for the event type. Using the Event Type We can use the Event type along with other queries. Here we specify some partial criteria from the Event Type and the result is a mix of events which shows the coloured and non-coloured events in the result. Print Page Previous Next Advertisements ”;

Dot Plot

Statistics – Dot Plot ”; Previous Next A dot chart or dot plot is a statistical chart consisting of data points plotted on a fairly simple scale, typically using filled in circles. Example Problem Statement: A study of “To what extent does it take you to have breakfast?” has these outcomes: Minutes 0 1 2 3 4 5 6 7 8 9 10 11 12 People 6 2 3 5 2 5 0 0 2 3 7 4 1 Draw the Dot Plot for Minutes to Eat Breakfast! Solution: 6 individuals take 0 minutes to have breakfast (they most likely had no breakfast!), 2 individuals say they just burn through 1 moment eating, and so on. And here is the dot plot: Print Page Previous Next Advertisements ”;

Binomial Distribution

Statistics – Binomial Distribution ”; Previous Next Bionominal appropriation is a discrete likelihood conveyance. This distribution was discovered by a Swiss Mathematician James Bernoulli. It is used in such situation where an experiment results in two possibilities – success and failure. Binomial distribution is a discrete probability distribution which expresses the probability of one set of two alternatives-successes (p) and failure (q). Binomial distribution is defined and given by the following probability function − Formula ${P(X-x)} = ^{n}{C_x}{Q^{n-x}}.{p^x}$ Where − ${p}$ = Probability of success. ${q}$ = Probability of failure = ${1-p}$. ${n}$ = Number of trials. ${P(X-x)}$ = Probability of x successes in n trials. Example Problem Statement − Eight coins are tossed at the same time. Discover the likelihood of getting no less than 6 heads. Solution − Let ${p}$=probability of getting a head. ${q}$=probability of getting a tail. $ Here,{p}=frac{1}{2}, {q}= frac{1}{2}, {n}={8}, \[7pt] {P(X-x)} = ^{n}{C_x}{Q^{n-x}}.{p^x} , \[7pt] ,{P (at least 6 heads)} = {P(6H)} +{P(7H)} +{P(8H)}, \[7pt] , ^{8}{C_6}{{(frac{1}{2})}^2}{{(frac{1}{2})}^6} + ^{8}{C_7}{{(frac{1}{2})}^1}{{(frac{1}{2})}^7} +^{8}{C_8}{{(frac{1}{2})}^8}, \[7pt] , = 28 times frac{1}{256} + 8 times frac{1}{256} + 1 times frac{1}{256}, \[7pt] , = frac{37}{256}$ Calculator Print Page Previous Next Advertisements ”;

Continuous Series Arithmetic Mean

Statistics – Continuous Series Arithmetic Mean ”; Previous Next When data is given based on ranges alongwith their frequencies. Following is an example of continous series: Items 0-5 5-10 10-20 20-30 30-40 Frequency 2 5 1 3 12 In case of continous series, a mid point is computed as $frac{lower-limit + upper-limit}{2}$ and Arithmetic Mean is computed using following formula. Formula $bar{x} = frac{f_1m_1 + f_2m_2 + f_3m_3……..+ f_nm_n}{N}$ Where − ${N}$ = Number of observations. ${f_1,f_2,f_3,…,f_n}$ = Different values of frequency f. ${m_1,m_2,m_3,…,m_n}$ = Different values of mid points for ranges. Example Problem Statement − Let”s calculate Arithmetic Mean for the following continous data − Items 0-10 10-20 20-30 30-40 Frequency 2 5 1 3 Solution − Based on the given data, we have − Items Mid-ptm Frequencyf ${fm}$ 0-10 5 2 10 10-20 15 5 75 20-30 25 1 25 30-40 35 3 105     ${N=11}$ ${sum fm=215}$ Based on the above mentioned formula, Arithmetic Mean $bar{x}$ will be − $bar{x} = frac{215}{11} \[7pt] , = {19.54}$ The Arithmetic Mean of the given numbers is 19.54. Calculator Print Page Previous Next Advertisements ”;