Quartile Deviation

Statistics – Quartile Deviation ”; Previous Next It depends on the lower quartile ${Q_1}$ and the upper quartile ${Q_3}$. The difference ${Q_3 – Q_1}$ is called the inter quartile range. The difference ${Q_3 – Q_1}$ divided by 2 is called semi-inter quartile range or the quartile deviation. Formula ${Q.D. = frac{Q_3 – Q_1}{2}}$ Coefficient of Quartile Deviation A relative measure of dispersion based on the quartile deviation is known as the coefficient of quartile deviation. It is characterized as ${Coefficient of Quartile Deviation = frac{Q_3 – Q_1}{Q_3 + Q_1}}$ Example Problem Statement: Calculate the quartile deviation and coefficient of quartile deviation from the data given below: Maximum Load(short-tons) Number of Cables 9.3-9.7 22 9.8-10.2 55 10.3-10.7 12 10.8-11.2 17 11.3-11.7 14 11.8-12.2 66 12.3-12.7 33 12.8-13.2 11 Solution: Maximum Load(short-tons) Number of Cables(f) ClassBounderies CumulativeFrequencies 9.3-9.7 2 9.25-9.75 2 9.8-10.2 5 9.75-10.25 2 + 5 = 7 10.3-10.7 12 10.25-10.75 7 + 12 = 19 10.8-11.2 17 10.75-11.25 19 + 17 = 36 11.3-11.7 14 11.25-11.75 36 + 14 = 50 11.8-12.2 6 11.75-12.25 50 + 6 = 56 12.3-12.7 3 12.25-12.75 56 + 3 = 59 12.8-13.2 1 12.75-13.25 59 + 1 = 60 ${Q_1}$ Value of ${frac{n}{4}^{th}}$ item =Value of ${frac{60}{4}^{th}}$ thing = ${15^{th}}$ item. Thus ${Q_1}$ lies in class 10.25-10.75. $ {Q_1 = 1+ frac{h}{f}(frac{n}{4} – c) \[7pt] ,Where l=10.25, h=0.5, f=12, frac{n}{4}=15 and c=7 , \[7pt] , = 10.25+frac{0.5}{12} (15-7) , \[7pt] , = 10.25+0.33 , \[7pt] , = 10.58 }$ ${Q_3}$ Value of ${frac{3n}{4}^{th}}$ item =Value of ${frac{3 times 60}{4}^{th}}$ thing = ${45^{th}}$ item. Thus ${Q_3}$ lies in class 11.25-11.75. $ {Q_3 = 1+ frac{h}{f}(frac{3n}{4} – c) \[7pt] ,Where l=11.25, h=0.5, f=14, frac{3n}{4}=45 and c=36 , \[7pt] , = 11.25+frac{0.5}{14} (45-36) , \[7pt] , = 11.25+0.32 , \[7pt] , = 11.57 }$ Quartile Deviation $ {Q.D. = frac{Q_3 – Q_1}{2} \[7pt] , = frac{11.57 – 10.58}{2} , \[7pt] , = frac{0.99}{2} , \[7pt] , = 0.495 }$ Coefficient of Quartile Deviation ${Coefficient of Quartile Deviation = frac{Q_3 – Q_1}{Q_3 + Q_1} \[7pt] , = frac{11.57 – 10.58}{11.57 + 10.58} , \[7pt] , = frac{0.99}{22.15} , \[7pt] , = 0.045 }$ Print Page Previous Next Advertisements ”;

Discrete Series Arithmetic Mode

Statistics – Discrete Series Arithmetic Mode ”; Previous Next When data is given along with their frequencies. Following is an example of discrete series − Items 5 10 20 30 40 50 60 70 Frequency 2 5 1 3 12 0 5 7 In discrete series, Arithmetic Mode can be determined by inspection and finding the variable which has the highest frequency associated with it. However, when there is very less difference between the maximum frequency and the frequency preceding it or succeeding it, then grouping table method is used. Example Problem Statement − Calculate Arithmetic Mode for the following discrete data − Items 14 36 45 70 105 145 Frequency 2 5 1 3 12 0 Solution − The Arithmetic Mode of the given numbers is 105 as the highest frequency,12 is associated with 105. Calculator Print Page Previous Next Advertisements ”;

Permutation with Replacement

Statistics – Permutation with Replacement ”; Previous Next Each of several possible ways in which a set or number of things can be ordered or arranged is called permutation Combination with replacement in probability is selecting an object from an unordered list multiple times. Permutation with replacement is defined and given by the following probability function: Formula ${^nP_r = n^r }$ Where − ${n}$ = number of items which can be selected. ${r}$ = number of items which are selected. ${^nP_r}$ = Ordered list of items or permutions Example Problem Statement: Electronic device usually require a personal code to operate. This particular device uses 4-digits code. Calculate how many codes are possible. Solution: Each code is represented by r=4 permutation with replacement of set of 10 digits{0,1,2,3,4,5,6,7,8,9} ${^{10}P_4 = (10)^4 \[7pt] = 10000 }$ Print Page Previous Next Advertisements ”;

Gamma Distribution

Statistics – Gamma Distribution ”; Previous Next The gamma distribution represents continuous probability distributions of two-parameter family. Gamma distributions are devised with generally three kind of parameter combinations. A shape parameter $ k $ and a scale parameter $ theta $. A shape parameter $ alpha = k $ and an inverse scale parameter $ beta = frac{1}{ theta} $, called as rate parameter. A shape parameter $ k $ and a mean parameter $ mu = frac{k}{beta} $. Each parameter is a positive real numbers. The gamma distribution is the maximum entropy probability distribution driven by following criteria. Formula ${E[X] = k theta = frac{alpha}{beta} gt 0 and is fixed. \[7pt] E[ln(X)] = psi (k) + ln( theta) = psi( alpha) – ln( beta) and is fixed. }$ Where − ${X}$ = Random variable. ${psi}$ = digamma function. Characterization using shape $ alpha $ and rate $ beta $ Probability density function Probability density function of Gamma distribution is given as: Formula ${ f(x; alpha, beta) = frac{beta^alpha x^{alpha – 1 } e^{-x beta}}{Gamma(alpha)} where x ge 0 and alpha, beta gt 0 }$ Where − ${alpha}$ = location parameter. ${beta}$ = scale parameter. ${x}$ = random variable. Cumulative distribution function Cumulative distribution function of Gamma distribution is given as: Formula ${ F(x; alpha, beta) = int_0^x f(u; alpha, beta) du = frac{gamma(alpha, beta x)}{Gamma(alpha)}}$ Where − ${alpha}$ = location parameter. ${beta}$ = scale parameter. ${x}$ = random variable. ${gamma(alpha, beta x)} $ = lower incomplete gamma function. Characterization using shape $ k $ and scale $ theta $ Probability density function Probability density function of Gamma distribution is given as: Formula ${ f(x; k, theta) = frac{x^{k – 1 } e^{-frac{x}{theta}}}{theta^k Gamma(k)} where x gt 0 and k, theta gt 0 }$ Where − ${k}$ = shape parameter. ${theta}$ = scale parameter. ${x}$ = random variable. ${Gamma(k)}$ = gamma function evaluated at k. Cumulative distribution function Cumulative distribution function of Gamma distribution is given as: Formula ${ F(x; k, theta) = int_0^x f(u; k, theta) du = frac{gamma(k, frac{x}{theta})}{Gamma(k)}}$ Where − ${k}$ = shape parameter. ${theta}$ = scale parameter. ${x}$ = random variable. ${gamma(k, frac{x}{theta})} $ = lower incomplete gamma function. Print Page Previous Next Advertisements ”;

Data Patterns

Statistics – Data Patterns ”; Previous Next Data patterns are very useful when they are drawn graphically. Data patterns commonly described in terms of features like center, spread, shape, and other unusual properties. Other special descriptive labels are symmetric, bell-shaped, skewed, etc. Center The center of a distribution, graphically, is located at the median of the distribution. Such a graphic chart displays that almost half of the observations are on either side. Height of each column indicates the frequency of observations. Spread The spread of a distribution refers to the variation of the data. If the set of observation covers a wide range, the spread is larger. If the observations are centered around a single value, then the spread is smaller. Shape The shape of a distribution can described using following characteristics. Symmetry – In symmetric distribution, graph can be divided at the center in such a way that each half is a mirror image of the other. Number of peaks. – Distributions with one or multiple peaks. Distribution with one clear peak is known as unimodal, and distribution with two clear peaks is called bimodal. A single peak symmetric distribution at the center, is referred to as bell-shaped. Skewness – Some distributions may have multiple observations on one side of the graph than the other side. Distributions having fewer observations towards lower values are said to be skewed right; and distributions with fewer observations towards lower values are said to be skewed left. Uniform – When the set of observations has no peak and have data equally spread across the range of the distribution, then the distribution is called a uniform distribution. Unusual Features Common unusual features of data patterns are gaps and outliers. Gaps – Gaps points to areas of a distribution having no observations. Following figure has a gap as there are no observations in the middle of the distribution. Outliers – Distributions may be characterized by extreme values that differ greatly from the other set of observation data. These extreme values are refered as outliers. Following figure illustrates a distribution with an outlier. Print Page Previous Next Advertisements ”;

Permutation

Statistics – Permutation ”; Previous Next A permutation is an arrangement of all or part of a set of objects, with regard to the order of the arrangement. For example, suppose we have a set of three letters: A, B, and C. we might ask how many ways we can arrange 2 letters from that set. Permutation is defined and given by the following function: Formula ${^nP_r = frac{n!}{(n-r)!} }$ Where − ${n}$ = of the set from which elements are permuted. ${r}$ = size of each permutation. ${n,r}$ are non negative integers. Example Problem Statement: A computer scientist is trying to discover the keyword for a financial account. If the keyword consists only of 10 lower case characters (e.g., 10 characters from among the set: a, b, c… w, x, y, z) and no character can be repeated, how many different unique arrangements of characters exist? Solution: Step 1: Determine whether the question pertains to permutations or combinations. Since changing the order of the potential keywords (e.g., ajk vs. kja) would create a new possibility, this is a permutations problem. Step 2: Determine n and r n = 26 since the computer scientist is choosing from 26 possibilities (e.g., a, b, c… x, y, z). r = 10 since the computer scientist is choosing 10 characters. Step 2: Apply the formula ${^{26}P_{10} = frac{26!}{(26-10)!} \[7pt] = frac{26!}{16!} \[7pt] = frac{26(25)(24)…(11)(10)(9)…(1)}{(16)(15)…(1)} \[7pt] = 26(25)(24)…(17) \[7pt] = 19275223968000 }$ Print Page Previous Next Advertisements ”;

Combination

Statistics – Combination ”; Previous Next A combination is a selection of all or part of a set of objects, without regard to the order in which objects are selected. For example, suppose we have a set of three letters: A, B, and C. we might ask how many ways we can select 2 letters from that set. Combination is defined and given by the following function − Formula ${C(n,r) = frac{n!}{r!(n-r)!}}$ Where − ${n}$ = the number of objects to choose from. ${r}$ = the number of objects selected. Example Problem Statement − How many different groups of 10 students can a teacher select from her classroom of 15 students? Solution − Step 1 − Determine whether the question pertains to permutations or combinations. Since changing the order of the selected students would not create a new group, this is a combinations problem. Step 2 − Determine n and r n = 15 since the teacher is choosing from 15 students. r = 10 since the teacher is selecting 10 students. Step 3 − Apply the formula ${^{15}C_{10} = frac{15!}{(15-10)!10!} \[7pt] = frac{15!}{5!10!} \[7pt] = frac{15(14)(13)(12)(11)(10!)}{5!10!} \[7pt] = frac{15(14)(13)(12)(11)}{5!} \[7pt] = frac{15(14)(13)(12)(11)}{5(4)(3)(2)(1)} \[7pt] = frac{(14)(13)(3)(11)}{(2)(1)} \[7pt] = (7)(13)(3)(11) \[7pt] = 3003}$ Calculator Print Page Previous Next Advertisements ”;

Discrete Series Arithmetic Mean

Discrete Series Arithmetic Mean ”; Previous Next When data is given along with their frequencies. Following is an example of discrete series − Items 5 10 20 30 40 50 60 70 Frequency 2 5 1 3 12 0 5 7 For discrete series, the Arithmetic Mean can be calculated using the following formula. Formula $bar{x} = frac{f_1x_1 + f_2x_2 + f_3x_3……..+ f_nx_n}{N}$ Alternatively, we can write same formula as follows − $bar{x} = frac{sum fx}{sum f}$ Where − ${N}$ = Number of observations ${f_1,f_2,f_3,…,f_n}$ = Different values of frequency f. ${x_1,x_2,x_3,…,x_n}$ = Different values of variable x. Example Problem Statement − Calculate Arithmetic Mean for the following discrete data − Items 14 36 45 70 Frequency 2 5 1 3 Solution − Based on the given data, we have − Items Frequencyf ${fx}$ 14 2 28 36 5 180 45 1 45 70 3 210   ${N=11}$ ${sum fx=463}$ Based on the above mentioned formula, Arithmetic Mean $bar{x}$ will be − $bar{x} = frac{463}{11} \[7pt] , = {42.09}$ The Arithmetic Mean of the given numbers is 42.09. Calculator Print Page Previous Next Advertisements ”;

Interval Estimation

Statistics – Interval Estimation ”; Previous Next Interval estimation is the use of sample data to calculate an interval of possible (or probable) values of an unknown population parameter, in contrast to point estimation, which is a single number. Formula ${mu = bar x pm Z_{frac{alpha}{2}}frac{sigma}{sqrt n}}$ Where − ${bar x}$ = mean ${Z_{frac{alpha}{2}}}$ = the confidence coefficient ${alpha}$ = confidence level ${sigma}$ = standard deviation ${n}$ = sample size Example Problem Statement: Suppose a student measuring the boiling temperature of a certain liquid observes the readings (in degrees Celsius) 102.5, 101.7, 103.1, 100.9, 100.5, and 102.2 on 6 different samples of the liquid. He calculates the sample mean to be 101.82. If he knows that the standard deviation for this procedure is 1.2 degrees, what is the interval estimation for the population mean at a 95% confidence level? Solution: The student calculated the sample mean of the boiling temperatures to be 101.82, with standard deviation ${sigma = 0.49}$. The critical value for a 95% confidence interval is 1.96, where ${frac{1-0.95}{2} = 0.025}$. A 95% confidence interval for the unknown mean. ${ = ((101.82 – (1.96 times 0.49)), (101.82 + (1.96 times 0.49))) \[7pt] = (101.82 – 0.96, 101.82 + 0.96) \[7pt] = (100.86, 102.78) }$ As the level of confidence decreases, the size of the corresponding interval will decrease. Suppose the student was interested in a 90% confidence interval for the boiling temperature. In this case, ${sigma = 0.90}$, and ${frac{1-0.90}{2} = 0.05}$. The critical value for this level is equal to 1.645, so the 90% confidence interval is ${ = ((101.82 – (1.645 times 0.49)), (101.82 + (1.645 times 0.49))) \[7pt] = (101.82 – 0.81, 101.82 + 0.81) \[7pt] = (101.01, 102.63)}$ An increase in sample size will decrease the length of the confidence interval without reducing the level of confidence. This is because the standard deviation decreases as n increases. Margin of Error The margin of error ${m}$ of interval estimation is defined to be the value added or subtracted from the sample mean which determines the length of the interval: ${Z_{frac{alpha}{2}}frac{sigma}{sqrt n}}$ Suppose in the example above, the student wishes to have a margin of error equal to 0.5 with 95% confidence. Substituting the appropriate values into the expression for ${m}$ and solving for n gives the calculation. ${ n = {(1.96 times frac{1.2}{0.5})}^2 \[7pt] = {frac{2.35}{0.5}^2} \[7pt] = {(4.7)}^2 = 22.09 }$ To achieve 95% interval estimation for the mean boiling point with total length less than 1 degree, the student will have to take 23 measurements. Print Page Previous Next Advertisements ”;

Outlier Function

Statistics – Outlier Function ”; Previous Next An outlier in a probability distribution function is a number that is more than 1.5 times the length of the data set away from either the lower or upper quartiles. Specifically, if a number is less than ${Q_1 – 1.5 times IQR}$ or greater than ${Q_3 + 1.5 times IQR}$, then it is an outlier. Outlier is defined and given by the following probability function: Formula ${Outlier datas are, lt Q_1 – 1.5 times IQR (or) gt Q_3 + 1.5 times IQR }$ Where − ${Q_1}$ = First Quartile ${Q_2}$ = Third Quartile ${IQR}$ = Inter Quartile Range Example Problem Statement: Consider a data set that represents the 8 different students periodic task count. The task count information set is, 11, 13, 15, 3, 16, 25, 12 and 14. Discover the outlier data from the students periodic task counts. Solution: Given data set is: 11 13 15 3 16 25 12 14 Arrange it in ascending order: 3 11 12 13 14 15 16 25 First Quartile Value() ${Q_1}$ ${ Q_1 = frac{(11 + 12)}{2} \[7pt] = 11.5 }$ Third Quartile Value() ${Q_3}$ ${ Q_3 = frac{(15 + 16)}{2} \[7pt] = 15.5 }$ Lower Outlier Range (L) ${ Q_1 – 1.5 times IQR \[7pt] = 11.5 – (1.5 times 4) \[7pt] = 11.5 – 6 \[7pt] = 5.5 }$ Upper Outlier Range (L) ${ Q_3 + 1.5 times IQR \[7pt] = 15.5 + (1.5 times 4) \[7pt] = 15.5 + 6 \[7pt] = 21.5 }$ In the given information, 5.5 and 21.5 is more greater than the other values in the given data set i.e. except from 3 and 25 since 3 is greater than 5.5 and 25 is lesser than 21.5. In this way, we utilize 3 and 25 as the outlier values. Print Page Previous Next Advertisements ”;