Kurtosis

Statistics – Kurtosis ”; Previous Next The degree of tailedness of a distribution is measured by kurtosis. It tells us the extent to which the distribution is more or less outlier-prone (heavier or light-tailed) than the normal distribution. Three different types of curves, courtesy of Investopedia, are shown as follows − It is difficult to discern different types of kurtosis from the density plots (left panel) because the tails are close to zero for all distributions. But differences in the tails are easy to see in the normal quantile-quantile plots (right panel). The normal curve is called Mesokurtic curve. If the curve of a distribution is more outlier prone (or heavier-tailed) than a normal or mesokurtic curve then it is referred to as a Leptokurtic curve. If a curve is less outlier prone (or lighter-tailed) than a normal curve, it is called as a platykurtic curve. Kurtosis is measured by moments and is given by the following formula − Formula ${beta_2 = frac{mu_4}{mu_2}}$ Where − ${mu_4 = frac{sum(x- bar x)^4}{N}}$ The greater the value of beta_2 the more peaked or leptokurtic the curve. A normal curve has a value of 3, a leptokurtic has beta_2 greater than 3 and platykurtic has beta_2 less then 3. Example Problem Statement: The data on daily wages of 45 workers of a factory are given. Compute beta_1 and beta_2 using moment about the mean. Comment on the results. Wages(Rs.) Number of Workers 100-200 1 120-200 2 140-200 6 160-200 20 180-200 11 200-200 3 220-200 2 Solution: Wages(Rs.) Number of Workers(f) Mid-ptm m-${frac{170}{20}}$ d ${fd}$ ${fd^2}$ ${fd^3}$ ${fd^4}$ 100-200 1 110 -3 -3 9 -27 81 120-200 2 130 -2 -4 8 -16 32 140-200 6 150 -1 -6 6 -6 6 160-200 20 170 0 0 0 0 0 180-200 11 190 1 11 11 11 11 200-200 3 210 2 6 12 24 48 220-200 2 230 3 6 18 54 162   ${N=45}$     ${sum fd = 10}$ ${sum fd^2 = 64}$ ${sum fd^3 = 40}$ ${sum fd^4 = 330}$ Since the deviations have been taken from an assumed mean, hence we first calculate moments about arbitrary origin and then moments about mean. Moments about arbitrary origin ”170” ${mu_1^1= frac{sum fd}{N} times i = frac{10}{45} times 20 = 4.44 \[7pt] mu_2^1= frac{sum fd^2}{N} times i^2 = frac{64}{45} times 20^2 =568.88 \[7pt] mu_3^1= frac{sum fd^2}{N} times i^3 = frac{40}{45} times 20^3 =7111.11 \[7pt] mu_4^1= frac{sum fd^4}{N} times i^4 = frac{330}{45} times 20^4 =1173333.33 }$ Moments about mean ${mu_2 = mu”_2 – (mu”_1 )^2 = 568.88-(4.44)^2 = 549.16 \[7pt] mu_3 = mu”_3 – 3(mu”_1)(mu”_2) + 2(mu”_1)^3 \[7pt] , = 7111.11 – (4.44) (568.88)+ 2(4.44)^3 \[7pt] , = 7111.11 – 7577.48+175.05 = – 291.32 \[7pt] \[7pt] mu_4= mu”_4 – 4(mu”_1)(mu”_3) + 6 (mu_1 )^2 (mu”_2) -3(mu”_1)^4 \[7pt] , = 1173333.33 – 4 (4.44)(7111.11)+6(4.44)^2 (568.88) – 3(4.44)^4 \[7pt] , = 1173333.33 – 126293.31+67288.03-1165.87 \[7pt] , = 1113162.18 }$ From the value of movement about mean, we can now calculate ${beta_1}$ and ${beta_2}$: ${beta_1 = mu^2_3 = frac{(-291.32)^2}{(549.16)^3} = 0.00051 \[7pt] beta_2 = frac{mu_4}{(mu_2)^2} = frac{1113162.18}{(546.16)^2} = 3.69 }$ From the above calculations, it can be concluded that ${beta_1}$, which measures skewness is almost zero, thereby indicating that the distribution is almost symmetrical. ${beta_2}$ Which measures kurtosis, has a value greater than 3, thus implying that the distribution is leptokurtic. Print Page Previous Next Advertisements ”;

Pooled Variance (r)

Statistics – Pooled Variance (r) ”; Previous Next Pooled Variance/Change is the weighted normal for assessing the fluctuations of two autonomous variables where the mean can differ between tests however the genuine difference continues as before. Example Problem Statement: Compute the Pooled Variance of the numbers 1, 2, 3, 4 and 5. Solution: Step 1 Decide the normal (mean) of the given arrangement of information by including every one of the numbers then gap it by the aggregate include of numbers given the information set. ${Mean = frac{1 + 2 + 3 + 4 + 5}{5} = frac{15}{5} = 3 }$ Step 2 At that point, subtract the mean worth with the given numbers in the information set. ${Rightarrow (1 – 3), (2 – 3), (3 – 3), (4 – 3), (5 – 3) Rightarrow – 2, – 1, 0, 1, 2 }$ Step 3 Square every period”s deviation to dodge the negative numbers. ${Rightarrow (- 2)^2, (- 1)^2, (0)^2, (1)^2, (2)^2 Rightarrow 4, 1, 0, 1, 4 }$ Step 4 Now discover Standard Deviation utilizing the underneath equation ${S = sqrt{frac{sum{X-M}^2}{n-1}}}$ Standard Deviation = ${frac{sqrt 10}{sqrt 4} = 1.58113 }$ Step 5 ${Pooled Variance (r) = frac{((aggregate check of numbers – 1) times Var)}{(aggregate tally of numbers – 1)} , \[7pt] (r) = (5 – 1) times frac{2.5}{(5 – 1)}, \[7pt] = frac{(4 times 2.5)}{4} = 2.5}$ Hence, Pooled Variance (r) =2.5 Print Page Previous Next Advertisements ”;

Laplace Distribution

Statistics – Laplace Distribution ”; Previous Next Laplace distribution represents the distribution of differences between two independent variables having identical exponential distributions. It is also called double exponential distribution. Probability density function Probability density function of Laplace distribution is given as: Formula ${ L(x | mu, b) = frac{1}{2b} e^{- frac{| x – mu |}{b}} }$ $ { = frac{1}{2b} } $ $ begin {cases} e^{- frac{x – mu}{b}}, & text{if $x lt mu $} \[7pt] e^{- frac{mu – x}{b}}, & text{if $x ge mu $} end{cases} $ Where − ${mu}$ = location parameter. ${b}$ = scale parameter and is > 0. ${x}$ = random variable. Cumulative distribution function Cumulative distribution function of Laplace distribution is given as: Formula ${ D(x) = int_{- infty}^x}$ $ = begin {cases} frac{1}{2}e^{frac{x – mu}{b}}, & text{if $x lt mu $} \[7pt] 1- frac{1}{2}e^{- frac{x – mu}{b}}, & text{if $x ge mu $} end{cases} $ $ { = frac{1}{2} + frac{1}{2}sgn(x – mu)(1 – e^{- frac{| x – mu |}{b}}) } $ Where − ${mu}$ = location parameter. ${b}$ = scale parameter and is > 0. ${x}$ = random variable. Print Page Previous Next Advertisements ”;

One Proportion Z Test

Statistics – One Proportion Z Test ”; Previous Next The test statistic is a z-score (z) defined by the following equation. ${z = frac{(p – P)}{sigma}}$ where P is the hypothesized value of population proportion in the null hypothesis, p is the sample proportion, and ${sigma}$ is the standard deviation of the sampling distribution. Test Statistics is defined and given by the following function: Formula ${ z = frac {hat p -p_o}{sqrt{frac{p_o(1-p_o)}{n}}} }$ Where − ${z}$ = Test statistics ${n}$ = Sample size ${p_o}$ = Null hypothesized value ${hat p}$ = Observed proportion Example Problem Statement: A survey claims that 9 out of 10 doctors recommend aspirin for their patients with headaches. To test this claim, a random sample of 100 doctors is obtained. Of these 100 doctors, 82 indicate that they recommend aspirin. Is this claim accurate? Use alpha = 0.05. Solution: Define Null and Alternative Hypotheses ${ H_0;p = .90 \[7pt] H_0;p ne .90 }$ Here Alpha = 0.05. Using an alpha of 0.05 with a two-tailed test, we would expect our distribution to look something like this: Here we have 0.025 in each tail. Looking up 1 – 0.025 in our z-table, we find a critical value of 1.96. Thus, our decision rule for this two-tailed test is: If Z is less than -1.96, or greater than 1.96, reject the null hypothesis.Calculate Test Statistic: ${ z = frac {hat p -p_o}{sqrt{frac{p_o(1-p_o)}{n}}} \[7pt] hat p = .82 \[7pt] p_o = .90 \[7pt] n = 100 \[7pt] z_o = frac {.82 – .90}{sqrt{frac{ .90 (1- .90)}{100}}} \[7pt] = frac{-.08}{0.03} \[7pt] = -2.667 }$ As z = -2.667 Thus as result we should reject the null hypothesis and as conclusion, The claim that 9 out of 10 doctors recommend aspirin for their patients is not accurate, z = -2.667, p < 0.05. Print Page Previous Next Advertisements ”;

Inverse Gamma Distribution

Statistics – Inverse Gamma Distribution ”; Previous Next Inverse Gamma Distribution is a reciprocal of gamma probability density function with positive shape parameters $ {alpha, beta } $ and location parameter $ { mu } $. $ {alpha } $ controls the height. Higher the $ {alpha } $, taller is the probability density function (PDF). $ {beta } $ controls the speed. It is defined by following formula. Formula ${ f(x) = frac{x^{-(alpha+1)}e^{frac{-1}{beta x}}}{ Gamma(alpha) beta^alpha} \[7pt] , where x gt 0 }$ Where − ${alpha}$ = positive shape parameter. ${beta}$ = positive shape parameter. ${x}$ = random variable. Following diagram shows the probability density function with different parameter combinations. Print Page Previous Next Advertisements ”;

Relative Standard Deviation

Statistics – Relative Standard Deviation ”; Previous Next In probability theory and statistics, the coefficient of variation (CV), also known as relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. Relative Standard Deviation, RSD is defined and given by the following probability function: Formula ${100 times frac{s}{bar x}}$ Where − ${s}$ = the sample standard deviation ${bar x}$ = sample mean Example Problem Statement: Find the RSD for the following set of numbers: 49, 51.3, 52.7, 55.8 and the standard deviation are 2.8437065. Solution: Step 1 – Standard deviation of sample: 2.8437065 (or 2.84 rounded to 2 decimal places). Step 2 – Multiply Step 1 by 100. Set this number aside for a moment. ${2.84 times 100 = 284}$ Step 3 – Find the sample mean, ${bar x}$. The sample mean is: ${frac{(49 + 51.3 + 52.7 + 55.8)}{4} = frac{208.8}{4} = 52.2.}$ Step 4Divide Step 2 by the absolute value of Step 3. ${frac{284}{|52.2|} = 5.44.}$ The RSD is: ${52.2 pm 5.4}$% Note that the RSD is expressed as a percentage. Print Page Previous Next Advertisements ”;

Scatterplots

Statistics – Scatterplots ”; Previous Next A scatterplot is a graphical way to display the relationship between two quantitative sample variables. It consists of an X axis, a Y axis and a series of dots where each dot represents one observation from a data set. The position of the dot refers to its X and Y values. Patterns of Data in Scatterplots Scatterplots are used to analyze patterns which generally varies on the basis of linearity, slope, and strength. Linearity – data pattern is either linear/straight or nonlinear/curved. Slope – direction of change in variable Y with respect to increase in value of variable X. If Y increases with increase in X, slope is positive otherwise slope is negative. Strength – Degree of spreadness of scatter in the plot. If dots are widely dispersed, the relationship is consider weak. If dot are densed around a line then the relationship is said to be strong. Print Page Previous Next Advertisements ”;

Process Sigma

Statistics – Process Sigma ”; Previous Next Process sigma can be defined using following four steps: Measure opportunities, Measure defects, Calculate yield, Look-up process sigma. Formulae Used ${DPMO = frac{Total defect}{Total Opportunities} times 1000000}$ ${Defect (%) = frac{Total defect}{Total Opportunities} times 100}$ ${Yield (%) = 100 – Defect (%) }$ ${Process Sigma = 0.8406+sqrt{29.37}-2.221 times (log (DPMO)) }$ Where − ${Opportunities}$ = Lowest defect noticeable by customer. ${DPMO}$ = Defects per Million Opportunities. Example Problem Statement: In equipment organization hard plate produced is 10000 and the defects is 5. Discover the process sigma. Solution: Given: Opportunities = 10000 and Defects = 5. Substitute the given qualities in the recipe, Step 1: Compute DPMO $ {DPMO = frac{Total defect}{Total Opportunities} times 1000000 \[7pt] , = (10000/5) times 1000000 , \[7pt] , = 500}$ Step 2: Compute Defect(%) $ {Defect (%) = frac{Total defect}{Total Opportunities} times 100 \[7pt] , = frac{10000}{5} times 100 , \[7pt] , = 0.05}$ Step 3: Compute Yield(%) $ {Yield (%) = 100 – Defect (%) \[7pt] , = 100 – 0.05 , \[7pt] , = 99.95}$ Step 3: Compute Process Sigma $ {Process Sigma = 0.8406+sqrt{29.37}-2.221 times (log (DPMO)) \[7pt] , = 0.8406 + sqrt {29.37} – 2.221 times (log (DPMO)) , \[7pt] , = 0.8406+sqrt(29.37) – 2.221*(log (500)) , \[7pt] , = 4.79 }$ Print Page Previous Next Advertisements ”;

Harmonic Resonance Frequency

Statistics – Harmonic Resonance Frequency ”; Previous Next Harmonic Resonance Frequency represents a signal or wave whose frequency is an integral multiple of the frequency of a reference signal or wave. Formula ${ f = frac{1}{2 pi sqrt{LC}} } $ Where − ${f}$ = Harmonic resonance frequency. ${L}$ = inductance of the load. ${C}$ = capacitanc of the load. Example Calculate the harmonic resonance frequency of a power system with the capcitance 5F, Inductance 6H and frequency 200Hz. Solution: Here capacitance, C is 5F. Inductance, L is 6H. Frequency, f is 200Hz. Using harmonic resonance frequency formula, let”s compute the resonance frequency as: ${ f = frac{1}{2 pi sqrt{LC}} \[7pt] implies f = frac{1}{2 pi sqrt{6 times 5}} \[7pt] , = frac{1}{2 times 3.14 times sqrt{30}} \[7pt] , = frac{1}{ 6.28 times 5.4772 } \[7pt] , = frac{1}{ 34.3968 } \[7pt] , = 0.0291 }$ Thus harmonic resonance frequency is $ { 0.0291 }$. Print Page Previous Next Advertisements ”;

Grand Mean

Statistics – Grand Mean ”; Previous Next When sample sizes are equal, in other words, there could be five values in each sample, or n values in each sample. The grand mean is the same as the mean of sample means. Formula ${X_{GM} = frac{sum x}{N}}$ Where − ${N}$ = Total number of sets. ${sum x}$ = sum of the mean of all sets. Example Problem Statement: Determine the mean of each group or set”s samples. Use the following data as a sample to determine the mean and grand mean. Jackson 1 6 7 10 4 Thomas 5 2 8 14 6 Garrard 8 2 9 12 7 Solution: Step 1: Compute all means $ {M_1 = frac{1+6+7+10+4}{5} = frac{28}{5} = 5.6 \[7pt] , M_2 = frac{5+2+8+14+6}{5} = frac{35}{5} = 7 \[7pt] , M_3 = frac{8+2+9+12+7}{5} = frac{38}{5} = 7.6 }$ Step 2: Divide the total by the number of groups to determine the grand mean. In the sample, there are three groups. $ {X_{GM} = frac{5.6+7+7.6}{3} = frac{20.2}{3} \[7pt] , = 6.73 }$ Print Page Previous Next Advertisements ”;