Statistics – Individual Series Arithmetic Mode ”; Previous Next When data is given on individual basis. Following is an example of individual series − Items 5 10 20 30 40 50 60 70 In case of individual items, the number of times each value occurs is counted and the value which is repeated maximum number of times is the modal value. Example Problem Statement − Calculate Arithmetic Mode for the following individual data − Items 14 36 45 36 105 36 Solution − The Arithmetic Mode of the given numbers is 36 as it is repeated maximum number of times,3. Calculator Print Page Previous Next Advertisements ”;
Category: Big Data & Analytics
Statistics – Negative Binomial Distribution ”; Previous Next Negative binomial distribution is a probability distribution of number of occurences of successes and failures in a sequence of independent trails before a specific number of success occurs. Following are the key points to be noted about a negative binomial experiment. The experiment should be of x repeated trials. Each trail have two possible outcome, one for success, another for failure. Probability of success is same on every trial. Output of one trial is independent of output of another trail. Experiment should be carried out until r successes are observed, where r is mentioned beforehand. Negative binomial distribution probability can be computed using following: Formula ${ f(x; r, P) = ^{x-1}C_{r-1} times P^r times (1-P)^{x-r} }$ Where − ${x}$ = Total number of trials. ${r}$ = Number of occurences of success. ${P}$ = Probability of success on each occurence. ${1-P}$ = Probability of failure on each occurence. ${f(x; r, P)}$ = Negative binomial probability, the probability that an x-trial negative binomial experiment results in the rth success on the xth trial, when the probability of success on each trial is P. ${^{n}C_{r}}$ = Combination of n items taken r at a time. Example Robert is a football player. His success rate of goal hitting is 70%. What is the probability that Robert hits his third goal on his fifth attempt? Solution: Here probability of success, P is 0.70. Number of trials, x is 5 and number of successes, r is 3. Using negative binomial distribution formula, let”s compute the probability of hitting third goal in fifth attempt. ${ f(x; r, P) = ^{x-1}C_{r-1} times P^r times (1-P)^{x-r} \[7pt] implies f(5; 3, 0.7) = ^4C_2 times 0.7^3 times 0.3^2 \[7pt] , = 6 times 0.343 times 0.09 \[7pt] , = 0.18522 }$ Thus probability of hitting third goal in fifth attempt is $ { 0.18522 }$. Print Page Previous Next Advertisements ”;
Histograms
Statistics – Histograms ”; Previous Next A histogram is a graphical representation of the distribution of numerical data. It is an estimate of the probability distribution of a continuous variable (quantitative variable). Problem Statement: Every month one measure the amount of weight one”s dog has picked up and get these outcomes: 0.5 0.5 0.3 -0.2 1.6 0 0.1 0.1 0.6 0.4 Draw the histogram demonstrating how much is that dog developing. Solution: monthly development vary from -0.2 (the fox lost weight that month) to 1.6. Putting them in order from lowest to highest weight gain. -0.2 0 0.1 0.1 0.3 0.4 0.5 0.5 0.6 1.6 We decide to put the results into groups of 0.5: The -0.5 to just below 0 range. The 0 to just below 0.5 range, etc. And here is the result: There are no values from 1 to just below 1.5, but we still show the space. Print Page Previous Next Advertisements ”;
Statistics – Individual Series Arithmetic Median ”; Previous Next When data is given on individual basis. Following is an example of individual series − Items 5 10 20 30 40 50 60 70 In case of a group having even number of distribution, Arithmetic Median is found out by taking out the Arithmetic Mean of two middle values after arranging the numbers in ascending order. Formula Median = Value of ($frac{N+1}{2})^{th} item$. Where − ${N}$ = Number of observations Example Problem Statement − Let”s calculate Arithmetic Median for the following individual data − Items 14 36 45 70 105 145 Solution − Based on the above mentioned formula, Arithmetic Median M will be − $M = Value of (frac{N+1}{2})^{th} item. \[7pt] , = Value of (frac{6+1}{2})^{th} item. \[7pt] , = Value of 3.5^{th} item. \[7pt] , = Value of (frac{3^{rd} item + 4^{th} item}{2})\[7pt] , = (frac{45 + 70}{2}) , = {57.5}$ The Arithmetic Median of the given numbers is 57.5. In case of a group having odd number of distribution, Arithmetic Median is the middle number after arranging the numbers in ascending order. Example Let”s calculate Arithmetic Median for the following individual data − Items 14 36 45 70 105 Given numbers are 5, an odd number thus middle number is the Arithmetic Median. ∴ The Arithmetic Median of the given numbers is 45. Calculator Print Page Previous Next Advertisements ”;
Inverse Gamma Distribution
Statistics – Inverse Gamma Distribution ”; Previous Next Inverse Gamma Distribution is a reciprocal of gamma probability density function with positive shape parameters $ {alpha, beta } $ and location parameter $ { mu } $. $ {alpha } $ controls the height. Higher the $ {alpha } $, taller is the probability density function (PDF). $ {beta } $ controls the speed. It is defined by following formula. Formula ${ f(x) = frac{x^{-(alpha+1)}e^{frac{-1}{beta x}}}{ Gamma(alpha) beta^alpha} \[7pt] , where x gt 0 }$ Where − ${alpha}$ = positive shape parameter. ${beta}$ = positive shape parameter. ${x}$ = random variable. Following diagram shows the probability density function with different parameter combinations. Print Page Previous Next Advertisements ”;
Relative Standard Deviation
Statistics – Relative Standard Deviation ”; Previous Next In probability theory and statistics, the coefficient of variation (CV), also known as relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. Relative Standard Deviation, RSD is defined and given by the following probability function: Formula ${100 times frac{s}{bar x}}$ Where − ${s}$ = the sample standard deviation ${bar x}$ = sample mean Example Problem Statement: Find the RSD for the following set of numbers: 49, 51.3, 52.7, 55.8 and the standard deviation are 2.8437065. Solution: Step 1 – Standard deviation of sample: 2.8437065 (or 2.84 rounded to 2 decimal places). Step 2 – Multiply Step 1 by 100. Set this number aside for a moment. ${2.84 times 100 = 284}$ Step 3 – Find the sample mean, ${bar x}$. The sample mean is: ${frac{(49 + 51.3 + 52.7 + 55.8)}{4} = frac{208.8}{4} = 52.2.}$ Step 4Divide Step 2 by the absolute value of Step 3. ${frac{284}{|52.2|} = 5.44.}$ The RSD is: ${52.2 pm 5.4}$% Note that the RSD is expressed as a percentage. Print Page Previous Next Advertisements ”;
Scatterplots
Statistics – Scatterplots ”; Previous Next A scatterplot is a graphical way to display the relationship between two quantitative sample variables. It consists of an X axis, a Y axis and a series of dots where each dot represents one observation from a data set. The position of the dot refers to its X and Y values. Patterns of Data in Scatterplots Scatterplots are used to analyze patterns which generally varies on the basis of linearity, slope, and strength. Linearity – data pattern is either linear/straight or nonlinear/curved. Slope – direction of change in variable Y with respect to increase in value of variable X. If Y increases with increase in X, slope is positive otherwise slope is negative. Strength – Degree of spreadness of scatter in the plot. If dots are widely dispersed, the relationship is consider weak. If dot are densed around a line then the relationship is said to be strong. Print Page Previous Next Advertisements ”;
Frequency Distribution
Statistics – Frequency Distribution ”; Previous Next Frequency distribution is a table that displays the frequency of various outcomes in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval, and in this way, the table summarizes the distribution of values in the sample. Example Problem Statement: Constructing a frequency distribution table of a survey was taken on Maple Avenue. In each of 20 homes, people were asked how many cars were registered to their households. The results were recorded as follows: 1 2 1 0 3 4 0 1 1 1 2 2 3 2 3 2 1 4 0 0 Solution: Steps to be followed for present this data in a frequency distribution table. Divide the results (x) into intervals, and then count the number of results in each interval. In this case, the intervals would be the number of households with no car (0), one car (1), two cars (2) and so forth. Make a table with separate columns for the interval numbers (the number of cars per household), the tallied results, and the frequency of results in each interval. Label these columns Number of cars, Tally and Frequency. Read the list of data from left to right and place a tally mark in the appropriate row. For example, the first result is a 1, so place a tally mark in the row beside where 1 appears in the interval column (Number of cars). The next result is a 2, so place a tally mark in the row beside the 2, and so on. When you reach your fifth tally mark, draw a tally line through the preceding four marks to make your final frequency calculations easier to read. Add up the number of tally marks in each row and record them in the final column entitled Frequency. Your frequency distribution table for this exercise should look like this: Frequency table for the number of cars registered in each household Number of cars (x) Tally Frequency (f) 0 ${lvertlvertlvertlvert}$ 4 1 ${require{cancel} cancel{lvertlvertlvertlvert} lvert}$ 6 2 ${cancel{lvertlvertlvertlvert}}$ 5 3 ${lvertlvertlvert}$ 3 4 ${lvertlvert}$ 3 By looking at this frequency distribution table quickly, we can see that out of 20 households surveyed, 4 households had no cars, 6 households had 1 car. Print Page Previous Next Advertisements ”;
Quartile Deviation
Statistics – Quartile Deviation ”; Previous Next It depends on the lower quartile ${Q_1}$ and the upper quartile ${Q_3}$. The difference ${Q_3 – Q_1}$ is called the inter quartile range. The difference ${Q_3 – Q_1}$ divided by 2 is called semi-inter quartile range or the quartile deviation. Formula ${Q.D. = frac{Q_3 – Q_1}{2}}$ Coefficient of Quartile Deviation A relative measure of dispersion based on the quartile deviation is known as the coefficient of quartile deviation. It is characterized as ${Coefficient of Quartile Deviation = frac{Q_3 – Q_1}{Q_3 + Q_1}}$ Example Problem Statement: Calculate the quartile deviation and coefficient of quartile deviation from the data given below: Maximum Load(short-tons) Number of Cables 9.3-9.7 22 9.8-10.2 55 10.3-10.7 12 10.8-11.2 17 11.3-11.7 14 11.8-12.2 66 12.3-12.7 33 12.8-13.2 11 Solution: Maximum Load(short-tons) Number of Cables(f) ClassBounderies CumulativeFrequencies 9.3-9.7 2 9.25-9.75 2 9.8-10.2 5 9.75-10.25 2 + 5 = 7 10.3-10.7 12 10.25-10.75 7 + 12 = 19 10.8-11.2 17 10.75-11.25 19 + 17 = 36 11.3-11.7 14 11.25-11.75 36 + 14 = 50 11.8-12.2 6 11.75-12.25 50 + 6 = 56 12.3-12.7 3 12.25-12.75 56 + 3 = 59 12.8-13.2 1 12.75-13.25 59 + 1 = 60 ${Q_1}$ Value of ${frac{n}{4}^{th}}$ item =Value of ${frac{60}{4}^{th}}$ thing = ${15^{th}}$ item. Thus ${Q_1}$ lies in class 10.25-10.75. $ {Q_1 = 1+ frac{h}{f}(frac{n}{4} – c) \[7pt] ,Where l=10.25, h=0.5, f=12, frac{n}{4}=15 and c=7 , \[7pt] , = 10.25+frac{0.5}{12} (15-7) , \[7pt] , = 10.25+0.33 , \[7pt] , = 10.58 }$ ${Q_3}$ Value of ${frac{3n}{4}^{th}}$ item =Value of ${frac{3 times 60}{4}^{th}}$ thing = ${45^{th}}$ item. Thus ${Q_3}$ lies in class 11.25-11.75. $ {Q_3 = 1+ frac{h}{f}(frac{3n}{4} – c) \[7pt] ,Where l=11.25, h=0.5, f=14, frac{3n}{4}=45 and c=36 , \[7pt] , = 11.25+frac{0.5}{14} (45-36) , \[7pt] , = 11.25+0.32 , \[7pt] , = 11.57 }$ Quartile Deviation $ {Q.D. = frac{Q_3 – Q_1}{2} \[7pt] , = frac{11.57 – 10.58}{2} , \[7pt] , = frac{0.99}{2} , \[7pt] , = 0.495 }$ Coefficient of Quartile Deviation ${Coefficient of Quartile Deviation = frac{Q_3 – Q_1}{Q_3 + Q_1} \[7pt] , = frac{11.57 – 10.58}{11.57 + 10.58} , \[7pt] , = frac{0.99}{22.15} , \[7pt] , = 0.045 }$ Print Page Previous Next Advertisements ”;
Statistics – Discrete Series Arithmetic Mode ”; Previous Next When data is given along with their frequencies. Following is an example of discrete series − Items 5 10 20 30 40 50 60 70 Frequency 2 5 1 3 12 0 5 7 In discrete series, Arithmetic Mode can be determined by inspection and finding the variable which has the highest frequency associated with it. However, when there is very less difference between the maximum frequency and the frequency preceding it or succeeding it, then grouping table method is used. Example Problem Statement − Calculate Arithmetic Mode for the following discrete data − Items 14 36 45 70 105 145 Frequency 2 5 1 3 12 0 Solution − The Arithmetic Mode of the given numbers is 105 as the highest frequency,12 is associated with 105. Calculator Print Page Previous Next Advertisements ”;