Venn Diagram

Statistics – Venn Diagram ”; Previous Next Venn diagram is a way to visually represents relationship between groups of entities or objects. Venn diagrams are comprised of circles where each circle represents a whole set. Venn diagram can have unlimited circles but generally two or three circles are preferred otherwise the diagram becomes too complex. Steps to draw a Venn Diagram Consider the following sets of people: Cricket Players – $ C = { Ram, Shyam, Mohan, Rohan, Ramesh, Suresh } $ Hockey Players – $ H = { Ramesh, Naresh, Mahesh, Leela, Sunita } $ Step 1: Draw a rectangle and label it as players. Step 2: Draw two circles and label them as Cricket and Hockey. Make sure that circles are overlapping each other. Step 3: Write Names inside the circle as relevant. Common name(s) should fall within common region. Union Union ($ cup $) represents a set where items are present in all categories but are not repeated. Example Problem Statement: Draw a Venn diagram of $ C cup H $. Solution: Step 1: Determine players who are either playing cricket or hockey. Draw them as following: $ C cup H = { Ram, Shyam, Mohan, Rohan, Ramesh, Suresh, Naresh, Mahesh, Leela, Sunita } $. Intersection Intersection ($ cap $) represents a set where items are present in both categories. Example Problem Statement: Draw a Venn diagram of $ C cap H $. Solution: Step 1: Determine players who are playing cricket and hockey both. Draw them as following: $ C cap H = { Ramesh } $. Difference Difference ($ – $) represents a set where items are present only in one category and not in other one. Example Problem Statement: Draw a Venn diagram of $ C – H $. Solution: Step 1: Determine players who are playing cricket only. Draw them as following: $ C – H = { Ram, Shyam, Mohan, Rohan, Suresh } $. Print Page Previous Next Advertisements ”;

Sum of Square

Statistics – Sum of Square ”; Previous Next In statistical data analysis the total sum of squares (TSS or SST) is a quantity that appears as part of a standard way of presenting results of such analyses. It is defined as being the sum, over all observations, of the squared differences of each observation from the overall mean. Total Sum of Squares is defined and given by the following function: Formula ${Sum of Squares = sum(x_i – bar x)^2 }$ Where − ${x_i}$ = frequency. ${bar x}$ = mean. Example Problem Statement: Calculate the sum of square of 9 children whose heights are 100,100,102,98,77,99,70,105,98 and whose means is 94.3. Solution: Given mean = 94.3. To find Sum of Squares: Calculation of Sum of Squares. Column AValue or Score${x_i}$ Column BDeviation Score${sum(x_i – bar x)}$ Column C${(Deviation Score)^2}$${sum(x_i – bar x)^2}$ 100 100-94.3 = 5.7 (5.7)2 = 32.49 100 100-94.3 = 5.7 (5.7)2 = 32.49 102 102-94.3 = 7.7 (7.7)2 = 59.29 98 98-94.3 = 3.7 (3.7)2 = 13.69 77 77-94.3 = -17.3 (-17.3)2 = 299.29 99 99-94.3 = 4.7 (4.7)2 = 22.09 70 70-94.3 = -24.3 (-24.3)2 = 590.49 105 105-94.3 = 10.7 (10.7)2 = 114.49 98 98-94.3 = 3.7 (3.7)2 = 3.69 ${sum x_i = 849}$ ${sum(x_i – bar x)}$ ${sum(x_i – bar x)^2}$   First Moment Sum of Squares Print Page Previous Next Advertisements ”;

Standard Deviation

Statistics – Standard Deviation ”; Previous Next Standard deviation is the square root of the average of squared deviations of the items from their mean. Symbolically it is represented by ${sigma}$. We”re going to discuss methods to compute the Standard deviation for three types of series: Individual Data Series Discrete Data Series Continuous Data Series Individual Data Series When data is given on individual basis. Following is an example of individual series: Items 5 10 20 30 40 50 60 70 Discrete Data Series When data is given alongwith their frequencies. Following is an example of discrete series: Items 5 10 20 30 40 50 60 70 Frequency 2 5 1 3 12 0 5 7 Continuous Data Series When data is given based on ranges alongwith their frequencies. Following is an example of continous series: Items 0-5 5-10 10-20 20-30 30-40 Frequency 2 5 1 3 12 Print Page Previous Next Advertisements ”;

Weak Law of Large Numbers

Statistics – Weak Law of Large Numbers ”; Previous Next The weak law of large numbers is a result in probability theory also known as Bernoulli”s theorem. Let P be a sequence of independent and identically distributed random variables, each having a mean and standard deviation. Formula $${ 0 = lim_{nto infty} P {lvert X – mu rvert gt frac{1}{n} } \[7pt] = P { lim_{nto infty} { lvert X – mu rvert gt frac{1}{n} } } \[7pt] = P { X ne mu } }$$ Where − ${n}$ = Number of samples ${X}$ = Sample value ${mu}$ = Sample mean Example Problem Statement: A six sided die is rolled large number of times. Figure the sample mean of their values. Solution: Sample Mean Calculation $ {Sample Mean = frac{1+2+3+4+5+6}{6} \[7pt] = frac{21}{6}, \[7pt] , = 3.5 }$ Print Page Previous Next Advertisements ”;

Statistics – Discussion

Discuss Statistics ”; Previous Next An operating system (OS) is a collection of software that manages computer hardware resources and provides common services for computer programs. The operating system is a vital component of the system software in a computer system. This tutorial will take you through step by step approach while learning Operating System concepts. Print Page Previous Next Advertisements ”;

Stratified sampling

Statistics – Stratified sampling ”; Previous Next This strategy for examining is utilized as a part of circumstance where the population can be effortlessly partitioned into gatherings or strata which are particularly not quite the same as one another, yet the components inside of a gathering are homogeneous regarding a few attributes e. g. understudies of school can be separated into strata on the premise of sexual orientation, courses offered, age and so forth. In this the population is initially partitioned into strata and afterward a basic irregular specimen is taken from every stratum. Stratified testing is of two sorts: proportionate stratified inspecting and disproportionate stratified examining. Proportionate Stratified Sampling – In this the number of units selected from each stratum is proportionate to the share of stratum in the population e.g. in a college there are total 2500 students out of which 1500 students are enrolled in graduate courses and 1000 are enrolled in post graduate courses. If a sample of 100 is to be chosen using proportionate stratified sampling then the number of undergraduate students in sample would be 60 and 40 would be post graduate students. Thus the two strata are represented in the same proportion in the sample as is their representation in the population. This method is most suitable when the purpose of sampling is to estimate the population value of some characteristic and there is no difference in within- stratum variances. Disproportionate Stratified Sampling – When the purpose of study is to compare the differences among strata then it become necessary to draw equal units from all strata irrespective of their share in population. Sometimes some strata are more variable with respect to some characteristic than other strata, in such a case a larger number of units may be drawn from the more variable strata. In both the situations the sample drawn is a disproportionate stratified sample. The difference in stratum size and stratum variability can be optimally allocated using the following formula for determining the sample size from different strata Formula ${n_i = frac{n.n_isigma_i}{n_1sigma_1+n_2sigma_2+…+n_ksigma_k} for i = 1,2 …k}$ Where − ${n_i}$ = the sample size of i strata. ${n}$ = the size of strata. ${sigma_1}$ = the standard deviation of i strata. In addition to it, there might be a situation where cost of collecting a sample might be more in one strata than in other. The optimal disproportionate sampling should be done in a manner that ${frac{n_1}{n_1sigma_1sqrt{c_1}} = frac{n_2}{n_2sigma_1sqrt{c_2}} = … = frac{n_k}{n_ksigma_ksqrt{c_k}}}$ Where ${c_1, c_2, … ,c_k}$ refer to the cost of sampling in k strata. The sample size from different strata can be determined using the following formula: ${n_i = frac{frac{n.n_isigma_i}{sqrt{c_i}}}{frac{n_1sigma_1}{sqrt{c_i}}+frac{n_2sigma_2}{sqrt{c_2}}+…+frac{n_ksigma_k}{sqrt{c_k}}} for i = 1,2 …k}$ Example Problem Statement: An organisation has 5000 employees who have been stratified into three levels. Stratum A: 50 executives with standard deviation = 9 Stratum B: 1250 non-manual workers with standard deviation = 4 Stratum C: 3700 manual workers with standard deviation = 1 How will a sample of 300 employees are drawn on a disproportionate basis having optimum allocation? Solution: Using the formula of disproportionate sampling for optimum allocation. ${n_i = frac{n.n_isigma_i}{n_1sigma_1+n_2sigma_2+n_3sigma_3}} \[7pt] , For Stream A, {n_1 = frac{300(50)(9)}{(50)(9)+(1250)(4)+(3700)(1)}} \[7pt] , = {frac{135000}{1950} = {14.75} or say {15}} \[7pt] , For Stream B, {n_1 = frac{300(1250)(4)}{(50)(9)+(1250)(4)+(3700)(1)}} \[7pt] , = {frac{150000}{1950} = {163.93} or say {167}} \[7pt] , For Stream C, {n_1 = frac{300(3700)(1)}{(50)(9)+(1250)(4)+(3700)(1)}} \[7pt] , = {frac{110000}{1950} = {121.3} or say {121}}$ Print Page Previous Next Advertisements ”;

Logistic Regression

Statistics – Logistic Regression ”; Previous Next Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. The outcome is measured with a dichotomous variable (in which there are only two possible outcomes). Formula ${pi(x) = frac{e^{alpha + beta x}}{1 + e^{alpha + beta x}}}$ Where − Response – Presence/Absence of characteristic. Predictor – Numeric variable observed for each case ${beta = 0 Rightarrow }$ P (Presence) is the same at each level of x. ${beta gt 0 Rightarrow }$ P (Presence) increases as x increases ${beta = 0 Rightarrow }$ P (Presence) decreases as x increases. Example Problem Statement: Solve the logistic regression of the following problem Rizatriptan for Migraine Response – Complete Pain Relief at 2 hours (Yes/No). Predictor – Dose (mg): Placebo (0), 2.5,5,10 Dose #Patients #Relieved %Relieved 0 67 2 3.0 2.5 75 7 9.3 5 130 29 22.3 10 145 40 27.6 Solution: Having ${alpha = -2.490} and ${beta = .165}, we”ve following data: $ {pi(0) = frac{e^{alpha + beta times 0}}{1 + e^{alpha + beta times 0}} \[7pt] , = frac{e^{-2.490 + 0}}{1 + e^{-2.490}} \[7pt] \[7pt] , = 0.03 \[7pt] pi(2.5) = frac{e^{alpha + beta times 2.5}}{1 + e^{alpha + beta times 2.5}} \[7pt] , = frac{e^{-2.490 + .165 times 2.5}}{1 + e^{-2.490 + .165 times 2.5}} \[7pt] , = 0.09 \[7pt] \[7pt] pi(5) = frac{e^{alpha + beta times 5}}{1 + e^{alpha + beta times 5}} \[7pt] , = frac{e^{-2.490 + .165 times 5}}{1 + e^{-2.490 + .165 times 5}} \[7pt] , = 0.23 \[7pt] \[7pt] pi(10) = frac{e^{alpha + beta times 10}}{1 + e^{alpha + beta times 10}} \[7pt] , = frac{e^{-2.490 + .165 times 10}}{1 + e^{-2.490 + .165 times 10}} \[7pt] , = 0.29 }$ Dose(${x}$) ${pi(x)}$ 0 0.03 2.5 0.09 5 0.23 10 0.29 Print Page Previous Next Advertisements ”;

Mcnemar Test

Statistics – Mcnemar Test ”; Previous Next Mc Nemer test is utilized for two related examples as a part of circumstances where the states of mind of individuals are noted previously, then after the fact treatment to test the essentialness of progress in sentiment if any. The Mc Nemer test is especially helpful when the information speaks the truth two related samples. For the most part this information is utilized as a part of circumstances where the states of mind of individuals are noted before overseeing the treatment and are then contrasted and investigations in the wake of managing the treatment. It can along these lines be said that utilizing McNemer test we can judge if there is any adjustment in the demeanors or supposition of individuals subsequent to regulating the treatment with the utilization of table as demonstrated as follows: Before Treatment After Treatment   Do not favour Favour Favour A B Do not favour C D As can be seen C and B don”t change their supposition and show ”Do Not Favour” and ”Favour” individually even after the treatment has been administered .However, A which was good before treatment demonstrates a ”Do Not Favour” reaction after treatment and vice versa for D. It can hence be said that ${A+D}$ shows change in individuals” reaction. The null hypothesis for McNemer test is that ${frac{(A+D)}{2}}$ cases change in one direction and the same proportion of change takes place in other direction. McNemer test statistic uses a transformed _test model as follows: ${x^2 = frac{(|A-D|-1)^2}{(A+D)}}$ (Degree of freedom = 1.) Acceptance Criteria: If the calculated value is less then the table value, accept null hypothesis. Rejection Criteria: If the calculated value is more than table value then null hypothesis is rejected. Illustration In a before and after experiment the responses obtained from 300 respondents were classified as follows: Before Treatment After Treatment   Do not favour Favour Favour 60 = A 90 = B Do not favour 120 = C 30 = D Test at 5% significance level, using McNemer test if there is any significant difference in the opinion of people after the treatment. Solution: ${H_o}$: There is no difference in the opinion of people even after the experiment. The test statistic is calculated using the formula: ${x^2 = frac{(|A-D|-1)^2}{(A+D)}} \[7pt] , = frac{(|60-30|-1)^2}{(60+30)} \[7pt] , = 9.34$ The value of test at 5% significance level for 1 D.F. is 3.84. Since the test is greater than the table value, the null hypothesis is rejected i.e. the opinion of people has changed after the treatment. Print Page Previous Next Advertisements ”;

Data collection – Observation

Statistics – Data collection – Observation ”; Previous Next Observation is a popular method of data collection in behavioral sciences. The power, observation has been summed by W.L. Prosser as follows “there is still no man that would not accept dog tracks in the mud against the sworn testimony of a hundred eye witnesses that no dog had passed by.” Observation refers to the monitoring and recording of behavioral and non behavioral activities and conditions in a systematic manner to obtain information about the phenomena of interest, ”Behavioral Observation” is: Non verbal analysis like body movement. eye movement. Linguistic analysis which includes observing sounds like ohs! and abs! Extra linguistic analysis which observes the pitch timbre, rate of speaking etc. Spatial analysis about how people relate to each other. The non behavioral observation is an analysis of records e.g. newspaper archives, physical condition analysis such as checking the quality of grains in gunny bags and process analysis which includes observing any process. Observation can be classified into various, categories. Type of Observation Structured Vs. Unstructured Observation – In structured observation the problem has been clearly defined, hence the behavior to be observed and the method by which it will be measured is specified beforehand in detail. This reduces the chances of observer introducing observer”s bias in research e.g. study of p1ant safety compliance can be observed in a structure manner. Unstructured analysis is used in situations where the problem has not been clearly defined hence it cannot be pre specified that what is to be observed. Hence a researcher monitors all relevant phenomena and a great deal of flexibility is allowed in terms of what they note and record e.g. the student”s behavior in a class would require monitoring their total behavior in the class environment. The data collected through unstructured analysis should be analyzed carefully so that no bias is introduced. Disguised Vs. Undisguised Observation – This classification has been done on the basis of whether the subjects should know that they are being observed or not. In disguised observation, the subjects are unaware of the facts that they are being observed. Their behavior is observed using hidden cameras, one way mirrors, or other devices. Since the subjects are unaware that they are being observed hence they behave in a natural way. The drawback is that it may take long hours of observation before the subjects display the phenomena of interest. Disguised observation may be: Direct observation when the behavior is observed by the researcher himself personally. Indirect observation which is the effect or the result of the behavior that is observed. In undisguised observation, the subjects are aware that they are being observed. In this type of observation, there is the fear that the subject might show a typical activity. The entry of observer may upset the subject, but for how long this disruption will exist cannot be said conclusively. Studies have shown that such descriptions are short-lived and the subjects soon resume normal behavior. Participant vs. Non-Participant Observation – If the observer participates in the situation while observing it is termed as participant observation. g. a researcher studying the life style of slum dwellers, following participant observation, will himself stay in slums. His role as an observer may be concealed or revealed. By becoming a part of the setting he is able to observe in an insightful manner. A problem that arises out of this method is that the observer may become sympathetic to the subjects and would have problem in viewing his research objectively. In case of non-participant observation, the observer remains outside the setting and does not involve himself or participate in the situation. Natural vs. Contrived Observation. – In natural observation the behavior is observed as it takes place in the actual setting e.g. the consumer preferences observed directly at Pizza Hut where consumers are ordering pizza. The advantage of this method is that the true results are obtained, but it is expensive and time consuming method. In contrived observation, the phenomena is observed in an artificial or simulated setting e.g. the consumers instead of being observed in a restaurant are made to order in a setting that looks like a restaurant but is not an actual one. This type of observation has the advantage of being over in a short time and recording of behavior is easily done. However, since the consumer”s are conscious of their setting they may not show actual behavior. Classification on the Basis of Mode of Administration – This includes: monitors and records the behavior as it occurs. The recording is done on an observation schedule. The personal observation not only records what, has been specified but also identifies and records unexpected behaviors that defy pre-established response categories. Mechanical Observation – Mechanical devices, instead of human”s are to record the behavior. The devices record the behavior as it occurs and data is sorted and analyzed later on. Apart from cameras, other devices are galvanometer which measures the emotional arousal induced by an exposure to a specific stimuli, audiometer and people meter that record which channel on TV is being viewed with the latter also recording who is viewing the channel, coulometer records the eye movement etc. Audit – It is the process of obtaining information by physical examination of data. The audit, which is a count of physical objects, is generally done by the researcher himself. An audit can be a store audit or a pantry audit. The store audits are performed by the distributors or manufacturers in order to ana1yse the market share, purchase pattern etc. e.g. the researcher may check the store records or do an analysis of inventory on hand to record the data. The pantry audit involves the researcher developing an inventory of brands quantities and package sizes of products in a consumer”s home, generally in the course of a personal interview. Such an audit is used to supplement or test the truthfulness of information provided in the direct questionnaire. Content

Geometric Probability Distribution

Statistics – Geometric Probability Distribution ”; Previous Next The geometric distribution is a special case of the negative binomial distribution. It deals with the number of trials required for a single success. Thus, the geometric distribution is a negative binomial distribution where the number of successes (r) is equal to 1. Formula ${P(X=x) = p times q^{x-1} }$ Where − ${p}$ = probability of success for single trial. ${q}$ = probability of failure for a single trial (1-p) ${x}$ = the number of failures before a success. ${P(X-x)}$ = Probability of x successes in n trials. Example Problem Statement: In an amusement fair, a competitor is entitled for a prize if he throws a ring on a peg from a certain distance. It is observed that only 30% of the competitors are able to do this. If someone is given 5 chances, what is the probability of his winning the prize when he has already missed 4 chances? Solution: If someone has already missed four chances and has to win in the fifth chance, then it is a probability experiment of getting the first success in 5 trials. The problem statement also suggests the probability distribution to be geometric. The probability of success is given by the geometric distribution formula: ${P(X=x) = p times q^{x-1} }$ Where − ${p = 30 % = 0.3 }$ ${x = 5}$ = the number of failures before a success. Therefore, the required probability: $ {P(X=5) = 0.3 times (1-0.3)^{5-1} , \[7pt] , = 0.3 times (0.7)^4, \[7pt] , approx 0.072 \[7pt] , approx 7.2 % }$ Print Page Previous Next Advertisements ”;