Big Data & Analytics Archives - Page 15 of 75 - Donotsad where can learn any thing work project and make money

Aug 10

Tableau – Environment Setup

Tableau – Environment Setup ”; Previous Next In this chapter, you will learn about the environment setup of Tableau. Download Tableau Desktop The Free Personal Edition of Tableau Desktop can be downloaded from Tableau Desktop. You need to register with your details to be able to download. After downloading, the installation is a very straightforward process in which you need to accept the license agreement and provide the target folder for installation. The following steps and screenshots describe the entire setup process. Start the Installation Wizard Double-click the TableauDesktop-64bit-9-2-2.exe. It will present a screen to allow the installation program to run. Click “Run”. Accept the License Agreement Read the license agreement and if you agree, choose the “I have read and accept the terms of this license agreement” option. Then, click “Install”. Start Trial On completion of the installation, the screen prompts you with the option to Start the trial now or later. You may choose to start it now. Also, if you have purchased Tableau then you may enter the License key. Provide Your Details Provide your name and organization details. Then, click “Next”. Registration Complete The registration completion screen appears. Click “Continue”. Verify the Installation You can verify the installation by going to the Windows start menu. Click the Tableau icon. The following screen appears. You are now ready to learn Tableau. Print Page Previous Next Advertisements ”;

Aug 10

Tableau – Navigation

Tableau – Navigation ”; Previous Next In this chapter, you will get acquainted with various navigational features available in Tableau interface. On running Tableau desktop, you get the menu at the top which shows all the commands we can navigate. Let’s open a blank workbook and go through the various important features under each menu. Menu Commands On closing the getting started window, you get the main interface with all the available Menu commands. They represent the entire set of features available in Tableau. Various sections of the menu are shown in the following diagram. Next, you can see some details of each menu. File Menu This menu is used to create a new Tableau workbook and open existing workbooks from both the local system and Tableau server. The important features in this menu are − Workbook Locale sets the language to be used in the report. Paste Sheets pastes a sheet into the current workbook, which is copied from another workbook. Export Packaged Workbook option is used to create a packaged workbook, which will be shared with other users. Data Menu This menu is used to create new data source to fetch the data for analysis and visualization. It also allows you to replace or upgrade the existing data source. The important features in this menu are as follows − New Data Source allows to view all the types of connections available and choose from it. Refresh All Extracts refreshes the data from the source. Edit Relationships option defines the fields in more than one data source for linking. Worksheet Menu This menu is used to create a new worksheet along with various display features such as showing the title and captions, etc. The important features in this menu are as follows − Show Summary allows to view the summary of the data used in the worksheet such as, count, etc. Tooltip shows the tooltip when hovering above various data fields. Run Update option updates the worksheet data or filters used. Dashboard Menu This menu is used to create a new dashboard along with various display features, such as showing the title and exporting the image, etc. The important features in this menu are as follows − Format sets the layout in terms of colors and sections of the dashboard. Actions link the dashboard sheets to external URLs or other sheets. Export Image option exports an image of the Dashboard. Story Menu This menu is used to create a new story which has many sheets or dashboards with related data. The important features in this menu are as follows − Format sets the layout in terms of colors and sections of the story. Run Update updates the story with the latest data from the source. Export Image option exports an image of the story. Analysis Menu This menu is used for analyzing the data present in the sheet. Tableau provides many outof-the-box features, such as calculating the percentage and performing a forecast, etc. The important features in this menu are as follows − Forecast shows a forecast based on available data. Trend Lines shows the trend line for a series of data. Create Calculated Field option creates additional fields based on certain calculation on the existing fields. Map Menu This menu is used for building map views in Tableau. You can assign geographic roles to fields in your data. The important features in this menu are as follows − Map Layers hides and shows map layers, such as street names, country borders, and adds data layers. Geocoding creates new geographic roles and assigns them to the geographic fields in your data. Format Menu This menu is used for applying the various formatting options to enhance the look and feel of the dashboards created. It provides features such as borders, colors, alignment of text, etc. The important features in this menu are as follows − Borders applies borders to the fields displayed in the report. Title & Caption assigns a title and caption to the reports. Cell Size customizes the size of the cells displaying the data. Workbook Theme applies a theme to the entire workbook. Server Menu Server Menu is used to login to the Tableau server if you have access, and publish your results to be used by others. It is also used to access the workbooks published by others. The important features in this menu are as follows − Publish Workbook publishes the workbook in the server to be used by others. Publish Data Source publishes the source data used in the workbook. Create User Filters creates filters on the worksheet to be applied by various users while accessing the report. Print Page Previous Next Advertisements ”;

Aug 10

Kurtosis

Statistics – Kurtosis ”; Previous Next The degree of tailedness of a distribution is measured by kurtosis. It tells us the extent to which the distribution is more or less outlier-prone (heavier or light-tailed) than the normal distribution. Three different types of curves, courtesy of Investopedia, are shown as follows − It is difficult to discern different types of kurtosis from the density plots (left panel) because the tails are close to zero for all distributions. But differences in the tails are easy to see in the normal quantile-quantile plots (right panel). The normal curve is called Mesokurtic curve. If the curve of a distribution is more outlier prone (or heavier-tailed) than a normal or mesokurtic curve then it is referred to as a Leptokurtic curve. If a curve is less outlier prone (or lighter-tailed) than a normal curve, it is called as a platykurtic curve. Kurtosis is measured by moments and is given by the following formula − Formula ${beta_2 = frac{mu_4}{mu_2}}$ Where − ${mu_4 = frac{sum(x- bar x)^4}{N}}$ The greater the value of beta_2 the more peaked or leptokurtic the curve. A normal curve has a value of 3, a leptokurtic has beta_2 greater than 3 and platykurtic has beta_2 less then 3. Example Problem Statement: The data on daily wages of 45 workers of a factory are given. Compute beta_1 and beta_2 using moment about the mean. Comment on the results. Wages(Rs.) Number of Workers 100-200 1 120-200 2 140-200 6 160-200 20 180-200 11 200-200 3 220-200 2 Solution: Wages(Rs.) Number of Workers(f) Mid-ptm m-${frac{170}{20}}$ d ${fd}$ ${fd^2}$ ${fd^3}$ ${fd^4}$ 100-200 1 110 -3 -3 9 -27 81 120-200 2 130 -2 -4 8 -16 32 140-200 6 150 -1 -6 6 -6 6 160-200 20 170 0 0 0 0 0 180-200 11 190 1 11 11 11 11 200-200 3 210 2 6 12 24 48 220-200 2 230 3 6 18 54 162 ${N=45}$ ${sum fd = 10}$ ${sum fd^2 = 64}$ ${sum fd^3 = 40}$ ${sum fd^4 = 330}$ Since the deviations have been taken from an assumed mean, hence we first calculate moments about arbitrary origin and then moments about mean. Moments about arbitrary origin ”170” ${mu_1^1= frac{sum fd}{N} times i = frac{10}{45} times 20 = 4.44 \[7pt] mu_2^1= frac{sum fd^2}{N} times i^2 = frac{64}{45} times 20^2 =568.88 \[7pt] mu_3^1= frac{sum fd^2}{N} times i^3 = frac{40}{45} times 20^3 =7111.11 \[7pt] mu_4^1= frac{sum fd^4}{N} times i^4 = frac{330}{45} times 20^4 =1173333.33 }$ Moments about mean ${mu_2 = mu”_2 – (mu”_1 )^2 = 568.88-(4.44)^2 = 549.16 \[7pt] mu_3 = mu”_3 – 3(mu”_1)(mu”_2) + 2(mu”_1)^3 \[7pt] , = 7111.11 – (4.44) (568.88)+ 2(4.44)^3 \[7pt] , = 7111.11 – 7577.48+175.05 = – 291.32 \[7pt] \[7pt] mu_4= mu”_4 – 4(mu”_1)(mu”_3) + 6 (mu_1 )^2 (mu”_2) -3(mu”_1)^4 \[7pt] , = 1173333.33 – 4 (4.44)(7111.11)+6(4.44)^2 (568.88) – 3(4.44)^4 \[7pt] , = 1173333.33 – 126293.31+67288.03-1165.87 \[7pt] , = 1113162.18 }$ From the value of movement about mean, we can now calculate ${beta_1}$ and ${beta_2}$: ${beta_1 = mu^2_3 = frac{(-291.32)^2}{(549.16)^3} = 0.00051 \[7pt] beta_2 = frac{mu_4}{(mu_2)^2} = frac{1113162.18}{(546.16)^2} = 3.69 }$ From the above calculations, it can be concluded that ${beta_1}$, which measures skewness is almost zero, thereby indicating that the distribution is almost symmetrical. ${beta_2}$ Which measures kurtosis, has a value greater than 3, thus implying that the distribution is leptokurtic. Print Page Previous Next Advertisements ”;

Aug 10

Pooled Variance (r)

Statistics – Pooled Variance (r) ”; Previous Next Pooled Variance/Change is the weighted normal for assessing the fluctuations of two autonomous variables where the mean can differ between tests however the genuine difference continues as before. Example Problem Statement: Compute the Pooled Variance of the numbers 1, 2, 3, 4 and 5. Solution: Step 1 Decide the normal (mean) of the given arrangement of information by including every one of the numbers then gap it by the aggregate include of numbers given the information set. ${Mean = frac{1 + 2 + 3 + 4 + 5}{5} = frac{15}{5} = 3 }$ Step 2 At that point, subtract the mean worth with the given numbers in the information set. ${Rightarrow (1 – 3), (2 – 3), (3 – 3), (4 – 3), (5 – 3) Rightarrow – 2, – 1, 0, 1, 2 }$ Step 3 Square every period”s deviation to dodge the negative numbers. ${Rightarrow (- 2)^2, (- 1)^2, (0)^2, (1)^2, (2)^2 Rightarrow 4, 1, 0, 1, 4 }$ Step 4 Now discover Standard Deviation utilizing the underneath equation ${S = sqrt{frac{sum{X-M}^2}{n-1}}}$ Standard Deviation = ${frac{sqrt 10}{sqrt 4} = 1.58113 }$ Step 5 ${Pooled Variance (r) = frac{((aggregate check of numbers – 1) times Var)}{(aggregate tally of numbers – 1)} , \[7pt] (r) = (5 – 1) times frac{2.5}{(5 – 1)}, \[7pt] = frac{(4 times 2.5)}{4} = 2.5}$ Hence, Pooled Variance (r) =2.5 Print Page Previous Next Advertisements ”;

Aug 10

Individual Series Arithmetic Median

Statistics – Individual Series Arithmetic Median ”; Previous Next When data is given on individual basis. Following is an example of individual series − Items 5 10 20 30 40 50 60 70 In case of a group having even number of distribution, Arithmetic Median is found out by taking out the Arithmetic Mean of two middle values after arranging the numbers in ascending order. Formula Median = Value of ($frac{N+1}{2})^{th} item$. Where − ${N}$ = Number of observations Example Problem Statement − Let”s calculate Arithmetic Median for the following individual data − Items 14 36 45 70 105 145 Solution − Based on the above mentioned formula, Arithmetic Median M will be − $M = Value of (frac{N+1}{2})^{th} item. \[7pt] , = Value of (frac{6+1}{2})^{th} item. \[7pt] , = Value of 3.5^{th} item. \[7pt] , = Value of (frac{3^{rd} item + 4^{th} item}{2})\[7pt] , = (frac{45 + 70}{2}) , = {57.5}$ The Arithmetic Median of the given numbers is 57.5. In case of a group having odd number of distribution, Arithmetic Median is the middle number after arranging the numbers in ascending order. Example Let”s calculate Arithmetic Median for the following individual data − Items 14 36 45 70 105 Given numbers are 5, an odd number thus middle number is the Arithmetic Median. ∴ The Arithmetic Median of the given numbers is 45. Calculator Print Page Previous Next Advertisements ”;

Aug 10

Inverse Gamma Distribution

Statistics – Inverse Gamma Distribution ”; Previous Next Inverse Gamma Distribution is a reciprocal of gamma probability density function with positive shape parameters $ {alpha, beta } $ and location parameter $ { mu } $. $ {alpha } $ controls the height. Higher the $ {alpha } $, taller is the probability density function (PDF). $ {beta } $ controls the speed. It is defined by following formula. Formula ${ f(x) = frac{x^{-(alpha+1)}e^{frac{-1}{beta x}}}{ Gamma(alpha) beta^alpha} \[7pt] , where x gt 0 }$ Where − ${alpha}$ = positive shape parameter. ${beta}$ = positive shape parameter. ${x}$ = random variable. Following diagram shows the probability density function with different parameter combinations. Print Page Previous Next Advertisements ”;

Aug 10

Relative Standard Deviation

Statistics – Relative Standard Deviation ”; Previous Next In probability theory and statistics, the coefficient of variation (CV), also known as relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. Relative Standard Deviation, RSD is defined and given by the following probability function: Formula ${100 times frac{s}{bar x}}$ Where − ${s}$ = the sample standard deviation ${bar x}$ = sample mean Example Problem Statement: Find the RSD for the following set of numbers: 49, 51.3, 52.7, 55.8 and the standard deviation are 2.8437065. Solution: Step 1 – Standard deviation of sample: 2.8437065 (or 2.84 rounded to 2 decimal places). Step 2 – Multiply Step 1 by 100. Set this number aside for a moment. ${2.84 times 100 = 284}$ Step 3 – Find the sample mean, ${bar x}$. The sample mean is: ${frac{(49 + 51.3 + 52.7 + 55.8)}{4} = frac{208.8}{4} = 52.2.}$ Step 4Divide Step 2 by the absolute value of Step 3. ${frac{284}{|52.2|} = 5.44.}$ The RSD is: ${52.2 pm 5.4}$% Note that the RSD is expressed as a percentage. Print Page Previous Next Advertisements ”;

Aug 10

Scatterplots

Statistics – Scatterplots ”; Previous Next A scatterplot is a graphical way to display the relationship between two quantitative sample variables. It consists of an X axis, a Y axis and a series of dots where each dot represents one observation from a data set. The position of the dot refers to its X and Y values. Patterns of Data in Scatterplots Scatterplots are used to analyze patterns which generally varies on the basis of linearity, slope, and strength. Linearity – data pattern is either linear/straight or nonlinear/curved. Slope – direction of change in variable Y with respect to increase in value of variable X. If Y increases with increase in X, slope is positive otherwise slope is negative. Strength – Degree of spreadness of scatter in the plot. If dots are widely dispersed, the relationship is consider weak. If dot are densed around a line then the relationship is said to be strong. Print Page Previous Next Advertisements ”;

Aug 10

Process Sigma

Statistics – Process Sigma ”; Previous Next Process sigma can be defined using following four steps: Measure opportunities, Measure defects, Calculate yield, Look-up process sigma. Formulae Used ${DPMO = frac{Total defect}{Total Opportunities} times 1000000}$ ${Defect (%) = frac{Total defect}{Total Opportunities} times 100}$ ${Yield (%) = 100 – Defect (%) }$ ${Process Sigma = 0.8406+sqrt{29.37}-2.221 times (log (DPMO)) }$ Where − ${Opportunities}$ = Lowest defect noticeable by customer. ${DPMO}$ = Defects per Million Opportunities. Example Problem Statement: In equipment organization hard plate produced is 10000 and the defects is 5. Discover the process sigma. Solution: Given: Opportunities = 10000 and Defects = 5. Substitute the given qualities in the recipe, Step 1: Compute DPMO $ {DPMO = frac{Total defect}{Total Opportunities} times 1000000 \[7pt] , = (10000/5) times 1000000 , \[7pt] , = 500}$ Step 2: Compute Defect(%) $ {Defect (%) = frac{Total defect}{Total Opportunities} times 100 \[7pt] , = frac{10000}{5} times 100 , \[7pt] , = 0.05}$ Step 3: Compute Yield(%) $ {Yield (%) = 100 – Defect (%) \[7pt] , = 100 – 0.05 , \[7pt] , = 99.95}$ Step 3: Compute Process Sigma $ {Process Sigma = 0.8406+sqrt{29.37}-2.221 times (log (DPMO)) \[7pt] , = 0.8406 + sqrt {29.37} – 2.221 times (log (DPMO)) , \[7pt] , = 0.8406+sqrt(29.37) – 2.221*(log (500)) , \[7pt] , = 4.79 }$ Print Page Previous Next Advertisements ”;

Aug 10

Harmonic Resonance Frequency

Statistics – Harmonic Resonance Frequency ”; Previous Next Harmonic Resonance Frequency represents a signal or wave whose frequency is an integral multiple of the frequency of a reference signal or wave. Formula ${ f = frac{1}{2 pi sqrt{LC}} } $ Where − ${f}$ = Harmonic resonance frequency. ${L}$ = inductance of the load. ${C}$ = capacitanc of the load. Example Calculate the harmonic resonance frequency of a power system with the capcitance 5F, Inductance 6H and frequency 200Hz. Solution: Here capacitance, C is 5F. Inductance, L is 6H. Frequency, f is 200Hz. Using harmonic resonance frequency formula, let”s compute the resonance frequency as: ${ f = frac{1}{2 pi sqrt{LC}} \[7pt] implies f = frac{1}{2 pi sqrt{6 times 5}} \[7pt] , = frac{1}{2 times 3.14 times sqrt{30}} \[7pt] , = frac{1}{ 6.28 times 5.4772 } \[7pt] , = frac{1}{ 34.3968 } \[7pt] , = 0.0291 }$ Thus harmonic resonance frequency is $ { 0.0291 }$. Print Page Previous Next Advertisements ”;