Tableau – Operators ”; Previous Next An operator is a symbol that tells the compiler to perform specific mathematical or logical manipulations. Tableau has a number of operators used to create calculated fields and formulas. Following are the details of the operators that are available and the order (precedence) of operations. Types of Operator General Operators Arithmetic Operators Relational Operators Logical Operators General Operators Following table shows the general operators supported by Tableau. These operators act on numeric, character, and date data types. Operator Description Example +(addition) Adds two numbers. Concatenates two strings. Adds days to dates. 7 + 3 Profit + Sales ”abc” + ”def” = ”abcdef” #April 15, 2004# + 15 = #April 30, 2004# –(subtraction) Subtracts two numbers. Subtracts days from dates. -(7+3) = -10 #April 16, 2004# – 15 = #April 1, 2004# Arithmetic Operators Following table shows the arithmetic operators supported by Tableau. These operators act only on numeric data types. Operator Description Example *(Multiplication) Numeric multiplication 23*2 = 46 /(Division) Numeric division 45/2 = 22.5 %(modulo) Reminder of numeric division 13 % 2 = 1 ^(power) Raised to the power 2^3 = 8 Comparison Operators Following table lists the comparison operators supported by Tableau. These operators are used in expressions. Each operator compares two numbers, dates, or strings and returns a Boolean (TRUE or FALSE). Booleans themselves, however, cannot be compared using these operators. Operator Description Example = = or = (Equal to) Compares two numbers or two strings or two dates to be equal. Returns the Boolean value TRUE if they are, else returns false. ‘Hello’ = ‘Hello’ 5 = 15/ 3 != or <> (Not equal to) Compares two numbers or two strings or two dates to be unequal. Returns the Boolean value TRUE if they are, else returns false. ‘Good’ <> ‘Bad’ 18 != 37 / 2 > (Greater than) Compares two numbers or two strings or two dates where the first argument is greater than second. Returns the boolean value TRUE if it is the case, else returns false. [Profit] > 20000 [Category] > ‘Q’ [Ship date] > #April 1, 2004# < (Less than) Compares two numbers or two strings or two dates where the first argument is smaller than second. Returns the boolean value TRUE if it is the case, else returns false. [Profit] < 20000 [Category] < ‘Q’ [Ship date] < #April 1, 2004# Logical Operators Following table shows the logical operators supported by Tableau. These operators are used in expressions whose result is a Boolean giving the output as TRUE or FALSE. Operator Description Example AND If the expressions or Boolean values present on both sides of AND operator is evaluated to be TRUE, then the result is TRUE. Else the result is FALSE. [Ship Date] > #April 1, 2012# AND [Profit] > 10000 OR If any one or both of the expressions or Boolean values present on both sides of AND operator is evaluated to be TRUE, then the result is TRUE. Else the result is FALSE. [Ship Date] > #April 1, 2012# OR [Profit] > 10000 NOT This operator negates the Boolean value of the expression present after it. NOT [Ship Date] > #April 1, 2012# Operator Precedence The following table describes the order in which operators are evaluated. The top row has the highest precedence. Operators on the same row have the same precedence. If two operators have the same precedence, they are evaluated from left to right in the formula. Also parentheses can be used. The inner parentheses are evaluated before the outer parentheses. Precedence Operator 1 –(negate) 2 ^(power) 3 *, /, % 4 +, – 5 ==, >, <, >=, <=, != 6 NOT 7 AND 8 OR Print Page Previous Next Advertisements ”;
Category: Big Data & Analytics
Multinomial Distribution
Statistics – Multinomial Distribution ”; Previous Next A multinomial experiment is a statistical experiment and it consists of n repeated trials. Each trial has a discrete number of possible outcomes. On any given trial, the probability that a particular outcome will occur is constant. Formula ${P_r = frac{n!}{(n_1!)(n_2!)…(n_x!)} {P_1}^{n_1}{P_2}^{n_2}…{P_x}^{n_x}}$ Where − ${n}$ = number of events ${n_1}$ = number of outcomes, event 1 ${n_2}$ = number of outcomes, event 2 ${n_x}$ = number of outcomes, event x ${P_1}$ = probability that event 1 happens ${P_2}$ = probability that event 2 happens ${P_x}$ = probability that event x happens Example Problem Statement: Three card players play a series of matches. The probability that player A will win any game is 20%, the probability that player B will win is 30%, and the probability player C will win is 50%. If they play 6 games, what is the probability that player A will win 1 game, player B will win 2 games, and player C will win 3? Solution: Given: ${n}$ = 12 (6 games total) ${n_1}$ = 1 (Player A wins) ${n_2}$ = 2 (Player B wins) ${n_3}$ = 3 (Player C wins) ${P_1}$ = 0.20 (probability that Player A wins) ${P_1}$ = 0.30 (probability that Player B wins) ${P_1}$ = 0.50 (probability that Player C wins) Putting the values into the formula, we get: ${ P_r = frac{n!}{(n_1!)(n_2!)…(n_x!)} {P_1}^{n_1}{P_2}^{n_2}…{P_x}^{n_x} , \[7pt] P_r(A=1, B=2, C=3)= frac{6!}{1!2!3!}(0.2^1)(0.3^2)(0.5^3) , \[7pt] = 0.135 }$ Print Page Previous Next Advertisements ”;
Poisson Distribution
Statistics – Poisson Distribution ”; Previous Next Poisson conveyance is discrete likelihood dispersion and it is broadly use in measurable work. This conveyance was produced by a French Mathematician Dr. Simon Denis Poisson in 1837 and the dissemination is named after him. The Poisson circulation is utilized as a part of those circumstances where the happening”s likelihood of an occasion is little, i.e., the occasion once in a while happens. For instance, the likelihood of faulty things in an assembling organization is little, the likelihood of happening tremor in a year is little, the mischance”s likelihood on a street is little, and so forth. All these are cases of such occasions where the likelihood of event is little. Poisson distribution is defined and given by the following probability function: Formula ${P(X-x)} = {e^{-m}}.frac{m^x}{x!}$ Where − ${m}$ = Probability of success. ${P(X-x)}$ = Probability of x successes. Example Problem Statement: A producer of pins realized that on a normal 5% of his item is faulty. He offers pins in a parcel of 100 and insurances that not more than 4 pins will be flawed. What is the likelihood that a bundle will meet the ensured quality? [Given: ${e^{-m}} = 0.0067$] Solution: Let p = probability of a defective pin = 5% = $frac{5}{100}$. We are given: ${n} = 100, {p} = frac{5}{100} , \[7pt] Rightarrow {np} = 100 times frac{5}{100} = {5}$ The Poisson distribution is given as: ${P(X-x)} = {e^{-m}}.frac{m^x}{x!}$ Required probability = P [packet will meet the guarantee] = P [packet contains up to 4 defectives] = P (0) +P (1) +P (2) +P (3) +P (4) $ = {e^{-5}}.frac{5^0}{0!} + {e^{-5}}.frac{5^1}{1!} + {e^{-5}}.frac{5^2}{2!} + {e^{-5}}.frac{5^3}{3!} +{e^{-5}}.frac{5^4}{4!}, \[7pt] = {e^{-5}}[1+frac{5}{1}+frac{25}{2}+frac{125}{6}+frac{625}{24}] , \[7pt] = 0.0067 times 65.374 = 0.438$ Print Page Previous Next Advertisements ”;
Power Calculator
Statistics – Power Calculator ”; Previous Next Whenever a hypothesis test is conducted, we need to ascertain that test is of high qualitity. One way to check the power or sensitivity of a test is to compute the probability of test that it can reject the null hypothesis correctly when an alternate hypothesis is correct. In other words, power of a test is the probability of accepting the alternate hypothesis when it is true, where alternative hypothesis detects an effect in the statistical test. $ {Power = P( reject H_0 | H_1 is true) } $ Power of a test is also test by checking the probability of Type I error($ { alpha } $) and of Type II error($ { beta } $) where Type I error represents the incorrect rejection of a valid null hypothesis whereas Type II error represents the incorrect retention of an invalid null hypothesis. Lesser the chances of Type I or Type II error, more is the power of statistical test. Example A survey has been conducted on students to check their IQ level. Suppose a random sample of 16 students is tested. The surveyor tests the null hypothesis that the IQ of student is 100 against the alternative hypothesis that the IQ of student is not 100, using a 0.05 level of significance and standard deviation of 16. What is the power of the hypothesis test if the true population mean were 116? Solution: As distribution of the test statistic under the null hypothesis follows a Student t-distribution. Here n is large, we can approximate the t-distribution by a normal distribution. As probability of committing Type I error($ { alpha } $) is 0.05 , we can reject the null hypothesis ${H_0}$ when the test statistic $ { T ge 1.645 } $. Let”s compute the value of sample mean using test statistics by following formula. $ {T = frac{ bar X – mu}{ frac{sigma}{sqrt mu}} \[7pt] implies bar X = mu + T(frac{sigma}{sqrt mu}) \[7pt] , = 100 + 1.645(frac{16}{sqrt {16}})\[7pt] , = 106.58 } $ Let”s compute the power of statistical test by following formula. $ {Power = P(bar X ge 106.58 where mu = 116 ) \[7pt] , = P( T ge -2.36) \[7pt] , = 1- P( T lt -2.36 ) \[7pt] , = 1 – 0.0091 \[7pt] , = 0.9909 } $ So we have a 99.09% chance of rejecting the null hypothesis ${H_0: mu = 100 } $ in favor of the alternative hypothesis $ {H_1: mu gt 100 } $ where unknown population mean is $ {mu = 116 } $. Print Page Previous Next Advertisements ”;
Kolmogorov Smirnov Test
Statistics – Kolmogorov Smirnov Test ”; Previous Next This test is used in situations where a comparison has to be made between an observed sample distribution and theoretical distribution. K-S One Sample Test This test is used as a test of goodness of fit and is ideal when the size of the sample is small. It compares the cumulative distribution function for a variable with a specified distribution. The null hypothesis assumes no difference between the observed and theoretical distribution and the value of test statistic ”D” is calculated as: Formula $D = Maximum |F_o(X)-F_r(X)|$ Where − ${F_o(X)}$ = Observed cumulative frequency distribution of a random sample of n observations. and ${F_o(X) = frac{k}{n}}$ = (No.of observations ≤ X)/(Total no.of observations). ${F_r(X)}$ = The theoretical frequency distribution. The critical value of ${D}$ is found from the K-S table values for one sample test. Acceptance Criteria: If calculated value is less than critical value accept null hypothesis. Rejection Criteria: If calculated value is greater than table value reject null hypothesis. Example Problem Statement: In a study done from various streams of a college 60 students, with equal number of students drawn from each stream, are we interviewed and their intention to join the Drama Club of college was noted. B.Sc. B.A. B.Com M.A. M.Com No. in each class 5 9 11 16 19 It was expected that 12 students from each class would join the Drama Club. Using the K-S test to find if there is any difference among student classes with regard to their intention of joining the Drama Club. Solution: ${H_o}$: There is no difference among students of different streams with respect to their intention of joining the drama club. We develop the cumulative frequencies for observed and theoretical distributions. Streams No. of students interested in joining ${F_O(X)}$ ${F_T(X)}$ ${|F_O(X)-F_T(X)|}$ Observed(O) Theoretical(T) B.Sc. 5 12 5/60 12/60 7/60 B.A. 9 12 14/60 24/60 10/60 B.COM. 11 12 25/60 36/60 11/60 M.A. 16 12 41/60 48/60 7/60 M.COM. 19 12 60/40 60/60 60/60 Total n=60 Test statistic ${|D|}$ is calculated as: $D = Maximum {|F_0 (X)-F_T (X)|} \[7pt] , = frac{11}{60} \[7pt] , = 0.183$ The table value of D at 5% significance level is given by ${D_0.05 = frac{1.36}{sqrt{n}}} \[7pt] , = frac{1.36}{sqrt{60}} \[7pt] , = 0.175$ Since the calculated value is greater than the critical value, hence we reject the null hypothesis and conclude that there is a difference among students of different streams in their intention of joining the Club. K-S Two Sample Test When instead of one, there are two independent samples then K-S two sample test can be used to test the agreement between two cumulative distributions. The null hypothesis states that there is no difference between the two distributions. The D-statistic is calculated in the same manner as the K-S One Sample Test. Formula ${D = Maximum |{F_n}_1(X)-{F_n}_2(X)|}$ Where − ${n_1}$ = Observations from first sample. ${n_2}$ = Observations from second sample. It has been seen that when the cumulative distributions show large maximum deviation ${|D|}$ it is indicating towards a difference between the two sample distributions. The critical value of D for samples where ${n_1 = n_2}$ and is ≤ 40, the K-S table for two sample case is used. When ${n_1}$ and/or ${n_2}$ > 40 then the K-S table for large samples of two sample test should be used. The null hypothesis is accepted if the calculated value is less than the table value and vice-versa. Thus use of any of these nonparametric tests helps a researcher to test the significance of his results when the characteristics of the target population are unknown or no assumptions had been made about them. Print Page Previous Next Advertisements ”;
Pie Chart
Statistics – Pie Chart ”; Previous Next A pie chart (or a pie graph) is a circular statistical graphical chart, which is divided into slices in order to explain or illustrate numerical proportions. In a pie chart, centeral angle, area and an arc length of each slice is proportional to the quantity or percentages it represents. Total percentages should be 100 and total of the arc measures should be 360° Following illustration of pie graph depicts the cost of construction of a house. From this graph, one can compare the sum spent on cement, steel and so on. One can also compute the actual sum spent on each individual expense. Consider an example, where we want to know how much more is the labour cost when compared to cost of steel. $ { Amount spent on labor = frac{90}{60} times 600000 = $ 150000 \[7pt] Sum spent on steel = frac{54}{360} times 600000 = $ 90000 \[7pt] Excess = 150000 – 90000 = $ 60000 \[7pt] Let 60000=x% of 600000. \[7pt] implies frac{x}{100} times 600000 = $ 60000. \[7pt] implies x = 10% of total expense. } $ Print Page Previous Next Advertisements ”;
Tableau – Data Types
Tableau – Data Types ”; Previous Next As a data analysis tool, Tableau classifies every piece of data into one of the four categories namely – String, Number, Boolean and datetime. Once data is loaded from the source, Tableau automatically assigns the data types. Contrarily, you can also change some of the data types if it satisfies the data conversion rule. The user has to specify the data type for calculated fields. Following table lists the description of data types supported by Tableau. Data Type Description Example STRING Any sequence of zero or more characters. They are enclosed within single quotes. The quote itself can be included in a string by writing it twice. ”Hello” ”Quoted” ”quote” NUMBER These are either integers or floating points. It is advised to round the floating point numbers while using them in calculations. 3 142.58 BOOLEAN They are logical values. TRUE FALSE DATE & DATETIME Tableau recognizes dates in almost all formats. But in case we need to force Tableau to recognize a string as date, then we put a # sign before the data. “02/01/2015” “#3 March 1982” Print Page Previous Next Advertisements ”;
Tableau – Design Flow
Tableau – Design Flow ”; Previous Next As Tableau helps in analyzing lots of data over diverse time periods, dimensions, and measures, it needs a very meticulous planning to create a good dashboard or story. Hence, it is important to know the approach to design a good dashboard. Like any other field of human endeavor, there are many best practices to be followed to create good worksheets and dashboards. Though the final outcome expected from a Tableau project is ideally a dashboard with story, there are many intermediate steps which needs to be completed to reach this goal. Following is a flow diagram of design steps that should be ideally followed to create effective dashboards. Connect to Data Source Tableau connects to all popular data sources. It has inbuilt connectors which take care of establishing the connection, once the connection parameters are supplied. Be it simple text files, relational sources, SQL sources or cloud data bases, Tableau connects to nearly every data source. Build Data Views After connecting to a data source, you get all the column and data available in the Tableau environment. You classify them as dimensions and measures, and create any hierarchy required. Using these you build views, which are traditionally known as Reports. Tableau provides easy drag and drop feature to build views. Enhance the Views The views created above needs to be enhanced further by the use of filters, aggregations, labeling of axes, formatting of colors and borders, etc. Create Worksheets Create different worksheets to create different views on the same or different data. Create and Organize Dashboards Dashboards contain multiple worksheets which are linked. Hence, the action in any of the worksheet can change the result in the dashboard accordingly. Create a Story A story is a sheet that contains a sequence of worksheets or dashboards that work together to convey information. You can create stories to show how facts are connected, provide context, demonstrate how decisions relate to outcomes, or simply make a compelling case. Print Page Previous Next Advertisements ”;
Tableau – Environment Setup
Tableau – Environment Setup ”; Previous Next In this chapter, you will learn about the environment setup of Tableau. Download Tableau Desktop The Free Personal Edition of Tableau Desktop can be downloaded from Tableau Desktop. You need to register with your details to be able to download. After downloading, the installation is a very straightforward process in which you need to accept the license agreement and provide the target folder for installation. The following steps and screenshots describe the entire setup process. Start the Installation Wizard Double-click the TableauDesktop-64bit-9-2-2.exe. It will present a screen to allow the installation program to run. Click “Run”. Accept the License Agreement Read the license agreement and if you agree, choose the “I have read and accept the terms of this license agreement” option. Then, click “Install”. Start Trial On completion of the installation, the screen prompts you with the option to Start the trial now or later. You may choose to start it now. Also, if you have purchased Tableau then you may enter the License key. Provide Your Details Provide your name and organization details. Then, click “Next”. Registration Complete The registration completion screen appears. Click “Continue”. Verify the Installation You can verify the installation by going to the Windows start menu. Click the Tableau icon. The following screen appears. You are now ready to learn Tableau. Print Page Previous Next Advertisements ”;
Tableau – Navigation
Tableau – Navigation ”; Previous Next In this chapter, you will get acquainted with various navigational features available in Tableau interface. On running Tableau desktop, you get the menu at the top which shows all the commands we can navigate. Let’s open a blank workbook and go through the various important features under each menu. Menu Commands On closing the getting started window, you get the main interface with all the available Menu commands. They represent the entire set of features available in Tableau. Various sections of the menu are shown in the following diagram. Next, you can see some details of each menu. File Menu This menu is used to create a new Tableau workbook and open existing workbooks from both the local system and Tableau server. The important features in this menu are − Workbook Locale sets the language to be used in the report. Paste Sheets pastes a sheet into the current workbook, which is copied from another workbook. Export Packaged Workbook option is used to create a packaged workbook, which will be shared with other users. Data Menu This menu is used to create new data source to fetch the data for analysis and visualization. It also allows you to replace or upgrade the existing data source. The important features in this menu are as follows − New Data Source allows to view all the types of connections available and choose from it. Refresh All Extracts refreshes the data from the source. Edit Relationships option defines the fields in more than one data source for linking. Worksheet Menu This menu is used to create a new worksheet along with various display features such as showing the title and captions, etc. The important features in this menu are as follows − Show Summary allows to view the summary of the data used in the worksheet such as, count, etc. Tooltip shows the tooltip when hovering above various data fields. Run Update option updates the worksheet data or filters used. Dashboard Menu This menu is used to create a new dashboard along with various display features, such as showing the title and exporting the image, etc. The important features in this menu are as follows − Format sets the layout in terms of colors and sections of the dashboard. Actions link the dashboard sheets to external URLs or other sheets. Export Image option exports an image of the Dashboard. Story Menu This menu is used to create a new story which has many sheets or dashboards with related data. The important features in this menu are as follows − Format sets the layout in terms of colors and sections of the story. Run Update updates the story with the latest data from the source. Export Image option exports an image of the story. Analysis Menu This menu is used for analyzing the data present in the sheet. Tableau provides many outof-the-box features, such as calculating the percentage and performing a forecast, etc. The important features in this menu are as follows − Forecast shows a forecast based on available data. Trend Lines shows the trend line for a series of data. Create Calculated Field option creates additional fields based on certain calculation on the existing fields. Map Menu This menu is used for building map views in Tableau. You can assign geographic roles to fields in your data. The important features in this menu are as follows − Map Layers hides and shows map layers, such as street names, country borders, and adds data layers. Geocoding creates new geographic roles and assigns them to the geographic fields in your data. Format Menu This menu is used for applying the various formatting options to enhance the look and feel of the dashboards created. It provides features such as borders, colors, alignment of text, etc. The important features in this menu are as follows − Borders applies borders to the fields displayed in the report. Title & Caption assigns a title and caption to the reports. Cell Size customizes the size of the cells displaying the data. Workbook Theme applies a theme to the entire workbook. Server Menu Server Menu is used to login to the Tableau server if you have access, and publish your results to be used by others. It is also used to access the workbooks published by others. The important features in this menu are as follows − Publish Workbook publishes the workbook in the server to be used by others. Publish Data Source publishes the source data used in the workbook. Create User Filters creates filters on the worksheet to be applied by various users while accessing the report. Print Page Previous Next Advertisements ”;