Tableau – Data Terminology

Tableau – Data Terminology ”; Previous Next As a powerful data visualization tool, Tableau has many unique terms and definitions. You need to get acquainted with their meaning before you start using the features in Tableau. The following list of terms is comprehensive and explains the terms most frequently used. S.No Terms & Meaning 1 Alias An alternative name that you can assign to a field or to a dimension member. 2 Bin A user-defined grouping of measures in the data source. 3 Bookmark A .tbm file in the Bookmarks folder in the Tableau repository that contains a single worksheet. Much like web browser bookmarks, .tbm files are a convenient way to quickly display different analyses. 4 Calculated Field A new field that you create by using a formula to modify the existing fields in your data source. 5 Crosstab A text table view. Use text tables to display the numbers associated with dimension members. 6 Dashboard A combination of several views arranged on a single page. Use dashboards to compare and monitor a variety of data simultaneously. 7 Data Pane A pane on the left side of the workbook that displays the fields of the data sources to which Tableau is connected. The fields are divided into dimensions and measures. The data pane also displays custom fields such as calculations, binned fields, and groups. You build views of your data by dragging fields from the data pane onto the various shelves that are a part of every worksheet. 8 Data Source Page A page where you can set up your data source. The data source page generally consists of four main areas − left pane, join area, preview area, and metadata area. 9 Dimension A field of categorical data. Dimensions typically hold discrete data such as hierarchies and members that cannot be aggregated. Examples of dimensions include dates, customer names, and customer segments. 10 Extract A saved subset of a data source that you can use to improve performance and analyze offline. You can create an extract by defining filters and limits that include the data you want in the extract. 11 Filters Shelf A shelf on the left of the workbook that you can use to exclude data from a view by filtering it using measures and dimensions. 12 Format Pane A pane that contains formatting settings that control the entire worksheet, as well as individual fields in the view. When open, the Format pane appears on the left side of the workbook. 13 Level Of Detail (LOD) Expression A syntax that supports aggregation at dimensionalities other than the view level. With the level of detail expressions, you can attach one or more dimensions to any aggregate expression. 14 Marks A part of the view that visually represents one or more rows in a data source. A mark can be, for example, a bar, line, or square. You can control the type, color, and size of marks. 15 Marks Card A card to the left of the view, where you can drag fields to control mark properties such as type, color, size, shape, label, tooltip, and detail. 16 Pages Shelf A shelf to the left of the view that you can use to split a view into a sequence of pages based on the members and values in a discrete or continuous field. Adding a field to the Pages shelf is like adding a field to the Rows shelf, except that a new page is created for each new row. 17 Rows Shelf A shelf at the top of the workbook that you can use to create the rows of a data table. The shelf accepts any number of dimensions and measures. When you place a dimension on the Rows shelf, Tableau creates headers for the members of that dimension. When you place a measure on the Rows shelf, Tableau creates quantitative axes for that measure. 18 Shelves Named areas to the left and top of the view. You build views by placing fields onto the shelves. Some shelves are available only when you select certain mark types. For example, the Shape shelf is available only when you select the Shape mark type. 19 Workbook A file with a .twb extension that contains one or more worksheets (and possibly also dashboards and stories). 20 Worksheet A sheet where you build views of your data by dragging fields onto shelves. Print Page Previous Next Advertisements ”;

Standard Error ( SE )

Statistics – Standard Error ( SE ) ”; Previous Next The standard deviation of a sampling distribution is called as standard error. In sampling, the three most important characteristics are: accuracy, bias and precision. It can be said that: The estimate derived from any one sample is accurate to the extent that it differs from the population parameter. Since the population parameters can only be determined by a sample survey, hence they are generally unknown and the actual difference between the sample estimate and population parameter cannot be measured. The estimator is unbiased if the mean of the estimates derived from all the possible samples equals the population parameter. Even if the estimator is unbiased an individual sample is most likely going to yield inaccurate estimate and as stated earlier, inaccuracy cannot be measured. However it is possible to measure the precision i.e. the range between which the true value of the population parameter is expected to lie, using the concept of standard error. Formula $SE_bar{x} = frac{s}{sqrt{n}}$ Where − ${s}$ = Standard Deviation and ${n}$ = No.of observations Example Problem Statement: Calculate Standard Error for the following individual data: Items 14 36 45 70 105 Solution: Let”s first compute the Arithmetic Mean $bar{x}$ $bar{x} = frac{14 + 36 + 45 + 70 + 105}{5} \[7pt] , = frac{270}{5} \[7pt] , = {54}$ Let”s now compute the Standard Deviation ${s}$ $s = sqrt{frac{1}{n-1}((x_{1}-bar{x})^{2}+(x_{2}-bar{x})^{2}+…+(x_{n}-bar{x})^{2})} \[7pt] , = sqrt{frac{1}{5-1}((14-54)^{2}+(36-54)^{2}+(45-54)^{2}+(70-54)^{2}+(105-54)^{2})} \[7pt] , = sqrt{frac{1}{4}(1600+324+81+256+2601)} \[7pt] , = {34.86}$ Thus the Standard Error $SE_bar{x}$ $SE_bar{x} = frac{s}{sqrt{n}} \[7pt] , = frac{34.86}{sqrt{5}} \[7pt] , = frac{34.86}{2.23} \[7pt] , = {15.63}$ The Standard Error of the given numbers is 15.63. The smaller the proportion of the population that is sampled the less is the effect of this multiplier because then the finite multiplier will be close to one and will affect the standard error negligibly. Hence if the sample size is less than 5% of population, the finite multiplier is ignored. Print Page Previous Next Advertisements ”;

Tableau – Forecasting

Tableau – Forecasting ”; Previous Next Forecasting is about predicting the future value of a measure. There are many mathematical models for forecasting. Tableau uses the model known as exponential smoothing. In exponential smoothing, recent observations are given relatively more weight than older observations. These models capture the evolving trend or seasonality of the data and extrapolate them into the future. The result of a forecast can also become a field in the visualization created. Tableau takes a time dimension and a measure field to create a forecast. Creating a Forecast Using the Sample-superstore, forecast the value of the measure sales for next year. To achieve this objective, following are the steps. Step 1 − Create a line chart with Order Date (Year) in the columns shelf and Sales in the Rows shelf. Go to the Analysis tab as shown in the following screenshot and click Forecast under Model category. Step 2 − On completing the above step, you will find the option to set various options for forecast. Choose the Forecast Length as 2 years and leave the Forecast Model to Automatic as shown in the following screenshot. Click OK, and you will get the final forecast result as shown in the following screenshot. Describe Forecast You can also get minute details of the forecast model by choosing the option Describe Forecast. To get this option, right-click on Forecast diagram as shown in the following screenshot. Print Page Previous Next Advertisements ”;

Ti 83 Exponential Regression

Statistics – Ti 83 Exponential Regression ”; Previous Next Ti 83 Exponential Regression is used to compute an equation which best fits the co-relation between sets of indisciriminate variables. Formula ${ y = a times b^x}$ Where − ${a, b}$ = coefficients for the exponential. Example Problem Statement: Calculate Exponential Regression Equation(y) for the following data points. Time (min), Ti 0 5 10 15 Temperature (°F), Te 140 129 119 112 Solution: Let consider a and b as coefficients for the exponential Regression. Step 1 ${ b = e^{ frac{n times sum Ti log(Te) – sum (Ti) times sum log(Te) } {n times sum (Ti)^2 – times (Ti) times sum (Ti) }} } $ Where − ${n}$ = total number of items. ${ sum Ti log(Te) = 0 times log(140) + 5 times log(129) + 10 times log(119) + 15 times log(112) = 62.0466 \[7pt] sum log(L2) = log(140) + log(129) + log(119) + log(112) = 8.3814 \[7pt] sum Ti = (0 + 5 + 10 + 15) = 30 \[7pt] sum Ti^2 = (0^2 + 5^2 + 10^2 + 15^2) = 350 \[7pt] implies b = e^{frac {4 times 62.0466 – 30 times 8.3814} {4 times 350 – 30 times 30}} \[7pt] = e^{-0.0065112} \[7pt] = 0.9935 } $ Step 2 ${ a = e^{ frac{sum log(Te) – sum (Ti) times log(b)}{n} } \[7pt] = e^{frac{8.3814 – 30 times log(0.9935)}{4}} \[7pt] = e^2.116590964 \[7pt] = 8.3028 } $ Step 3 Putting the value of a and b in Exponential Regression Equation(y), we get. ${ y = a times b^x \[7pt] = 8.3028 times 0.9935^x } $ Print Page Previous Next Advertisements ”;

Tableau – Overview

Tableau – Overview ”; Previous Next As a leading data visualization tool, Tableau has many desirable and unique features. Its powerful data discovery and exploration application allows you to answer important questions in seconds. You can use Tableau”s drag and drop interface to visualize any data, explore different views, and even combine multiple databases easily. It does not require any complex scripting. Anyone who understands the business problems can address it with a visualization of the relevant data. After analysis, sharing with others is as easy as publishing to Tableau Server. Tableau Features Tableau provides solutions for all kinds of industries, departments, and data environments. Following are some unique features which enable Tableau to handle diverse scenarios. Speed of Analysis − As it does not require high level of programming expertise, any user with access to data can start using it to derive value from the data. Self-Reliant − Tableau does not need a complex software setup. The desktop version which is used by most users is easily installed and contains all the features needed to start and complete data analysis. Visual Discovery − The user explores and analyzes the data by using visual tools like colors, trend lines, charts, and graphs. There is very little script to be written as nearly everything is done by drag and drop. Blend Diverse Data Sets − Tableau allows you to blend different relational, semistructured and raw data sources in real time, without expensive up-front integration costs. The users don’t need to know the details of how data is stored. Architecture Agnostic − Tableau works in all kinds of devices where data flows. Hence, the user need not worry about specific hardware or software requirements to use Tableau. Real-Time Collaboration − Tableau can filter, sort, and discuss data on the fly and embed a live dashboard in portals like SharePoint site or Salesforce. You can save your view of data and allow colleagues to subscribe to your interactive dashboards so they see the very latest data just by refreshing their web browser. Centralized Data − Tableau server provides a centralized location to manage all of the organization’s published data sources. You can delete, change permissions, add tags, and manage schedules in one convenient location. It’s easy to schedule extract refreshes and manage them in the data server. Administrators can centrally define a schedule for extracts on the server for both incremental and full refreshes. Print Page Previous Next Advertisements ”;

Probability Bayes Theorem

Statistics – Probability Bayes Theorem ”; Previous Next One of the most significant developments in the probability field has been the development of Bayesian decision theory which has proved to be of immense help in making decisions under uncertain conditions. The Bayes Theorem was developed by a British Mathematician Rev. Thomas Bayes. The probability given under Bayes theorem is also known by the name of inverse probability, posterior probability or revised probability. This theorem finds the probability of an event by considering the given sample information; hence the name posterior probability. The bayes theorem is based on the formula of conditional probability. conditional probability of event ${A_1}$ given event ${B}$ is ${P(A_1/B) = frac{P(A_1 and B)}{P(B)}}$ Similarly probability of event ${A_1}$ given event ${B}$ is ${P(A_2/B) = frac{P(A_2 and B)}{P(B)}}$ Where ${P(B) = P(A_1 and B) + P(A_2 and B) \[7pt] P(B) = P(A_1) times P (B/A_1) + P (A_2) times P (BA_2) }$ ${P(A_1/B)}$ can be rewritten as ${P(A_1/B) = frac{P(A_1) times P (B/A_1)}{P(A_1)} times P (B/A_1) + P (A_2) times P (BA_2)}$ Hence the general form of Bayes Theorem is ${P(A_i/B) = frac{P(A_i) times P (B/A_i)}{sum_{i=1}^k P(A_i) times P (B/A_i)}}$ Where ${A_1}$, ${A_2}$…${A_i}$…${A_n}$ are set of n mutually exclusive and exhaustive events. Print Page Previous Next Advertisements ”;

Tableau – Tree Map

Tableau – Tree Map ”; Previous Next The tree map displays data in nested rectangles. The dimensions define the structure of the tree map and measures define the size or color of the individual rectangle. The rectangles are easy to visualize as both the size and shade of the color of the rectangle reflect the value of the measure. A Tree Map is created using one or more dimension with one or two measures. Creating a Tree Map Using the Sample-superstore, plan to find the size of profits for each Ship mode values. To achieve this objective, following are the steps. Step 1 − Drag and drop the measure profit two times to the Marks Card. Once to the Size shelf and again to the Color shelf. Step 2 − Drag and drop the dimension ship mode to the Label shelf. Choose the chart type Tree Map from Show Me. The following chart appears. Tree Map with Two Dimensions You can add the dimension Region to the above Tree map chart. Drag and drop it twice. Once to the Color shelf and again to the Label shelf. The chart that appears will show four outer boxes for four regions and then the boxes for ship modes nested inside them. All the different regions will now have different colors. Print Page Previous Next Advertisements ”;

Residual sum of squares

Statistics – Residual Sum of Squares ”; Previous Next In statistics, the residual sum of squares (RSS), also known as the sum of squared residuals (SSR) or the sum of squared errors of prediction (SSE), is the sum of the squares of residuals (deviations of predicted from actual empirical values of data). Residual Sum of Squares (RSS) is defined and given by the following function: Formula ${RSS = sum_{i=0}^n(epsilon_i)^2 = sum_{i=0}^n(y_i – (alpha + beta x_i))^2}$ Where − ${X, Y}$ = set of values. ${alpha, beta}$ = constant of values. ${n}$ = set value of count Example Problem Statement: Consider two populace bunches, where X = 1,2,3,4 and Y = 4, 5, 6, 7, consistent worth ${alpha}$ = 1, ${beta}$ = 2. Locate the Residual Sum of Square (RSS) values of the two populace bunch. Solution: Given, ${X = 1,2,3,4 Y = 4,5,6,7 alpha = 1 beta = 2 }$ Arrangement: Substitute the given qualities in the recipe, Remaining Sum of Squares Formula ${RSS = sum_{i=0}^n(epsilon_i)^2 = sum_{i=0}^n(y_i – (alpha + beta x_i))^2, \[7pt] = sum(4-(1+(2x_1)))^2 + (5-(1+(2x_2)))^2 + (6-(1+(2x_3))^2 + (7-(1+(2x_4))^2, \[7pt] = sum(1)^2 + (0)^2 + (-1)^2 + (-2)^2, \[7pt] = 6 }$ Print Page Previous Next Advertisements ”;

Tableau – Home

Tableau Tutorial PDF Version Quick Guide Resources Job Search Discussion Tableau is a Business Intelligence tool for visually analyzing the data. Users can create and distribute an interactive and shareable dashboard, which depict the trends, variations, and density of the data in the form of graphs and charts. Tableau can connect to files, relational and Big Data sources to acquire and process data. The software allows data blending and real-time collaboration, which makes it very unique. It is used by businesses, academic researchers, and many government organizations for visual data analysis. It is also positioned as a leader Business Intelligence and Analytics Platform in Gartner Magic Quadrant. Audience This tutorial is designed for all those readers who want to create, read, write, and modify Business Intelligence Reports using Tableau. In addition, it will also be quite useful for those readers who would like to become a Data Analyst or a Data Scientist. Prerequisites Before proceeding with this tutorial, you should have a basic understanding of Computer Programming terminologies and Data analysis. You should also have some knowledge on various types of graphs and charts. Familiarity with SQL will be an added advantage. Print Page Previous Next Advertisements ”;

Regression Intercept Confidence Interval

Statistics – Regression Intercept Confidence Interval ”; Previous Next Regression Intercept Confidence Interval, is a way to determine closeness of two factors and is used to check the reliability of estimation. Formula ${R = beta_0 pm t(1 – frac{alpha}{2}, n-k-1) times SE_{beta_0} }$ Where − ${beta_0}$ = Regression intercept. ${k}$ = Number of Predictors. ${n}$ = sample size. ${SE_{beta_0}}$ = Standard Error. ${alpha}$ = Percentage of Confidence Interval. ${t}$ = t-value. Example Problem Statement: Compute the Regression Intercept Confidence Interval of following data. Total number of predictors (k) are 1, regression intercept ${beta_0}$ as 5, sample size (n) as 10 and standard error ${SE_{beta_0}}$ as 0.15. Solution: Let us consider the case of 99% Confidence Interval. Step 1: Compute t-value where ${ alpha = 0.99}$. ${ = t(1 – frac{alpha}{2}, n-k-1) \[7pt] = t(1 – frac{0.99}{2}, 10-1-1) \[7pt] = t(0.005,8) \[7pt] = 3.3554 }$ Step 2: ${ge} $Regression intercept: ${ = beta_0 + t(1 – frac{alpha}{2}, n-k-1) times SE_{beta_0} \[7pt] = 5 – (3.3554 times 0.15) \[7pt] = 5 – 0.50331 \[7pt] = 4.49669 }$ Step 3: ${le} $Regression intercept: ${ = beta_0 – t(1 – frac{alpha}{2}, n-k-1) times SE_{beta_0} \[7pt] = 5 + (3.3554 times 0.15) \[7pt] = 5 + 0.50331 \[7pt] = 5.50331 }$ As a result, Regression Intercept Confidence Interval is ${4.49669}$ or ${5.50331}$ for 99% Confidence Interval. Print Page Previous Next Advertisements ”;