Tableau – String Calculations

Tableau – String Calculations ”; Previous Next In this chapter, you will learn about calculations in Tableau involving Strings. Tableau has many inbuilt string functions, which can be used to do string manipulations such as – comparing, concatenating, replacing few characters from a string, etc. Following are the steps to create a calculation field and use string functions in it. Create Calculated Field While connected to Sample superstore, go to the Analysis menu and click ‘Create Calculated Field’ as shown in the following screenshot. Calculation Editor The above step opens a calculation editor which lists all the functions that is available in Tableau. You can change the dropdown value and see only the functions related to strings. Create a Formula Consider you want to find out the sales in the cities, which contain the letter “o”. For this, create the formula as shown in the following screenshot. Using the Calculated Field Now, to see the created field in action, you can drag it to the Rows shelf and drag the Sales field to the Columns shelf. The following screenshot shows the Sales values. Print Page Previous Next Advertisements ”;

Sqoop – List Tables

Sqoop – List Tables ”; Previous Next This chapter describes how to list out the tables of a particular database in MySQL database server using Sqoop. Sqoop list-tables tool parses and executes the ‘SHOW TABLES’ query against a particular database. Thereafter, it lists out the present tables in a database. Syntax The following syntax is used for Sqoop list-tables command. $ sqoop list-tables (generic-args) (list-tables-args) $ sqoop-list-tables (generic-args) (list-tables-args) Sample Query The following command is used to list all the tables in the userdb database of MySQL database server. $ sqoop list-tables –connect jdbc:mysql://localhost/userdb –username root If the command is executes successfully, then it will display the list of tables in the userdb database as follows. … 13/05/31 16:45:58 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. emp emp_add emp_contact Print Page Previous Next Advertisements ”;

Tableau – Basic Filters

Tableau – Basic Filters ”; Previous Next Filtering is the process of removing certain values or range of values from a result set. Tableau filtering feature allows both simple scenarios using field values as well as advanced calculation or context-based filters. In this chapter, you will learn about the basic filters available in Tableau. There are three types of basic filters available in Tableau. They are as follows − Filter Dimensions are the filters applied on the dimension fields. Filter Measures are the filters applied on the measure fields. Filter Dates are the filters applied on the date fields. Filter Dimensions These filters are applied on the dimension fields. Typical examples include filtering based on categories of text or numeric values with logical expressions greater than or less than conditions. Example We use the Sample – Superstore data source to apply dimension filters on the sub-category of products. We create a view for showing profit for each sub-category of products according to their shipping mode. For it, drag the dimension field “Sub-Category” to the Rows shelf and the measure field “profit” to the Columns shelf. Next, drag the Sub-Category dimension to the Filters shelf to open the Filter dialog box. Click the None button at the bottom of the list to deselect all segments. Then, select the Exclude option in the lower right corner of the dialog box. Finally, select Labels and Storage and then click OK. The following screenshot shows the result with the above two categories excluded. Filter Measures These filters are applied on the measure fields. Filtering is based on the calculations applied to the measure fields. Hence, while in dimension filters you use only values to filter, in measures filter you use calculations based on fields. Example You can use the Sample – Superstore data source to apply dimension filters on the average value of the profits. First, create a view with ship mode and subcategory as dimensions and Average of profit as shown in the following screenshot. Next, drag the AVG (profit) value to the filter pane. Choose Average as the filter mode. Next, choose “At least” and give a value to filter the rows, which meet these criteria. After completion of the above steps, we get the final view below showing only the subcategories whose average profit is greater than 20. Filter Dates Tableau treats the date field in three different ways while applying the date field. It can apply filter by taking a relative date as compared to today, an absolute date, or range of dates. Each of this option is presented when a date field is dragged out of the filter pane. Example We choose the sample – Superstore data source and create a view with order date in the column shelf and profit in the rows shelf as shown in the following screenshot. Next, drag the “order date” field to the filter shelf and choose Range of dates in the filter dialog box. Choose the dates as shown in the following screenshot. On clicking OK, the final view appears showing the result for the chosen range of dates as seen in the following screenshot. Print Page Previous Next Advertisements ”;

Z table

Statistics – Z table ”; Previous Next Standard Normal Probability Table The following table shows the area under the curve to the left of a z-score: z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09 -3.4 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0002 -3.3 .0005 .0005 .0005 .0004 .0004 .0004 .0004 .0004 .0004 .0003 -3.2 .0007 .0007 .0006 .0006 .0006 .0006 .0006 .0005 .0005 .0005 -3.1 .0010 .0009 .0009 .0009 .0008 .0008 .0008 .0008 .0007 .0007 -3.0 .0013 .0013 .0013 .0012 .0012 .0011 .0011 .0011 .0010 .0010 -2.9 .0019 .0018 .0018 .0017 .0016 .0016 .0015 .0015 .0014 .0014 -2.8 .0026 .0025 .0024 .0023 .0023 .0022 .0021 .0021 .0020 .0019 -2.7 .0035 .0034 .0033 .0032 .0031 .0030 .0029 .0028 .0027 .0026 -2.6 .0047 .0045 .0044 .0043 .0041 .0040 .0039 .0038 .0037 .0036 -2.5 .0062 .0060 .0059 .0057 .0055 .0054 .0052 .0051 .0049 .0048 -2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064 -2.3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084 -2.2 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110 -2.1 .0179 .0174 .0170 .0166 .0162 .0158 .0154 .0150 .0146 .0143 -2.0 .0228 .0222 .0217 .0212 .0207 .0202 .0197 .0192 .0188 .0183 -1.9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233 -1.8 .0359 .0351 .0344 .0336 .0329 .0322 .0314 .0307 .0301 .0294 -1.7 .0446 .0436 .0427 .0418 .0409 .0401 .0392 .0384 .0375 .0367 -1.6 .0548 .0537 .0526 .0516 .0505 .0495 .0485 .0475 .0465 .0455 -1.5 .0668 .0655 .0643 .0630 .0618 .0606 .0594 .0582 .0571 .0559 -1.4 .0808 .0793 .0778 .0764 .0749 .0735 .0721 .0708 .0694 .0681 -1.3 .0968 .0951 .0934 .0918 .0901 .0885 .0869 .0853 .0838 .0823 -1.2 .1151 .1131 .1112 .1093 .1075 .1056 .1038 .1020 .1003 .0985 -1.1 .1357 .1335 .1314 .1292 .1271 .1251 .1230 .1210 .1190 .1170 -1.0 .1587 .1562 .1539 .1515 .1492 .1469 .1446 .1423 .1401 .1379 -0.9 .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611 -0.8 .2119 .2090 .2061 .2033 .2005 .1977 .1949 .1922 .1894 .1867 -0.7 .2420 .2389 .2358 .2327 .2296 .2266 .2236 .2206 .2177 .2148 -0.6 .2743 .2709 .2676 .2643 .2611 .2578 .2546 .2514 .2483 .2451 -0.5 .3085 .3050 .3015 .2s981 .2946 .2912 .2877 .2843 .2810 .2776 -0.4 .3446 .3409 .3372 .3336 .3300 .3264 .3228 .3192 .3156 .3121 -0.3 .3821 .3783 .3745 .3707 .3669 .3632 .3594 .3557 .3520 .3483 -0.2 .4207 .4168 .4129 .4090 .4052 .4013 .3974 .3936 .3897 .3859 -0.1 .4602 .4562 .4522 .4483 .4443 .4404 .4364 .4325 .4286 .4247 0.0 .5000 .4960 .4920 .4880 .4840 .4801 .4761 .4721 .4681 .4641 The following table shows the area under the curve to the left of a z-score: z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09 0.0 .5000 .4960 .4920 .4880 .4840 .4801 .4761 .4721 .4681 .4641 0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753 0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141 0.3 .6179 .6217 .6255 .6293 .6331 .6368 .6406 .6443 .6480 .6517 0.4 .6554 .6591 .6628 .6664 .6700 .6736 .6772 .6808 .6844 .6879 0.5 .6915 .6950 .6985 .7019 .7054 .7088 .7123 .7157 .7190 .7224 0.6 .7257 .7291 .7324 .7357 .7389 .7422 .7454 .7486 .7517 .7549 0.7 .7580 .7611 .7642 .7673 .7704 .7734 .7764 .7794 .7823 .7852 0.8 .7881 .7910 .7939 .7967 .7995 .8023 .8051 .8078 .8106 .8133 0.9 .8159 .8186 .8212 .8238 .8264 .8289 .8315 .8340 .8365 .8389 1.0 .8413 .8438 .8461 .8485 .8508 .8531 .8554 .8577 .8599 .8621 1.1 .8643 .8665 .8686 .8708 .8729 .8749 .8770 .8790 .8810 .8830 1.2 .8849 .8869 .8888 .8907 .8925 .8944 .8962 .8980 .8997 .9015 1.3 .9032 .9049 .9066 .9082 .9099 .9115 .9131 .9147 .9162 .9177 1.4 .9192 .9207 .9222 .9236 .9251 .9265 .9279 .9292 .9306 .9319 1.5 .9332 .9345 .9357 .9370 .9382 .9394 .9406 .9418 .9429 .9441 1.6 .9452 .9463 .9474 .9484 .9495 .9505 .9515 .9525 .9535 .9545 1.7 .9554 .9564 .9573 .9582 .9591 .9599 .9608 .9616 .9625 .9633 1.8 .9641 .9649 .9656 .9664 .9671 .9678 .9686 .9693 .9699 .9706 1.9 .9713 .9719 .9726 .9732 .9738 .9744 .9750 .9756 .9761 .9767 2.0 .9772 .9778 .9783 .9788 .9793 .9798 .9803 .9808 .9812 .9817 2.1 .9821 .9826 .9830 .9834 .9838 .9842 .9846 .9850 .9854 .9857 2.2 .9861 .9864 .9868 .9871 .9875 .9878 .9881 .9884 .9887 .9890 2.3 .9893 .9896 .9898 .9901 .9904 .9906 .9909 .9911 .9913 .9916 2.4 .9918 .9920 .9922 .9925 .9927 .9929 .9931 .9932 .9934 .9936 2.5 .9938 .9940 .9941 .9943 .9945 .9946 .9948 .9949 .9951 .9952 2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964 2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974 2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981 2.9 .9981 .9982 .9982 .9983 .9984 .9984 .9985 .9985 .9986 .9986 3.0 .9987 .9987 .9987 .9988 .9988 .9989 .9989 .9989 .9990 .9990 3.1 .9990 .9991 .9991 .9991 .9992 .9992 .9992 .9992 .9993 .9993 3.2 .9993 .9993 .9994 .9994 .9994 .9994 .9994 .9995 .9995 .9995 3.3 .9995 .9995 .9995 .9996 .9996 .9996 .9996 .9996 .9996 .9997 3.4 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9997 .9998 Print Page Previous Next Advertisements ”;

Correlation Co-efficient

Statistics – Correlation Co-efficient ”; Previous Next Correlation Co-efficient A correlation coefficient is a statistical measure of the degree to which changes to the value of one variable predict change to the value of another. In positively correlated variables, the value increases or decreases in tandem. In negatively correlated variables, the value of one increases as the value of the other decreases. Correlation coefficients are expressed as values between +1 and -1. A coefficient of +1 indicates a perfect positive correlation: A change in the value of one variable will predict a change in the same direction in the second variable. A coefficient of -1 indicates a perfect negative: A change in the value of one variable predicts a change in the opposite direction in the second variable. Lesser degrees of correlation are expressed as non-zero decimals. A coefficient of zero indicates there is no discernable relationship between fluctuations of the variables. Formula ${r = frac{N sum xy – (sum x)(sum y)}{sqrt{[Nsum x^2 – (sum x)^2][Nsum y^2 – (sum y)^2]}} }$ Where − ${N}$ = Number of pairs of scores ${sum xy}$ = Sum of products of paired scores. ${sum x}$ = Sum of x scores. ${sum y}$ = Sum of y scores. ${sum x^2}$ = Sum of squared x scores. ${sum y^2}$ = Sum of squared y scores. Example Problem Statement: Calculate the correlation co-efficient of the following: X Y 1 2 3 5 4 5 4 8 Solution: ${ sum xy = (1)(2) + (3)(5) + (4)(5) + (4)(8) = 69 \[7pt] sum x = 1 + 3 + 4 + 4 = 12 \[7pt] sum y = 2 + 5 + 5 + 8 = 20 \[7pt] sum x^2 = 1^2 + 3^2 + 4^2 + 4^2 = 42 \[7pt] sum y^2 = 2^2 + 5^2 + 5^2 + 8^2 = 118 \[7pt] r= frac{69 – frac{(12)(20)}{4}}{sqrt{(42 – frac{(12)^2}{4})(118-frac{(20)^2}{4}}} \[7pt] = .866 }$ Print Page Previous Next Advertisements ”;

Tableau – Filter Operations

Tableau – Filter Operations ”; Previous Next Any data analysis and visualization work involves the use of extensive filtering of data. Tableau has a very wide variety of filtering options to address these needs. There are many inbuilt functions for applying filters on the records using both dimensions and measures. The filter option for measures offers numeric calculations and comparison. The filter option for dimension offers choosing string values from a list or using a custom list of values. In this chapter, you will learn about the various options as well as the steps to edit and clear the filters. Creating Filters Filters are created by dragging the required field to the Filters shelf located above the Marks card. Create a horizontal bar chart by dragging the measure sales to the Columns shelf and the dimension Sub-Category to the Rows shelf. Again drag the measure sales into the Filters shelf. Once this filter is created, right-click and choose the edit filter option from the pop-up menu. Creating Filters for Measures Measures are numeric fields. So, the filter options for such fields involve choosing values. Tableau offers the following types of filters for measures. Range of Values − Specifies the minimum and maximum values of the range to include in the view. At Least − Includes all values that are greater than or equal to a specified minimum value. At Most − Includes all values that are less than or equal to a specified maximum value. Special − Helps you filter on Null values. Include only Null values, Non-null values, or All Values. Following worksheet shows these options. Creating Filters for Dimensions Dimensions are descriptive fields having values which are strings. Tableau offers the following types of filters for dimensions. General Filter − allows to select specific values from a list. Wildcard Filter − allows to mention wildcards like cha* to filter all string values starting with cha. Condition Filter − applies conditions such as sum of sales. Top Filter − chooses the records representing a range of top values. Following worksheet shows these options. Clearing Filters Filters can be easily removed by choosing the clear filter option as shown in the following screenshot. Print Page Previous Next Advertisements ”;

Tableau – Top Filters

Tableau – Top Filters ”; Previous Next The Top option in Tableau filter is used to limit the result set from a filter. For example, from a large set of records on sales you want only the top 10 values. You can apply this filter using the inbuilt options for limiting the records in many ways or by creating a formula. In this chapter, you will explore the inbuilt options. Creating a Top Filter Using the Sample-superstore, find the sub-category of products which represents the top 5 sales amount. To achieve this objective, following are the steps. Step 1 − Drag the dimension Sub-Category to the Rows shelf and the Measure Sales to the Columns shelf. Choose the horizontal bar as the chart type. Tableau shows the following chart. Step 2 − Right-click on the field Sub-Category and go to the tab named Top. Here, choose the second radio option by field. From the drop-down, choose the option Top 5 by Sum of Sales. On completion of the above step, you will get the following chart, which shows the top 5 Sub-Category of products by sales. Print Page Previous Next Advertisements ”;

Zookeeper – Installation

Zookeeper – Installation ”; Previous Next Before installing ZooKeeper, make sure your system is running on any of the following operating systems − Any of Linux OS − Supports development and deployment. It is preferred for demo applications. Windows OS − Supports only development. Mac OS − Supports only development. ZooKeeper server is created in Java and it runs on JVM. You need to use JDK 6 or greater. Now, follow the steps given below to install ZooKeeper framework on your machine. Step 1: Verifying Java Installation We believe you already have a Java environment installed on your system. Just verify it using the following command. $ java -version If you have Java installed on your machine, then you could see the version of installed Java. Otherwise, follow the simple steps given below to install the latest version of Java. Step 1.1: Download JDK Download the latest version of JDK by visiting the following link and download the latest version. Java The latest version (while writing this tutorial) is JDK 8u 60 and the file is “jdk-8u60-linuxx64.tar.gz”. Please download the file on your machine. Step 1.2: Extract the files Generally, files are downloaded to the downloads folder. Verify it and extract the tar setup using the following commands. $ cd /go/to/download/path $ tar -zxf jdk-8u60-linux-x64.gz Step 1.3: Move to opt directory To make Java available to all users, move the extracted java content to “/usr/local/java” folder. $ su password: (type password of root user) $ mkdir /opt/jdk $ mv jdk-1.8.0_60 /opt/jdk/ Step 1.4: Set path To set path and JAVA_HOME variables, add the following commands to ~/.bashrc file. export JAVA_HOME = /usr/jdk/jdk-1.8.0_60 export PATH=$PATH:$JAVA_HOME/bin Now, apply all the changes into the current running system. $ source ~/.bashrc Step 1.5: Java alternatives Use the following command to change Java alternatives. update-alternatives –install /usr/bin/java java /opt/jdk/jdk1.8.0_60/bin/java 100 Step 1.6 Verify the Java installation using the verification command (java -version) explained in Step 1. Step 2: ZooKeeper Framework Installation Step 2.1: Download ZooKeeper To install ZooKeeper framework on your machine, visit the following link and download the latest version of ZooKeeper. http://zookeeper.apache.org/releases.html As of now, the latest version of ZooKeeper is 3.4.6 (ZooKeeper-3.4.6.tar.gz). Step 2.2: Extract the tar file Extract the tar file using the following commands − $ cd opt/ $ tar -zxf zookeeper-3.4.6.tar.gz $ cd zookeeper-3.4.6 $ mkdir data Step 2.3: Create configuration file Open the configuration file named conf/zoo.cfg using the command vi conf/zoo.cfg and all the following parameters to set as starting point. $ vi conf/zoo.cfg tickTime = 2000 dataDir = /path/to/zookeeper/data clientPort = 2181 initLimit = 5 syncLimit = 2 Once the configuration file has been saved successfully, return to the terminal again. You can now start the zookeeper server. Step 2.4: Start ZooKeeper server Execute the following command − $ bin/zkServer.sh start After executing this command, you will get a response as follows − $ JMX enabled by default $ Using config: /Users/../zookeeper-3.4.6/bin/../conf/zoo.cfg $ Starting zookeeper … STARTED Step 2.5: Start CLI Type the following command − $ bin/zkCli.sh After typing the above command, you will be connected to the ZooKeeper server and you should get the following response. Connecting to localhost:2181 ……………. ……………. ……………. Welcome to ZooKeeper! ……………. ……………. WATCHER:: WatchedEvent state:SyncConnected type: None path:null [zk: localhost:2181(CONNECTED) 0] Stop ZooKeeper Server After connecting the server and performing all the operations, you can stop the zookeeper server by using the following command. $ bin/zkServer.sh stop Print Page Previous Next Advertisements ”;

Tableau – Extracting Data

Tableau – Extracting Data ”; Previous Next Data extraction in Tableau creates a subset of data from the data source. This is useful in increasing the performance by applying filters. It also helps in applying some features of Tableau to data which may not be available in the data source like finding the distinct values in the data. However, the data extract feature is most frequently used for creating an extract to be stored in the local drive for offline access by Tableau. Creating an Extract Extraction of data is done by following the menu – Data → Extract Data. It creates many options such as applying limits to how many rows to be extracted and whether to aggregate data for dimensions. The following screen shows the Extract Data option. Applying Extract Filters To extract a subset of data from the data source, you can create filters which will return only the relevant rows. Let’s consider the Sample Superstore data set and create an extract. In the filter option, choose Select from list and tick mark the checkbox value for which you need to pull the data from the source. Adding New Data to Extract To add more data for an already created extract, you can choose the option Data → Extract → Append Data from File. In this case, browse the file containing the data and click OK to finish. Of course, the number and datatype of columns in the file should be in sync with the existing data. Extract History You can verify the history of data extracts to be sure about how many times the extract has happened and at what times. For this, you can use the menu – Data → Extract History. Print Page Previous Next Advertisements ”;

Sqoop – Export

Sqoop – Export ”; Previous Next This chapter describes how to export data back from the HDFS to the RDBMS database. The target table must exist in the target database. The files which are given as input to the Sqoop contain records, which are called rows in table. Those are read and parsed into a set of records and delimited with user-specified delimiter. The default operation is to insert all the record from the input files to the database table using the INSERT statement. In update mode, Sqoop generates the UPDATE statement that replaces the existing record into the database. Syntax The following is the syntax for the export command. $ sqoop export (generic-args) (export-args) $ sqoop-export (generic-args) (export-args) Example Let us take an example of the employee data in file, in HDFS. The employee data is available in emp_data file in ‘emp/’ directory in HDFS. The emp_data is as follows. 1201, gopal, manager, 50000, TP 1202, manisha, preader, 50000, TP 1203, kalil, php dev, 30000, AC 1204, prasanth, php dev, 30000, AC 1205, kranthi, admin, 20000, TP 1206, satish p, grp des, 20000, GR It is mandatory that the table to be exported is created manually and is present in the database from where it has to be exported. The following query is used to create the table ‘employee’ in mysql command line. $ mysql mysql> USE db; mysql> CREATE TABLE employee ( id INT NOT NULL PRIMARY KEY, name VARCHAR(20), deg VARCHAR(20), salary INT, dept VARCHAR(10)); The following command is used to export the table data (which is in emp_data file on HDFS) to the employee table in db database of Mysql database server. $ sqoop export –connect jdbc:mysql://localhost/db –username root –table employee –export-dir /emp/emp_data The following command is used to verify the table in mysql command line. mysql>select * from employee; If the given data is stored successfully, then you can find the following table of given employee data. +——+————–+————-+——————-+——–+ | Id | Name | Designation | Salary | Dept | +——+————–+————-+——————-+——–+ | 1201 | gopal | manager | 50000 | TP | | 1202 | manisha | preader | 50000 | TP | | 1203 | kalil | php dev | 30000 | AC | | 1204 | prasanth | php dev | 30000 | AC | | 1205 | kranthi | admin | 20000 | TP | | 1206 | satish p | grp des | 20000 | GR | +——+————–+————-+——————-+——–+ Print Page Previous Next Advertisements ”;