Apache Tajo – Database Creation ”; Previous Next This section explains the Tajo DDL commands. Tajo has a built-in database named default. Create Database Statement Create Database is a statement used to create a database in Tajo. The syntax for this statement is as follows − CREATE DATABASE [IF NOT EXISTS] <database_name> Query default> default> create database if not exists test; Result The above query will generate the following result. OK Database is the namespace in Tajo. A database can contain multiple tables with a unique name. Show Current Database To check the current database name, issue the following command − Query default> c Result The above query will generate the following result. You are now connected to database “default” as user “user1″. default> Connect to Database As of now, you have created a database named “test”. The following syntax is used to connect the “test” database. c <database name> Query default> c test Result The above query will generate the following result. You are now connected to database “test” as user “user1”. test> You can now see the prompt changes from default database to test database. Drop Database To drop a database, use the following syntax − DROP DATABASE <database-name> Query test> c default You are now connected to database “default” as user “user1″. default> drop database test; Result The above query will generate the following result. OK Print Page Previous Next Advertisements ”;
Category: Big Data & Analytics
Apache Pig – Date-time Functions ”; Previous Next Apache Pig provides the following Date and Time functions − S.N. Functions & Description 1 ToDate(milliseconds) This function returns a date-time object according to the given parameters. The other alternative for this function are ToDate(iosstring), ToDate(userstring, format), ToDate(userstring, format, timezone) 2 CurrentTime() returns the date-time object of the current time. 3 GetDay(datetime) Returns the day of a month from the date-time object. 4 GetHour(datetime) Returns the hour of a day from the date-time object. 5 GetMilliSecond(datetime) Returns the millisecond of a second from the date-time object. 6 GetMinute(datetime) Returns the minute of an hour from the date-time object. 7 GetMonth(datetime) Returns the month of a year from the date-time object. 8 GetSecond(datetime) Returns the second of a minute from the date-time object. 9 GetWeek(datetime) Returns the week of a year from the date-time object. 10 GetWeekYear(datetime) Returns the week year from the date-time object. 11 GetYear(datetime) Returns the year from the date-time object. 12 AddDuration(datetime, duration) Returns the result of a date-time object along with the duration object. 13 SubtractDuration(datetime, duration) Subtracts the Duration object from the Date-Time object and returns the result. 14 DaysBetween(datetime1, datetime2) Returns the number of days between the two date-time objects. 15 HoursBetween(datetime1, datetime2) Returns the number of hours between two date-time objects. 16 MilliSecondsBetween(datetime1, datetime2) Returns the number of milliseconds between two date-time objects. 17 MinutesBetween(datetime1, datetime2) Returns the number of minutes between two date-time objects. 18 MonthsBetween(datetime1, datetime2) Returns the number of months between two date-time objects. 19 SecondsBetween(datetime1, datetime2) Returns the number of seconds between two date-time objects. 20 WeeksBetween(datetime1, datetime2) Returns the number of weeks between two date-time objects. 21 YearsBetween(datetime1, datetime2) Returns the number of years between two date-time objects. Print Page Previous Next Advertisements ”;
Apache Pig – Storing Data
Apache Pig – Storing Data ”; Previous Next In the previous chapter, we learnt how to load data into Apache Pig. You can store the loaded data in the file system using the store operator. This chapter explains how to store data in Apache Pig using the Store operator. Syntax Given below is the syntax of the Store statement. STORE Relation_name INTO ” required_directory_path ” [USING function]; Example Assume we have a file student_data.txt in HDFS with the following content. 001,Rajiv,Reddy,9848022337,Hyderabad 002,siddarth,Battacharya,9848022338,Kolkata 003,Rajesh,Khanna,9848022339,Delhi 004,Preethi,Agarwal,9848022330,Pune 005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 006,Archana,Mishra,9848022335,Chennai. And we have read it into a relation student using the LOAD operator as shown below. grunt> student = LOAD ”hdfs://localhost:9000/pig_data/student_data.txt” USING PigStorage(”,”) as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray ); Now, let us store the relation in the HDFS directory “/pig_Output/” as shown below. grunt> STORE student INTO ” hdfs://localhost:9000/pig_Output/ ” USING PigStorage (”,”); Output After executing the store statement, you will get the following output. A directory is created with the specified name and the data will be stored in it. 2015-10-05 13:05:05,429 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer. MapReduceLau ncher – 100% complete 2015-10-05 13:05:05,429 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats – Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.6.0 0.15.0 Hadoop 2015-10-0 13:03:03 2015-10-05 13:05:05 UNKNOWN Success! Job Stats (time in seconds): JobId Maps Reduces MaxMapTime MinMapTime AvgMapTime MedianMapTime job_14459_06 1 0 n/a n/a n/a n/a MaxReduceTime MinReduceTime AvgReduceTime MedianReducetime Alias Feature 0 0 0 0 student MAP_ONLY OutPut folder hdfs://localhost:9000/pig_Output/ Input(s): Successfully read 0 records from: “hdfs://localhost:9000/pig_data/student_data.txt” Output(s): Successfully stored 0 records in: “hdfs://localhost:9000/pig_Output” Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1443519499159_0006 2015-10-05 13:06:06,192 [main] INFO org.apache.pig.backend.hadoop.executionengine .mapReduceLayer.MapReduceLau ncher – Success! Verification You can verify the stored data as shown below. Step 1 First of all, list out the files in the directory named pig_output using the ls command as shown below. hdfs dfs -ls ”hdfs://localhost:9000/pig_Output/” Found 2 items rw-r–r- 1 Hadoop supergroup 0 2015-10-05 13:03 hdfs://localhost:9000/pig_Output/_SUCCESS rw-r–r- 1 Hadoop supergroup 224 2015-10-05 13:03 hdfs://localhost:9000/pig_Output/part-m-00000 You can observe that two files were created after executing the store statement. Step 2 Using cat command, list the contents of the file named part-m-00000 as shown below. $ hdfs dfs -cat ”hdfs://localhost:9000/pig_Output/part-m-00000” 1,Rajiv,Reddy,9848022337,Hyderabad 2,siddarth,Battacharya,9848022338,Kolkata 3,Rajesh,Khanna,9848022339,Delhi 4,Preethi,Agarwal,9848022330,Pune 5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 6,Archana,Mishra,9848022335,Chennai Print Page Previous Next Advertisements ”;
Apache Tajo – Installation
Apache Tajo – Installation ”; Previous Next To install Apache Tajo, you must have the following software on your system − Hadoop version 2.3 or greater Java version 1.7 or higher Linux or Mac OS Let us now continue with the following steps to install Tajo. Verifying Java installation Hopefully, you have already installed Java version 8 on your machine. Now, you just need to proceed by verifying it. To verify, use the following command − $ java -version If Java is successfully installed on your machine, you could see the present version of the installed Java. If Java is not installed follow these steps to install Java 8 on your machine. Download JDK Download the latest version of JDK by visiting the following link and then, download the latest version. https://www.oracle.com The latest version is JDK 8u 92 and the file is “jdk-8u92-linux-x64.tar.gz”. Please download the file on your machine. Following this, extract the files and move them to a specific directory. Now, set the Java alternatives. Finally, Java is installed on your machine. Verifying Hadoop Installation You have already installed Hadoop on your system. Now, verify it using the following command − $ hadoop version If everything is fine with your setup, then you could see the version of Hadoop. If Hadoop is not installed, download and install Hadoop by visiting the following link − https://www.apache.org Apache Tajo Installation Apache Tajo provides two execution modes — local mode and fully distributed mode. After verifying Java and Hadoop installation proceed with the following steps to install Tajo cluster on your machine. A local mode Tajo instance requires very easy configurations. Download the latest version of Tajo by visiting the following link − https://www.apache.org/dyn/closer.cgi/tajo Now you can download the file “tajo-0.11.3.tar.gz” from your machine. Extract Tar File Extract the tar file by using the following command − $ cd opt/ $ tar tajo-0.11.3.tar.gz $ cd tajo-0.11.3 Set Environment Variable Add the following changes to “conf/tajo-env.sh” file $ cd tajo-0.11.3 $ vi conf/tajo-env.sh # Hadoop home. Required export HADOOP_HOME = /Users/path/to/Hadoop/hadoop-2.6.2 # The java implementation to use. Required. export JAVA_HOME = /path/to/jdk1.8.0_92.jdk/ Here, you must specify Hadoop and Java path to “tajo-env.sh” file. After the changes are made, save the file and quit the terminal. Start Tajo Server To launch the Tajo server, execute the following command − $ bin/start-tajo.sh You will receive a response similar to the following − Starting single TajoMaster starting master, logging to /Users/path/to/Tajo/tajo-0.11.3/bin/../ localhost: starting worker, logging to /Users/path/toe/Tajo/tajo-0.11.3/bin/../logs/ Tajo master web UI: http://local:26080 Tajo Client Service: local:26002 Now, type the command “jps” to see the running daemons. $ jps 1010 TajoWorker 1140 Jps 933 TajoMaster Launch Tajo Shell (Tsql) To launch the Tajo shell client, use the following command − $ bin/tsql You will receive the following output − welcome to _____ ___ _____ ___ /_ _/ _ |/_ _/ / / // /_| |_/ // / / /_//_/ /_/___/ __/ 0.11.3 Try ? for help. Quit Tajo Shell Execute the following command to quit Tsql − default> q bye! Here, the default refers to the catalog in Tajo. Web UI Type the following URL to launch the Tajo web UI − http://localhost:26080/ You will now see the following screen which is similar to the ExecuteQuery option. Stop Tajo To stop the Tajo server, use the following command − $ bin/stop-tajo.sh You will get the following response − localhost: stopping worker stopping master Print Page Previous Next Advertisements ”;
Apache Tajo – Configuration Settings ”; Previous Next Tajo’s configuration is based on Hadoop’s configuration system. This chapter explains Tajo configuration settings in detail. Basic Settings Tajo uses the following two config files − catalog-site.xml − configuration for the catalog server. tajo-site.xml − configuration for other Tajo modules. Distributed Mode Configuration Distributed mode setup runs on Hadoop Distributed File System (HDFS). Let’s follow the steps to configure Tajo distributed mode setup. tajo-site.xml This file is available @ /path/to/tajo/conf directory and acts as configuration for other Tajo modules. To access Tajo in a distributed mode, apply the following changes to “tajo-site.xml”. <property> <name>tajo.rootdir</name> <value>hdfs://hostname:port/tajo</value> </property> <property> <name>tajo.master.umbilical-rpc.address</name> <value>hostname:26001</value> </property> <property> <name>tajo.master.client-rpc.address</name> <value>hostname:26002</value> </property> <property> <name>tajo.catalog.client-rpc.address</name> <value>hostname:26005</value> </property> Master Node Configuration Tajo uses HDFS as a primary storage type. The configuration is as follows and should be added to “tajo-site.xml”. <property> <name>tajo.rootdir</name> <value>hdfs://namenode_hostname:port/path</value> </property> Catalog Configuration If you want to customize the catalog service, copy $path/to/Tajo/conf/catalogsite.xml.template to $path/to/Tajo/conf/catalog-site.xml and add any of the following configuration as needed. For example, if you use “Hive catalog store” to access Tajo, then the configuration should be like the following − <property> <name>tajo.catalog.store.class</name> <value>org.apache.tajo.catalog.store.HCatalogStore</value> </property> If you need to store MySQL catalog, then apply the following changes − <property> <name>tajo.catalog.store.class</name> <value>org.apache.tajo.catalog.store.MySQLStore</value> </property> <property> <name>tajo.catalog.jdbc.connection.id</name> <value><mysql user name></value> </property> <property> <name>tajo.catalog.jdbc.connection.password</name> <value><mysql user password></value> </property> <property> <name>tajo.catalog.jdbc.uri</name> <value>jdbc:mysql://<mysql host name>:<mysql port>/<database name for tajo> ?createDatabaseIfNotExist = true</value> </property> Similarly, you can register the other Tajo supported catalogs in the configuration file. Worker Configuration By default, the TajoWorker stores temporary data on the local file system. It is defined in the “tajo-site.xml” file as follows − <property> <name>tajo.worker.tmpdir.locations</name> <value>/disk1/tmpdir,/disk2/tmpdir,/disk3/tmpdir</value> </property> To increase the capacity of running tasks of each worker resource, choose the following configuration − <property> <name>tajo.worker.resource.cpu-cores</name> <value>12</value> </property> <property> <name>tajo.task.resource.min.memory-mb</name> <value>2000</value> </property> <property> <name>tajo.worker.resource.disks</name> <value>4</value> </property> To make the Tajo worker run in a dedicated mode, choose the following configuration − <property> <name>tajo.worker.resource.dedicated</name> <value>true</value> </property> Print Page Previous Next Advertisements ”;
Load & Store Functions
Apache Pig – Load & Store Functions ”; Previous Next The Load and Store functions in Apache Pig are used to determine how the data goes ad comes out of Pig. These functions are used with the load and store operators. Given below is the list of load and store functions available in Pig. S.N. Function & Description 1 PigStorage() To load and store structured files. 2 TextLoader() To load unstructured data into Pig. 3 BinStorage() To load and store data into Pig using machine readable format. 4 Handling Compression In Pig Latin, we can load and store compressed data. Print Page Previous Next Advertisements ”;
Apache Pig – Limit Operator
Apache Pig – Limit Operator ”; Previous Next The LIMIT operator is used to get a limited number of tuples from a relation. Syntax Given below is the syntax of the LIMIT operator. grunt> Result = LIMIT Relation_name required number of tuples; Example Assume that we have a file named student_details.txt in the HDFS directory /pig_data/ as shown below. student_details.txt 001,Rajiv,Reddy,21,9848022337,Hyderabad 002,siddarth,Battacharya,22,9848022338,Kolkata 003,Rajesh,Khanna,22,9848022339,Delhi 004,Preethi,Agarwal,21,9848022330,Pune 005,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar 006,Archana,Mishra,23,9848022335,Chennai 007,Komal,Nayak,24,9848022334,trivendram 008,Bharathi,Nambiayar,24,9848022333,Chennai And we have loaded this file into Pig with the relation name student_details as shown below. grunt> student_details = LOAD ”hdfs://localhost:9000/pig_data/student_details.txt” USING PigStorage(”,”) as (id:int, firstname:chararray, lastname:chararray,age:int, phone:chararray, city:chararray); Now, let’s sort the relation in descending order based on the age of the student and store it into another relation named limit_data using the ORDER BY operator as shown below. grunt> limit_data = LIMIT student_details 4; Verification Verify the relation limit_data using the DUMP operator as shown below. grunt> Dump limit_data; Output It will produce the following output, displaying the contents of the relation limit_data as follows. (1,Rajiv,Reddy,21,9848022337,Hyderabad) (2,siddarth,Battacharya,22,9848022338,Kolkata) (3,Rajesh,Khanna,22,9848022339,Delhi) (4,Preethi,Agarwal,21,9848022330,Pune) Print Page Previous Next Advertisements ”;
Pig Latin – Basics
Pig Latin â Basics ”; Previous Next Pig Latin is the language used to analyze data in Hadoop using Apache Pig. In this chapter, we are going to discuss the basics of Pig Latin such as Pig Latin statements, data types, general and relational operators, and Pig Latin UDFâs. Pig Latin â Data Model As discussed in the previous chapters, the data model of Pig is fully nested. A Relation is the outermost structure of the Pig Latin data model. And it is a bag where − A bag is a collection of tuples. A tuple is an ordered set of fields. A field is a piece of data. Pig Latin â Statemets While processing data using Pig Latin, statements are the basic constructs. These statements work with relations. They include expressions and schemas. Every statement ends with a semicolon (;). We will perform various operations using operators provided by Pig Latin, through statements. Except LOAD and STORE, while performing all other operations, Pig Latin statements take a relation as input and produce another relation as output. As soon as you enter a Load statement in the Grunt shell, its semantic checking will be carried out. To see the contents of the schema, you need to use the Dump operator. Only after performing the dump operation, the MapReduce job for loading the data into the file system will be carried out. Example Given below is a Pig Latin statement, which loads data to Apache Pig. grunt> Student_data = LOAD ”student_data.txt” USING PigStorage(”,”)as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray ); Pig Latin â Data types Given below table describes the Pig Latin data types. S.N. Data Type Description & Example 1 int Represents a signed 32-bit integer. Example : 8 2 long Represents a signed 64-bit integer. Example : 5L 3 float Represents a signed 32-bit floating point. Example : 5.5F 4 double Represents a 64-bit floating point. Example : 10.5 5 chararray Represents a character array (string) in Unicode UTF-8 format. Example : âtutorials pointâ 6 Bytearray Represents a Byte array (blob). 7 Boolean Represents a Boolean value. Example : true/ false. 8 Datetime Represents a date-time. Example : 1970-01-01T00:00:00.000+00:00 9 Biginteger Represents a Java BigInteger. Example : 60708090709 10 Bigdecimal Represents a Java BigDecimal Example : 185.98376256272893883 Complex Types 11 Tuple A tuple is an ordered set of fields. Example : (raja, 30) 12 Bag A bag is a collection of tuples. Example : {(raju,30),(Mohhammad,45)} 13 Map A Map is a set of key-value pairs. Example : [ ânameâ#âRajuâ, âageâ#30] Null Values Values for all the above data types can be NULL. Apache Pig treats null values in a similar way as SQL does. A null can be an unknown value or a non-existent value. It is used as a placeholder for optional values. These nulls can occur naturally or can be the result of an operation. Pig Latin â Arithmetic Operators The following table describes the arithmetic operators of Pig Latin. Suppose a = 10 and b = 20. Operator Description Example + Addition − Adds values on either side of the operator a + b will give 30 − Subtraction − Subtracts right hand operand from left hand operand a − b will give −10 * Multiplication − Multiplies values on either side of the operator a * b will give 200 / Division − Divides left hand operand by right hand operand b / a will give 2 % Modulus − Divides left hand operand by right hand operand and returns remainder b % a will give 0 ? : Bincond − Evaluates the Boolean operators. It has three operands as shown below. variable x = (expression) ? value1 if true : value2 if false. b = (a == 1)? 20: 30; if a = 1 the value of b is 20. if a!=1 the value of b is 30. CASE WHEN THEN ELSE END Case − The case operator is equivalent to nested bincond operator. CASE f2 % 2 WHEN 0 THEN ”even” WHEN 1 THEN ”odd” END Pig Latin â Comparison Operators The following table describes the comparison operators of Pig Latin. Operator Description Example == Equal − Checks if the values of two operands are equal or not; if yes, then the condition becomes true. (a = b) is not true != Not Equal − Checks if the values of two operands are equal or not. If the values are not equal, then condition becomes true. (a != b) is true. > Greater than − Checks if the value of the left operand is greater than the value of the right operand. If yes, then the condition becomes true. (a > b) is not true. < Less than − Checks if the value of the left operand is less than the value of the right operand. If yes, then the condition becomes true. (a < b) is true. >= Greater than or equal to − Checks if the value of the left operand is greater than or equal to the value of the right operand. If yes, then the condition becomes true. (a >= b) is not true. <= Less than or equal to − Checks if the value of the left operand is less than or equal to the value of the right operand. If yes, then the condition becomes true. (a <= b) is true. matches Pattern matching − Checks whether the string in the left-hand side matches with the constant in the right-hand side. f1 matches ”.*tutorial.*” Pig Latin â Type Construction Operators The following table describes the Type construction operators of Pig Latin. Operator Description Example () Tuple constructor operator − This operator is used to construct a tuple. (Raju, 30) {} Bag constructor operator − This operator is used to construct a bag. {(Raju, 30), (Mohammad, 45)} [] Map constructor operator − This operator is used to construct a tuple. [name#Raja, age#30] Pig Latin â Relational Operations The following table describes the relational operators
Apache Pig – Illustrate Operator ”; Previous Next The illustrate operator gives you the step-by-step execution of a sequence of statements. Syntax Given below is the syntax of the illustrate operator. grunt> illustrate Relation_name; Example Assume we have a file student_data.txt in HDFS with the following content. 001,Rajiv,Reddy,9848022337,Hyderabad 002,siddarth,Battacharya,9848022338,Kolkata 003,Rajesh,Khanna,9848022339,Delhi 004,Preethi,Agarwal,9848022330,Pune 005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 006,Archana,Mishra,9848022335,Chennai. And we have read it into a relation student using the LOAD operator as shown below. grunt> student = LOAD ”hdfs://localhost:9000/pig_data/student_data.txt” USING PigStorage(”,”) as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray ); Now, let us illustrate the relation named student as shown below. grunt> illustrate student; Output On executing the above statement, you will get the following output. grunt> illustrate student; INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$M ap – Aliases being processed per job phase (AliasName[line,offset]): M: student[1,10] C: R: ——————————————————————————————— |student | id:int | firstname:chararray | lastname:chararray | phone:chararray | city:chararray | ——————————————————————————————— | | 002 | siddarth | Battacharya | 9848022338 | Kolkata | ——————————————————————————————— Print Page Previous Next Advertisements ”;
Apache Pig – Diagnostic Operators ”; Previous Next The load statement will simply load the data into the specified relation in Apache Pig. To verify the execution of the Load statement, you have to use the Diagnostic Operators. Pig Latin provides four different types of diagnostic operators − Dump operator Describe operator Explanation operator Illustration operator In this chapter, we will discuss the Dump operators of Pig Latin. Dump Operator The Dump operator is used to run the Pig Latin statements and display the results on the screen. It is generally used for debugging Purpose. Syntax Given below is the syntax of the Dump operator. grunt> Dump Relation_Name Example Assume we have a file student_data.txt in HDFS with the following content. 001,Rajiv,Reddy,9848022337,Hyderabad 002,siddarth,Battacharya,9848022338,Kolkata 003,Rajesh,Khanna,9848022339,Delhi 004,Preethi,Agarwal,9848022330,Pune 005,Trupthi,Mohanthy,9848022336,Bhuwaneshwar 006,Archana,Mishra,9848022335,Chennai. And we have read it into a relation student using the LOAD operator as shown below. grunt> student = LOAD ”hdfs://localhost:9000/pig_data/student_data.txt” USING PigStorage(”,”) as ( id:int, firstname:chararray, lastname:chararray, phone:chararray, city:chararray ); Now, let us print the contents of the relation using the Dump operator as shown below. grunt> Dump student Once you execute the above Pig Latin statement, it will start a MapReduce job to read data from HDFS. It will produce the following output. 2015-10-01 15:05:27,642 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher – 100% complete 2015-10-01 15:05:27,652 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats – Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 2.6.0 0.15.0 Hadoop 2015-10-01 15:03:11 2015-10-01 05:27 UNKNOWN Success! Job Stats (time in seconds): JobId job_14459_0004 Maps 1 Reduces 0 MaxMapTime n/a MinMapTime n/a AvgMapTime n/a MedianMapTime n/a MaxReduceTime 0 MinReduceTime 0 AvgReduceTime 0 MedianReducetime 0 Alias student Feature MAP_ONLY Outputs hdfs://localhost:9000/tmp/temp580182027/tmp757878456, Input(s): Successfully read 0 records from: “hdfs://localhost:9000/pig_data/ student_data.txt” Output(s): Successfully stored 0 records in: “hdfs://localhost:9000/tmp/temp580182027/ tmp757878456″ Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager spill count : 0Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_1443519499159_0004 2015-10-01 15:06:28,403 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLau ncher – Success! 2015-10-01 15:06:28,441 [main] INFO org.apache.pig.data.SchemaTupleBackend – Key [pig.schematuple] was not set… will not generate code. 2015-10-01 15:06:28,485 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat – Total input paths to process : 1 2015-10-01 15:06:28,485 [main] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil – Total input paths to process : 1 (1,Rajiv,Reddy,9848022337,Hyderabad) (2,siddarth,Battacharya,9848022338,Kolkata) (3,Rajesh,Khanna,9848022339,Delhi) (4,Preethi,Agarwal,9848022330,Pune) (5,Trupthi,Mohanthy,9848022336,Bhuwaneshwar) (6,Archana,Mishra,9848022335,Chennai) Print Page Previous Next Advertisements ”;