Discuss Apache Presto ”; Previous Next JApache Presto is an open source distributed SQL engine. Presto originated at Facebook for data analytics needs and later was open sourced. Now, Teradata joins Presto community and offers support. Apache Presto is very useful for performing queries even petabytes of data. Extensible architecture and storage plugin interfaces are very easy to interact with other file systems. Most of today’s best industrial companies are adopting Presto for its interactive speeds and low latency performance. This tutorial explores Presto architecture, configuration, and storage plugins. It discusses the basic and advanced queries and finally concludes with real-time examples. Print Page Previous Next Advertisements ”;
Category: apache Presto
Apache Presto – SQL Functions ”; Previous Next As of now we were discussing running some simple basic queries on Presto. This chapter will discuss the important SQL functions. Math Functions Math functions operate on mathematical formulas. Following table describes the list of functions in detail. S.No. Function & Description 1. abs(x) Returns the absolute value of x 2. cbrt(x) Returns the cube root of x 3. ceiling(x) Returns the x value rounded up to the nearest integer 4. ceil(x) Alias for ceiling(x) 5. degrees(x) Returns the degree value for x 6. e(x) Returns the double value for Euler’s number 7. exp(x) Returns the exponent value for Euler’s number 8. floor(x) Returns x rounded down to the nearest integer 9. from_base(string,radix) Returns the value of string interpreted as a base-radix number 10. ln(x) Returns the natural logarithm of x 11. log2(x) Returns the base 2 logarithm of x 12. log10(x) Returns the base 10 logarithm of x 13. log(x,y) Returns the base y logarithm of x 14. mod(n,m) Returns the modulus (remainder) of n divided by m 15. pi() Returns pi value. The result will be returned as a double value 16. power(x,p) Returns power of value ‘p’ to the x value 17. pow(x,p) Alias for power(x,p) 18. radians(x) converts the angle x in degree radians 19. rand() Alias for radians() 20. random() Returns the pseudo-random value 21. rand(n) Alias for random() 22. round(x) Returns the rounded value for x 23. round(x,d) x value rounded for the ‘d’ decimal places 24. sign(x) Returns the signum function of x, i.e., 0 if the argument is 0 1 if the argument is greater than 0 -1 if the argument is less than 0 For double arguments, the function additionally returns − NaN if the argument is NaN 1 if the argument is +Infinity -1 if the argument is -Infinity 25. sqrt(x) Returns the square root of x 26. to_base(x,radix) Return type is archer. The result is returned as the base radix for x 27. truncate(x) Truncates the value for x 28. width_bucket(x, bound1, bound2, n) Returns the bin number of x specified bound1 and bound2 bounds and n number of buckets 29. width_bucket(x, bins) Returns the bin number of x according to the bins specified by the array bins Trigonometric Functions Trigonometric functions arguments are represented as radians(). Following table lists out the functions. S.No Functions & Description 1. acos(x) Returns the inverse cosine value(x) 2. asin(x) Returns the inverse sine value(x) 3. atan(x) Returns the inverse tangent value(x) 4. atan2(y,x) Returns the inverse tangent value(y/x) 5. cos(x) Returns the cosine value(x) 6. cosh(x) Returns the hyperbolic cosine value(x) 7. sin(x) Returns the sine value(x) 8. tan(x) Returns the tangent value(x) 9. tanh(x) Returns the hyperbolic tangent value(x) Bitwise Functions The following table lists out the Bitwise functions. S.No Functions & Description 1. bit_count(x, bits) Count the number of bits 2. bitwise_and(x,y) Perform bitwise AND operation for two bits, x and y 3. bitwise_or(x,y) Bitwise OR operation between two bits x, y 4. bitwise_not(x) Bitwise Not operation for bit x 5. bitwise_xor(x,y) XOR operation for bits x, y String Functions Following table lists out the String functions. S.No Functions & Description 1. concat(string1, …, stringN) Concatenate the given strings 2. length(string) Returns the length of the given string 3. lower(string) Returns the lowercase format for the string 4. upper(string) Returns the uppercase format for the given string 5. lpad(string, size, padstring) Left padding for the given string 6. ltrim(string) Removes the leading whitespace from the string 7. replace(string, search, replace) Replaces the string value 8. reverse(string) Reverses the operation performed for the string 9. rpad(string, size, padstring) Right padding for the given string 10. rtrim(string) Removes the trailing whitespace from the string 11. split(string, delimiter) Splits the string on delimiter and returns an array of size at the most limit 12. split_part(string, delimiter, index) Splits the string on delimiter and returns the field index 13. strpos(string, substring) Returns the starting position of the substring in the string 14. substr(string, start) Returns the substring for the given string 15. substr(string, start, length) Returns the substring for the given string with the specific length 16. trim(string) Removes the leading and trailing whitespace from the string Date and Time Functions Following table lists out the Date and Time functions. S.No Functions & Description 1. current_date Returns the current date 2. current_time Returns the current time 3. current_timestamp Returns the current timestamp 4. current_timezone() Returns the current timezone 5. now() Returns the current date,timestamp with the timezone 6. localtime Returns the local time 7. localtimestamp Returns the local timestamp Regular Expression Functions The following table lists out the Regular Expression functions. S.No Functions & Description 1. regexp_extract_all(string, pattern) Returns the string matched by the regular expression for the pattern 2. regexp_extract_all(string, pattern, group) Returns the string matched by the regular expression for the pattern and the group 3. regexp_extract(string, pattern) Returns the first substring matched by the regular expression for the pattern 4. regexp_extract(string, pattern, group) Returns the first substring matched by the regular expression for the pattern and the group 5. regexp_like(string, pattern) Returns the string matches for the pattern. If the string is returned, the value will be true otherwise false 6. regexp_replace(string, pattern) Replaces the instance of the string matched for the expression with the pattern 7. regexp_replace(string, pattern, replacement) Replace the instance of the string matched for the expression with the pattern and replacement 8. regexp_split(string, pattern) Splits the regular expression for the given pattern JSON Functions The following table lists out JSON functions. S.No Functions & Description 1. json_array_contains(json, value) Check the value exists in a json array. If the value exists it will return true, otherwise false 2. json_array_get(json_array, index) Get the element for index in json array 3. json_array_length(json) Returns the length in json array 4. json_format(json) Returns the json structure format 5. json_parse(string) Parses the string as a json 6. json_size(json, json_path) Returns the size of the value URL Functions The following table lists
Apache Presto – HIVE Connector ”; Previous Next The Hive connector allows querying data stored in a Hive data warehouse. Prerequisites Hadoop Hive Hopefully you have installed Hadoop and Hive on your machine. Start all the services one by one in the new terminal. Then, start hive metastore using the following command, hive –service metastore Presto uses Hive metastore service to get the hive table’s details. Configuration Settings Create a file “hive.properties” under “etc/catalog” directory. Use the following command. $ cd etc $ cd catalog $ vi hive.properties connector.name = hive-cdh4 hive.metastore.uri = thrift://localhost:9083 After making all the changes, save the file and quit the terminal. Create Database Create a database in Hive using the following query − Query hive> CREATE SCHEMA tutorials; After the database is created, you can verify it using the “show databases” command. Create Table Create Table is a statement used to create a table in Hive. For example, use the following query. hive> create table author(auth_id int, auth_name varchar(50), topic varchar(100) STORED AS SEQUENCEFILE; Insert Table Following query is used to insert records in hive’s table. hive> insert into table author values (1,’ Doug Cutting’,Hadoop), (2,’ James Gosling’,java),(3,’ Dennis Ritchie’,C); Start Presto CLI You can start Presto CLI to connect Hive storage plugin using the following command. $ ./presto –server localhost:8080 –catalog hive —schema tutorials; You will receive the following response. presto:tutorials > List Schemas To list out all the schemas in Hive connector, type the following command. Query presto:tutorials > show schemas from hive; Result default tutorials List Tables To list out all the tables in “tutorials” schema, use the following query. Query presto:tutorials > show tables from hive.tutorials; Result author Fetch Table Following query is used to fetch all the records from hive’s table. Query presto:tutorials > select * from hive.tutorials.author; Result auth_id | auth_name | topic ———+—————-+——– 1 | Doug Cutting | Hadoop 2 | James Gosling | java 3 | Dennis Ritchie | C Print Page Previous Next Advertisements ”;
Custom Function Application
Apache Presto – Custom Function Application ”; Previous Next Create a Maven project to develop Presto custom function. SimpleFunctionsFactory.java Create SimpleFunctionsFactory class to implement FunctionFactory interface. package com.tutorialspoint.simple.functions; import com.facebook.presto.metadata.FunctionFactory; import com.facebook.presto.metadata.FunctionListBuilder; import com.facebook.presto.metadata.SqlFunction; import com.facebook.presto.spi.type.TypeManager; import java.util.List; public class SimpleFunctionFactory implements FunctionFactory { private final TypeManager typeManager; public SimpleFunctionFactory(TypeManager typeManager) { this.typeManager = typeManager; } @Override public List<SqlFunction> listFunctions() { return new FunctionListBuilder(typeManager) .scalar(SimpleFunctions.class) .getFunctions(); } } SimpleFunctionsPlugin.java Create a SimpleFunctionsPlugin class to implement Plugin interface. package com.tutorialspoint.simple.functions; import com.facebook.presto.metadata.FunctionFactory; import com.facebook.presto.spi.Plugin; import com.facebook.presto.spi.type.TypeManager; import com.google.common.collect.ImmutableList; import javax.inject.Inject; import java.util.List; import static java.util.Objects.requireNonNull; public class SimpleFunctionsPlugin implements Plugin { private TypeManager typeManager; @Inject public void setTypeManager(TypeManager typeManager) { this.typeManager = requireNonNull(typeManager, “typeManager is null”); //Inject TypeManager class here } @Override public <T> List<T> getServices(Class<T> type){ if (type == FunctionFactory.class) { return ImmutableList.of(type.cast(new SimpleFunctionFactory(typeManager))); } return ImmutableList.of(); } } Add Resource File Create a resource file which is specified in the implementation package. (com.tutorialspoint.simple.functions.SimpleFunctionsPlugin) Now move to the resource file location @ /path/to/resource/ Then add the changes, com.facebook.presto.spi.Plugin pom.xml Add the following dependencies to pom.xml file. <?xml version = “1.0”?> <project xmlns = “http://maven.apache.org/POM/4.0.0″ xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation = “http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd”> <modelVersion>4.0.0</modelVersion> <groupId>com.tutorialspoint.simple.functions</groupId> <artifactId>presto-simple-functions</artifactId> <packaging>jar</packaging> <version>1.0</version> <name>presto-simple-functions</name> <description>Simple test functions for Presto</description> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> </properties> <dependencies> <dependency> <groupId>com.facebook.presto</groupId> <artifactId>presto-spi</artifactId> <version>0.149</version> </dependency> <dependency> <groupId>com.facebook.presto</groupId> <artifactId>presto-main</artifactId> <version>0.149</version> </dependency> <dependency> <groupId>javax.inject</groupId> <artifactId>javax.inject</artifactId> <version>1</version> </dependency> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>19.0</version> </dependency> </dependencies> <build> <finalName>presto-simple-functions</finalName> <plugins> <!– Make this jar executable –> <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-jar-plugin</artifactId> <version>2.3.2</version> </plugin> </plugins> </build> </project> SimpleFunctions.java Create SimpleFunctions class using Presto attributes. package com.tutorialspoint.simple.functions; import com.facebook.presto.operator.Description; import com.facebook.presto.operator.scalar.ScalarFunction; import com.facebook.presto.operator.scalar.StringFunctions; import com.facebook.presto.spi.type.StandardTypes; import com.facebook.presto.type.LiteralParameters; import com.facebook.presto.type.SqlType; public final class SimpleFunctions { private SimpleFunctions() { } @Description(“Returns summation of two numbers”) @ScalarFunction(“mysum”) //function name @SqlType(StandardTypes.BIGINT) public static long sum(@SqlType(StandardTypes.BIGINT) long num1, @SqlType(StandardTypes.BIGINT) long num2) { return num1 + num2; } } After the application is created compile and execute the application. It will produce the JAR file. Copy the file and move the JAR file into the target Presto server plugin directory. Compilation mvn compile Execution mvn package Now restart Presto server and connect Presto client. Then execute the custom function application as explained below, $ ./presto –catalog mysql –schema default Query presto:default> select mysum(10,10); Result _col0 ——- 20 Print Page Previous Next Advertisements ”;
Apache Presto – Quick Guide
Apache Presto – Quick Guide ”; Previous Next Apache Presto – Overview Data analytics is the process of analyzing raw data to gather relevant information for better decision making. It is primarily used in many organizations to make business decisions. Well, big data analytics involves a large amount of data and this process is quite complex, hence companies use different strategies. For example, Facebook is one of the leading data driven and largest data warehouse company in the world. Facebook warehouse data is stored in Hadoop for large scale computation. Later, when warehouse data grew to petabytes, they decided to develop a new system with low latency. In the year of 2012, Facebook team members designed “Presto” for interactive query analytics that would operate quickly even with petabytes of data. What is Apache Presto? Apache Presto is a distributed parallel query execution engine, optimized for low latency and interactive query analysis. Presto runs queries easily and scales without down time even from gigabytes to petabytes. A single Presto query can process data from multiple sources like HDFS, MySQL, Cassandra, Hive and many more data sources. Presto is built in Java and easy to integrate with other data infrastructure components. Presto is powerful, and leading companies like Airbnb, DropBox, Groupon, Netflix are adopting it. Presto − Features Presto contains the following features − Simple and extensible architecture. Pluggable connectors – Presto supports pluggable connector to provide metadata and data for queries. Pipelined executions – Avoids unnecessary I/O latency overhead. User-defined functions – Analysts can create custom user-defined functions to migrate easily. Vectorized columnar processing. Presto − Benefits Here is a list of benefits that Apache Presto offers − Specialized SQL operations Easy to install and debug Simple storage abstraction Quickly scales petabytes data with low latency Presto − Applications Presto supports most of today’s best industrial applications. Let’s take a look at some of the notable applications. Facebook − Facebook built Presto for data analytics needs. Presto easily scales large velocity of data. Teradata − Teradata provides end-to-end solutions in Big Data analytics and data warehousing. Teradata contribution to Presto makes it easier for more companies to enable all analytical needs. Airbnb − Presto is an integral part of the Airbnb data infrastructure. Well, hundreds of employees are running queries each day with the technology. Why Presto? Presto supports standard ANSI SQL which has made it very easy for data analysts and developers. Though it is built in Java, it avoids typical issues of Java code related to memory allocation and garbage collection. Presto has a connector architecture that is Hadoop friendly. It allows to easily plug in file systems. Presto runs on multiple Hadoop distributions. In addition, Presto can reach out from a Hadoop platform to query Cassandra, relational databases, or other data stores. This cross-platform analytic capability allows Presto users to extract maximum business value from gigabytes to petabytes of data. Apache Presto – Architecture The architecture of Presto is almost similar to classic MPP (massively parallel processing) DBMS architecture. The following diagram illustrates the architecture of Presto. The above diagram consists of different components. Following table describes each of the component in detail. S.No Component & Description 1. Client Client (Presto CLI) submits SQL statements to a coordinator to get the result. 2. Coordinator Coordinator is a master daemon. The coordinator initially parses the SQL queries then analyzes and plans for the query execution. Scheduler performs pipeline execution, assigns work to the closest node and monitors progress. 3. Connector Storage plugins are called as connectors. Hive, HBase, MySQL, Cassandra and many more act as a connector; otherwise you can also implement a custom one. The connector provides metadata and data for queries. The coordinator uses the connector to get metadata for building a query plan. 4. Worker The coordinator assigns task to worker nodes. The workers get actual data from the connector. Finally, the worker node delivers result to the client. Presto − Workflow Presto is a distributed system that runs on a cluster of nodes. Presto’s distributed query engine is optimized for interactive analysis and supports standard ANSI SQL, including complex queries, aggregations, joins, and window functions. Presto architecture is simple and extensible. Presto client (CLI) submits SQL statements to a master daemon coordinator. The scheduler connects through execution pipeline. The scheduler assigns work to nodes which is closest to the data and monitors progress. The coordinator assigns task to multiple worker nodes and finally the worker node delivers the result back to the client. The client pulls data from the output process. Extensibility is the key design. Pluggable connectors like Hive, HBase, MySQL, etc., provides metadata and data for queries. Presto was designed with a “simple storage abstraction” that makes it easy to provide SQL query capability against these different kind of data sources. Execution Model Presto supports custom query and execution engine with operators designed to support SQL semantics. In addition to improved scheduling, all processing is in memory and pipelined across the network between different stages. This avoids unnecessary I/O latency overhead. Apache Presto – Installation This chapter will explain how to install Presto on your machine. Let’s go through the basic requirements of Presto, Linux or Mac OS Java version 8 Now, let’s continue the following steps to install Presto on your machine. Verifying Java installation Hopefully, you have already installed Java version 8 on your machine right now, so you just verify it using the following command. $ java -version If Java is successfully installed on your machine, you could see the version of installed Java. If Java is not installed, follow the subsequent steps to install Java 8 on your machine. Download JDK. Download the latest version of JDK by visiting the following link. http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html The latest version is JDK 8u 92 and the file is “jdk-8u92-linux-x64.tar.gz”. Please download the file on your machine. After that, extract the files and move to the specific directory. Then set Java alternatives. Finally Java will be installed on your machine.
Apache Presto – Useful Resources ”; Previous Next The following resources contain additional information on Apache Presto. Please use them to get more in-depth knowledge on this. Useful Video Courses Apache Spark Online Training Course 47 Lectures 3.5 hours Tutorialspoint More Detail Delta Lake with Apache Spark using Scala 53 Lectures 2 hours Bigdata Engineer More Detail Apache Spark with Scala for Certified Databricks Professional 78 Lectures 5.5 hours Bigdata Engineer More Detail Apache Cassandra for Beginners 28 Lectures 2 hours Navdeep Kaur More Detail NGINX, Apache, SSL Encryption – Training Course 60 Lectures 3.5 hours YouAccel More Detail Learn Advanced Apache Kafka from Scratch Featured 154 Lectures 9 hours Learnkart Technology Pvt Ltd More Detail Print Page Previous Next Advertisements ”;
Apache Presto – Overview
Apache Presto – Overview ”; Previous Next Data analytics is the process of analyzing raw data to gather relevant information for better decision making. It is primarily used in many organizations to make business decisions. Well, big data analytics involves a large amount of data and this process is quite complex, hence companies use different strategies. For example, Facebook is one of the leading data driven and largest data warehouse company in the world. Facebook warehouse data is stored in Hadoop for large scale computation. Later, when warehouse data grew to petabytes, they decided to develop a new system with low latency. In the year of 2012, Facebook team members designed “Presto” for interactive query analytics that would operate quickly even with petabytes of data. What is Apache Presto? Apache Presto is a distributed parallel query execution engine, optimized for low latency and interactive query analysis. Presto runs queries easily and scales without down time even from gigabytes to petabytes. A single Presto query can process data from multiple sources like HDFS, MySQL, Cassandra, Hive and many more data sources. Presto is built in Java and easy to integrate with other data infrastructure components. Presto is powerful, and leading companies like Airbnb, DropBox, Groupon, Netflix are adopting it. Presto − Features Presto contains the following features − Simple and extensible architecture. Pluggable connectors – Presto supports pluggable connector to provide metadata and data for queries. Pipelined executions – Avoids unnecessary I/O latency overhead. User-defined functions – Analysts can create custom user-defined functions to migrate easily. Vectorized columnar processing. Presto − Benefits Here is a list of benefits that Apache Presto offers − Specialized SQL operations Easy to install and debug Simple storage abstraction Quickly scales petabytes data with low latency Presto − Applications Presto supports most of today’s best industrial applications. Let’s take a look at some of the notable applications. Facebook − Facebook built Presto for data analytics needs. Presto easily scales large velocity of data. Teradata − Teradata provides end-to-end solutions in Big Data analytics and data warehousing. Teradata contribution to Presto makes it easier for more companies to enable all analytical needs. Airbnb − Presto is an integral part of the Airbnb data infrastructure. Well, hundreds of employees are running queries each day with the technology. Why Presto? Presto supports standard ANSI SQL which has made it very easy for data analysts and developers. Though it is built in Java, it avoids typical issues of Java code related to memory allocation and garbage collection. Presto has a connector architecture that is Hadoop friendly. It allows to easily plug in file systems. Presto runs on multiple Hadoop distributions. In addition, Presto can reach out from a Hadoop platform to query Cassandra, relational databases, or other data stores. This cross-platform analytic capability allows Presto users to extract maximum business value from gigabytes to petabytes of data. Print Page Previous Next Advertisements ”;
Apache Presto – Administration Tools ”; Previous Next In this chapter, we will discuss the administration tools used in Presto. Let’s start with the Web Interface of Presto. Web Interface Presto provides a web interface for monitoring and managing queries. It can be accessed from the port number specified in the coordinator Config Properties. Start Presto server and Presto CLI. Then you can access the web interface from the following url − http://localhost:8080/ The output will be similar to the above screen. Here, the main page has a list of queries along with information like unique query ID, query text, query state, percentage completed, username and source from which this query is originated. Latest queries are running first, then completed or not completed queries are displayed at the bottom. Tuning the Performance on Presto If Presto cluster is having any performance-related issues, change your default configuration settings to the following settings. Config Properties task. info -refresh-max-wait − Reduces coordinator work load. task.max-worker-threads − Splits the process and assigns to each worker nodes. distributed-joins-enabled − Hash-based distributed joins. node-scheduler.network-topology − Sets network topology to scheduler. JVM Settings Change your default JVM settings to the following settings. This will be helpful for diagnosing garbage collection issues. -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCCause -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+PrintReferenceGC -XX:+PrintClassHistogramAfterFullGC -XX:+PrintClassHistogramBeforeFullGC -XX:PrintFLSStatistics = 2 -XX:+PrintAdaptiveSizePolicy -XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount = 1 Print Page Previous Next Advertisements ”;
Apache Presto – Configuration Settings ”; Previous Next This chapter will discuss the configuration settings for Presto. Presto Verifier The Presto Verifier can be used to test Presto against another database (such as MySQL), or to test two Presto clusters against each other. Create Database in MySQL Open MySQL server and create a database using the following command. create database test Now you have created “test” database in the server. Create the table and load it with the following query. CREATE TABLE verifier_queries( id INT NOT NULL AUTO_INCREMENT, suite VARCHAR(256) NOT NULL, name VARCHAR(256), test_catalog VARCHAR(256) NOT NULL, test_schema VARCHAR(256) NOT NULL, test_prequeries TEXT, test_query TEXT NOT NULL, test_postqueries TEXT, test_username VARCHAR(256) NOT NULL default ”verifier-test”, test_password VARCHAR(256), control_catalog VARCHAR(256) NOT NULL, control_schema VARCHAR(256) NOT NULL, control_prequeries TEXT, control_query TEXT NOT NULL, control_postqueries TEXT, control_username VARCHAR(256) NOT NULL default ”verifier-test”, control_password VARCHAR(256), session_properties_json TEXT, PRIMARY KEY (id) ); Add Config Settings Create a properties file to configure the verifier − $ vi config.properties suite = mysuite query-database = jdbc:mysql://localhost:3306/tutorials?user=root&password=pwd control.gateway = jdbc:presto://localhost:8080 test.gateway = jdbc:presto://localhost:8080 thread-count = 1 Here, in the query-database field, enter the following details − mysql database name, user name, and password. Download JAR File Download Presto-verifier jar file by visiting the following link, https://repo1.maven.org/maven2/com/facebook/presto/presto-verifier/0.149/ Now the version “presto-verifier-0.149-executable.jar” is downloaded on your machine. Execute JAR Execute the JAR file using the following command, $ mv presto-verifier-0.149-executable.jar verifier $ chmod+x verifier Run Verifier Run the verifier using the following command, $ ./verifier config.properties Create Table Let’s create a simple table in “test” database using the following query. create table product(id int not null, name varchar(50)) Insert Table After creating a table, insert two records using the following query, insert into product values(1,’Phone”) insert into product values(2,’Television’) Run Verifier Query Execute the following sample query in the verifier terminal (./verifier config.propeties) to check the verifier result. Sample Query insert into verifier_queries (suite, test_catalog, test_schema, test_query, control_catalog, control_schema, control_query) values (”mysuite”, ”mysql”, ”default”, ”select * from mysql.test.product”, ”mysql”, ”default”, ”select * from mysql.test.product”); Here, select * from mysql.test.product query refers to mysql catalog, test is database name and product is table name. In this way, you can access mysql connector using Presto server. Here, two same select queries are tested against each other to see the performance. Similarly, you can run other queries to test the performance results. You can also connect two Presto clusters to check the performance results. Print Page Previous Next Advertisements ”;
Apache Presto – MySQL Connector ”; Previous Next The MySQL connector is used to query an external MySQL database. Prerequisites MySQL server installation. Configuration Settings Hopefully you have installed mysql server on your machine. To enable mysql properties on Presto server, you must create a file “mysql.properties” in “etc/catalog” directory. Issue the following command to create a mysql.properties file. $ cd etc $ cd catalog $ vi mysql.properties connector.name = mysql connection-url = jdbc:mysql://localhost:3306 connection-user = root connection-password = pwd Save the file and quit the terminal. In the above file, you must enter your mysql password in connection-password field. Create Database in MySQL Server Open MySQL server and create a database using the following command. create database tutorials Now you have created “tutorials” database in the server. To enable database type, use the command “use tutorials” in the query window. Create Table Let’s create a simple table on “tutorials” database. create table author(auth_id int not null, auth_name varchar(50),topic varchar(100)) Insert Table After creating a table, insert three records using the following query. insert into author values(1,”Doug Cutting”,”Hadoop”) insert into author values(2,’James Gosling”,”java”) insert into author values(3,”Dennis Ritchie’,”C”) Select Records To retrieve all the records, type the following query. Query select * from author Result auth_id auth_name topic 1 Doug Cutting Hadoop 2 James Gosling java 3 Dennis Ritchie C As of now, you have queried data using MySQL server. Let’s connect Mysql storage plugin to Presto server. Connect Presto CLI Type the following command to connect MySql plugin on Presto CLI. ./presto –server localhost:8080 –catalog mysql –schema tutorials You will receive the following response. presto:tutorials> Here “tutorials” refers to schema in mysql server. List Schemas To list out all the schemas in mysql, type the following query in Presto server. Query presto:tutorials> show schemas from mysql; Result Schema ——————– information_schema performance_schema sys tutorials From this result, we can conclude the first three schemas as predefined and the last one as created by yourself. List Tables from Schema Following query lists out all the tables in tutorials schema. Query presto:tutorials> show tables from mysql.tutorials; Result Table ——– author We have created only one table in this schema. If you have created multiple tables, it will list out all the tables. Describe Table To describe the table fields, type the following query. Query presto:tutorials> describe mysql.tutorials.author; Result Column | Type | Comment ———–+————–+——— auth_id | integer | auth_name | varchar(50) | topic | varchar(100) | Show Columns from Table Query presto:tutorials> show columns from mysql.tutorials.author; Result Column | Type | Comment ———–+————–+——— auth_id | integer | auth_name | varchar(50) | topic | varchar(100) | Access Table Records To fetch all the records from mysql table, issue the following query. Query presto:tutorials> select * from mysql.tutorials.author; Result auth_id | auth_name | topic ———+—————-+——– 1 | Doug Cutting | Hadoop 2 | James Gosling | java 3 | Dennis Ritchie | C From this result, you can retrieve mysql server records in Presto. Create Table Using as Command Mysql connector doesn’t support create table query but you can create a table using as command. Query presto:tutorials> create table mysql.tutorials.sample as select * from mysql.tutorials.author; Result CREATE TABLE: 3 rows You can’t insert rows directly because this connector has some limitations. It cannot support the following queries − create insert update delete drop To view the records in the newly created table, type the following query. Query presto:tutorials> select * from mysql.tutorials.sample; Result auth_id | auth_name | topic ———+—————-+——– 1 | Doug Cutting | Hadoop 2 | James Gosling | java 3 | Dennis Ritchie | C Print Page Previous Next Advertisements ”;