Cassandra – Batch

Cassandra – Batch Statements Using Batch Statements Using BATCH, you can execute multiple modification statements (insert, update, delete) simultaneiously. Its syntax is as follows − BEGIN BATCH <insert-stmt>/ <update-stmt>/ <delete-stmt> APPLY BATCH Example Assume there is a table in Cassandra called emp having the following data − emp_id emp_name emp_city emp_phone emp_sal 1 ram Hyderabad 9848022338 50000 2 robin Delhi 9848022339 50000 3 rahman Chennai 9848022330 45000 In this example, we will perform the following operations − Insert a new row with the following details (4, rajeev, pune, 9848022331, 30000). Update the salary of employee with row id 3 to 50000. Delete city of the employee with row id 2. To perform the above operations in one go, use the following BATCH command − cqlsh:tutorialspoint> BEGIN BATCH … INSERT INTO emp (emp_id, emp_city, emp_name, emp_phone, emp_sal) values( 4,”Pune”,”rajeev”,9848022331, 30000); … UPDATE emp SET emp_sal = 50000 WHERE emp_id =3; … DELETE emp_city FROM emp WHERE emp_id = 2; … APPLY BATCH; Verification After making changes, verify the table using the SELECT statement. It should produce the following output − cqlsh:tutorialspoint> select * from emp; emp_id | emp_city | emp_name | emp_phone | emp_sal ——–+———–+———-+————+——— 1 | Hyderabad | ram | 9848022338 | 50000 2 | null | robin | 9848022339 | 50000 3 | Chennai | rahman | 9848022330 | 50000 4 | Pune | rajeev | 9848022331 | 30000 (4 rows) Here you can observe the table with modified data. Batch Statements using Java API Batch statements can be written programmatically in a table using the execute() method of Session class. Follow the steps given below to execute multiple statements using batch statement with the help of Java API. Step1: Create a Cluster Object Create an instance of Cluster.builder class of com.datastax.driver.core package as shown below. //Creating Cluster.Builder object Cluster.Builder builder1 = Cluster.builder(); Add a contact point (IP address of the node) using the addContactPoint() method of Cluster.Builder object. This method returns Cluster.Builder. //Adding contact point to the Cluster.Builder object Cluster.Builder builder2 = build.addContactPoint( “127.0.0.1” ); Using the new builder object, create a cluster object. To do so, you have a method called build() in the Cluster.Builder class. Use the following code to create the cluster object − //Building a cluster Cluster cluster = builder.build(); You can build the cluster object using a single line of code as shown below. Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1″).build(); Step 2: Create a Session Object Create an instance of Session object using the connect() method of Cluster class as shown below. Session session = cluster.connect( ); This method creates a new session and initializes it. If you already have a keyspace, then you can set it to the existing one by passing the KeySpace name in string format to this method as shown below. Session session = cluster.connect(“ Your keyspace name ”); Here we are using the KeySpace named tp. Therefore, create the session object as shown below. Session session = cluster.connect(“tp”); Step 3: Execute Query You can execute CQL queries using the execute() method of Session class. Pass the query either in string format or as a Statement class object to the execute() method. Whatever you pass to this method in string format will be executed on the cqlsh. In this example, we will perform the following operations − Insert a new row with the following details (4, rajeev, pune, 9848022331, 30000). Update the salary of employee with row id 3 to 50000. Delete the city of the employee with row id 2. You have to store the query in a string variable and pass it to the execute() method as shown below. String query1 = ” BEGIN BATCH INSERT INTO emp (emp_id, emp_city, emp_name, emp_phone, emp_sal) values( 4,”Pune”,”rajeev”,9848022331, 30000); UPDATE emp SET emp_sal = 50000 WHERE emp_id =3; DELETE emp_city FROM emp WHERE emp_id = 2; APPLY BATCH;”; Given below is the complete program to execute multiple statements simultaneously on a table in Cassandra using Java API. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Session; public class Batch { public static void main(String args[]){ //query String query =” BEGIN BATCH INSERT INTO emp (emp_id, emp_city, emp_name, emp_phone, emp_sal) values( 4,”Pune”,”rajeev”,9848022331, 30000);” + “UPDATE emp SET emp_sal = 50000 WHERE emp_id =3;” + “DELETE emp_city FROM emp WHERE emp_id = 2;” + “APPLY BATCH;”; //Creating Cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); //Creating Session object Session session = cluster.connect(“tp”); //Executing the query session.execute(query); System.out.println(“Changes done”); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Batch.java $java Batch Under normal conditions, it should produce the following output − Changes done Print Page Previous Next Advertisements ”;

Cassandra – Quick Guide

Cassandra – Quick Guide ”; Previous Next Cassandra – Introduction Apache Cassandra is a highly scalable, high-performance distributed database designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. It is a type of NoSQL database. Let us first understand what a NoSQL database does. NoSQLDatabase A NoSQL database (sometimes called as Not Only SQL) is a database that provides a mechanism to store and retrieve data other than the tabular relations used in relational databases. These databases are schema-free, support easy replication, have simple API, eventually consistent, and can handle huge amounts of data. The primary objective of a NoSQL database is to have simplicity of design, horizontal scaling, and finer control over availability. NoSql databases use different data structures compared to relational databases. It makes some operations faster in NoSQL. The suitability of a given NoSQL database depends on the problem it must solve. NoSQL vs. Relational Database The following table lists the points that differentiate a relational database from a NoSQL database. Relational Database NoSql Database Supports powerful query language. Supports very simple query language. It has a fixed schema. No fixed schema. Follows ACID (Atomicity, Consistency, Isolation, and Durability). It is only “eventually consistent”. Supports transactions. Does not support transactions. Besides Cassandra, we have the following NoSQL databases that are quite popular − Apache HBase − HBase is an open source, non-relational, distributed database modeled after Google’s BigTable and is written in Java. It is developed as a part of Apache Hadoop project and runs on top of HDFS, providing BigTable-like capabilities for Hadoop. MongoDB − MongoDB is a cross-platform document-oriented database system that avoids using the traditional table-based relational database structure in favor of JSON-like documents with dynamic schemas making the integration of data in certain types of applications easier and faster. What is Apache Cassandra? Apache Cassandra is an open source, distributed and decentralized/distributed storage system (database), for managing very large amounts of structured data spread out across the world. It provides highly available service with no single point of failure. Listed below are some of the notable points of Apache Cassandra − It is scalable, fault-tolerant, and consistent. It is a column-oriented database. Its distribution design is based on Amazon’s Dynamo and its data model on Google’s Bigtable. Created at Facebook, it differs sharply from relational database management systems. Cassandra implements a Dynamo-style replication model with no single point of failure, but adds a more powerful “column family” data model. Cassandra is being used by some of the biggest companies such as Facebook, Twitter, Cisco, Rackspace, ebay, Twitter, Netflix, and more. Features of Cassandra Cassandra has become so popular because of its outstanding technical features. Given below are some of the features of Cassandra: Elastic scalability − Cassandra is highly scalable; it allows to add more hardware to accommodate more customers and more data as per requirement. Always on architecture − Cassandra has no single point of failure and it is continuously available for business-critical applications that cannot afford a failure. Fast linear-scale performance − Cassandra is linearly scalable, i.e., it increases your throughput as you increase the number of nodes in the cluster. Therefore it maintains a quick response time. Flexible data storage − Cassandra accommodates all possible data formats including: structured, semi-structured, and unstructured. It can dynamically accommodate changes to your data structures according to your need. Easy data distribution − Cassandra provides the flexibility to distribute data where you need by replicating data across multiple data centers. Transaction support − Cassandra supports properties like Atomicity, Consistency, Isolation, and Durability (ACID). Fast writes − Cassandra was designed to run on cheap commodity hardware. It performs blazingly fast writes and can store hundreds of terabytes of data, without sacrificing the read efficiency. History of Cassandra Cassandra was developed at Facebook for inbox search. It was open-sourced by Facebook in July 2008. Cassandra was accepted into Apache Incubator in March 2009. It was made an Apache top-level project since February 2010. Cassandra – Architecture The design goal of Cassandra is to handle big data workloads across multiple nodes without any single point of failure. Cassandra has peer-to-peer distributed system across its nodes, and data is distributed among all the nodes in a cluster. All the nodes in a cluster play the same role. Each node is independent and at the same time interconnected to other nodes. Each node in a cluster can accept read and write requests, regardless of where the data is actually located in the cluster. When a node goes down, read/write requests can be served from other nodes in the network. Data Replication in Cassandra In Cassandra, one or more of the nodes in a cluster act as replicas for a given piece of data. If it is detected that some of the nodes responded with an out-of-date value, Cassandra will return the most recent value to the client. After returning the most recent value, Cassandra performs a read repair in the background to update the stale values. The following figure shows a schematic view of how Cassandra uses data replication among the nodes in a cluster to ensure no single point of failure. Note − Cassandra uses the Gossip Protocol in the background to allow the nodes to communicate with each other and detect any faulty nodes in the cluster. Components of Cassandra The key components of Cassandra are as follows − Node − It is the place where data is stored. Data center − It is a collection of related nodes. Cluster − A cluster is a component that contains one or more data centers. Commit log − The commit log is a crash-recovery mechanism in Cassandra. Every write operation is written to the commit log. Mem-table − A mem-table is a memory-resident data structure. After commit log, the data will be written to the mem-table. Sometimes, for a single-column family, there will be multiple mem-tables. SSTable − It is a disk file to

Cassandra – Create Data

Cassandra – Create Data ”; Previous Next Creating Data in a Table You can insert data into the columns of a row in a table using the command INSERT. Given below is the syntax for creating data in a table. INSERT INTO <tablename> (<column1 name>, <column2 name>….) VALUES (<value1>, <value2>….) USING <option> Example Let us assume there is a table called emp with columns (emp_id, emp_name, emp_city, emp_phone, emp_sal) and you have to insert the following data into the emp table. emp_id emp_name emp_city emp_phone emp_sal 1 ram Hyderabad 9848022338 50000 2 robin Hyderabad 9848022339 40000 3 rahman Chennai 9848022330 45000 Use the commands given below to fill the table with required data. cqlsh:tutorialspoint> INSERT INTO emp (emp_id, emp_name, emp_city, emp_phone, emp_sal) VALUES(1,”ram”, ”Hyderabad”, 9848022338, 50000); cqlsh:tutorialspoint> INSERT INTO emp (emp_id, emp_name, emp_city, emp_phone, emp_sal) VALUES(2,”robin”, ”Hyderabad”, 9848022339, 40000); cqlsh:tutorialspoint> INSERT INTO emp (emp_id, emp_name, emp_city, emp_phone, emp_sal) VALUES(3,”rahman”, ”Chennai”, 9848022330, 45000); Verification After inserting data, use SELECT statement to verify whether the data has been inserted or not. If you verify the emp table using SELECT statement, it will give you the following output. cqlsh:tutorialspoint> SELECT * FROM emp; emp_id | emp_city | emp_name | emp_phone | emp_sal ——–+———–+———-+————+——— 1 | Hyderabad | ram | 9848022338 | 50000 2 | Hyderabad | robin | 9848022339 | 40000 3 | Chennai | rahman | 9848022330 | 45000 (3 rows) Here you can observe the table has populated with the data we inserted. Creating Data using Java API You can create data in a table using the execute() method of Session class. Follow the steps given below to create data in a table using java API. Step1: Create a Cluster Object Create an instance of Cluster.builder class of com.datastax.driver.core package as shown below. //Creating Cluster.Builder object Cluster.Builder builder1 = Cluster.builder(); Add a contact point (IP address of the node) using the addContactPoint() method of Cluster.Builder object. This method returns Cluster.Builder. //Adding contact point to the Cluster.Builder object Cluster.Builder builder2 = build.addContactPoint(“127.0.0.1”); Using the new builder object, create a cluster object. To do so, you have a method called build() in the Cluster.Builder class. The following code shows how to create a cluster object. //Building a cluster Cluster cluster = builder.build(); You can build a cluster object using a single line of code as shown below. Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1″).build(); Step 2: Create a Session Object Create an instance of Session object using the connect() method of Cluster class as shown below. Session session = cluster.connect( ); This method creates a new session and initializes it. If you already have a keyspace, then you can set it to the existing one by passing the KeySpace name in string format to this method as shown below. Session session = cluster.connect(“ Your keyspace name ” ); Here we are using the KeySpace called tp. Therefore, create the session object as shown below. Session session = cluster.connect(“ tp” ); Step 3: Execute Query You can execute CQL queries using the execute() method of Session class. Pass the query either in string format or as a Statement class object to the execute() method. Whatever you pass to this method in string format will be executed on the cqlsh. In the following example, we are inserting data in a table called emp. You have to store the query in a string variable and pass it to the execute() method as shown below. String query1 = “INSERT INTO emp (emp_id, emp_name, emp_city, emp_phone, emp_sal) VALUES(1,”ram”, ”Hyderabad”, 9848022338, 50000);” ; String query2 = “INSERT INTO emp (emp_id, emp_name, emp_city, emp_phone, emp_sal) VALUES(2,”robin”, ”Hyderabad”, 9848022339, 40000);” ; String query3 = “INSERT INTO emp (emp_id, emp_name, emp_city, emp_phone, emp_sal) VALUES(3,”rahman”, ”Chennai”, 9848022330, 45000);” ; session.execute(query1); session.execute(query2); session.execute(query3); Given below is the complete program to insert data into a table in Cassandra using Java API. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Session; public class Create_Data { public static void main(String args[]){ //queries String query1 = “INSERT INTO emp (emp_id, emp_name, emp_city, emp_phone, emp_sal)” + ” VALUES(1,”ram”, ”Hyderabad”, 9848022338, 50000);” ; String query2 = “INSERT INTO emp (emp_id, emp_name, emp_city, emp_phone, emp_sal)” + ” VALUES(2,”robin”, ”Hyderabad”, 9848022339, 40000);” ; String query3 = “INSERT INTO emp (emp_id, emp_name, emp_city, emp_phone, emp_sal)” + ” VALUES(3,”rahman”, ”Chennai”, 9848022330, 45000);” ; //Creating Cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); //Creating Session object Session session = cluster.connect(“tp”); //Executing the query session.execute(query1); session.execute(query2); session.execute(query3); System.out.println(“Data created”); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Create_Data.java $java Create_Data Under normal conditions, it should produce the following output − Data created Print Page Previous Next Advertisements ”;

Cassandra – Read Data

Cassandra – Read Data ”; Previous Next Reading Data using Select Clause SELECT clause is used to read data from a table in Cassandra. Using this clause, you can read a whole table, a single column, or a particular cell. Given below is the syntax of SELECT clause. SELECT FROM <tablename> Example Assume there is a table in the keyspace named emp with the following details − emp_id emp_name emp_city emp_phone emp_sal 1 ram Hyderabad 9848022338 50000 2 robin null 9848022339 50000 3 rahman Chennai 9848022330 50000 4 rajeev Pune 9848022331 30000 The following example shows how to read a whole table using SELECT clause. Here we are reading a table called emp. cqlsh:tutorialspoint> select * from emp; emp_id | emp_city | emp_name | emp_phone | emp_sal ——–+———–+———-+————+——— 1 | Hyderabad | ram | 9848022338 | 50000 2 | null | robin | 9848022339 | 50000 3 | Chennai | rahman | 9848022330 | 50000 4 | Pune | rajeev | 9848022331 | 30000 (4 rows) Reading Required Columns The following example shows how to read a particular column in a table. cqlsh:tutorialspoint> SELECT emp_name, emp_sal from emp; emp_name | emp_sal ———-+——— ram | 50000 robin | 50000 rajeev | 30000 rahman | 50000 (4 rows) Where Clause Using WHERE clause, you can put a constraint on the required columns. Its syntax is as follows − SELECT FROM <table name> WHERE <condition>; Note − A WHERE clause can be used only on the columns that are a part of primary key or have a secondary index on them. In the following example, we are reading the details of an employee whose salary is 50000. First of all, set secondary index to the column emp_sal. cqlsh:tutorialspoint> CREATE INDEX ON emp(emp_sal); cqlsh:tutorialspoint> SELECT * FROM emp WHERE emp_sal=50000; emp_id | emp_city | emp_name | emp_phone | emp_sal ——–+———–+———-+————+——— 1 | Hyderabad | ram | 9848022338 | 50000 2 | null | robin | 9848022339 | 50000 3 | Chennai | rahman | 9848022330 | 50000 Reading Data using Java API You can read data from a table using the execute() method of Session class. Follow the steps given below to execute multiple statements using batch statement with the help of Java API. Step1:Create a Cluster Object Create an instance of Cluster.builder class of com.datastax.driver.core package as shown below. //Creating Cluster.Builder object Cluster.Builder builder1 = Cluster.builder(); Add a contact point (IP address of the node) using the addContactPoint() method of Cluster.Builder object. This method returns Cluster.Builder. //Adding contact point to the Cluster.Builder object Cluster.Builder builder2 = build.addContactPoint( “127.0.0.1” ); Using the new builder object, create a cluster object. To do so, you have a method called build() in the Cluster.Builder class. Use the following code to create the cluster object. //Building a cluster Cluster cluster = builder.build(); You can build the cluster object using a single line of code as shown below. Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); Step 2: Create a Session Object Create an instance of Session object using the connect() method of Cluster class as shown below. Session session = cluster.connect( ); This method creates a new session and initializes it. If you already have a keyspace, then you can set it to the existing one by passing the KeySpace name in string format to this method as shown below. Session session = cluster.connect(“Your keyspace name”); Here we are using the KeySpace called tp. Therefore, create the session object as shown below. Session session = cluster.connect(“tp”); Step 3: Execute Query You can execute CQL queries using execute() method of Session class. Pass the query either in string format or as a Statement class object to the execute() method. Whatever you pass to this method in string format will be executed on the cqlsh. In this example, we are retrieving the data from emp table. Store the query in a string and pass it to the execute() method of session class as shown below. String query = ”SELECT 8 FROM emp”; session.execute(query); Execute the query using the execute() method of Session class. Step 4: Get the ResultSet Object The select queries will return the result in the form of a ResultSet object, therefore store the result in the object of RESULTSET class as shown below. ResultSet result = session.execute( ); Given below is the complete program to read data from a table. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.ResultSet; import com.datastax.driver.core.Session; public class Read_Data { public static void main(String args[])throws Exception{ //queries String query = “SELECT * FROM emp”; //Creating Cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); //Creating Session object Session session = cluster.connect(“tutorialspoint”); //Getting the ResultSet ResultSet result = session.execute(query); System.out.println(result.all()); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Read_Data.java $java Read_Data Under normal conditions, it should produce the following output − [Row[1, Hyderabad, ram, 9848022338, 50000], Row[2, Delhi, robin, 9848022339, 50000], Row[4, Pune, rajeev, 9848022331, 30000], Row[3, Chennai, rahman, 9848022330, 50000]] Print Page Previous Next Advertisements ”;

Cassandra – Alter Table

Cassandra – Alter Table Altering a Table You can alter a table using the command ALTER TABLE. Given below is the syntax for creating a table. Syntax ALTER (TABLE | COLUMNFAMILY) <tablename> <instruction> Using ALTER command, you can perform the following operations − Add a column Drop a column Adding a Column Using ALTER command, you can add a column to a table. While adding columns, you have to take care that the column name is not conflicting with the existing column names and that the table is not defined with compact storage option. Given below is the syntax to add a column to a table. ALTER TABLE table name ADD new column datatype; Example Given below is an example to add a column to an existing table. Here we are adding a column called emp_email of text datatype to the table named emp. cqlsh:tutorialspoint> ALTER TABLE emp … ADD emp_email text; Verification Use the SELECT statement to verify whether the column is added or not. Here you can observe the newly added column emp_email. cqlsh:tutorialspoint> select * from emp; emp_id | emp_city | emp_email | emp_name | emp_phone | emp_sal ——–+———-+———–+———-+———–+——— Dropping a Column Using ALTER command, you can delete a column from a table. Before dropping a column from a table, check that the table is not defined with compact storage option. Given below is the syntax to delete a column from a table using ALTER command. ALTER table name DROP column name; Example Given below is an example to drop a column from a table. Here we are deleting the column named emp_email. cqlsh:tutorialspoint> ALTER TABLE emp DROP emp_email; Verification Verify whether the column is deleted using the select statement, as shown below. cqlsh:tutorialspoint> select * from emp; emp_id | emp_city | emp_name | emp_phone | emp_sal ——–+———-+———-+———–+——— (0 rows) Since emp_email column has been deleted, you cannot find it anymore. Altering a Table using Java API You can create a table using the execute() method of Session class. Follow the steps given below to alter a table using Java API. Step1: Create a Cluster Object First of all, create an instance of Cluster.builder class of com.datastax.driver.core package as shown below. //Creating Cluster.Builder object Cluster.Builder builder1 = Cluster.builder(); Add a contact point (IP address of the node) using the addContactPoint() method of Cluster.Builder object. This method returns Cluster.Builder. //Adding contact point to the Cluster.Builder object Cluster.Builder builder2 = build.addContactPoint( “127.0.0.1” ); Using the new builder object, create a cluster object. To do so, you have a method called build() in the Cluster.Builder class. The following code shows how to create a cluster object. //Building a cluster Cluster cluster = builder.build(); You can build a cluster object using a single line of code as shown below. Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); Step 2: Create a Session Object Create an instance of Session object using the connect() method of Cluster class as shown below. Session session = cluster.connect( ); This method creates a new session and initializes it. If you already have a keyspace, you can set it to the existing one by passing the KeySpace name in string format to this method as shown below. Session session = cluster.connect(“ Your keyspace name ” ); Session session = cluster.connect(“ tp” ); Here we are using the KeySpace named tp. Therefore, create the session object as shown below. Step 3: Execute Query You can execute CQL queries using the execute() method of Session class. Pass the query either in string format or as a Statement class object to the execute() method. Whatever you pass to this method in string format will be executed on the cqlsh. In the following example, we are adding a column to a table named emp. To do so, you have to store the query in a string variable and pass it to the execute() method as shown below. //Query String query1 = “ALTER TABLE emp ADD emp_email text”; session.execute(query); Given below is the complete program to add a column to an existing table. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Session; public class Add_column { public static void main(String args[]){ //Query String query = “ALTER TABLE emp ADD emp_email text”; //Creating Cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); //Creating Session object Session session = cluster.connect(“tp”); //Executing the query session.execute(query); System.out.println(“Column added”); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Add_Column.java $java Add_Column Under normal conditions, it should produce the following output − Column added Deleting a Column Given below is the complete program to delete a column from an existing table. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Session; public class Delete_Column { public static void main(String args[]){ //Query String query = “ALTER TABLE emp DROP emp_email;”; //Creating Cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); //Creating Session object Session session = cluster.connect(“tp”); //executing the query session.execute(query); System.out.println(“Column deleted”); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Delete_Column.java $java Delete_Column Under normal conditions, it should produce the following output − Column deleted Print Page Previous Next Advertisements ”;

Cassandra – Installation

Cassandra – Installation ”; Previous Next Cassandra can be accessed using cqlsh as well as drivers of different languages. This chapter explains how to set up both cqlsh and java environments to work with Cassandra. Pre-Installation Setup Before installing Cassandra in Linux environment, we require to set up Linux using ssh (Secure Shell). Follow the steps given below for setting up Linux environment. Create a User At the beginning, it is recommended to create a separate user for Hadoop to isolate Hadoop file system from Unix file system. Follow the steps given below to create a user. Open root using the command “su”. Create a user from the root account using the command “useradd username”. Now you can open an existing user account using the command “su username”. Open the Linux terminal and type the following commands to create a user. $ su password: # useradd hadoop # passwd hadoop New passwd: Retype new passwd SSH Setup and Key Generation SSH setup is required to perform different operations on a cluster such as starting, stopping, and distributed daemon shell operations. To authenticate different users of Hadoop, it is required to provide public/private key pair for a Hadoop user and share it with different users. The following commands are used for generating a key value pair using SSH − copy the public keys form id_rsa.pub to authorized_keys, and provide owner, read and write permissions to authorized_keys file respectively. $ ssh-keygen -t rsa $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys $ chmod 0600 ~/.ssh/authorized_keys Verify ssh: ssh localhost Installing Java Java is the main prerequisite for Cassandra. First of all, you should verify the existence of Java in your system using the following command − $ java -version If everything works fine it will give you the following output. java version “1.7.0_71” Java(TM) SE Runtime Environment (build 1.7.0_71-b13) Java HotSpot(TM) Client VM (build 25.0-b02, mixed mode) If you don’t have Java in your system, then follow the steps given below for installing Java. Step 1 Download java (JDK <latest version> – X64.tar.gz) from the following link: Then jdk-7u71-linux-x64.tar.gz will be downloaded onto your system. Step 2 Generally you will find the downloaded java file in the Downloads folder. Verify it and extract the jdk-7u71-linux-x64.gz file using the following commands. $ cd Downloads/ $ ls jdk-7u71-linux-x64.gz $ tar zxf jdk-7u71-linux-x64.gz $ ls jdk1.7.0_71 jdk-7u71-linux-x64.gz Step 3 To make Java available to all users, you have to move it to the location “/usr/local/”. Open root, and type the following commands. $ su password: # mv jdk1.7.0_71 /usr/local/ # exit Step 4 For setting up PATH and JAVA_HOME variables, add the following commands to ~/.bashrc file. export JAVA_HOME = /usr/local/jdk1.7.0_71 export PATH = $PATH:$JAVA_HOME/bin Now apply all the changes into the current running system. $ source ~/.bashrc Step 5 Use the following commands to configure java alternatives. # alternatives –install /usr/bin/java java usr/local/java/bin/java 2 # alternatives –install /usr/bin/javac javac usr/local/java/bin/javac 2 # alternatives –install /usr/bin/jar jar usr/local/java/bin/jar 2 # alternatives –set java usr/local/java/bin/java # alternatives –set javac usr/local/java/bin/javac # alternatives –set jar usr/local/java/bin/jar Now use the java -version command from the terminal as explained above. Setting the Path Set the path of Cassandra path in “/.bashrc” as shown below. [hadoop@linux ~]$ gedit ~/.bashrc export CASSANDRA_HOME = ~/cassandra export PATH = $PATH:$CASSANDRA_HOME/bin Download Cassandra Apache Cassandra is available at Download Link Cassandra using the following command. $ wget http://supergsego.com/apache/cassandra/2.1.2/apache-cassandra-2.1.2-bin.tar.gz Unzip Cassandra using the command zxvf as shown below. $ tar zxvf apache-cassandra-2.1.2-bin.tar.gz. Create a new directory named cassandra and move the contents of the downloaded file to it as shown below. $ mkdir Cassandra $ mv apache-cassandra-2.1.2/* cassandra. Configure Cassandra Open the cassandra.yaml: file, which will be available in the bin directory of Cassandra. $ gedit cassandra.yaml Note − If you have installed Cassandra from a deb or rpm package, the configuration files will be located in /etc/cassandra directory of Cassandra. The above command opens the cassandra.yaml file. Verify the following configurations. By default, these values will be set to the specified directories. data_file_directories “/var/lib/cassandra/data” commitlog_directory “/var/lib/cassandra/commitlog” saved_caches_directory “/var/lib/cassandra/saved_caches” Make sure these directories exist and can be written to, as shown below. Create Directories As super-user, create the two directories /var/lib/cassandra and /var./log/cassandra into which Cassandra writes its data. [root@linux cassandra]# mkdir /var/lib/cassandra [root@linux cassandra]# mkdir /var/log/cassandra Give Permissions to Folders Give read-write permissions to the newly created folders as shown below. [root@linux /]# chmod 777 /var/lib/cassandra [root@linux /]# chmod 777 /var/log/cassandra Start Cassandra To start Cassandra, open the terminal window, navigate to Cassandra home directory/home, where you unpacked Cassandra, and run the following command to start your Cassandra server. $ cd $CASSANDRA_HOME $./bin/cassandra -f Using the –f option tells Cassandra to stay in the foreground instead of running as a background process. If everything goes fine, you can see the Cassandra server starting. Programming Environment To set up Cassandra programmatically, download the following jar files − slf4j-api-1.7.5.jar cassandra-driver-core-2.0.2.jar guava-16.0.1.jar metrics-core-3.0.2.jar netty-3.9.0.Final.jar Place them in a separate folder. For example, we are downloading these jars to a folder named “Cassandra_jars”. Set the classpath for this folder in “.bashrc”file as shown below. [hadoop@linux ~]$ gedit ~/.bashrc //Set the following class path in the .bashrc file. export CLASSPATH = $CLASSPATH:/home/hadoop/Cassandra_jars/* Eclipse Environment Open Eclipse and create a new project called Cassandra _Examples. Right click on the project, select Build Path→Configure Build Path as shown below. It will open the properties window. Under Libraries tab, select Add External JARs. Navigate to the directory where you saved your jar files. Select all the five jar files and click OK as shown below. Under Referenced Libraries, you can see all the required jars added as shown below − Maven Dependencies Given below is the pom.xml for building a Cassandra project using maven. <project xmlns = “http://maven.apache.org/POM/4.0.0” xmlns:xsi = “http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation = “http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd”> <build> <sourceDirectory>src</sourceDirectory> <plugins> <plugin> <artifactId>maven-compiler-plugin</artifactId> <version>3.1</version> <configuration> <source>1.7</source> <target>1.7</target> </configuration> </plugin> </plugins> </build> <dependencies> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>1.7.5</version> </dependency> <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-driver-core</artifactId> <version>2.0.2</version> </dependency> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>16.0.1</version> </dependency> <dependency> <groupId>com.codahale.metrics</groupId> <artifactId>metrics-core</artifactId> <version>3.0.2</version> </dependency> <dependency> <groupId>io.netty</groupId>

Cassandra – Useful Resources

Cassandra – Useful Resources ”; Previous Next The following resources contain additional information on Cassandra. Please use them to get more in-depth knowledge on this topic. Nodejs Course with 10 real-world Projects Most Popular 89 Lectures 17 hours Eduonix Learning Solutions More Detail Real-Time Spark Project For Beginners: Hadoop, Spark, Docker 25 Lectures 6.5 hours Pari Margu More Detail Apache Cassandra for Beginners 28 Lectures 2 hours Navdeep Kaur More Detail Print Page Previous Next Advertisements ”;

Cassandra – Drop Index

Cassandra – Drop Index ”; Previous Next Dropping an Index You can drop an index using the command DROP INDEX. Its syntax is as follows − DROP INDEX <identifier> Given below is an example to drop an index of a column in a table. Here we are dropping the index of the column name in the table emp. cqlsh:tp> drop index name; Dropping an Index using Java API You can drop an index of a table using the execute() method of Session class. Follow the steps given below to drop an index from a table. Step1: Create a Cluster Object Create an instance of Cluster.builder class of com.datastax.driver.core package as shown below. //Creating Cluster.Builder object Cluster.Builder builder1 = Cluster.builder(); Add a contact point (IP address of the node) using the addContactPoint() method of Cluster.Builder object. This method returns Cluster.Builder. //Adding contact point to the Cluster.Builder object Cluster.Builder builder2 = build.addContactPoint( “127.0.0.1” ); Using the new builder object, create a cluster object. To do so, you have a method called build() in the Cluster.Builder class. The following code shows how to create a cluster object. //Building a cluster Cluster cluster = builder.build(); You can build a cluster object using a single line of code as shown below. Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); Step 2: Create a Session Object Create an instance of Session object using the connect() method of Cluster class as shown below. Session session = cluster.connect( ); This method creates a new session and initializes it. If you already have a keyspace, then you can set it to the existing one by passing the KeySpace name in string format to this method as shown below. Session session = cluster.connect(“ Your keyspace name ” ); Here we are using the KeySpace named tp. Therefore, create the session object as shown below. Session session = cluster.connect(“ tp” ); Step 3: Execute Query You can execute CQL queries using the execute() method of Session class. Pass the query either in string format or as a Statement class object to the execute() method. Whatever you pass to this method in string format will be executed on the cqlsh. In the following example, we are dropping an index “name” of emp table. You have to store the query in a string variable and pass it to the execute() method as shown below. //Query String query = “DROP INDEX user_name;”; session.execute(query); Given below is the complete program to drop an index in Cassandra using Java API. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Session; public class Drop_Index { public static void main(String args[]){ //Query String query = “DROP INDEX user_name;”; //Creating cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build();. //Creating Session object Session session = cluster.connect(“tp”); //Executing the query session.execute(query); System.out.println(“Index dropped”); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Drop_index.java $java Drop_index Under normal conditions, it should produce the following output − Index dropped Print Page Previous Next Advertisements ”;

Cassandra – Drop Keyspace

Cassandra – Drop Keyspace ”; Previous Next Dropping a Keyspace You can drop a KeySpace using the command DROP KEYSPACE. Given below is the syntax for dropping a KeySpace. Syntax DROP KEYSPACE <identifier> i.e. DROP KEYSPACE “KeySpace name” Example The following code deletes the keyspace tutorialspoint. cqlsh> DROP KEYSPACE tutorialspoint; Verification Verify the keyspaces using the command Describe and check whether the table is dropped as shown below. cqlsh> DESCRIBE keyspaces; system system_traces Since we have deleted the keyspace tutorialspoint, you will not find it in the keyspaces list. Dropping a Keyspace using Java API You can create a keyspace using the execute() method of Session class. Follow the steps given below to drop a keyspace using Java API. Step1: Create a Cluster Object First of all, create an instance of Cluster.builder class of com.datastax.driver.core package as shown below. //Creating Cluster.Builder object Cluster.Builder builder1 = Cluster.builder(); Add a contact point (IP address of the node) using the addContactPoint() method of Cluster.Builder object. This method returns Cluster.Builder. //Adding contact point to the Cluster.Builder object Cluster.Builder builder2 = build.addContactPoint( “127.0.0.1” ); Using the new builder object, create a cluster object. To do so, you have a method called build() in the Cluster.Builder class. The following code shows how to create a cluster object. //Building a cluster Cluster cluster = builder.build(); You can build a cluster object using a single line of code as shown below. Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); Step 2: Create a Session Object Create an instance of Session object using the connect() method of Cluster class as shown below. Session session = cluster.connect( ); This method creates a new session and initializes it. If you already have a keyspace, you can set it to the existing one by passing the keyspace name in string format to this method as shown below. Session session = cluster.connect(“ Your keyspace name”); Step 3: Execute Query You can execute CQL queries using the execute() method of Session class. Pass the query either in string format or as a Statement class object to the execute() method. Whatever you pass to this method in string format will be executed on the cqlsh. In the following example, we are deleting a keyspace named tp. You have to store the query in a string variable and pass it to the execute() method as shown below. String query = “DROP KEYSPACE tp; “; session.execute(query); Given below is the complete program to create and use a keyspace in Cassandra using Java API. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Session; public class Drop_KeySpace { public static void main(String args[]){ //Query String query = “Drop KEYSPACE tp”; //creating Cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); //Creating Session object Session session = cluster.connect(); //Executing the query session.execute(query); System.out.println(“Keyspace deleted”); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Delete_KeySpace.java $java Delete_KeySpace Under normal conditions, it should produce the following output − Keyspace deleted Print Page Previous Next Advertisements ”;

Cassandra – Create Keyspace

Cassandra – Create Keyspace ”; Previous Next Creating a Keyspace using Cqlsh A keyspace in Cassandra is a namespace that defines data replication on nodes. A cluster contains one keyspace per node. Given below is the syntax for creating a keyspace using the statement CREATE KEYSPACE. Syntax CREATE KEYSPACE <identifier> WITH <properties> i.e. CREATE KEYSPACE “KeySpace Name” WITH replication = {”class”: ‘Strategy name’, ”replication_factor” : ‘No.Of replicas’}; CREATE KEYSPACE “KeySpace Name” WITH replication = {”class”: ‘Strategy name’, ”replication_factor” : ‘No.Of replicas’} AND durable_writes = ‘Boolean value’; The CREATE KEYSPACE statement has two properties: replication and durable_writes. Replication The replication option is to specify the Replica Placement strategy and the number of replicas wanted. The following table lists all the replica placement strategies. Strategy name Description Simple Strategy” Specifies a simple replication factor for the cluster. Network Topology Strategy Using this option, you can set the replication factor for each data-center independently. Old Network Topology Strategy This is a legacy replication strategy. Using this option, you can instruct Cassandra whether to use commitlog for updates on the current KeySpace. This option is not mandatory and by default, it is set to true. Example Given below is an example of creating a KeySpace. Here we are creating a KeySpace named TutorialsPoint. We are using the first replica placement strategy, i.e.., Simple Strategy. And we are choosing the replication factor to 1 replica. cqlsh.> CREATE KEYSPACE tutorialspoint WITH replication = {”class”:”SimpleStrategy”, ”replication_factor” : 3}; Verification You can verify whether the table is created or not using the command Describe. If you use this command over keyspaces, it will display all the keyspaces created as shown below. cqlsh> DESCRIBE keyspaces; tutorialspoint system system_traces Here you can observe the newly created KeySpace tutorialspoint. Durable_writes By default, the durable_writes properties of a table is set to true, however it can be set to false. You cannot set this property to simplex strategy. Example Given below is the example demonstrating the usage of durable writes property. cqlsh> CREATE KEYSPACE test … WITH REPLICATION = { ”class” : ”NetworkTopologyStrategy”, ”datacenter1” : 3 } … AND DURABLE_WRITES = false; Verification You can verify whether the durable_writes property of test KeySpace was set to false by querying the System Keyspace. This query gives you all the KeySpaces along with their properties. cqlsh> SELECT * FROM system_schema.keyspaces; keyspace_name | durable_writes | strategy_class | strategy_options —————-+—————-+——————————————————+—————————- test | False | org.apache.cassandra.locator.NetworkTopologyStrategy | {“datacenter1” : “3”} tutorialspoint | True | org.apache.cassandra.locator.SimpleStrategy | {“replication_factor” : “4”} system | True | org.apache.cassandra.locator.LocalStrategy | { } system_traces | True | org.apache.cassandra.locator.SimpleStrategy | {“replication_factor” : “2”} (4 rows) Here you can observe the durable_writes property of test KeySpace was set to false. Using a Keyspace You can use a created KeySpace using the keyword USE. Its syntax is as follows − Syntax:USE <identifier> Example In the following example, we are using the KeySpace tutorialspoint. cqlsh> USE tutorialspoint; cqlsh:tutorialspoint> Creating a Keyspace using Java API You can create a Keyspace using the execute() method of Session class. Follow the steps given below to create a keyspace using Java API. Step1: Create a Cluster Object First of all, create an instance of Cluster.builder class of com.datastax.driver.core package as shown below. //Creating Cluster.Builder object Cluster.Builder builder1 = Cluster.builder(); Add a contact point (IP address of the node) using addContactPoint() method of Cluster.Builder object. This method returns Cluster.Builder. //Adding contact point to the Cluster.Builder object Cluster.Builder builder2 = build.addContactPoint( “127.0.0.1” ); Using the new builder object, create a cluster object. To do so, you have a method called build() in the Cluster.Builder class. The following code shows how to create a cluster object. //Building a cluster Cluster cluster = builder.build(); You can build a cluster object in a single line of code as shown below. Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); Step 2: Create a Session Object Create an instance of Session object using the connect() method of Cluster class as shown below. Session session = cluster.connect( ); This method creates a new session and initializes it. If you already have a keyspace, you can set it to the existing one by passing the keyspace name in string format to this method as shown below. Session session = cluster.connect(“ Your keyspace name ” ); Step 3: Execute Query You can execute CQL queries using the execute() method of Session class. Pass the query either in string format or as a Statement class object to the execute() method. Whatever you pass to this method in string format will be executed on the cqlsh. In this example, we are creating a KeySpace named tp. We are using the first replica placement strategy, i.e., Simple Strategy, and we are choosing the replication factor to 1 replica. You have to store the query in a string variable and pass it to the execute() method as shown below. String query = “CREATE KEYSPACE tp WITH replication ” + “= {”class”:”SimpleStrategy”, ”replication_factor”:1}; “; session.execute(query); Step4 : Use the KeySpace You can use a created KeySpace using the execute() method as shown below. execute(“ USE tp ” ); Given below is the complete program to create and use a keyspace in Cassandra using Java API. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Session; public class Create_KeySpace { public static void main(String args[]){ //Query String query = “CREATE KEYSPACE tp WITH replication ” + “= {”class”:”SimpleStrategy”, ”replication_factor”:1};”; //creating Cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); //Creating Session object Session session = cluster.connect(); //Executing the query session.execute(query); //using the KeySpace session.execute(“USE tp”); System.out.println(“Keyspace created”); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Create_KeySpace.java $java Create_KeySpace Under normal conditions, it will produce the following output − Keyspace created Print Page Previous Next Advertisements ”;