Big Data & Analytics Archives - Page 58 of 75 - Donotsad where can learn any thing work project and make money

Aug 10

Apache Solr – Deleting Documents

Apache Solr – Deleting Documents ”; Previous Next Deleting the Document To delete documents from the index of Apache Solr, we need to specify the ID’s of the documents to be deleted between the <delete></delete> tags. <delete> <id>003</id> <id>005</id> <id>004</id> <id>002</id> </delete> Here, this XML code is used to delete the documents with ID’s 003 and 005. Save this code in a file with the name delete.xml. If you want to delete the documents from the index which belongs to the core named my_core, then you can post the delete.xml file using the post tool, as shown below. [Hadoop@localhost bin]$ ./post -c my_core delete.xml On executing the above command, you will get the following output. /home/Hadoop/java/bin/java -classpath /home/Hadoop/Solr/dist/Solr-core 6.2.0.jar -Dauto = yes -Dc = my_core -Ddata = files org.apache.Solr.util.SimplePostTool delete.xml SimplePostTool version 5.0.0 Posting files to [base] url http://localhost:8983/Solr/my_core/update… Entering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots, rtf,htm,html,txt,log POSTing file delete.xml (application/xml) to [base] 1 files indexed. COMMITting Solr index changes to http://localhost:8983/Solr/my_core/update… Time spent: 0:00:00.179 Verification Visit the homepage of the of Apache Solr web interface and select the core as my_core. Try to retrieve all the documents by passing the query “:” in the text area q and execute the query. On executing, you can observe that the specified documents are deleted. Deleting a Field Sometimes we need to delete documents based on fields other than ID. For example, we may have to delete the documents where the city is Chennai. In such cases, you need to specify the name and value of the field within the <query></query> tag pair. <delete> <query>city:Chennai</query> </delete> Save it as delete_field.xml and perform the delete operation on the core named my_core using the post tool of Solr. [Hadoop@localhost bin]$ ./post -c my_core delete_field.xml On executing the above command, it produces the following output. /home/Hadoop/java/bin/java -classpath /home/Hadoop/Solr/dist/Solr-core 6.2.0.jar -Dauto = yes -Dc = my_core -Ddata = files org.apache.Solr.util.SimplePostTool delete_field.xml SimplePostTool version 5.0.0 Posting files to [base] url http://localhost:8983/Solr/my_core/update… Entering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots, rtf,htm,html,txt,log POSTing file delete_field.xml (application/xml) to [base] 1 files indexed. COMMITting Solr index changes to http://localhost:8983/Solr/my_core/update… Time spent: 0:00:00.084 Verification Visit the homepage of the of Apache Solr web interface and select the core as my_core. Try to retrieve all the documents by passing the query “:” in the text area q and execute the query. On executing, you can observe that the documents containing the specified field value pair are deleted. Deleting All Documents Just like deleting a specific field, if you want to delete all the documents from an index, you just need to pass the symbol “:” between the tags <query></ query>, as shown below. <delete> <query>*:*</query> </delete> Save it as delete_all.xml and perform the delete operation on the core named my_core using the post tool of Solr. [Hadoop@localhost bin]$ ./post -c my_core delete_all.xml On executing the above command, it produces the following output. /home/Hadoop/java/bin/java -classpath /home/Hadoop/Solr/dist/Solr-core 6.2.0.jar -Dauto = yes -Dc = my_core -Ddata = files org.apache.Solr.util.SimplePostTool deleteAll.xml SimplePostTool version 5.0.0 Posting files to [base] url http://localhost:8983/Solr/my_core/update… Entering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf, htm,html,txt,log POSTing file deleteAll.xml (application/xml) to [base] 1 files indexed. COMMITting Solr index changes to http://localhost:8983/Solr/my_core/update… Time spent: 0:00:00.138 Verification Visit the homepage of Apache Solr web interface and select the core as my_core. Try to retrieve all the documents by passing the query “:” in the text area q and execute the query. On executing, you can observe that the documents containing the specified field value pair are deleted. Deleting all the documents using Java (Client API) Following is the Java program to add documents to Apache Solr index. Save this code in a file with the name UpdatingDocument.java. import java.io.IOException; import org.apache.Solr.client.Solrj.SolrClient; import org.apache.Solr.client.Solrj.SolrServerException; import org.apache.Solr.client.Solrj.impl.HttpSolrClient; import org.apache.Solr.common.SolrInputDocument; public class DeletingAllDocuments { public static void main(String args[]) throws SolrServerException, IOException { //Preparing the Solr client String urlString = “http://localhost:8983/Solr/my_core”; SolrClient Solr = new HttpSolrClient.Builder(urlString).build(); //Preparing the Solr document SolrInputDocument doc = new SolrInputDocument(); //Deleting the documents from Solr Solr.deleteByQuery(“*”); //Saving the document Solr.commit(); System.out.println(“Documents deleted”); } } Compile the above code by executing the following commands in the terminal − [Hadoop@localhost bin]$ javac DeletingAllDocuments [Hadoop@localhost bin]$ java DeletingAllDocuments On executing the above command, you will get the following output. Documents deleted Print Page Previous Next Advertisements ”;

Aug 10

Apache Solr – Indexing Data

Apache Solr – Indexing Data ”; Previous Next In general, indexing is an arrangement of documents or (other entities) systematically. Indexing enables users to locate information in a document. Indexing collects, parses, and stores documents. Indexing is done to increase the speed and performance of a search query while finding a required document. Indexing in Apache Solr In Apache Solr, we can index (add, delete, modify) various document formats such as xml, csv, pdf, etc. We can add data to Solr index in several ways. In this chapter, we are going to discuss indexing − Using the Solr Web Interface. Using any of the client APIs like Java, Python, etc. Using the post tool. In this chapter, we will discuss how to add data to the index of Apache Solr using various interfaces (command line, web interface, and Java client API) Adding Documents using Post Command Solr has a post command in its bin/ directory. Using this command, you can index various formats of files such as JSON, XML, CSV in Apache Solr. Browse through the bin directory of Apache Solr and execute the –h option of the post command, as shown in the following code block. [Hadoop@localhost bin]$ cd $SOLR_HOME [Hadoop@localhost bin]$ ./post -h On executing the above command, you will get a list of options of the post command, as shown below. Usage: post -c <collection> [OPTIONS] <files|directories|urls|-d [“..”]> or post –help collection name defaults to DEFAULT_SOLR_COLLECTION if not specified OPTIONS ======= Solr options: -url <base Solr update URL> (overrides collection, host, and port) -host <host> (default: localhost) -p or -port <port> (default: 8983) -commit yes|no (default: yes) Web crawl options: -recursive <depth> (default: 1) -delay <seconds> (default: 10) Directory crawl options: -delay <seconds> (default: 0) stdin/args options: -type <content/type> (default: application/xml) Other options: -filetypes <type>[,<type>,…] (default: xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots, rtf,htm,html,txt,log) -params “<key> = <value>[&<key> = <value>…]” (values must be URL-encoded; these pass through to Solr update request) -out yes|no (default: no; yes outputs Solr response to console) -format Solr (sends application/json content as Solr commands to /update instead of /update/json/docs) Examples: * JSON file:./post -c wizbang events.json * XML files: ./post -c records article*.xml * CSV file: ./post -c signals LATEST-signals.csv * Directory of files: ./post -c myfiles ~/Documents * Web crawl: ./post -c gettingstarted http://lucene.apache.org/Solr -recursive 1 -delay 1 * Standard input (stdin): echo ”{commit: {}}” | ./post -c my_collection – type application/json -out yes –d * Data as string: ./post -c signals -type text/csv -out yes -d $”id,valuen1,0.47” Example Suppose we have a file named sample.csv with the following content (in the bin directory). Student ID First Name Lasst Name Phone City 001 Rajiv Reddy 9848022337 Hyderabad 002 Siddharth Bhattacharya 9848022338 Kolkata 003 Rajesh Khanna 9848022339 Delhi 004 Preethi Agarwal 9848022330 Pune 005 Trupthi Mohanty 9848022336 Bhubaneshwar 006 Archana Mishra 9848022335 Chennai The above dataset contains personal details like Student id, first name, last name, phone, and city. The CSV file of the dataset is shown below. Here, you must note that you need to mention the schema, documenting its first line. id, first_name, last_name, phone_no, location 001, Pruthvi, Reddy, 9848022337, Hyderabad 002, kasyap, Sastry, 9848022338, Vishakapatnam 003, Rajesh, Khanna, 9848022339, Delhi 004, Preethi, Agarwal, 9848022330, Pune 005, Trupthi, Mohanty, 9848022336, Bhubaneshwar 006, Archana, Mishra, 9848022335, Chennai You can index this data under the core named sample_Solr using the post command as follows − [Hadoop@localhost bin]$ ./post -c Solr_sample sample.csv On executing the above command, the given document is indexed under the specified core, generating the following output. /home/Hadoop/java/bin/java -classpath /home/Hadoop/Solr/dist/Solr-core 6.2.0.jar -Dauto = yes -Dc = Solr_sample -Ddata = files org.apache.Solr.util.SimplePostTool sample.csv SimplePostTool version 5.0.0 Posting files to [base] url http://localhost:8983/Solr/Solr_sample/update… Entering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf, htm,html,txt,log POSTing file sample.csv (text/csv) to [base] 1 files indexed. COMMITting Solr index changes to http://localhost:8983/Solr/Solr_sample/update… Time spent: 0:00:00.228 Visit the homepage of Solr Web UI using the following URL − http://localhost:8983/ Select the core Solr_sample. By default, the request handler is /select and the query is “:”. Without doing any modifications, click the ExecuteQuery button at the bottom of the page. On executing the query, you can observe the contents of the indexed CSV document in JSON format (default), as shown in the following screenshot. Note − In the same way, you can index other file formats such as JSON, XML, CSV, etc. Adding Documents using the Solr Web Interface You can also index documents using the web interface provided by Solr. Let us see how to index the following JSON document. [ { “id” : “001”, “name” : “Ram”, “age” : 53, “Designation” : “Manager”, “Location” : “Hyderabad”, }, { “id” : “002”, “name” : “Robert”, “age” : 43, “Designation” : “SR.Programmer”, “Location” : “Chennai”, }, { “id” : “003”, “name” : “Rahim”, “age” : 25, “Designation” : “JR.Programmer”, “Location” : “Delhi”, } ] Step 1 Open Solr web interface using the following URL − http://localhost:8983/ Step 2 Select the core Solr_sample. By default, the values of the fields Request Handler, Common Within, Overwrite, and Boost are /update, 1000, true, and 1.0 respectively, as shown in the following screenshot. Now, choose the document format you want from JSON, CSV, XML, etc. Type the document to be indexed in the text area and click the Submit Document button, as shown in the following screenshot. Adding Documents using Java Client API Following is the Java program to add documents to Apache Solr index. Save this code in a file with the name AddingDocument.java. import java.io.IOException; import org.apache.Solr.client.Solrj.SolrClient; import org.apache.Solr.client.Solrj.SolrServerException; import org.apache.Solr.client.Solrj.impl.HttpSolrClient; import org.apache.Solr.common.SolrInputDocument; public class AddingDocument { public static void main(String args[]) throws Exception { //Preparing the Solr client String urlString = “http://localhost:8983/Solr/my_core”; SolrClient Solr = new HttpSolrClient.Builder(urlString).build(); //Preparing the Solr document SolrInputDocument doc = new SolrInputDocument(); //Adding fields to the document doc.addField(“id”, “003”); doc.addField(“name”, “Rajaman”); doc.addField(“age”,”34″); doc.addField(“addr”,”vishakapatnam”); //Adding the document to Solr Solr.add(doc); //Saving the changes Solr.commit(); System.out.println(“Documents added”); } } Compile the above code by executing the following commands in the terminal − [Hadoop@localhost bin]$ javac AddingDocument [Hadoop@localhost bin]$ java AddingDocument

Aug 10

AWS Quicksight – Sharing Analysis

AWS Quicksight – Sharing Analysis ”; Previous Next Once the analysis is ready, this can be shared with users by email or other Quicksight users. On the top right side menu, there is a “Share” icon By default, the analysis can be accessed by the author or admin only. You would require providing access to the user to be able to view the analysis. On clicking “Manage analysis access”, it will show all the registered users. You can choose the users you want to extend the access and click on “Invite users”. The users will get an email notification and once they accept the notification, they will be able to get analysis by email. Print Page Previous Next Advertisements ”;

Aug 10

AWS Quicksight – Managing IAM Policies

AWS Quicksight – Managing IAM Policies ”; Previous Next To manage IAM policies for Quicksight account, you can use root user or IAM credentials. It is recommended to use IAM credentials to manage resource access and policies instead of root user. Following policies are required to signup and use Amazon Quicksight − Standard Edition ds:AuthorizeApplication ds:CheckAlias ds:CreateAlias ds:CreateIdentityPoolDirectory ds:DeleteDirectory ds:DescribeDirectories ds:DescribeTrusts ds:UnauthorizeApplication iam:CreatePolicy iam:CreateRole iam:ListAccountAliases quicksight:CreateUser quicksight:CreateAdmin quicksight:Subscribe Enterprise Edition Apart from the above mentioned policies, below permissions are required in enterprise edition − quicksight:GetGroupMapping quicksight:SearchDirectoryGroups quicksight:SetGroupMapping You can also allow a user to manage permissions for AWS resources in Quicksight. Following IAM policies should be assigned in both editions − iam:AttachRolePolicy iam:CreatePolicy iam:CreatePolicyVersion iam:CreateRole iam:DeletePolicyVersion iam:DeleteRole iam:DetachRolePolicy iam:GetPolicy iam:GetPolicyVersion iam:GetRole iam:ListAttachedRolePolicies iam:ListEntitiesForPolicy iam:ListPolicyVersions iam:ListRoles s3:ListAllMyBuckets To prevent an AWS administrator to unsubscribe from Quicksight, you can deny all users “quicksight:Unsubscribe” IAM policy for dashboard embedding To embed an AWS Quciksight dashboard URL in web page, you need the following IAM policies to be assigned to the user − { “Version”: “2012-10-17”, “Statement”: [ { “Action”: “quicksight:RegisterUser”, “Resource”: “*”, “Effect”: “Allow” }, { “Action”: “quicksight:GetDashboardEmbedUrl”, “Resource”: “arn:aws:quicksight:us-east-1: 868211930999:dashboard/ f2cb6cf2-477c-45f9-a1b3-639239eb95d8 “, “Effect”: “Allow” } ] } You can manage and test these roles and policies using IAM policy simulator in Quicksight. Below is the link to access IAM Policy simulator − https://policysim.aws.amazon.com/home/index.jsp?# Print Page Previous Next Advertisements ”;

Aug 10

Cassandra – Read Data

Cassandra – Read Data ”; Previous Next Reading Data using Select Clause SELECT clause is used to read data from a table in Cassandra. Using this clause, you can read a whole table, a single column, or a particular cell. Given below is the syntax of SELECT clause. SELECT FROM <tablename> Example Assume there is a table in the keyspace named emp with the following details − emp_id emp_name emp_city emp_phone emp_sal 1 ram Hyderabad 9848022338 50000 2 robin null 9848022339 50000 3 rahman Chennai 9848022330 50000 4 rajeev Pune 9848022331 30000 The following example shows how to read a whole table using SELECT clause. Here we are reading a table called emp. cqlsh:tutorialspoint> select * from emp; emp_id | emp_city | emp_name | emp_phone | emp_sal ——–+———–+———-+————+——— 1 | Hyderabad | ram | 9848022338 | 50000 2 | null | robin | 9848022339 | 50000 3 | Chennai | rahman | 9848022330 | 50000 4 | Pune | rajeev | 9848022331 | 30000 (4 rows) Reading Required Columns The following example shows how to read a particular column in a table. cqlsh:tutorialspoint> SELECT emp_name, emp_sal from emp; emp_name | emp_sal ———-+——— ram | 50000 robin | 50000 rajeev | 30000 rahman | 50000 (4 rows) Where Clause Using WHERE clause, you can put a constraint on the required columns. Its syntax is as follows − SELECT FROM <table name> WHERE <condition>; Note − A WHERE clause can be used only on the columns that are a part of primary key or have a secondary index on them. In the following example, we are reading the details of an employee whose salary is 50000. First of all, set secondary index to the column emp_sal. cqlsh:tutorialspoint> CREATE INDEX ON emp(emp_sal); cqlsh:tutorialspoint> SELECT * FROM emp WHERE emp_sal=50000; emp_id | emp_city | emp_name | emp_phone | emp_sal ——–+———–+———-+————+——— 1 | Hyderabad | ram | 9848022338 | 50000 2 | null | robin | 9848022339 | 50000 3 | Chennai | rahman | 9848022330 | 50000 Reading Data using Java API You can read data from a table using the execute() method of Session class. Follow the steps given below to execute multiple statements using batch statement with the help of Java API. Step1:Create a Cluster Object Create an instance of Cluster.builder class of com.datastax.driver.core package as shown below. //Creating Cluster.Builder object Cluster.Builder builder1 = Cluster.builder(); Add a contact point (IP address of the node) using the addContactPoint() method of Cluster.Builder object. This method returns Cluster.Builder. //Adding contact point to the Cluster.Builder object Cluster.Builder builder2 = build.addContactPoint( “127.0.0.1” ); Using the new builder object, create a cluster object. To do so, you have a method called build() in the Cluster.Builder class. Use the following code to create the cluster object. //Building a cluster Cluster cluster = builder.build(); You can build the cluster object using a single line of code as shown below. Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); Step 2: Create a Session Object Create an instance of Session object using the connect() method of Cluster class as shown below. Session session = cluster.connect( ); This method creates a new session and initializes it. If you already have a keyspace, then you can set it to the existing one by passing the KeySpace name in string format to this method as shown below. Session session = cluster.connect(“Your keyspace name”); Here we are using the KeySpace called tp. Therefore, create the session object as shown below. Session session = cluster.connect(“tp”); Step 3: Execute Query You can execute CQL queries using execute() method of Session class. Pass the query either in string format or as a Statement class object to the execute() method. Whatever you pass to this method in string format will be executed on the cqlsh. In this example, we are retrieving the data from emp table. Store the query in a string and pass it to the execute() method of session class as shown below. String query = ”SELECT 8 FROM emp”; session.execute(query); Execute the query using the execute() method of Session class. Step 4: Get the ResultSet Object The select queries will return the result in the form of a ResultSet object, therefore store the result in the object of RESULTSET class as shown below. ResultSet result = session.execute( ); Given below is the complete program to read data from a table. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.ResultSet; import com.datastax.driver.core.Session; public class Read_Data { public static void main(String args[])throws Exception{ //queries String query = “SELECT * FROM emp”; //Creating Cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); //Creating Session object Session session = cluster.connect(“tutorialspoint”); //Getting the ResultSet ResultSet result = session.execute(query); System.out.println(result.all()); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Read_Data.java $java Read_Data Under normal conditions, it should produce the following output − [Row[1, Hyderabad, ram, 9848022338, 50000], Row[2, Delhi, robin, 9848022339, 50000], Row[4, Pune, rajeev, 9848022331, 30000], Row[3, Chennai, rahman, 9848022330, 50000]] Print Page Previous Next Advertisements ”;

Aug 10

AWS Quicksight – Using Parameters

AWS Quicksight – Using Parameters ”; Previous Next Parameters are variables that are created to pass control of the user to modify the dashboards. Parameters can be created using the fields of the input data set or on the filters created for analysis. For example, you create a parameter using a filter, the dashboard users can directly apply filter without creating any filter. Creating a Parameter In this section, we will see how to create a parameter − Step 1 − Enter the name of parameter. Let us take Gender as parameter. Step 2 − Choose data type. By default, it is string. Step 3 − Choose the possible options as below. The parameter would be added. You can choose to add control to the dashboard. Print Page Previous Next Advertisements ”;

Aug 10

Cassandra – Alter Table

Cassandra – Alter Table Altering a Table You can alter a table using the command ALTER TABLE. Given below is the syntax for creating a table. Syntax ALTER (TABLE | COLUMNFAMILY) <tablename> <instruction> Using ALTER command, you can perform the following operations − Add a column Drop a column Adding a Column Using ALTER command, you can add a column to a table. While adding columns, you have to take care that the column name is not conflicting with the existing column names and that the table is not defined with compact storage option. Given below is the syntax to add a column to a table. ALTER TABLE table name ADD new column datatype; Example Given below is an example to add a column to an existing table. Here we are adding a column called emp_email of text datatype to the table named emp. cqlsh:tutorialspoint> ALTER TABLE emp … ADD emp_email text; Verification Use the SELECT statement to verify whether the column is added or not. Here you can observe the newly added column emp_email. cqlsh:tutorialspoint> select * from emp; emp_id | emp_city | emp_email | emp_name | emp_phone | emp_sal ——–+———-+———–+———-+———–+——— Dropping a Column Using ALTER command, you can delete a column from a table. Before dropping a column from a table, check that the table is not defined with compact storage option. Given below is the syntax to delete a column from a table using ALTER command. ALTER table name DROP column name; Example Given below is an example to drop a column from a table. Here we are deleting the column named emp_email. cqlsh:tutorialspoint> ALTER TABLE emp DROP emp_email; Verification Verify whether the column is deleted using the select statement, as shown below. cqlsh:tutorialspoint> select * from emp; emp_id | emp_city | emp_name | emp_phone | emp_sal ——–+———-+———-+———–+——— (0 rows) Since emp_email column has been deleted, you cannot find it anymore. Altering a Table using Java API You can create a table using the execute() method of Session class. Follow the steps given below to alter a table using Java API. Step1: Create a Cluster Object First of all, create an instance of Cluster.builder class of com.datastax.driver.core package as shown below. //Creating Cluster.Builder object Cluster.Builder builder1 = Cluster.builder(); Add a contact point (IP address of the node) using the addContactPoint() method of Cluster.Builder object. This method returns Cluster.Builder. //Adding contact point to the Cluster.Builder object Cluster.Builder builder2 = build.addContactPoint( “127.0.0.1” ); Using the new builder object, create a cluster object. To do so, you have a method called build() in the Cluster.Builder class. The following code shows how to create a cluster object. //Building a cluster Cluster cluster = builder.build(); You can build a cluster object using a single line of code as shown below. Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); Step 2: Create a Session Object Create an instance of Session object using the connect() method of Cluster class as shown below. Session session = cluster.connect( ); This method creates a new session and initializes it. If you already have a keyspace, you can set it to the existing one by passing the KeySpace name in string format to this method as shown below. Session session = cluster.connect(“ Your keyspace name ” ); Session session = cluster.connect(“ tp” ); Here we are using the KeySpace named tp. Therefore, create the session object as shown below. Step 3: Execute Query You can execute CQL queries using the execute() method of Session class. Pass the query either in string format or as a Statement class object to the execute() method. Whatever you pass to this method in string format will be executed on the cqlsh. In the following example, we are adding a column to a table named emp. To do so, you have to store the query in a string variable and pass it to the execute() method as shown below. //Query String query1 = “ALTER TABLE emp ADD emp_email text”; session.execute(query); Given below is the complete program to add a column to an existing table. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Session; public class Add_column { public static void main(String args[]){ //Query String query = “ALTER TABLE emp ADD emp_email text”; //Creating Cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); //Creating Session object Session session = cluster.connect(“tp”); //Executing the query session.execute(query); System.out.println(“Column added”); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Add_Column.java $java Add_Column Under normal conditions, it should produce the following output − Column added Deleting a Column Given below is the complete program to delete a column from an existing table. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Session; public class Delete_Column { public static void main(String args[]){ //Query String query = “ALTER TABLE emp DROP emp_email;”; //Creating Cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); //Creating Session object Session session = cluster.connect(“tp”); //executing the query session.execute(query); System.out.println(“Column deleted”); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Delete_Column.java $java Delete_Column Under normal conditions, it should produce the following output − Column deleted Print Page Previous Next Advertisements ”;

Aug 10

Cassandra – Installation

Cassandra – Installation ”; Previous Next Cassandra can be accessed using cqlsh as well as drivers of different languages. This chapter explains how to set up both cqlsh and java environments to work with Cassandra. Pre-Installation Setup Before installing Cassandra in Linux environment, we require to set up Linux using ssh (Secure Shell). Follow the steps given below for setting up Linux environment. Create a User At the beginning, it is recommended to create a separate user for Hadoop to isolate Hadoop file system from Unix file system. Follow the steps given below to create a user. Open root using the command “su”. Create a user from the root account using the command “useradd username”. Now you can open an existing user account using the command “su username”. Open the Linux terminal and type the following commands to create a user. $ su password: # useradd hadoop # passwd hadoop New passwd: Retype new passwd SSH Setup and Key Generation SSH setup is required to perform different operations on a cluster such as starting, stopping, and distributed daemon shell operations. To authenticate different users of Hadoop, it is required to provide public/private key pair for a Hadoop user and share it with different users. The following commands are used for generating a key value pair using SSH − copy the public keys form id_rsa.pub to authorized_keys, and provide owner, read and write permissions to authorized_keys file respectively. $ ssh-keygen -t rsa $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys $ chmod 0600 ~/.ssh/authorized_keys Verify ssh: ssh localhost Installing Java Java is the main prerequisite for Cassandra. First of all, you should verify the existence of Java in your system using the following command − $ java -version If everything works fine it will give you the following output. java version “1.7.0_71” Java(TM) SE Runtime Environment (build 1.7.0_71-b13) Java HotSpot(TM) Client VM (build 25.0-b02, mixed mode) If you don’t have Java in your system, then follow the steps given below for installing Java. Step 1 Download java (JDK <latest version> – X64.tar.gz) from the following link: Then jdk-7u71-linux-x64.tar.gz will be downloaded onto your system. Step 2 Generally you will find the downloaded java file in the Downloads folder. Verify it and extract the jdk-7u71-linux-x64.gz file using the following commands. $ cd Downloads/ $ ls jdk-7u71-linux-x64.gz $ tar zxf jdk-7u71-linux-x64.gz $ ls jdk1.7.0_71 jdk-7u71-linux-x64.gz Step 3 To make Java available to all users, you have to move it to the location “/usr/local/”. Open root, and type the following commands. $ su password: # mv jdk1.7.0_71 /usr/local/ # exit Step 4 For setting up PATH and JAVA_HOME variables, add the following commands to ~/.bashrc file. export JAVA_HOME = /usr/local/jdk1.7.0_71 export PATH = $PATH:$JAVA_HOME/bin Now apply all the changes into the current running system. $ source ~/.bashrc Step 5 Use the following commands to configure java alternatives. # alternatives –install /usr/bin/java java usr/local/java/bin/java 2 # alternatives –install /usr/bin/javac javac usr/local/java/bin/javac 2 # alternatives –install /usr/bin/jar jar usr/local/java/bin/jar 2 # alternatives –set java usr/local/java/bin/java # alternatives –set javac usr/local/java/bin/javac # alternatives –set jar usr/local/java/bin/jar Now use the java -version command from the terminal as explained above. Setting the Path Set the path of Cassandra path in “/.bashrc” as shown below. [hadoop@linux ~]$ gedit ~/.bashrc export CASSANDRA_HOME = ~/cassandra export PATH = $PATH:$CASSANDRA_HOME/bin Download Cassandra Apache Cassandra is available at Download Link Cassandra using the following command. $ wget http://supergsego.com/apache/cassandra/2.1.2/apache-cassandra-2.1.2-bin.tar.gz Unzip Cassandra using the command zxvf as shown below. $ tar zxvf apache-cassandra-2.1.2-bin.tar.gz. Create a new directory named cassandra and move the contents of the downloaded file to it as shown below. $ mkdir Cassandra $ mv apache-cassandra-2.1.2/* cassandra. Configure Cassandra Open the cassandra.yaml: file, which will be available in the bin directory of Cassandra. $ gedit cassandra.yaml Note − If you have installed Cassandra from a deb or rpm package, the configuration files will be located in /etc/cassandra directory of Cassandra. The above command opens the cassandra.yaml file. Verify the following configurations. By default, these values will be set to the specified directories. data_file_directories “/var/lib/cassandra/data” commitlog_directory “/var/lib/cassandra/commitlog” saved_caches_directory “/var/lib/cassandra/saved_caches” Make sure these directories exist and can be written to, as shown below. Create Directories As super-user, create the two directories /var/lib/cassandra and /var./log/cassandra into which Cassandra writes its data. [root@linux cassandra]# mkdir /var/lib/cassandra [root@linux cassandra]# mkdir /var/log/cassandra Give Permissions to Folders Give read-write permissions to the newly created folders as shown below. [root@linux /]# chmod 777 /var/lib/cassandra [root@linux /]# chmod 777 /var/log/cassandra Start Cassandra To start Cassandra, open the terminal window, navigate to Cassandra home directory/home, where you unpacked Cassandra, and run the following command to start your Cassandra server. $ cd $CASSANDRA_HOME $./bin/cassandra -f Using the –f option tells Cassandra to stay in the foreground instead of running as a background process. If everything goes fine, you can see the Cassandra server starting. Programming Environment To set up Cassandra programmatically, download the following jar files − slf4j-api-1.7.5.jar cassandra-driver-core-2.0.2.jar guava-16.0.1.jar metrics-core-3.0.2.jar netty-3.9.0.Final.jar Place them in a separate folder. For example, we are downloading these jars to a folder named “Cassandra_jars”. Set the classpath for this folder in “.bashrc”file as shown below. [hadoop@linux ~]$ gedit ~/.bashrc //Set the following class path in the .bashrc file. export CLASSPATH = $CLASSPATH:/home/hadoop/Cassandra_jars/* Eclipse Environment Open Eclipse and create a new project called Cassandra _Examples. Right click on the project, select Build Path→Configure Build Path as shown below. It will open the properties window. Under Libraries tab, select Add External JARs. Navigate to the directory where you saved your jar files. Select all the five jar files and click OK as shown below. Under Referenced Libraries, you can see all the required jars added as shown below − Maven Dependencies Given below is the pom.xml for building a Cassandra project using maven. <project xmlns = “http://maven.apache.org/POM/4.0.0” xmlns:xsi = “http://www.w3.org/2001/XMLSchema-instance” xsi:schemaLocation = “http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd”> <build> <sourceDirectory>src</sourceDirectory> <plugins> <plugin> <artifactId>maven-compiler-plugin</artifactId> <version>3.1</version> <configuration> <source>1.7</source> <target>1.7</target> </configuration> </plugin> </plugins> </build> <dependencies> <dependency> <groupId>org.slf4j</groupId> <artifactId>slf4j-api</artifactId> <version>1.7.5</version> </dependency> <dependency> <groupId>com.datastax.cassandra</groupId> <artifactId>cassandra-driver-core</artifactId> <version>2.0.2</version> </dependency> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>16.0.1</version> </dependency> <dependency> <groupId>com.codahale.metrics</groupId> <artifactId>metrics-core</artifactId> <version>3.0.2</version> </dependency> <dependency> <groupId>io.netty</groupId>

Aug 10

Cassandra – Useful Resources

Cassandra – Useful Resources ”; Previous Next The following resources contain additional information on Cassandra. Please use them to get more in-depth knowledge on this topic. Nodejs Course with 10 real-world Projects Most Popular 89 Lectures 17 hours Eduonix Learning Solutions More Detail Real-Time Spark Project For Beginners: Hadoop, Spark, Docker 25 Lectures 6.5 hours Pari Margu More Detail Apache Cassandra for Beginners 28 Lectures 2 hours Navdeep Kaur More Detail Print Page Previous Next Advertisements ”;

Aug 10

Cassandra – Drop Index

Cassandra – Drop Index ”; Previous Next Dropping an Index You can drop an index using the command DROP INDEX. Its syntax is as follows − DROP INDEX <identifier> Given below is an example to drop an index of a column in a table. Here we are dropping the index of the column name in the table emp. cqlsh:tp> drop index name; Dropping an Index using Java API You can drop an index of a table using the execute() method of Session class. Follow the steps given below to drop an index from a table. Step1: Create a Cluster Object Create an instance of Cluster.builder class of com.datastax.driver.core package as shown below. //Creating Cluster.Builder object Cluster.Builder builder1 = Cluster.builder(); Add a contact point (IP address of the node) using the addContactPoint() method of Cluster.Builder object. This method returns Cluster.Builder. //Adding contact point to the Cluster.Builder object Cluster.Builder builder2 = build.addContactPoint( “127.0.0.1” ); Using the new builder object, create a cluster object. To do so, you have a method called build() in the Cluster.Builder class. The following code shows how to create a cluster object. //Building a cluster Cluster cluster = builder.build(); You can build a cluster object using a single line of code as shown below. Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build(); Step 2: Create a Session Object Create an instance of Session object using the connect() method of Cluster class as shown below. Session session = cluster.connect( ); This method creates a new session and initializes it. If you already have a keyspace, then you can set it to the existing one by passing the KeySpace name in string format to this method as shown below. Session session = cluster.connect(“ Your keyspace name ” ); Here we are using the KeySpace named tp. Therefore, create the session object as shown below. Session session = cluster.connect(“ tp” ); Step 3: Execute Query You can execute CQL queries using the execute() method of Session class. Pass the query either in string format or as a Statement class object to the execute() method. Whatever you pass to this method in string format will be executed on the cqlsh. In the following example, we are dropping an index “name” of emp table. You have to store the query in a string variable and pass it to the execute() method as shown below. //Query String query = “DROP INDEX user_name;”; session.execute(query); Given below is the complete program to drop an index in Cassandra using Java API. import com.datastax.driver.core.Cluster; import com.datastax.driver.core.Session; public class Drop_Index { public static void main(String args[]){ //Query String query = “DROP INDEX user_name;”; //Creating cluster object Cluster cluster = Cluster.builder().addContactPoint(“127.0.0.1”).build();. //Creating Session object Session session = cluster.connect(“tp”); //Executing the query session.execute(query); System.out.println(“Index dropped”); } } Save the above program with the class name followed by .java, browse to the location where it is saved. Compile and execute the program as shown below. $javac Drop_index.java $java Drop_index Under normal conditions, it should produce the following output − Index dropped Print Page Previous Next Advertisements ”;