DynamoDB – Global Secondary Indexes ”; Previous Next Applications requiring various query types with different attributes can use a single or multiple global secondary indexes in performing these detailed queries. For example − A system keeping a track of users, their login status, and their time logged in. The growth of the previous example slows queries on its data. Global secondary indexes accelerate queries by organizing a selection of attributes from a table. They employ primary keys in sorting data, and require no key table attributes, or key schema identical to the table. All the global secondary indexes must include a partition key, with the option of a sort key. The index key schema can differ from the table, and index key attributes can use any top-level string, number, or binary table attributes. In a projection, you can use other table attributes, however, queries do not retrieve from parent tables. Attribute Projections Projections consist of an attribute set copied from table to secondary index. A Projection always occurs with the table partition key and sort key. In queries, projections allow DynamoDB access to any attribute of the projection; they essentially exist as their own table. In a secondary index creation, you must specify attributes for projection. DynamoDB offers three ways to perform this task − KEYS_ONLY − All index items consist of table partition and sort key values, and index key values. This creates the smallest index. INCLUDE − It includes KEYS_ONLY attributes and specified non-key attributes. ALL − It includes all source table attributes, creating the largest possible index. Note the tradeoffs in projecting attributes into a global secondary index, which relate to throughput and storage cost. Consider the following points − If you only need access to a few attributes, with low latency, project only those you need. This reduces storage and write costs. If an application frequently accesses certain non-key attributes, project them because the storage costs pale in comparison to scan consumption. You can project large sets of attributes frequently accessed, however, this carries a high storage cost. Use KEYS_ONLY for infrequent table queries and frequent writes/updates. This controls size, but still offers good performance on queries. Global Secondary Index Queries and Scans You can utilize queries for accessing a single or multiple items in an index. You must specify index and table name, desired attributes, and conditions; with the option to return results in ascending or descending order. You can also utilize scans to get all index data. It requires table and index name. You utilize a filter expression to retrieve specific data. Table and Index Data Synchronization DynamoDB automatically performs synchronization on indexes with their parent table. Each modifying operation on items causes asynchronous updates, however, applications do not write to indexes directly. You need to understand the impact of DynamoDB maintenance on indices. On creation of an index, you specify key attributes and data types, which means on a write, those data types must match key schema data types. On item creation or deletion, indexes update in an eventually consistent manner, however, updates to data propagate in a fraction of a second (unless system failure of some type occurs). You must account for this delay in applications. Throughput Considerations in Global Secondary Indexes − Multiple global secondary indexes impact throughput. Index creation requires capacity unit specifications, which exist separate from the table, resulting in operations consuming index capacity units rather than table units. This can result in throttling if a query or write exceeds provisioned throughput. View throughput settings by using DescribeTable. Read Capacity − Global secondary indexes deliver eventual consistency. In queries, DynamoDB performs provision calculations identical to that used for tables, with a lone difference of using index entry size rather than item size. The limit of a query returns remains 1MB, which includes attribute name size and values across every returned item. Write Capacity When write operations occur, the affected index consumes write units. Write throughput costs are the sum of write capacity units consumed in table writes and units consumed in index updates. A successful write operation requires sufficient capacity, or it results in throttling. Write costs also remain dependent on certain factors, some of which are as follows − New items defining indexed attributes or item updates defining undefined indexed attributes use a single write operation to add the item to the index. Updates changing indexed key attribute value use two writes to delete an item and write a new one. A table write triggering deletion of an indexed attribute uses a single write to erase the old item projection in the index. Items absent in the index prior to and after an update operation use no writes. Updates changing only projected attribute value in the index key schema, and not indexed key attribute value, use one write to update values of projected attributes into the index. All these factors assume an item size of less than or equal to 1KB. Global Secondary Index Storage On an item write, DynamoDB automatically copies the right set of attributes to any indices where the attributes must exist. This impacts your account by charging it for table item storage and attribute storage. The space used results from the sum of these quantities − Byte size of table primary key Byte size of index key attribute Byte size of projected attributes 100 byte-overhead per index item You can estimate storage needs through estimating average item size and multiplying by the quantity of the table items with the global secondary index key attributes. DynamoDB does not write item data for a table item with an undefined attribute defined as an index partition or sort key. Global Secondary Index Crud Create a table with global secondary indexes by using the CreateTable operation paired with the GlobalSecondaryIndexes parameter. You must specify an attribute to serve as the index partition key, or use another for the index sort key. All index key attributes must be string, number, or binary scalars. You must
Category: dynamodb
DynamoDB – Conditions
DynamoDB – Conditions ”; Previous Next In granting permissions, DynamoDB allows specifying conditions for them through a detailed IAM policy with condition keys. This supports settings like access to specific items and attributes. Note − The DynamoDB does not support any tags. Detailed Control Several conditions allow specificity down to items and attributes like granting read-only access to specific items based on user account. Implement this level of control with conditioned IAM policies, which manages the security credentials. Then simply apply the policy to the desired users, groups, and roles. Web Identity Federation, a topic discussed later, also provides a way to control user access through Amazon, Facebook, and Google logins. The condition element of IAM policy implements access control. You simply add it to a policy. An example of its use consists of denying or permitting access to table items and attributes. The condition element can also employ condition keys to limit permissions. You can review the following two examples of the condition keys − dynamodb:LeadingKeys − It prevents the item access by users without an ID matching the partition key value. dynamodb:Attributes − It prevents users from accessing or operating on attributes outside of those listed. On evaluation, IAM policies result in a true or false value. If any part evaluates to false, the whole policy evaluates to false, which results in denial of access. Be sure to specify all required information in condition keys to ensure users have appropriate access. Predefined Condition Keys AWS offers a collection of predefined condition keys, which apply to all services. They support a broad range of uses and fine detail in examining users and access. Note − There is case sensitivity in condition keys. You can review a selection of the following service-specific keys − dynamodb:LeadingKey − It represents a table”s first key attribute; the partition key. Use the ForAllValues modifier in conditions. dynamodb:Select − It represents a query/scan request Select parameter. It must be of the value ALL_ATTRIBUTES, ALL_PROJECTED_ATTRIBUTES, SPECIFIC_ATTRIBUTES, or COUNT. dynamodb:Attributes − It represents an attribute name list within a request, or attributes returned from a request. Its values and their functions resemble API action parameters, e.g., BatchGetItem uses AttributesToGet. dynamodb:ReturnValues − It represents a requests’ ReturnValues parameter, and can use these values: ALL_OLD, UPDATED_OLD, ALL_NEW, UPDATED_NEW, and NONE. dynamodb:ReturnConsumedCapacity − It represents a request”s ReturnConsumedCapacity parameter, and can use these values: TOTAL and NONE. Print Page Previous Next Advertisements ”;
DynamoDB – Data Pipeline
DynamoDB – Data Pipeline ”; Previous Next Data Pipeline allows for exporting and importing data to/from a table, file, or S3 bucket. This of course proves useful in backups, testing, and for similar needs or scenarios. In an export, you use the Data Pipeline console, which makes a new pipeline and launches an Amazon EMR (Elastic MapReduce) cluster to perform the export. An EMR reads data from DynamoDB and writes to the target. We discuss EMR in detail later in this tutorial. In an import operation, you use the Data Pipeline console, which makes a pipeline and launches EMR to perform the import. It reads data from the source and writes to the destination. Note − Export/import operations carry a cost given the services used, specifically, EMR and S3. Using Data Pipeline You must specify action and resource permissions when using Data Pipeline. You can utilize an IAM role or policy to define them. The users who are performing imports/exports should make a note that they would require an active access key ID and secret key. IAM Roles for Data Pipeline You need two IAM roles to use Data Pipeline − DataPipelineDefaultRole − This has all the actions you permit the pipeline to perform for you. DataPipelineDefaultResourceRole − This has resources you permit the pipeline to provision for you. If you are new to Data Pipeline, you must spawn each role. All the previous users possess these roles due to the existing roles. Use the IAM console to create IAM roles for Data Pipeline, and perform the following four steps − Step 1 − Log in to the IAM console located at https://console.aws.amazon.com/iam/ Step 2 − Select Roles from the dashboard. Step 3 − Select Create New Role. Then enter DataPipelineDefaultRole in the Role Name field, and select Next Step. In the AWS Service Roles list in the Role Type panel, navigate to Data Pipeline, and choose Select. Select Create Role in the Review panel. Step 4 − Select Create New Role. Print Page Previous Next Advertisements ”;
DynamoDB – MapReduce
DynamoDB – MapReduce ”; Previous Next Amazon”s Elastic MapReduce (EMR) allows you to quickly and efficiently process big data. EMR runs Apache Hadoop on EC2 instances, but simplifies the process. You utilize Apache Hive to query map reduce job flows through HiveQL, a query language resembling SQL. Apache Hive serves as a way to optimize queries and your applications. You can use the EMR tab of the management console, the EMR CLI, an API, or an SDK to launch a job flow. You also have the option to run Hive interactively or utilize a script. The EMR read/write operations impact throughput consumption, however, in large requests, it performs retries with the protection of a backoff algorithm. Also, running EMR concurrently with other operations and tasks may result in throttling. The DynamoDB/EMR integration does not support binary and binary set attributes. DynamoDB/EMR Integration Prerequisites Review this checklist of necessary items before using EMR − An AWS account A populated table under the same account employed in EMR operations A custom Hive version with DynamoDB connectivity DynamoDB connectivity support An S3 bucket (optional) An SSH client (optional) An EC2 key pair (optional) Hive Setup Before using EMR, create a key pair to run Hive in interactive mode. The key pair allows connection to EC2 instances and master nodes of job flows. You can perform this by following the subsequent steps − Log in to the management console, and open the EC2 console located at https://console.aws.amazon.com/ec2/ Select a region in the upper, right-hand portion of the console. Ensure the region matches the DynamoDB region. In the Navigation pane, select Key Pairs. Select Create Key Pair. In the Key Pair Name field, enter a name and select Create. Download the resulting private key file which uses the following format: filename.pem. Note − You cannot connect to EC2 instances without the key pair. Hive Cluster Create a hive-enabled cluster to run Hive. It builds the required environment of applications and infrastructure for a Hive-to-DynamoDB connection. You can perform this task by using the following steps − Access the EMR console. Select Create Cluster. In the creation screen, set the cluster configuration with a descriptive name for the cluster, select Yes for termination protection and check on Enabled for logging, an S3 destination for log folder S3 location, and Enabled for debugging. In the Software Configuration screen, ensure the fields hold Amazon for Hadoop distribution, the latest version for AMI version, a default Hive version for Applications to be Installed-Hive, and a default Pig version for Applications to be Installed-Pig. In the Hardware Configuration screen, ensure the fields hold Launch into EC2-Classic for Network, No Preference for EC2 Availability Zone, the default for Master-Amazon EC2 Instance Type, no check for Request Spot Instances, the default for Core-Amazon EC2 Instance Type, 2 for Count, no check for Request Spot Instances, the default for Task-Amazon EC2 Instance Type, 0 for Count, and no check for Request Spot Instances. Be sure to set a limit providing sufficient capacity to prevent cluster failure. In the Security and Access screen, ensure fields hold your key pair in EC2 key pair, No other IAM users in IAM user access, and Proceed without roles in IAM role. Review the Bootstrap Actions screen, but do not modify it. Review settings, and select Create Cluster when finished. A Summary pane appears on the start of the cluster. Activate SSH Session You need an active the SSH session to connect to the master node and execute CLI operations. Locate the master node by selecting the cluster in the EMR console. It lists the master node as Master Public DNS Name. Install PuTTY if you do not have it. Then launch PuTTYgen and select Load. Choose your PEM file, and open it. PuTTYgen will inform you of successful import. Select Save private key to save in PuTTY private key format (PPK), and choose Yes for saving without a pass phrase. Then enter a name for the PuTTY key, hit Save, and close PuTTYgen. Use PuTTY to make a connection with the master node by first starting PuTTY. Choose Session from the Category list. Enter hadoop@DNS within the Host Name field. Expand Connection > SSH in the Category list, and choose Auth. In the controlling options screen, select Browse for Private key file for authentication. Then select your private key file and open it. Select Yes for the security alert pop-up. When connected to the master node, a Hadoop command prompt appears, which means you can begin an interactive Hive session. Hive Table Hive serves as a data warehouse tool allowing queries on EMR clusters using HiveQL. The previous setups give you a working prompt. Run Hive commands interactively by simply entering “hive,” and then any commands you wish. See our Hive tutorial for more information on Hive. Print Page Previous Next Advertisements ”;
DynamoDB – Table Activity
DynamoDB – Table Activity ”; Previous Next DynamoDB streams enable you to track and respond to table item changes. Employ this functionality to create an application which responds to changes by updating information across sources. Synchronize data for thousands of users of a large, multi-user system. Use it to send notifications to users on updates. Its applications prove diverse and substantial. DynamoDB streams serve as the main tool used to achieve this functionality. The streams capture time-ordered sequences containing item modifications within a table. They hold this data for a maximum of 24 hours. Applications use them to view the original and modified items, almost in real-time. Streams enabled on a table capture all modifications. On any CRUD operation, DynamoDB creates a stream record with the primary key attributes of the modified items. You can configure streams for additional information such as before and after images. The Streams carry two guarantees − Each record appears one time in the stream and Each item modification results in the stream records of the same order as that of the modifications. All streams process in real-time to allow you to employ them for related functionality in applications. Managing Streams On table creation, you can enable a stream. Existing tables allow stream disabling or settings changes. Streams offer the feature of asynchronous operation, which means no table performance impact. Utilize the AWS Management console for simple stream management. First, navigate to the console, and choose Tables. In the Overview tab, choose Manage Stream. Inside the window, select the information added to a stream on table data modifications. After entering all settings, select Enable. If you want to disable any existing streams, select Manage Stream, and then Disable. You can also utilize the APIs CreateTable and UpdateTable to enable or change a stream. Use the parameter StreamSpecification to configure the stream. StreamEnabled specifies status, meaning true for enabled and false for disabled. StreamViewType specifies information added to the stream: KEYS_ONLY, NEW_IMAGE, OLD_IMAGE, and NEW_AND_OLD_IMAGES. Stream Reading Read and process streams by connecting to an endpoint and making API requests. Each stream consists of stream records, and every record exists as a single modification which owns the stream. Stream records include a sequence number revealing publishing order. Records belong to groups also known as shards. Shards function as containers for several records, and also hold information needed for accessing and traversing records. After 24 hours, records automatically delete. These Shards are generated and deleted as needed, and do not last long. They also divide into multiple new shards automatically, typically in response to write activity spikes. On stream disabling, open shards close. The hierarchical relationship between shards means applications must prioritize the parent shards for correct processing order. You can use Kinesis Adapter to automatically do this. Note − The operations resulting in no change do not write stream records. Accessing and processing records requires performing the following tasks − Determine the ARN of the target stream. Determine the shard(s) of the stream holding the target records. Access the shard(s) to retrieve the desired records. Note − There should be a maximum of 2 processes reading a shard at once. If it exceeds 2 processes, then it can throttle the source. The stream API actions available include ListStreams DescribeStream GetShardIterator GetRecords You can review the following example of the stream reading − import java.util.ArrayList; import java.util.HashMap; import java.util.List; import java.util.Map; import com.amazonaws.auth.profile.ProfileCredentialsProvider; import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient; import com.amazonaws.services.dynamodbv2.AmazonDynamoDBStreamsClient; import com.amazonaws.services.dynamodbv2.model.AttributeAction; import com.amazonaws.services.dynamodbv2.model.AttributeDefinition; import com.amazonaws.services.dynamodbv2.model.AttributeValue; import com.amazonaws.services.dynamodbv2.model.AttributeValueUpdate; import com.amazonaws.services.dynamodbv2.model.CreateTableRequest; import com.amazonaws.services.dynamodbv2.model.DescribeStreamRequest; import com.amazonaws.services.dynamodbv2.model.DescribeStreamResult; import com.amazonaws.services.dynamodbv2.model.DescribeTableResult; import com.amazonaws.services.dynamodbv2.model.GetRecordsRequest; import com.amazonaws.services.dynamodbv2.model.GetRecordsResult; import com.amazonaws.services.dynamodbv2.model.GetShardIteratorRequest; import com.amazonaws.services.dynamodbv2.model.GetShardIteratorResult; import com.amazonaws.services.dynamodbv2.model.KeySchemaElement; import com.amazonaws.services.dynamodbv2.model.KeyType; import com.amazonaws.services.dynamodbv2.model.ProvisionedThroughput; import com.amazonaws.services.dynamodbv2.model.Record; import com.amazonaws.services.dynamodbv2.model.Shard; import com.amazonaws.services.dynamodbv2.model.ShardIteratorType; import com.amazonaws.services.dynamodbv2.model.StreamSpecification; import com.amazonaws.services.dynamodbv2.model.StreamViewType; import com.amazonaws.services.dynamodbv2.util.Tables; public class StreamsExample { private static AmazonDynamoDBClient dynamoDBClient = new AmazonDynamoDBClient(new ProfileCredentialsProvider()); private static AmazonDynamoDBStreamsClient streamsClient = new AmazonDynamoDBStreamsClient(new ProfileCredentialsProvider()); public static void main(String args[]) { dynamoDBClient.setEndpoint(“InsertDbEndpointHere”); streamsClient.setEndpoint(“InsertStreamEndpointHere”); // table creation String tableName = “MyTestingTable”; ArrayList<AttributeDefinition> attributeDefinitions = new ArrayList<AttributeDefinition>(); attributeDefinitions.add(new AttributeDefinition() .withAttributeName(“ID”) .withAttributeType(“N”)); ArrayList<KeySchemaElement> keySchema = new ArrayList<KeySchemaElement>(); keySchema.add(new KeySchemaElement() .withAttributeName(“ID”) .withKeyType(KeyType.HASH)); //Partition key StreamSpecification streamSpecification = new StreamSpecification(); streamSpecification.setStreamEnabled(true); streamSpecification.setStreamViewType(StreamViewType.NEW_AND_OLD_IMAGES); CreateTableRequest createTableRequest = new CreateTableRequest() .withTableName(tableName) .withKeySchema(keySchema) .withAttributeDefinitions(attributeDefinitions) .withProvisionedThroughput(new ProvisionedThroughput() .withReadCapacityUnits(1L) .withWriteCapacityUnits(1L)) .withStreamSpecification(streamSpecification); System.out.println(“Executing CreateTable for ” + tableName); dynamoDBClient.createTable(createTableRequest); System.out.println(“Creating ” + tableName); try { Tables.awaitTableToBecomeActive(dynamoDBClient, tableName); } catch (InterruptedException e) { e.printStackTrace(); } // Get the table”s stream settings DescribeTableResult describeTableResult = dynamoDBClient.describeTable(tableName); String myStreamArn = describeTableResult.getTable().getLatestStreamArn(); StreamSpecification myStreamSpec = describeTableResult.getTable().getStreamSpecification(); System.out.println(“Current stream ARN for ” + tableName + “: “+ myStreamArn); System.out.println(“Stream enabled: “+ myStreamSpec.getStreamEnabled()); System.out.println(“Update view type: “+ myStreamSpec.getStreamViewType()); // Add an item int numChanges = 0; System.out.println(“Making some changes to table data”); Map<String, AttributeValue> item = new HashMap<String, AttributeValue>(); item.put(“ID”, new AttributeValue().withN(“222”)); item.put(“Alert”, new AttributeValue().withS(“item!”)); dynamoDBClient.putItem(tableName, item); numChanges++; // Update the item Map<String, AttributeValue> key = new HashMap<String, AttributeValue>(); key.put(“ID”, new AttributeValue().withN(“222”)); Map<String, AttributeValueUpdate> attributeUpdates = new HashMap<String, AttributeValueUpdate>(); attributeUpdates.put(“Alert”, new AttributeValueUpdate() .withAction(AttributeAction.PUT) .withValue(new AttributeValue().withS(“modified item”))); dynamoDBClient.updateItem(tableName, key, attributeUpdates); numChanges++; // Delete the item dynamoDBClient.deleteItem(tableName, key); numChanges++; // Get stream shards DescribeStreamResult describeStreamResult = streamsClient.describeStream(new DescribeStreamRequest() .withStreamArn(myStreamArn)); String streamArn = describeStreamResult.getStreamDescription().getStreamArn(); List<Shard> shards = describeStreamResult.getStreamDescription().getShards(); // Process shards for (Shard shard : shards) { String shardId = shard.getShardId(); System.out.println(“Processing ” + shardId + ” in “+ streamArn); // Get shard iterator GetShardIteratorRequest getShardIteratorRequest = new GetShardIteratorRequest() .withStreamArn(myStreamArn) .withShardId(shardId) .withShardIteratorType(ShardIteratorType.TRIM_HORIZON); GetShardIteratorResult getShardIteratorResult = streamsClient.getShardIterator(getShardIteratorRequest); String nextItr = getShardIteratorResult.getShardIterator(); while (nextItr != null && numChanges > 0) { // Read data records with iterator GetRecordsResult getRecordsResult = streamsClient.getRecords(new GetRecordsRequest(). withShardIterator(nextItr)); List<Record> records = getRecordsResult.getRecords(); System.out.println(“Pulling records…”); for (Record record : records) { System.out.println(record); numChanges–; } nextItr = getRecordsResult.getNextShardIterator(); } } } } Print Page Previous Next Advertisements ”;
Web Identity Federation
DynamoDB – Web Identity Federation ”; Previous Next Web Identity Federation allows you to simplify authentication and authorization for large user groups. You can skip the creation of individual accounts, and require users to login to an identity provider to get temporary credentials or tokens. It uses AWS Security Token Service (STS) to manage credentials. Applications use these tokens to interact with services. Web Identity Federation also supports other identity providers such as – Amazon, Google, and Facebook. Function − In use, Web Identity Federation first calls an identity provider for user and app authentication, and the provider returns a token. This results in the app calling AWS STS and passing the token for input. STS authorizes the app and grants it temporary access credentials, which allow the app to use an IAM role and access resources based on policy. Implementing Web Identity Federation You must perform the following three steps prior to use − Use a supported third party identity provider to register as a developer. Register your application with the provider to obtain an app ID. Create a single or multiple IAM roles, including policy attachment. You must use a role per provider per app. Assume one of your IAM roles to use Web Identity Federation. Your app must then perform a three-step process − Authentication Credential acquisition Resource Access In the first step, your app uses its own interface to call the provider and then manages the token process. Then step two manages tokens and requires your app to send an AssumeRoleWithWebIdentity request to AWS STS. The request holds the first token, the provider app ID, and the ARN of the IAM role. The STS the provides credentials set to expire after a certain period. In the final step, your app receives a response from STS containing access information for DynamoDB resources. It consists of access credentials, expiration time, role, and role ID. Print Page Previous Next Advertisements ”;
DynamoDB – Indexes
DynamoDB – Indexes ”; Previous Next DynamoDB uses indexes for primary key attributes to improve accesses. They accelerate application accesses and data retrieval, and support better performance by reducing application lag. Secondary Index A secondary index holds an attribute subset and an alternate key. You use it through either a query or scan operation, which targets the index. Its contents include attributes you project or copy. In creation, you define an alternate key for the index, and any attributes you wish to project in the index. DynamoDB then performs a copy of the attributes into the index, including primary key attributes sourced from the table. After performing these tasks, you simply use a query/scan as if performing on a table. DynamoDB automatically maintains all secondary indices. On item operations, such as adding or deleting, it updates any indexes on the target table. DynamoDB offers two types of secondary indexes − Global Secondary Index − This index includes a partition key and sort key, which may differ from the source table. It uses the label “global” due to the capability of queries/scans on the index to span all table data, and over all partitions. Local Secondary Index − This index shares a partition key with the table, but uses a different sort key. Its “local” nature results from all of its partitions scoping to a table partition with identical partition key value. The best type of index to use depends on application needs. Consider the differences between the two presented in the following table − Quality Global Secondary Index Local Secondary Index Key Schema It uses a simple or composite primary key. It always uses a composite primary key. Key Attributes The index partition key and sort key can consist of string, number, or binary table attributes. The partition key of the index is an attribute shared with the table partition key. The sort key can be string, number, or binary table attributes. Size Limits Per Partition Key Value They carry no size limitations. It imposes a 10GB maximum limit on total size of indexed items associated with a partition key value. Online Index Operations You can spawn them at table creation, add them to existing tables, or delete existing ones. You must create them at table creation, but cannot delete them or add them to existing tables. Queries It allows queries covering the entire table, and every partition. They address single partitions through the partition key value provided in the query. Consistency Queries of these indices only offer the eventually consistent option. Queries of these offer the options of eventually consistent or strongly consistent. Throughput Cost It includes throughput settings for reads and writes. Queries/scans consume capacity from the index, not the table, which also applies to table write updates. Queries/scans consume table read capacity. Table writes update local indexes, and consume table capacity units. Projection Queries/scans can only request attributes projected into the index, with no retrievals of table attributes. Queries/scans can request those attributes not projected; furthermore, automatic fetches of them occur. When creating multiple tables with secondary indexes, do it sequentially; meaning make a table and wait for it to reach ACTIVE state before creating another and again waiting. DynamoDB does not permit concurrent creation. Each secondary index requires certain specifications − Type − Specify local or global. Name − It uses naming rules identical to tables. Key Schema − Only top level string, number, or binary type are permitted, with index type determining other requirements. Attributes for Projection − DynamoDB automatically projects them, and allows any data type. Throughput − Specify read/write capacity for global secondary indexes. The limit for indexes remains 5 global and 5 local per table. You can access the detailed information about indexes with DescribeTable. It returns the name, size, and item count. Note − These values updates every 6 hours. In queries or scans used to access index data, provide the table and index names, desired attributes for the result, and any conditional statements. DynamoDB offers the option to return results in either ascending or descending order. Note − The deletion of a table also deletes all indexes. Print Page Previous Next Advertisements ”;
DynamoDB – Scan
DynamoDB – Scan ”; Previous Next Scan Operations read all table items or secondary indices. Its default function results in returning all data attributes of all items within an index or table. Employ the ProjectionExpression parameter in filtering attributes. Every scan returns a result set, even on finding no matches, which results in an empty set. Scans retrieve no more than 1MB, with the option to filter data. Note − The parameters and filtering of scans also apply to querying. Types of Scan Operations Filtering − Scan operations offer fine filtering through filter expressions, which modify data after scans, or queries; before returning results. The expressions use comparison operators. Their syntax resembles condition expressions with the exception of key attributes, which filter expressions do not permit. You cannot use a partition or sort key in a filter expression. Note − The 1MB limit applies prior to any application of filtering. Throughput Specifications − Scans consume throughput, however, consumption focuses on item size rather than returned data. The consumption remains the same whether you request every attribute or only a few, and using or not using a filter expression also does not impact consumption. Pagination − DynamoDB paginates results causing division of results into specific pages. The 1MB limit applies to returned results, and when you exceed it, another scan becomes necessary to gather the rest of the data. The LastEvaluatedKey value allows you to perform this subsequent scan. Simply apply the value to the ExclusiveStartkey. When the LastEvaluatedKey value becomes null, the operation has completed all pages of data. However, a non-null value does not automatically mean more data remains. Only a null value indicates status. The Limit Parameter − The limit parameter manages the result size. DynamoDB uses it to establish the number of items to process before returning data, and does not work outside of the scope. If you set a value of x, DynamoDB returns the first x matching items. The LastEvaluatedKey value also applies in cases of limit parameters yielding partial results. Use it to complete scans. Result Count − Responses to queries and scans also include information related to ScannedCount and Count, which quantify scanned/queried items and quantify items returned. If you do not filter, their values are identical. When you exceed 1MB, the counts represent only the portion processed. Consistency − Query results and scan results are eventually consistent reads, however, you can set strongly consistent reads as well. Use the ConsistentRead parameter to change this setting. Note − Consistent read settings impact consumption by using double the capacity units when set to strongly consistent. Performance − Queries offer better performance than scans due to scans crawling the full table or secondary index, resulting in a sluggish response and heavy throughput consumption. Scans work best for small tables and searches with less filters, however, you can design lean scans by obeying a few best practices such as avoiding sudden, accelerated read activity and exploiting parallel scans. A query finds a certain range of keys satisfying a given condition, with performance dictated by the amount of data it retrieves rather than the volume of keys. The parameters of the operation and the number of matches specifically impact performance. Parallel Scan Scan operations perform processing sequentially by default. Then they return data in 1MB portions, which prompts the application to fetch the next portion. This results in long scans for large tables and indices. This characteristic also means scans may not always fully exploit the available throughput. DynamoDB distributes table data across multiple partitions; and scan throughput remains limited to a single partition due to its single-partition operation. A solution for this problem comes from logically dividing tables or indices into segments. Then “workers” parallel (concurrently) scan segments. It uses the parameters of Segment and TotalSegments to specify segments scanned by certain workers and specify the total quantity of segments processed. Worker Number You must experiment with worker values (Segment parameter) to achieve the best application performance. Note − Parallel scans with large sets of workers impacts throughput by possibly consuming all throughput. Manage this issue with the Limit parameter, which you can use to stop a single worker from consuming all throughput. The following is a deep scan example. Note − The following program may assume a previously created data source. Before attempting to execute, acquire supporting libraries and create necessary data sources (tables with required characteristics, or other referenced sources). This example also uses Eclipse IDE, an AWS credentials file, and the AWS Toolkit within an Eclipse AWS Java Project. import java.util.HashMap; import java.util.Iterator; import java.util.Map; import com.amazonaws.auth.profile.ProfileCredentialsProvider; import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient; import com.amazonaws.services.dynamodbv2.document.DynamoDB; import com.amazonaws.services.dynamodbv2.document.Item; import com.amazonaws.services.dynamodbv2.document.ItemCollection; import com.amazonaws.services.dynamodbv2.document.ScanOutcome; import com.amazonaws.services.dynamodbv2.document.Table; public class ScanOpSample { static DynamoDB dynamoDB = new DynamoDB( new AmazonDynamoDBClient(new ProfileCredentialsProvider())); static String tableName = “ProductList”; public static void main(String[] args) throws Exception { findProductsUnderOneHun(); //finds products under 100 dollars } private static void findProductsUnderOneHun() { Table table = dynamoDB.getTable(tableName); Map<String, Object> expressionAttributeValues = new HashMap<String, Object>(); expressionAttributeValues.put(“:pr”, 100); ItemCollection<ScanOutcome> items = table.scan ( “Price < :pr”, //FilterExpression “ID, Nomenclature, ProductCategory, Price”, //ProjectionExpression null, //No ExpressionAttributeNames expressionAttributeValues); System.out.println(“Scanned ” + tableName + ” to find items under $100.”); Iterator<Item> iterator = items.iterator(); while (iterator.hasNext()) { System.out.println(iterator.next().toJSONPretty()); } } } Print Page Previous Next Advertisements ”;
DynamoDB – Permissions API
DynamoDB – Permissions API ”; Previous Next DynamoDB API offers a large set of actions, which require permissions. In setting permissions, you must establish the actions permitted, resources permitted, and conditions of each. You can specify actions within the Action field of the policy. Specify resource value within the Resource field of the policy. But do ensure that you use the correct syntax containing the Dynamodb: prefix with the API operation. For example − dynamodb:CreateTable You can also employ condition keys to filter permissions. Permissions and API Actions Take a good look at the API actions and associated permissions given in the following table − API Operation Necessary Permission BatchGetItem dynamodb:BatchGetItem BatchWriteItem dynamodb:BatchWriteItem CreateTable dynamodb:CreateTable DeleteItem dynamodb:DeleteItem DeleteTable dynamodb:DeleteTable DescribeLimits dynamodb:DescribeLimits DescribeReservedCapacity dynamodb:DescribeReservedCapacity DescribeReservedCapacityOfferings dynamodb:DescribeReservedCapacityOfferings DescribeStream dynamodb:DescribeStream DescribeTable dynamodb:DescribeTable GetItem dynamodb:GetItem GetRecords dynamodb:GetRecords GetShardIterator dynamodb:GetShardIterator ListStreams dynamodb:ListStreams ListTables dynamodb:ListTables PurchaseReservedCapacityOfferings dynamodb:PurchaseReservedCapacityOfferings PutItem dynamodb:PutItem Query dynamodb:Query Scan dynamodb:Scan UpdateItem dynamodb:UpdateItem UpdateTable dynamodb:UpdateTable Resources In the following table, you can review the resources associated with each permitted API action − API Operation Resource BatchGetItem arn:aws:dynamodb:region:account-id:table/table-name BatchWriteItem arn:aws:dynamodb:region:account-id:table/table-name CreateTable arn:aws:dynamodb:region:account-id:table/table-name DeleteItem arn:aws:dynamodb:region:account-id:table/table-name DeleteTable arn:aws:dynamodb:region:account-id:table/table-name DescribeLimits arn:aws:dynamodb:region:account-id:* DescribeReservedCapacity arn:aws:dynamodb:region:account-id:* DescribeReservedCapacityOfferings arn:aws:dynamodb:region:account-id:* DescribeStream arn:aws:dynamodb:region:account-id:table/table-name/stream/stream-label DescribeTable arn:aws:dynamodb:region:account-id:table/table-name GetItem arn:aws:dynamodb:region:account-id:table/table-name GetRecords arn:aws:dynamodb:region:account-id:table/table-name/stream/stream-label GetShardIterator arn:aws:dynamodb:region:account-id:table/table-name/stream/stream-label ListStreams arn:aws:dynamodb:region:account-id:table/table-name/stream/* ListTables * PurchaseReservedCapacityOfferings arn:aws:dynamodb:region:account-id:* PutItem arn:aws:dynamodb:region:account-id:table/table-name Query arn:aws:dynamodb:region:account-id:table/table-name or arn:aws:dynamodb:region:account-id:table/table-name/index/index-name Scan arn:aws:dynamodb:region:account-id:table/table-name or arn:aws:dynamodb:region:account-id:table/table-name/index/index-name UpdateItem arn:aws:dynamodb:region:account-id:table/table-name UpdateTable arn:aws:dynamodb:region:account-id:table/table-name Print Page Previous Next Advertisements ”;
DynamoDB – Update Items
DynamoDB – Update Items ”; Previous Next Updating an item in DynamoDB mainly consists of specifying the full primary key and table name for the item. It requires a new value for each attribute you modify. The operation uses UpdateItem, which modifies the existing items or creates them on discovery of a missing item. In updates, you might want to track the changes by displaying the original and new values, before and after the operations. UpdateItem uses the ReturnValues parameter to achieve this. Note − The operation does not report capacity unit consumption, but you can use the ReturnConsumedCapacity parameter. Use the GUI console, Java, or any other tool to perform this task. How to Update Items Using GUI Tools? Navigate to the console. In the navigation pane on the left side, select Tables. Choose the table needed, and then select the Items tab. Choose the item desired for an update, and select Actions | Edit. Modify any attributes or values necessary in the Edit Item window. Update Items Using Java Using Java in the item update operations requires creating a Table class instance, and calling its updateItem method. Then you specify the item”s primary key, and provide an UpdateExpression detailing attribute modifications. The Following is an example of the same − DynamoDB dynamoDB = new DynamoDB(new AmazonDynamoDBClient( new ProfileCredentialsProvider())); Table table = dynamoDB.getTable(“ProductList”); Map<String, String> expressionAttributeNames = new HashMap<String, String>(); expressionAttributeNames.put(“#M”, “Make”); expressionAttributeNames.put(“#P”, “Price expressionAttributeNames.put(“#N”, “ID”); Map<String, Object> expressionAttributeValues = new HashMap<String, Object>(); expressionAttributeValues.put(“:val1”, new HashSet<String>(Arrays.asList(“Make1″,”Make2”))); expressionAttributeValues.put(“:val2”, 1); //Price UpdateItemOutcome outcome = table.updateItem( “internalID”, // key attribute name 111, // key attribute value “add #M :val1 set #P = #P – :val2 remove #N”, // UpdateExpression expressionAttributeNames, expressionAttributeValues); The updateItem method also allows for specifying conditions, which can be seen in the following example − Table table = dynamoDB.getTable(“ProductList”); Map<String, String> expressionAttributeNames = new HashMap<String, String>(); expressionAttributeNames.put(“#P”, “Price”); Map<String, Object> expressionAttributeValues = new HashMap<String, Object>(); expressionAttributeValues.put(“:val1”, 44); // change Price to 44 expressionAttributeValues.put(“:val2”, 15); // only if currently 15 UpdateItemOutcome outcome = table.updateItem (new PrimaryKey(“internalID”,111), “set #P = :val1”, // Update “#P = :val2”, // Condition expressionAttributeNames, expressionAttributeValues); Update Items Using Counters DynamoDB allows atomic counters, which means using UpdateItem to increment/decrement attribute values without impacting other requests; furthermore, the counters always update. The following is an example that explains how it can be done. Note − The following sample may assume a previously created data source. Before attempting to execute, acquire supporting libraries and create necessary data sources (tables with required characteristics, or other referenced sources). This sample also uses Eclipse IDE, an AWS credentials file, and the AWS Toolkit within an Eclipse AWS Java Project. package com.amazonaws.codesamples.document; import java.io.IOException; import java.util.Arrays; import java.util.HashMap; import java.util.HashSet; import java.util.Map; import com.amazonaws.auth.profile.ProfileCredentialsProvider; import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient; import com.amazonaws.services.dynamodbv2.document.DeleteItemOutcome; import com.amazonaws.services.dynamodbv2.document.DynamoDB; import com.amazonaws.services.dynamodbv2.document.Item; import com.amazonaws.services.dynamodbv2.document.Table; import com.amazonaws.services.dynamodbv2.document.UpdateItemOutcome; import com.amazonaws.services.dynamodbv2.document.spec.DeleteItemSpec; import com.amazonaws.services.dynamodbv2.document.spec.UpdateItemSpec; import com.amazonaws.services.dynamodbv2.document.utils.NameMap; import com.amazonaws.services.dynamodbv2.document.utils.ValueMap; import com.amazonaws.services.dynamodbv2.model.ReturnValue; public class UpdateItemOpSample { static DynamoDB dynamoDB = new DynamoDB(new AmazonDynamoDBClient( new ProfileCredentialsProvider())); static String tblName = “ProductList”; public static void main(String[] args) throws IOException { createItems(); retrieveItem(); // Execute updates updateMultipleAttributes(); updateAddNewAttribute(); updateExistingAttributeConditionally(); // Item deletion deleteItem(); } private static void createItems() { Table table = dynamoDB.getTable(tblName); try { Item item = new Item() .withPrimaryKey(“ID”, 303) .withString(“Nomenclature”, “Polymer Blaster 4000”) .withStringSet( “Manufacturers”, new HashSet<String>(Arrays.asList(“XYZ Inc.”, “LMNOP Inc.”))) .withNumber(“Price”, 50000) .withBoolean(“InProduction”, true) .withString(“Category”, “Laser Cutter”); table.putItem(item); item = new Item() .withPrimaryKey(“ID”, 313) .withString(“Nomenclature”, “Agitatatron 2000”) .withStringSet( “Manufacturers”, new HashSet<String>(Arrays.asList(“XYZ Inc,”, “CDE Inc.”))) .withNumber(“Price”, 40000) .withBoolean(“InProduction”, true) .withString(“Category”, “Agitator”); table.putItem(item); } catch (Exception e) { System.err.println(“Cannot create items.”); System.err.println(e.getMessage()); } } private static void updateAddNewAttribute() { Table table = dynamoDB.getTable(tableName); try { Map<String, String> expressionAttributeNames = new HashMap<String, String>(); expressionAttributeNames.put(“#na”, “NewAttribute”); UpdateItemSpec updateItemSpec = new UpdateItemSpec() .withPrimaryKey(“ID”, 303) .withUpdateExpression(“set #na = :val1”) .withNameMap(new NameMap() .with(“#na”, “NewAttribute”)) .withValueMap(new ValueMap() .withString(“:val1”, “A value”)) .withReturnValues(ReturnValue.ALL_NEW); UpdateItemOutcome outcome = table.updateItem(updateItemSpec); // Confirm System.out.println(“Displaying updated item…”); System.out.println(outcome.getItem().toJSONPretty()); } catch (Exception e) { System.err.println(“Cannot add an attribute in ” + tableName); System.err.println(e.getMessage()); } } } Print Page Previous Next Advertisements ”;