140

I want to know item count with DynamoDB querying.

I can querying for DynamoDB, but I only want to know 'total count of item'.

For example, 'SELECT COUNT(*) FROM ... WHERE ...' in MySQL

$result = $aws->query(array(
 'TableName' => 'game_table',
 'IndexName' => 'week-point-index',
 'KeyConditions' => array(
    'week' => array(
        'ComparisonOperator' => 'EQ',
        'AttributeValueList' => array(
            array(Type::STRING => $week)
        )
    ),
    'point' => array(
        'ComparisonOperator' => 'GE',
        'AttributeValueList' => array(
            array(Type::NUMBER => $my_point)
        )
    )
 ),
));
echo Count($result['Items']);

this code gets the all users data higher than my point.

If count of $result is 100,000, $result is too much big. And it would exceed the limits of the query size.

I need help.

mkobit
  • 43,979
  • 12
  • 156
  • 150
sam lee
  • 1,413
  • 2
  • 9
  • 6

18 Answers18

204

With the aws dynamodb cli you can get it via scan as follows:

aws dynamodb scan --table-name <TABLE_NAME> --select "COUNT"

The response will look similar to this:

{
    "Count": 123,
    "ScannedCount": 123,
    "ConsumedCapacity": null
}

notice that this information is in real time in contrast to the describe-table api

Daniel Bubenheim
  • 4,043
  • 3
  • 14
  • 20
  • Does selecting only count consume any capacity units at all? The response includes `ConsumedCapacity: null` which kind of suggests that it's "free"? – JHH Jan 17 '18 at 10:52
  • 4
    @JHH I've just tried this and it does appear to use read capacity units. According to the DynamoDB console my query consumed about 12 units (only 1 is configured); I assume it's utilising burst capacity to service the query. – Zodman Jan 23 '18 at 06:34
  • 4
    @Zodman thanks for trying it! I read through the docs once more and I actually found a statement regarding counts that I previously missed: "Getting the count of items uses the same quantity of read capacity units and is subject to the same item size calculations, because DynamoDB has to read each item in order to increment the count." https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/CapacityUnitCalculations.html#ItemSizeCalculations.Reads – JHH Jan 23 '18 at 07:38
  • 2
    One might think that caching the count, at least within eventually consistent boundaries, would be feasible to make it only consume one single CU, but maybe getting only the count isn't a very common scenario. – JHH Jan 23 '18 at 07:41
  • @Daniel Bubenheim does this give you the actual count, given that Item count update every 6 hours? I will try it anyway and see what I get. Thanks – Helen Neely Feb 26 '18 at 10:46
  • 2
    @HelenNeely Yes, this way provides the latest item count. – Daniel Bubenheim Feb 27 '18 at 19:37
  • I wonder what the java equivalent of this is. The java code in the accepted answer requires `KeyConditions` which I don't want. – Sridhar Sarnobat Aug 02 '18 at 00:40
  • I get this error when the table increases to a few hundred thousand entries while using ~700 WCU in parallel. Shouldn't this scan/count use only a single RCU? `An error occurred (ProvisionedThroughputExceededException) when calling the Scan operation (reached max retries: 9): The level of configured provisioned throughput for the table was exceeded. Consider increasing your provisioning level with the UpdateTable API.` – cahen Jan 18 '19 at 11:50
  • 7
    This answer is not correct, at least in the context of the question. The question is how to get a count of the records matching a filter expression, hence the comparison to the WHERE statement. AFAIK from trying to figure this out for months is you can't get a count of matching records. dynamoDB will go through each record in the table or index and return those matching the filter, 1000 records at a time. You may only have 20 matching records and would get 20 as the count. However there may be 40 in the next 1000 records, and so on. – Chris Love Jul 23 '20 at 15:34
  • Good to know, it still does. You're welcome @Motilal – Daniel Bubenheim Mar 19 '21 at 17:33
49

You can use the Select parameter and use COUNT in the request. It "returns the number of matching items, rather than the matching items themselves". Important, as brought up by Saumitra R. Bhave in a comment, "If the size of the Query result set is larger than 1 MB, then ScannedCount and Count will represent only a partial count of the total items. You will need to perform multiple Query operations in order to retrieve all of the results".

I'm Not familiar with PHP but here is how you could use it with Java. And then instead of using Count (which I am guessing is a function in PHP) on the 'Items' you can use the Count value from the response - $result['Count']:

final String week = "whatever";
final Integer myPoint = 1337;
Condition weekCondition = new Condition()
        .withComparisonOperator(ComparisonOperator.EQ)
        .withAttributeValueList(new AttributeValue().withS(week));
Condition myPointCondition = new Condition()
        .withComparisonOperator(ComparisonOperator.GE)
        .withAttributeValueList(new AttributeValue().withN(myPoint.toString()))

Map<String, Condition> keyConditions = new HashMap<>();
keyConditions.put("week", weekCondition);
keyConditions.put("point", myPointCondition);

QueryRequest request = new QueryRequest("game_table");
request.setIndexName("week-point-index");
request.setSelect(Select.COUNT);
request.setKeyConditions(keyConditions);

QueryResult result = dynamoDBClient.query(request);
Integer count = result.getCount();

If you don't need to emulate the WHERE clause, you can use a DescribeTable request and use the resulting item count to get an estimate.

The number of items in the specified table. DynamoDB updates this value approximately every six hours. Recent changes might not be reflected in this value.

Also, an important note from the documentation as noted by Saumitra R. Bhave in the comments on this answer:

If the size of the Query result set is larger than 1 MB, ScannedCount and Count represent only a partial count of the total items. You need to perform multiple Query operations to retrieve all the results (see Paginating Table Query Results).

mkobit
  • 43,979
  • 12
  • 156
  • 150
  • 21
    I think there is a small catch here though, as per http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Query.html#Query.Count which says "If the size of the Query result set is larger than 1 MB, then ScannedCount and Count will represent only a partial count of the total items", which basically means that Items are first fetch and then counted, and if the size of fetched items exceed 1MB even 'count' will be paginated – Saumitra R. Bhave Apr 16 '17 at 06:22
  • I don't want to steal @mkobit's original work, but this is a way to do it using fluent chaining: `AmazonDynamoDBClientBuilder.standard().withRegion(region).withCredentials(credentialsProvider).build() .query(new QueryRequest(freeKeysTableName).withSelect(Select.COUNT)).getCount()` – Sridhar Sarnobat Jul 27 '18 at 00:31
  • 3
    Just to complete your answer, as pointed out you may have to repeat the call until LastEvaluatedKey is no longer present in the results and this may look like this (comes to the end of the code above): `while(result.getLastEvaluatedKey()!=null){ request.setExclusiveStartKey(result.getLastEvaluatedKey()); result = dynamoDBClient.query(request); count+= result.getCount(); }` – Peter Koncz Apr 05 '19 at 07:03
29

Can be seen from UI as well. Go to overview tab on table, you will see item count. Hope it helps someone.

coder007
  • 299
  • 3
  • 2
  • 10
    Yes, but this has limitations: "Storage size and item count are not updated in real-time. They are updated periodically, roughly every six hours." – matthias Jun 21 '18 at 13:23
  • Also, if we want to see the count of a query, it cant be done in Overview tab. It just shows total rows in table, not the count of resultset of my query. – Jyotsana Nandwani Apr 08 '19 at 05:16
  • In 2023 there's a "Get live item count" button or this in "Items summary" section of "Overview" tab. – Aleksandr Kravets Jun 01 '23 at 12:26
21

I'm too late here but like to extend Daniel's answer about using aws cli to include filter expression.

Running

aws dynamodb scan \
    --table-name <tableName> \
    --filter-expression "#v = :num" \
    --expression-attribute-names '{"#v": "fieldName"}' \
    --expression-attribute-values '{":num": {"N": "123"}}' \
    --select "COUNT"

would give

{
    "Count": 2945,
    "ScannedCount": 7874,
    "ConsumedCapacity": null
}

That is, ScannedCount is total count and Count is the number of items which are filtered by given expression (fieldName=123).

bob
  • 2,674
  • 1
  • 29
  • 46
12

Replace the table name and use the below query to get the data on your local environment:

aws dynamodb scan --table-name <TABLE_NAME> --select "COUNT" --endpoint-url http://localhost:8000

Replace the table name and remove the endpoint url to get the data on production environment

aws dynamodb scan --table-name <TABLE_NAME> --select "COUNT"
Ravi Ranjan
  • 269
  • 3
  • 2
8

If you happen to reach here, and you are working with C#, here is the code:

var cancellationToken = new CancellationToken();

var request = new ScanRequest("TableName") {Select = Select.COUNT};

var result = context.Client.ScanAsync(request, cancellationToken).Result;

totalCount = result.Count;
ymerej
  • 727
  • 1
  • 8
  • 21
7

If anyone is looking for a straight forward NodeJS Lambda count solution:

const data = await dynamo.scan({ Select: "COUNT", TableName: "table" }).promise();

// data.Count -> number of elements in table.
Epic Speedy
  • 636
  • 1
  • 11
  • 25
6

I'm posting this answer for anyone using C# that wants a fully functional, well-tested answer that demonstrates using query instead of scan. In particular, this answer handles more than 1MB size of items to count.

public async Task<int> GetAvailableCount(string pool_type, string pool_key)
{
    var queryRequest = new QueryRequest
    {
        TableName = PoolsDb.TableName,
        ConsistentRead = true,
        Select = Select.COUNT,
        KeyConditionExpression = "pool_type_plus_pool_key = :type_plus_key",
        ExpressionAttributeValues = new Dictionary<string, AttributeValue> {
            {":type_plus_key", new AttributeValue { S =  pool_type + pool_key }}
        },
    };
    var t0 = DateTime.UtcNow;
    var result = await Client.QueryAsync(queryRequest);
    var count = result.Count;
    var iter = 0;
    while ( result.LastEvaluatedKey != null && result.LastEvaluatedKey.Values.Count > 0) 
    {
        iter++;
        var lastkey = result.LastEvaluatedKey.Values.ToList()[0].S;
        _logger.LogDebug($"GetAvailableCount {pool_type}-{pool_key} iteration {iter} instance key {lastkey}");
        queryRequest.ExclusiveStartKey = result.LastEvaluatedKey;
        result = await Client.QueryAsync(queryRequest);
        count += result.Count;
    }
    _logger.LogDebug($"GetAvailableCount {pool_type}-{pool_key} returned {count} after {iter} iterations in {(DateTime.UtcNow - t0).TotalMilliseconds} ms.");
    return count;
}

}

JohnOpincar
  • 5,620
  • 3
  • 35
  • 38
  • What do you pass for pool_type and pool_key ? – BUMA May 30 '22 at 08:42
  • 1
    @BUMA, the table you are trying to count items in would determine what your expressions and values would be. In this case, pool_type and pool_key identify a specific pool of resources that are being managed by the service this code was taken from. – JohnOpincar Jun 08 '22 at 13:48
4

Adding some additional context to this question. In some circumstances it makes sense to Scan the table to obtain the live item count. However, if this is a frequent occurrence or if you have large tables then it can be expensive from both a cost and performance point of view. Below, I highlight 3 ways to gain the item count for your tables.

1. Scan

Using a Scan requires you to read every item in the table, this works well for one off queries but it is not scalable and can become quite expensive. Using Select: COUNT will prevent returning data, but you must still pay for reading the entire table.

Pros

  • Gets you the most recent item count ("live")
  • Is a simple API call
  • Can be run in parallel to reduce time

Cons

  • Reads the entire dataset
  • Slow performance
  • High cost

CLI example

aws dynamodb scan \
--table-name test \
--select COUNT

2. DescribeTable

DynamoDB DescribeTable API provides you with an estimated value for ItemCount which is updated approx. every 6 hours.

The number of items in the specified table. DynamoDB updates this value approximately every six hours. Recent changes might not be reflected in this value. Ref.

Calling this API gives you an instant response, however, the value of the ItemCount could be up to 6 hours stale. In certain situations this value may be adequate.

Pros

  • Instant response
  • No cost to retrieve ItemCount
  • Can be called frequently

Cons

  • Data could be stale by up to 6 hours.

CLI Example

aws dynamodb describe-table \
--table-name test \
--query Table.ItemCount

DescribeTable and CloudWatch

As previously mentioned DescribeTable updates your tables ItemCount approx. every 6 hours. We can obtain that value and plot it on a custom CloudWatch graph which allows you to monitor your tables ItemCount over time, providing you historical data.

Pros

  • Provides historical data
  • Infer how your ItemCount changes over time
  • Reasonably easy to implement

Cons

  • Data could be stale by up to 6 hours.

Implementation

Tracking DynamoDB Storage History with CloudWatch showcases how to automatically push the value of DescribeTable to CloudWatch periodically using EventBridge and Lambda, however, it is designed to push TableSizeBytes instead of ItemCount. Some small modifications to the Lambda will allow you to record ItemCount.

Leeroy Hannigan
  • 11,409
  • 3
  • 14
  • 31
3

DynamoDB now has a 'Get Live Item Count' button in the UI. Please note the production caveat if you have a large table that will consume read capacity.

Live Item Count

Live Item Count

Brian Fegter
  • 693
  • 12
  • 21
2

In Scala:

import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder
import com.amazonaws.services.dynamodbv2.document.DynamoDB
val client = AmazonDynamoDBClientBuilder.standard().build()

val dynamoDB = new DynamoDB(client)
val tableDescription = dynamoDB.getTable("table name").describe().getItemCount()
1

Similar to Java in PHP only set Select PARAMETER with value 'COUNT'

$result = $aws->query(array(
 'TableName' => 'game_table',
 'IndexName' => 'week-point-index',
 'KeyConditions' => array(
    'week' => array(
        'ComparisonOperator' => 'EQ',
        'AttributeValueList' => array(
            array(Type::STRING => $week)
        )
    ),
    'point' => array(
        'ComparisonOperator' => 'GE',
        'AttributeValueList' => array(
            array(Type::NUMBER => $my_point)
        )
    )
 ),
 'Select' => 'COUNT'
));

and acces it just like this :

echo $result['Count'];

but as Saumitra mentioned above be careful with resultsets largers than 1 MB, in that case use LastEvaluatedKey til it returns null to get the last updated count value.

jajhonrod
  • 69
  • 3
0

You could use dynamodb mapper query.

PaginatedQueryList<YourModel> list = DymamoDBMapper.query(YourModel.class, queryExpression);
int count = list.size();

it calls loadAllResults() that would lazily load next available result until allResultsLoaded.

Ref: https://docs.amazonaws.cn/en_us/amazondynamodb/latest/developerguide/DynamoDBMapper.Methods.html#DynamoDBMapper.Methods.query

Feng Han
  • 361
  • 2
  • 7
0

This is how you would do it using the DynamoDBMapper (Kotlin syntax), example with no filters at all:

dynamoDBMapper.count(MyEntity::class.java, DynamoDBScanExpression())
Michael Böckling
  • 7,341
  • 6
  • 55
  • 76
0
$aws = new Aws\DynamoDb\DynamoDbClient([
    'region'  => 'us-west-2',
    'version' => 'latest',
]);

$result = $aws->scan(array(
    'TableName' => 'game_table',
    'Count' => true
));

echo $result['Count'];
Dragos Lupei
  • 592
  • 4
  • 10
-1
len(response['Items'])

will give you the count of the filtered rows

where,

fe = Key('entity').eq('tesla')
response = table.scan(FilterExpression=fe)
RendezAWS
  • 103
  • 9
-2

I used scan to get total count of the required tableName.Following is a Java code snippet for same

Long totalItemCount = 0;
do{
    ScanRequest req = new ScanRequest();
    req.setTableName(tableName);

    if(result != null){
        req.setExclusiveStartKey(result.getLastEvaluatedKey());
    }

    result = client.scan(req);

    totalItemCount += result.getItems().size();

} while(result.getLastEvaluatedKey() != null);

System.out.println("Result size: " + totalItemCount);
-5

This is solution for AWS JavaScript SDK users, it is almost same for other languages.

Result.data.Count will give you what you are looking for

 apigClient.getitemPost({}, body, {})

    .then(function(result){

        var dataoutput = result.data.Items[0];

        console.log(result.data.Count);
  }).catch( function(result){

});