Skip to main content
Version: 5.3

aggs

Description

Performs statistical operations on data using internal storage mechanisms.

danger

Using aggs in a query is valid only if it follows commands that also use internal storage mechanisms. These include source, inputlookup, search, and peval. This condition must also hold for all subqueries within the query.

Syntax

 | aggs [composite=<boolean>] <functions-expression> ["," <functions-expression>]  [<by_expression>]

Required Arguments

At least one function must be used:

ParameterSyntaxDescription
countcount | count(<field>)Calculates the number of events containing a field. If no field is specified, it calculates the total number of events.
valuesvalues(<field>)Computes an array of unique values for the given field.
avgavg(<field>)Computes the average value for the given field.
dcdc(<field>)Counts the number of unique values in the specified field.
earliestearliest(<field>)Computes the field value for the earliest event. An optional second parameter is the name of the timestamp field (default: @timestamp).
latestlatest(<field>)Computes the field value for the latest event. An optional second parameter is the name of the timestamp field (default: @timestamp).
maxmax(<field>)Calculates the maximum value for the given field.
minmin(<field>)Calculates the minimum value for the given field.
sumsum(<field>)Computes the sum of values for the given field.
percperc(<field>, <percent>)Calculates the percentile for the specified field and percentage.
medianmedian(<field>)Calculates the median for the specified field.

Optional Arguments

ParameterSyntaxDefaultDescription
compositecomposite=<boolean>falseAllows the use of scrolling in aggregations to obtain all possible segments (buckets) with multiple queries (similar to scroll in stats). It can only be used if there is grouping (by fields). The number of segments (buckets) that will be returned is fixed—1000.
<by_expression><by_field> ["," <by_field> ...]The name of the field (or fields) for grouping values.
Composite usage

The composite argument is available when querying OpenSearch.

Using Wildcards

The * symbol is supported for bulk field selection based on a pattern in aggregation functions. It can be used to select all fields matching the pattern, as well as to rename them using aliases.

Usage specifics in aggregations:

  • when using numerical aggregations (avg, sum, min, perc, etc.), only numerical fields will be considered. If the pattern matches other field types, the system will warn the user that these fields are excluded from the calculation
  • multiple uses of * within a single pattern are allowed
  • the number of fields for any aggregation is limited by the cluster configuration parameter max_fields_under_pattern, which defaults to 250
... | aggs avg(audit_*) as *
... | aggs count(audit_*_*)
Keyword usage in OpenSearch

Aggregations in OpenSearch perform statistical processing on numeric fields or keyword if the field is text-based. For text fields, you need to append <field-name>.keyword, which should be done for both specified and by fields. Exceptions are specified fields in functions like earliest and latest. Example:

...
| aggs avg(user.keyword), earliest(user) by event.keyword, user_count

Query Examples

warning

The examples below use datasets for mapping the .keyword field. If default indexing is configured, the .keyword suffix must be specified.

Mapping Settings

A text field is broken down into individual words during indexing, allowing for partial matches to be found (e.g., finding "cat" within the word "kitten"). This is convenient for full-text search but requires more resources.

A keyword field is not tokenized—it is stored and searched as a whole. It is suitable for exact matches (e.g., searching for "Dubai" only as a complete value). It performs faster.

Basic Examples

Example 1

Calculating the count of events for each combination of DNS status code and host name.

source winlogbeat_dns
| aggs count by event.code, host.keyword
| sort - count

When using by fields in a query, a row is returned for each unique value of the by field, containing that value and the results of the statistical functions. Since this example uses three by fields (event.code, status, and host), each unique combination of these values will be on a separate row. The sort command sorts by the count field by default.

The result of the query could be the following table:

countevent.codehost.keyword
15953VLG-DC-01
14985504MSK-DC-02
3564015SPB-DC-03
2544013VLG-DC-01
1234007SPB-DC-03
984013MSK-DC-02
795504SPB-DC-03

Example 2

Calculating the list of hosts and the count of actions for each user and their action on the hosts.

source wineventlog
| aggs count, values(host.name.keyword) as hosts by user.name.keyword, event.action.keyword

For each unique pair event.action and user.name, the query computes the count of events and a list of unique hosts, stored in the count and hosts fields respectively.

The result of the query could be the following table:

user.name.keywordevent.action.keywordcounthosts
IvanovMGcredential-validated41MSK-WS-01
SPB-WS-02
IvanovMGlogged-in26SPB-WS-02
VLG-WS-03
KZN-WS-05
IvanovMGlogged-out25SPB-WS-02
VLG-WS-03
KZN-WS-05
PetrovSIlogged-out33SPB-WS-02
KZN-WS-05
PetrovSIadded-member-to-group6MSK-WS-01
AndreevAEadded-member-to-group14SPB-WS-02
VLG-WS-03

Advanced Examples

Example 1

Getting a list of unique values for the user field:

... | aggs values(user.keyword)

Returns an array of unique values from the user.keyword field across all documents matching the query.

Sample input data:

usersource.addressevent
IvanovMG192.168.1.10login
PetrovSI192.168.1.15logout
IvanovMG192.168.1.20login
AndreevAE10.0.0.1password_reset
SidorovVA10.0.0.5password_reset

The query execution result may be the following table:

values (user.keyword)
IvanovMG
PetrovSI
AndreevAE
SidorovVA

Example 2

Getting a list of unique values for the user field for each unique event field value.

In this example, the event field in the data has values start and end. After running the command, the system will return pairs of results: start with a list of unique user values (event = "start"), and end with a list of unique user values (event = "end").

... | aggs values(user.keyword) by event.keyword

The query execution result may be the following table:

event.keywordvalues (user.keyword)
startIvanov
Petrov
endAndreev

Example 2.2

In this example, the response.status field in the data contains HTTP status values. After executing the command, the system will return pairs: 200 and a list of unique source.address.keyword values (where response.status = 200), 403 and a list of unique source.address.keyword values (where response.status = 403), and similarly for response.status = 401:

... | aggs values(source.address.keyword) by response.status

Sample input data:

user.keywordresponse.statussource.address.keywordrequest.referrer.keywordnetwork.bytes
IvanovMG200192.168.1.10/data2048
PetrovSI200192.168.1.15/login1024
IvanovMG403175.100.1.20/data512
AndreevAE40110.0.0.1/config128

The query execution result may be the following table:

response.statusvalues (source.address.keyword)
200192.168.1.10
192.168.1.15
403175.100.1.20
40110.0.0.1

Example 3

The following examples will use input data from user action audits, a portion of which is presented below. The dataset used in the examples does not have index mapping configured.

l_timehostusertargetactionoutcome
2025-05-30 13:47:08ws-01smithfordlocked-out-user-account
changed-password
added-member-to-group
success
2025-05-30 13:47:08ws-01avdeevfordreset-password
changed-password
failure
2025-05-30 13:47:08ws-01dawsoncooperenabled-user-accountsuccess
2025-05-30 13:47:08ws-02holmesjenkinschanged-passwordfailure
2025-05-30 13:47:08ws-01owenturnerremoved-member-from-groupsuccess
2025-05-30 13:47:08ws-01taylerwatsonadded-member-to-groupfailure
2025-05-30 13:47:08ws-02wrightlloydremoved-member-from-groupfailure

Example 3.1

List of all unique values for the user and action fields:

...
| aggs values(user.keyword), values(action.keyword)

Returns arrays of unique values for user.keyword and action.keyword independently of each other.

The query execution result may be the following table:

values (user.keyword)values (action.keyword)
denisov
avdeev
abramova
popov
shishkin
locked-out-user-account
reset-password
enabled-user-account
changed-password
removed-member-from-group
added-member-to-group

Example 3.2

List of unique values for user.keyword and action.keyword that share the same combination of user.keyword and action.keyword field values:

...
| aggs values(user.keyword), values(action.keyword) by user.keyword, action.keyword

Groups documents by unique combinations of user.keyword and action.keyword, returning the values of these fields for each group.

The query execution result may be the following table:

user.keywordaction.keywordvalues (user.keyword)values (action.keyword)
denisovlocked-out-user-accountdenisovlocked-out-user-account
changed-password
added-member-to-group
denisovchanged-passworddenisovlocked-out-user-account
changed-password
added-member-to-group
denisovadded-member-to-groupdenisovlocked-out-user-account
changed-password
added-member-to-group
avdeevreset-passwordavdeevreset-password
changed-password
avdeevchanged-passwordavdeevreset-password
changed-password
abramovaenabled-user-accountabramova["enabled-user-account"]
popovchanged-passwordpopovchanged-password
abramovaremoved-member-from-groupabramovaremoved-member-from-group
denisovadded-member-to-groupdenisovadded-member-to-group
shishkinremoved-member-from-groupshishkinremoved-member-from-group

Example 3.3

Number of documents containing the user.keyword field:

... | aggs count(user.keyword)

The query execution result may be the following table:

count (user.keyword)
5

Example 3.4

Number of documents containing the action.keyword field for each unique value of the user.keyword field:

... | aggs count(action.keyword) by user.keyword

The query execution result may be the following table:

user.keywordcount (user.keyword)
denisov955
avdeev34
abramova1007
popov1022
shishkin873

Example 3.5

Number of documents containing the user.keyword field and number of documents containing the action.keyword field:

... | aggs count(user.keyword), count(action.keyword)

The query execution result may be the following table:

count (user.keyword)count (action.keyword)
53891

Example 3.6

Number of documents containing the user.keyword field for each unique combination of user.keyword and action.keyword field values:

... | aggs count(user.keyword) by user.keyword, action.keyword

The query execution result may be the following table showing the first 7 entries:

user.keywordaction.keywordcount (user.keyword)
denisovlocked-out-user-account201
denisovchanged-password40
denisovadded-member-to-group46
shishkinlocked-out-user-account271
popovlocked-out-user-account511
denisovdisabled-user-account23
abramovaadded-user-account51
avdeevreset-password1
avdeevchanged-password2
popovadded-user-account59

Example 3.7

In this example, the system will return the contents of the action.keyword field for the first record in the sample, the contents of the action.keyword field for the last record in the sample, and the average value of the action_count field (which was calculated by the command from the previous example) for each unique user.keyword:

... | aggs count(user.keyword) as action_count by user.keyword, event.keyword
| aggs first(action.keyword), last(action.keyword), avg(action_count.keyword) by user.keyword

The query execution result may be the following table:

user.keywordfirst (action.keyword)last (action.keyword)avg (action_count.keyword)
denisovlocked-out-user-accountchanged-password95.8
abramovalocked-out-user-accountchanged-password100.1
shishkinremoved-member-from-groupreset-password102.2
popovremoved-member-from-groupreset-password113.22
avdeevchanged-passwordreset-password1.5

Example 3.8

In this example, the system will return the contents of the action.keyword field for the earliest record in the sample and the contents of the user field for the latest record in the sample:

... | aggs earliest(action.keyword), latest(user.keyword) by @timestamp

The query execution result may be the following table showing the first 7 entries:

@timestampearliest (action.keyword)latest (user.keyword)
2025-05-30T10:20:00.000000+0000removed-member-from-groupabramova
2025-05-30T10:21:00.000000+0000added-member-to-groupabramova
2025-05-30T10:22:00.000000+0000removed-member-from-groupdenisov
2025-05-30T10:23:00.000000+0000deleted-user-accountpopov
2025-05-30T10:24:00.000000+0000enabled-user-accountabramova
2025-05-30T10:25:00.000000+0000locked-out-user-accountabramova
2025-05-30T10:26:00.000000+0000unlocked-user-accountshishkin

Example 4

Number of unique values in the user field:

... | aggs dc(user)

Sample input data

user
Ivanov
Petrov
Andreev
Ivanov

The query execution result may be the following table:

dc(user)
3

Example 5

In this example, the system will return the value of the user field that occurs in 50 percent of cases:

... | aggs perc(user, 50)

In this example, the system groups data by delivery status status and for each group returns an array of unique vehicle identifiers vehicle, calculates the total distance distance, and determines the delivery time delivery_time that occurs in 90 percent of cases:

Example №1
... | aggs values(vehicle.keyword), sum(distance), perc(delivery_time, 90) by status.keyword

Sample input data:

vehiclestatusdistancedelivery_time@timestamp
MTL Pounderdelivered100242025-04-30T10:00:00Z
Maibatsu Muledelivered150302025-04-30T10:10:00Z
MTL Pounderin_transit200482025-04-30T10:20:00Z
Jobiult Haulerin_transit120362025-04-30T10:30:00Z
Maibatsu Muledelivered80282025-04-30T10:40:00Z
Vapid Bensonpending50122025-04-30T10:50:00Z

The query execution result may be the following table:

status.keywordvalues (vehicle_id.keyword)sum (distance)perc (delivery_time, 90)
deliveredMTL Pounder
Maibatsu Mule
33029.6
in_transitMTL Pounder
Jobiult Hauler
32046.8
pendingVapid Benson5012

Example 6

In this example, the system groups data by combinations of transaction type transaction_type and currency currency, and for each group counts the number of unique clients client, calculates the total transaction amount amount, and returns the identifier of the earliest transaction transaction_id:

... | aggs dc(client.keyword), sum(amount), earliest(transaction_id) by transaction_type.keyword, currency.keyword

Sample input data:

clienttransaction_typecurrencyamounttransaction_id@timestamp
IvanovdepositUSD100012025-04-30T10:00:00Z
PetrovdepositUSD50022025-04-30T10:10:00Z
IvanovwithdrawalEUR20032025-04-30T10:20:00Z
AndreevdepositEUR40042025-04-30T10:30:00Z
PetrovwithdrawalUSD30052025-04-30T10:40:00Z
IvanovdepositUSD60062025-04-30T09:50:00Z

The query execution result may be the following table:

transaction_type.keywordcurrency.keyworddc (client.keyword)sum (amount)earliest (transaction_id)
depositUSD215006
depositEUR14004
withdrawalEUR12003
withdrawalUSD13005

Example 7

This example will perform several calculations – for each field matching the pattern audit_* (all fields whose names start with audit_). The output name will be formed from the dynamic part of the field name * with the suffix _count:

... | aggs count(audit_*) as *_count

Sample input data:

audit_nameaudit_statusdistancedelivery_time@timestamp
MTL Pounderdelivered100242025-04-30T10:00:00Z
Maibatsu Muledelivered150302025-04-30T10:10:00Z
MTL Pounderin_transit200482025-04-30T10:20:00Z
Jobiult Haulerin_transit120362025-04-30T10:30:00Z
Maibatsu Muledelivered80282025-04-30T10:40:00Z
Vapid Bensonpending50122025-04-30T10:50:00Z

The query execution result might be the following table:

name_countstatus_count
66

Example 8

This example uses a more complex field pattern containing two * wildcards:

... | aggs max(audit_*_*) as *_count_*

Sample input data:

audit_nameaudit_delivery_statusaudit_distanceaudit_delivery_time@timestamp
MTL Pounderdelivered100242025-04-30T10:00:00Z
Maibatsu Muledelivered150302025-04-30T10:10:00Z
MTL Pounderin_transit200482025-04-30T10:20:00Z
Jobiult Haulerin_transit120362025-04-30T10:30:00Z
Maibatsu Muledelivered80282025-04-30T10:40:00Z
Vapid Bensonpending50122025-04-30T10:50:00Z

The query execution result might be the following table:

delivery_count_time
48

It is important to note that this query uses the numeric aggregation type max – meaning the calculation will be performed on numeric fields. In this case, two fields match the pattern: audit_delivery_status and audit_delivery_time. However, the values in the audit_delivery_status field are not numeric, so the calculation was performed only on the audit_delivery_time field.