Triggers

Triggers determine when Beam emits results for a window. They control the balance between latency (how quickly you get results) and completeness (how much data is included).

What are Triggers?

Triggers answer the question: “When should we emit results for this window?” From the Apache Beam examples:

/**
 * This example illustrates the basic concepts behind triggering. It shows how to use 
 * different trigger definitions to produce partial (speculative) results before all 
 * the data is processed and to control when updated results are produced for late data.
 *
 * Concepts:
 * 1. The default triggering behavior
 * 2. Late data with the default trigger
 * 3. How to get speculative estimates
 * 4. Combining late data and speculative estimates
 */

Why Triggers?

Triggers enable you to:

Get early results before all data arrives (speculative results)
Handle late data that arrives after the window closes
Balance latency and accuracy based on your requirements
Refine results as more data becomes available

Without triggers, windows only emit results once when the watermark passes the end of the window. Triggers give you fine-grained control over this timing.

Default Trigger Behavior

By default, Beam uses the following trigger:

AfterWatermark.pastEndOfWindow()

This means:

Results are emitted once when the watermark passes the end of the window
Late data is discarded (unless you configure allowed lateness)
No early (speculative) results

// Default trigger (implicit)
PCollection<String> windowed = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5))));
// Emits results once per window when watermark passes

// Equivalent explicit form
PCollection<String> windowed = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(AfterWatermark.pastEndOfWindow())
        .discardingFiredPanes());

Built-in Trigger Types

Event Time Triggers

Fire based on the watermark (event time progress):

AfterWatermark

Fire when the watermark passes the end of the window:

import org.apache.beam.sdk.transforms.windowing.AfterWatermark;
import org.apache.beam.sdk.transforms.windowing.Window;

PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(AfterWatermark.pastEndOfWindow())
        .discardingFiredPanes());

Processing Time Triggers

Fire based on processing time (wall clock time):

AfterProcessingTime

Fire after a certain amount of processing time has elapsed:

import org.apache.beam.sdk.transforms.windowing.AfterProcessingTime;
import org.apache.beam.sdk.transforms.windowing.Repeatedly;

// Fire every minute of processing time
PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(
            Repeatedly.forever(
                AfterProcessingTime.pastFirstElementInPane()
                    .plusDelayOf(Duration.standardMinutes(1))))
        .discardingFiredPanes());

Count-based Triggers

Fire after a certain number of elements:

AfterCount

import org.apache.beam.sdk.transforms.windowing.AfterPane;

// Fire after every 100 elements
PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(
            Repeatedly.forever(
                AfterPane.elementCountAtLeast(100)))
        .discardingFiredPanes());

Composite Triggers

Combine multiple triggers with logical operations:

AfterEach (Sequence)

Fire triggers in sequence:

import org.apache.beam.sdk.transforms.windowing.AfterEach;

// Fire after 100 elements, then after watermark
PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(
            AfterEach.inOrder(
                AfterPane.elementCountAtLeast(100),
                AfterWatermark.pastEndOfWindow()))
        .discardingFiredPanes());

AfterAll

Fire when all sub-triggers have fired:

import org.apache.beam.sdk.transforms.windowing.AfterAll;

// Fire when BOTH conditions are met
PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(
            AfterAll.of(
                AfterPane.elementCountAtLeast(100),
                AfterProcessingTime.pastFirstElementInPane()
                    .plusDelayOf(Duration.standardMinutes(1))))
        .discardingFiredPanes());

AfterAny

Fire when any sub-trigger fires:

import org.apache.beam.sdk.transforms.windowing.AfterFirst;

// Fire when EITHER condition is met
PCollection<String> results = input.apply(
    Window.<String>into(FixedWindows.of(Duration.standardMinutes(5)))
        .triggering(
            AfterFirst.of(
                AfterPane.elementCountAtLeast(1000),
                AfterProcessingTime.pastFirstElementInPane()
                    .plusDelayOf(Duration.standardMinutes(1))))
        .discardingFiredPanes());

Early and Late Firings

The most common pattern combines early (speculative) results with on-time and late results:

import org.apache.beam.sdk.transforms.windowing.*;

// Early results every minute, final result at watermark, late data every 5 minutes
PCollection<KV<String, Integer>> results = input.apply(
    Window.<KV<String, Integer>>into(
            FixedWindows.of(Duration.standardMinutes(30)))
        .triggering(
            AfterWatermark.pastEndOfWindow()
                // Speculative early results
                .withEarlyFirings(
                    AfterProcessingTime.pastFirstElementInPane()
                        .plusDelayOf(Duration.standardMinutes(1)))
                // Handle late data
                .withLateFirings(
                    AfterProcessingTime.pastFirstElementInPane()
                        .plusDelayOf(Duration.standardMinutes(5))))
        .withAllowedLateness(Duration.standardDays(1))
        .accumulatingFiredPanes());

This trigger:

Early: Fires every minute with partial results
On-time: Fires when watermark passes end of window
Late: Fires every 5 minutes for late-arriving data

Accumulation Modes

Control how results accumulate across trigger firings:

Discarding Mode

Each pane contains only new data since the last firing:

.discardingFiredPanes()

Window [0:00-0:05]
  Firing 1 (early):    elements [A, B, C]      → output: [A, B, C]
  Firing 2 (on-time):  elements [D, E]         → output: [D, E]
  Firing 3 (late):     elements [F]            → output: [F]

Use discarding mode when you want to avoid duplicate processing and can handle incremental updates.

Accumulating Mode

Each pane contains all data seen so far:

.accumulatingFiredPanes()

Window [0:00-0:05]
  Firing 1 (early):    elements [A, B, C]      → output: [A, B, C]
  Firing 2 (on-time):  elements [D, E]         → output: [A, B, C, D, E]
  Firing 3 (late):     elements [F]            → output: [A, B, C, D, E, F]

Use accumulating mode when downstream systems expect complete results and can handle retractions/updates.

Complete Example: Real-time Dashboard

From the Apache Beam examples:

public class RealtimeDashboard {
    
    public static void main(String[] args) {
        Pipeline p = Pipeline.create(options);
        
        PCollection<KV<String, Integer>> scores = p
            .apply("ReadScores", 
                PubsubIO.readStrings().fromTopic("game-scores"))
            .apply("ParseScores", 
                ParDo.of(new ParseEventFn()));
        
        // Get early updates every 30 seconds, final at watermark
        PCollection<KV<String, Integer>> teamScores = scores.apply(
            Window.<KV<String, Integer>>into(
                    FixedWindows.of(Duration.standardMinutes(5)))
                .triggering(
                    AfterWatermark.pastEndOfWindow()
                        .withEarlyFirings(
                            AfterProcessingTime.pastFirstElementInPane()
                                .plusDelayOf(Duration.standardSeconds(30))))
                .withAllowedLateness(Duration.standardMinutes(10))
                .accumulatingFiredPanes())
            .apply("SumScores", Sum.integersPerKey());
        
        // Write to dashboard
        teamScores
            .apply("FormatResults", ParDo.of(new FormatFn()))
            .apply("WriteToDashboard", 
                PubsubIO.writeStrings().to("dashboard-updates"));
        
        p.run();
    }
}

This example:

Early firings: Updates dashboard every 30 seconds with partial results
On-time firing: Emits final result when window closes
Late firings: Updates if late data arrives (within 10-minute lateness)
Accumulating: Each update contains complete results so far

Pane Information

Access metadata about trigger firings:

class ProcessPaneFn extends DoFn<KV<String, Integer>, String> {
    @ProcessElement
    public void processElement(
            @Element KV<String, Integer> element,
            BoundedWindow window,
            PaneInfo pane,
            OutputReceiver<String> out) {
        
        String timing = pane.getTiming().toString(); // EARLY, ON_TIME, LATE
        boolean isFirst = pane.isFirst();
        boolean isLast = pane.isLast();
        long index = pane.getIndex();
        
        String result = String.format(
            "Key: %s, Value: %d, Timing: %s, IsFirst: %s, IsLast: %s",
            element.getKey(), element.getValue(), timing, isFirst, isLast);
        
        out.output(result);
    }
}

Best Practices

Start simple: Begin with the default trigger. Add complexity only when you need early results or late data handling.

Cost vs. Latency: More frequent firings mean lower latency but higher processing costs. Balance based on your requirements.

Monitor pane counts: High pane counts may indicate issues with your trigger configuration or data arrival patterns.

Common Patterns

Real-time dashboard: Early firings every 30-60 seconds
Hourly reports: Watermark trigger with 1-hour allowed lateness
Session analysis: Default trigger with session windows
High-throughput counting: Process every N elements or M seconds
Critical updates: Watermark trigger with minimal latency tolerance

When to Use Which Trigger

Use Case	Recommended Trigger
Batch processing	Default (AfterWatermark)
Real-time dashboard	AfterWatermark with early firings
Low-latency alerts	AfterProcessingTime
Exactly-once processing	Default with no early/late firings
Approximate results OK	Frequent AfterProcessingTime
Late data important	AfterWatermark with late firings

Debugging Triggers

Log pane information to understand trigger behavior:

class LogPaneInfoFn<T> extends DoFn<T, T> {
    private static final Logger LOG = LoggerFactory.getLogger(LogPaneInfoFn.class);
    
    @ProcessElement
    public void processElement(
            @Element T element,
            BoundedWindow window,
            PaneInfo pane,
            OutputReceiver<T> out) {
        
        LOG.info("Element: {}, Window: {}, Pane: {}, Timing: {}",
            element, window, pane.getIndex(), pane.getTiming());
        
        out.output(element);
    }
}

What are Triggers?

Why Triggers?

Default Trigger Behavior

Built-in Trigger Types

Event Time Triggers

AfterWatermark

Processing Time Triggers

AfterProcessingTime

Count-based Triggers

AfterCount

Composite Triggers

AfterEach (Sequence)

AfterAll

AfterAny

Early and Late Firings

Accumulation Modes

Discarding Mode

Accumulating Mode

Complete Example: Real-time Dashboard

Pane Information

Best Practices

Common Patterns

When to Use Which Trigger

Debugging Triggers

Next Steps

Windowing

Transforms

Documentation Index

​What are Triggers?

​Why Triggers?

​Default Trigger Behavior

​Built-in Trigger Types

​Event Time Triggers

​AfterWatermark

​Processing Time Triggers

​AfterProcessingTime

​Count-based Triggers

​AfterCount

​Composite Triggers

​AfterEach (Sequence)

​AfterAll

​AfterAny

​Early and Late Firings

​Accumulation Modes

​Discarding Mode

​Accumulating Mode

​Complete Example: Real-time Dashboard

​Pane Information

​Best Practices

​Common Patterns

​When to Use Which Trigger

​Debugging Triggers

​Next Steps

Windowing

Transforms

What are Triggers?

Why Triggers?

Default Trigger Behavior

Built-in Trigger Types

Event Time Triggers

AfterWatermark

Processing Time Triggers

AfterProcessingTime

Count-based Triggers

AfterCount

Composite Triggers

AfterEach (Sequence)

AfterAll

AfterAny

Early and Late Firings

Accumulation Modes

Discarding Mode

Accumulating Mode

Complete Example: Real-time Dashboard

Pane Information

Best Practices

Common Patterns

When to Use Which Trigger

Debugging Triggers

Next Steps