Skip to content

Top N

PersonalizationRanking

Tracks the top N values for a given field, ranked by engagement count or accumulated numeric value. Purpose-built for personalization — tracks per-event-type breakdowns, temporal metadata, and uses a larger internal pool to prevent frozen leaderboards.

When to Use

  • Personalization: what are a user's top clips, songs, or products by engagement?
  • Recommendation inputs: feed ranked preference data into ML models
  • Engagement profiling: which items get deep engagement vs. shallow interactions?
  • Content curation: build "Your Top Songs" or "Most Viewed" features

How It Works

The top_n operation maintains an internal pool of tracked values (default: 100) that is larger than the number returned (default: 20). This separation is critical: newcomers accumulate counts in the internal pool before competing for a spot in the top N response. When the pool exceeds max_tracked, the value with the oldest _last_seen timestamp is evicted — so actively interacted newcomers are never evicted in favor of stale incumbents.

top_n vs unique

uniquetop_n
PurposeCardinality tracking ("how many distinct X?")Preference profiling ("what are the top X and how?")
CountsFlat integer per valuePer-event-type breakdown with temporal metadata
Internal trackingOnly stores top/recent N valuesStores up to max_tracked values internally, returns top N
New value entryStarts at 1, may be immediately evicted in ranked modeStarts at 1, accumulates in the internal pool before competing

Configuration

json
{
  "name": "top_clips_30d",
  "operation": "top_n",
  "group_by": "device_id",
  "fields": ["event_properties.clip_id"],
  "window_duration_seconds": 2592000,
  "operation_config": {
    "n": 20,
    "max_tracked": 100,
    "group_counts_by": "event_name",
    "snapshot_fields": ["event_properties.title", "event_properties.genre"]
  }
}

Configuration Options

OptionTypeDefaultDescription
nnumber20Number of top values to return in the response
max_trackednumber100Maximum values to track internally. Must be ≥ n
sum_bystringDot-notation path to a numeric field to accumulate per value (e.g., "event_properties.playDuration"). Switches ranking from event count to accumulated sum
group_counts_bystringField to sub-group counts by (e.g., "event_name")
snapshot_fieldsstring[]Dot-notation field paths to snapshot on each value. Captures the last-seen value of each field
evict_onstring[]Event names that trigger immediate removal from tracking (e.g., ["dislike", "block"])

Fields

fields specifies which values are tracked and ranked. Supports dot notation (e.g., "event_properties.clip_id"). When multiple fields are specified, their values are joined with : to form a compound key.

Response

FieldDescription
valuesTop N values sorted by _total_sum (if sum_by set) or _total descending
uniqueNumber of values in the response (≤ n)
unique_all_timeTotal unique values ever seen in the window (including evicted)
evictedValues no longer in the internal pool (capacity + rejection evictions)
rejectedValues explicitly removed via evict_on. Subset of evicted
total_eventsTotal engagement events processed (excludes rejections)
rankings[value]._totalTotal event count for this value
rankings[value]._total_sumAccumulated numeric sum (only when sum_by is set)
rankings[value]._first_seenISO timestamp of first event
rankings[value]._last_seenISO timestamp of most recent event
rankings[value]._snapshotLast-seen values of snapshot_fields
rankings[value].[event_type]Count per event type (when group_counts_by is set)

Example (with group_counts_by and snapshot_fields)

json
{
  "behaviors": {
    "top_clips_30d": {
      "values": ["clip_789", "clip_456", "clip_123"],
      "unique": 3,
      "unique_all_time": 47,
      "evicted": 0,
      "total_events": 101,
      "rankings": {
        "clip_789": {
          "_total": 54,
          "_first_seen": "2025-05-18T09:12:00.000Z",
          "_last_seen": "2025-06-12T16:52:04.000Z",
          "_snapshot": { "title": "Midnight Drive", "genre": "lo-fi" },
          "PlaySong": 25,
          "SongCompleted": 20,
          "ShareSong": 5,
          "CreateSong": 1,
          "SkipSong": 3
        },
        "clip_456": {
          "_total": 27,
          "_first_seen": "2025-06-01T14:30:00.000Z",
          "_last_seen": "2025-06-02T11:15:00.000Z",
          "_snapshot": { "title": "Neon Skyline", "genre": "synthwave" },
          "PlaySong": 14,
          "SongCompleted": 4,
          "ShareSong": 1,
          "SkipSong": 8
        },
        "clip_123": {
          "_total": 20,
          "_first_seen": "2025-06-10T08:00:00.000Z",
          "_last_seen": "2025-06-12T16:50:00.000Z",
          "_snapshot": { "title": "Ocean Breeze", "genre": "ambient" },
          "PlaySong": 9,
          "SongCompleted": 9,
          "ShareSong": 2
        }
      },
      "timestamp": "2025-05-18T09:12:00.000Z",
      "remaining_window_seconds": 1547896
    }
  }
}

Example (without group_counts_by — flat counts)

json
{
  "behaviors": {
    "top_products_7d": {
      "values": ["product_A", "product_B", "product_C"],
      "unique": 3,
      "unique_all_time": 12,
      "evicted": 0,
      "total_events": 45,
      "rankings": {
        "product_A": {
          "_total": 22,
          "_first_seen": "2025-06-05T10:00:00.000Z",
          "_last_seen": "2025-06-12T14:30:00.000Z"
        },
        "product_B": {
          "_total": 15,
          "_first_seen": "2025-06-06T09:00:00.000Z",
          "_last_seen": "2025-06-12T12:00:00.000Z"
        },
        "product_C": {
          "_total": 8,
          "_first_seen": "2025-06-10T16:00:00.000Z",
          "_last_seen": "2025-06-12T16:00:00.000Z"
        }
      },
      "timestamp": "2025-06-05T10:00:00.000Z",
      "remaining_window_seconds": 518400
    }
  }
}

Ranking by Duration Instead of Count

Use sum_by to rank by accumulated numeric value instead of event count:

json
{
  "name": "top_clips_by_play_duration_30d",
  "operation": "top_n",
  "group_by": "device_id",
  "fields": ["event_properties.clip_id"],
  "window_duration_seconds": 2592000,
  "operation_config": {
    "n": 20,
    "sum_by": "event_properties.playDuration",
    "snapshot_fields": ["event_properties.title", "event_properties.genre"]
  }
}
json
{
  "behaviors": {
    "top_clips_by_play_duration_30d": {
      "values": ["clip_789", "clip_456", "clip_123"],
      "unique": 3,
      "unique_all_time": 12,
      "total_events": 34,
      "rankings": {
        "clip_789": {
          "_total": 6,
          "_total_sum": 2400,
          "_first_seen": "2025-06-01T09:00:00.000Z",
          "_last_seen": "2025-06-12T16:52:04.000Z",
          "_snapshot": { "title": "Midnight Drive", "genre": "lo-fi" }
        },
        "clip_456": {
          "_total": 18,
          "_total_sum": 900,
          "_first_seen": "2025-06-05T14:30:00.000Z",
          "_last_seen": "2025-06-12T11:15:00.000Z",
          "_snapshot": { "title": "Neon Skyline", "genre": "synthwave" }
        },
        "clip_123": {
          "_total": 10,
          "_total_sum": 600,
          "_first_seen": "2025-06-10T08:00:00.000Z",
          "_last_seen": "2025-06-12T16:50:00.000Z",
          "_snapshot": { "title": "Ocean Breeze", "genre": "ambient" }
        }
      },
      "evicted": 0,
      "rejected": 0,
      "timestamp": "2025-06-01T09:00:00.000Z",
      "remaining_window_seconds": 1547896
    }
  }
}

Notice: clip_789 has only 6 plays but 2,400 seconds of total listening time — it ranks first. clip_456 has 18 plays but only 900 seconds total (the user repeatedly starts it and skips). Without sum_by, clip_456 would rank first. With sum_by, the ranking reflects actual engagement depth.


Tips

Count vs Duration — Two Different Signals

_total (event count) and _total_sum (play duration) often disagree, and that disagreement is signal. A clip with high _total but low _total_sum is getting started and skipped — the user is curious but not hooked. A clip with low _total but high _total_sum is getting full listens — it's a deep favorite.

Frozen Leaderboard Prevention

Unlike unique with ranked: true — where evicted values lose their counts permanently — top_n maintains an internal tracking pool (max_tracked, default 100) larger than the returned set. New values accumulate counts before needing to outrank the top N. Eviction is based on _last_seen (stalest value removed), not count.

Reading Engagement Profiles

When group_counts_by is set to "event_name", each value includes a per-event-type breakdown. A clip with { PlaySong: 25, SongCompleted: 20, ShareSong: 5 } signals deep engagement, while { PlaySong: 14, SkipSong: 8, SongCompleted: 4 } signals the user tried it but didn't love it.

User Exploration Signals

unique_all_time and evicted reveal user behavior patterns. A user with unique: 20, unique_all_time: 47, evicted: 27 explores broadly with shifting preferences. A user with unique: 5, unique_all_time: 5, evicted: 0 has a tight, stable set of favorites.

Snapshot Fields — Item Metadata Without a Catalog Join

Use snapshot_fields to attach lightweight item metadata directly to each top N entry — useful for display purposes or quick ML feature enrichment without a catalog lookup. Snapshots always reflect the last-seen values.

Evict On — Hard Negative Signals

Use evict_on to immediately remove values when the user sends a strong negative signal. Without it, a clip played 50 times and then disliked stays at #1. With evict_on: ["dislike"], the dislike hard-deletes it. The rejected field tells you how many values the user has actively removed.