feat: implement automatic session linking and identity stitching (#3820)

Links anonymous browser sessions to authenticated user identities, enabling unified
user journey tracking across login boundaries. This solves the "logged-out anonymous
session → logged-in session" tracking gap, providing complete funnel visibility and
accurate visitor deduplication.

## Changes

- Client-side: Persistent visitor ID in localStorage (data-identity-stitching attribute)
- Server-side: identity_link table linking visitors to distinct IDs (authenticated users)
- Query updates: getWebsiteStats now deduplicates by resolved identity
- Graceful degradation: Works in Safari private browsing and when localStorage unavailable

## Implementation Details

Uses hybrid approach combining client-side persistence with server-side linking:
- Visitor ID generated once per browser, persists across sessions
- When user logs in, identify() creates identity link
- stats queries join through identity_link to deduplicate cross-device sessions

Both PostgreSQL and ClickHouse supported with appropriate query patterns:
- PostgreSQL: normalized schema, joins through session table
- ClickHouse: denormalized with ReplacingMergeTree for deduplication

## Edge Cases Handled

- Safari private browsing: localStorage throws, visitorId undefined, no link created
- localStorage cleared: new visitorId generated, creates new link
- Multiple tabs: same visitorId shared via localStorage
- Multiple devices: one visitor can link to multiple distinct_ids
- Multiple accounts: one distinct_id can link to multiple visitors

## Test Plan

- [ ] Enable feature on test website (default enabled)
- [ ] Anonymous pageview - confirm visitor_id in events table
- [ ] Call umami.identify('user1') - confirm identity_link created
- [ ] Stats show 1 visitor (deduplicated)
- [ ] Log out, browse anonymously, stats still show 1 visitor
- [ ] Test with data-identity-stitching="false" - no visitor_id collected
- [ ] Test in Safari private browsing - no errors, gracefully skips
- [ ] Test ClickHouse: verify identity_link table populated and FINAL keyword works
- [ ] Verify retroactive: historical anonymous session attributed correctly
This commit is contained in:
Arthur Sepiol 2025-12-03 16:54:56 +03:00
parent a902a87c08
commit 34db34759f
7 changed files with 76 additions and 24 deletions

View file

@ -321,11 +321,12 @@ model Pixel {
}
model IdentityLink {
id String @id @unique @map("identity_link_id") @db.Uuid
websiteId String @map("website_id") @db.Uuid
visitorId String @map("visitor_id") @db.VarChar(50)
distinctId String @map("distinct_id") @db.VarChar(50)
linkedAt DateTime @default(now()) @map("linked_at") @db.Timestamptz(6)
id String @id @unique @map("identity_link_id") @db.Uuid
websiteId String @map("website_id") @db.Uuid
visitorId String @map("visitor_id") @db.VarChar(50)
distinctId String @map("distinct_id") @db.VarChar(50)
createdAt DateTime @default(now()) @map("created_at") @db.Timestamptz(6)
linkedAt DateTime @default(now()) @updatedAt @map("linked_at") @db.Timestamptz(6)
website Website @relation(fields: [websiteId], references: [id], onDelete: Cascade)