Links anonymous browser sessions to authenticated user identities, enabling unified
user journey tracking across login boundaries. This solves the "logged-out anonymous
session → logged-in session" tracking gap, providing complete funnel visibility and
accurate visitor deduplication.
## Changes
- Client-side: Persistent visitor ID in localStorage (data-identity-stitching attribute)
- Server-side: identity_link table linking visitors to distinct IDs (authenticated users)
- Query updates: getWebsiteStats now deduplicates by resolved identity
- Graceful degradation: Works in Safari private browsing and when localStorage unavailable
## Implementation Details
Uses hybrid approach combining client-side persistence with server-side linking:
- Visitor ID generated once per browser, persists across sessions
- When user logs in, identify() creates identity link
- stats queries join through identity_link to deduplicate cross-device sessions
Both PostgreSQL and ClickHouse supported with appropriate query patterns:
- PostgreSQL: normalized schema, joins through session table
- ClickHouse: denormalized with ReplacingMergeTree for deduplication
## Edge Cases Handled
- Safari private browsing: localStorage throws, visitorId undefined, no link created
- localStorage cleared: new visitorId generated, creates new link
- Multiple tabs: same visitorId shared via localStorage
- Multiple devices: one visitor can link to multiple distinct_ids
- Multiple accounts: one distinct_id can link to multiple visitors
## Test Plan
- [ ] Enable feature on test website (default enabled)
- [ ] Anonymous pageview - confirm visitor_id in events table
- [ ] Call umami.identify('user1') - confirm identity_link created
- [ ] Stats show 1 visitor (deduplicated)
- [ ] Log out, browse anonymously, stats still show 1 visitor
- [ ] Test with data-identity-stitching="false" - no visitor_id collected
- [ ] Test in Safari private browsing - no errors, gracefully skips
- [ ] Test ClickHouse: verify identity_link table populated and FINAL keyword works
- [ ] Verify retroactive: historical anonymous session attributed correctly
Adds automatic session linking/identity stitching to link anonymous
browsing sessions with authenticated user sessions.
## Changes
### Database Schema
- Add `identity_link` table (PostgreSQL + ClickHouse) to store mappings
between visitor IDs and authenticated user IDs
- Add `visitor_id` field to `Session` model
- Add `visitor_id` column to ClickHouse `website_event` table
### Client Tracker
- Generate and persist `visitor_id` in localStorage
- Include `vid` in all tracking payloads
- Support opt-out via `data-identity-stitching="false"` attribute
### API
- Accept `vid` parameter in `/api/send` endpoint
- Auto-create identity links when `identify()` is called with both
visitor_id and distinct_id
- Store visitor_id in sessions and events
### Query Updates
- Update `getWebsiteStats` to deduplicate visitors by resolved identity
- Visitors who browse anonymously then log in are now counted as one user
## Usage
When a user logs in, call `umami.identify(userId)`. If identity stitching
is enabled (default), the tracker automatically links the anonymous
visitor_id to the authenticated userId. Stats queries then resolve
linked identities to accurately count unique visitors.
Resolves#3820