Commit graph

15 commits

Author SHA1 Message Date
Arthur Sepiol
34db34759f feat: implement automatic session linking and identity stitching (#3820)
Links anonymous browser sessions to authenticated user identities, enabling unified
user journey tracking across login boundaries. This solves the "logged-out anonymous
session → logged-in session" tracking gap, providing complete funnel visibility and
accurate visitor deduplication.

## Changes

- Client-side: Persistent visitor ID in localStorage (data-identity-stitching attribute)
- Server-side: identity_link table linking visitors to distinct IDs (authenticated users)
- Query updates: getWebsiteStats now deduplicates by resolved identity
- Graceful degradation: Works in Safari private browsing and when localStorage unavailable

## Implementation Details

Uses hybrid approach combining client-side persistence with server-side linking:
- Visitor ID generated once per browser, persists across sessions
- When user logs in, identify() creates identity link
- stats queries join through identity_link to deduplicate cross-device sessions

Both PostgreSQL and ClickHouse supported with appropriate query patterns:
- PostgreSQL: normalized schema, joins through session table
- ClickHouse: denormalized with ReplacingMergeTree for deduplication

## Edge Cases Handled

- Safari private browsing: localStorage throws, visitorId undefined, no link created
- localStorage cleared: new visitorId generated, creates new link
- Multiple tabs: same visitorId shared via localStorage
- Multiple devices: one visitor can link to multiple distinct_ids
- Multiple accounts: one distinct_id can link to multiple visitors

## Test Plan

- [ ] Enable feature on test website (default enabled)
- [ ] Anonymous pageview - confirm visitor_id in events table
- [ ] Call umami.identify('user1') - confirm identity_link created
- [ ] Stats show 1 visitor (deduplicated)
- [ ] Log out, browse anonymously, stats still show 1 visitor
- [ ] Test with data-identity-stitching="false" - no visitor_id collected
- [ ] Test in Safari private browsing - no errors, gracefully skips
- [ ] Test ClickHouse: verify identity_link table populated and FINAL keyword works
- [ ] Verify retroactive: historical anonymous session attributed correctly
2025-12-03 16:54:56 +03:00
Arthur Sepiol
a902a87c08 feat: implement identity stitching for session linking (#3820)
Adds automatic session linking/identity stitching to link anonymous
browsing sessions with authenticated user sessions.

## Changes

### Database Schema
- Add `identity_link` table (PostgreSQL + ClickHouse) to store mappings
  between visitor IDs and authenticated user IDs
- Add `visitor_id` field to `Session` model
- Add `visitor_id` column to ClickHouse `website_event` table

### Client Tracker
- Generate and persist `visitor_id` in localStorage
- Include `vid` in all tracking payloads
- Support opt-out via `data-identity-stitching="false"` attribute

### API
- Accept `vid` parameter in `/api/send` endpoint
- Auto-create identity links when `identify()` is called with both
  visitor_id and distinct_id
- Store visitor_id in sessions and events

### Query Updates
- Update `getWebsiteStats` to deduplicate visitors by resolved identity
- Visitors who browse anonymously then log in are now counted as one user

## Usage

When a user logs in, call `umami.identify(userId)`. If identity stitching
is enabled (default), the tracker automatically links the anonymous
visitor_id to the authenticated userId. Stats queries then resolve
linked identities to accurately count unique visitors.

Resolves #3820
2025-12-03 16:06:54 +03:00
Mike Cao
67105f2cff Updated packages. 2025-09-10 17:16:04 -07:00
Mike Cao
3c5c1e48e9 Refactored settings. Updated sidebar. 2025-08-15 22:16:28 -07:00
Mike Cao
eabdd18604 Updated Prisma build. 2025-08-15 12:29:33 -07:00
Mike Cao
88639dfe83 New schema for pixels and links. 2025-08-13 20:27:54 -07:00
Mike Cao
585706cc16 Fix css issue. 2020-08-13 00:29:07 -07:00
Mike Cao
000f84df96 Rename website column. Table component. 2020-08-06 23:02:24 -07:00
Mike Cao
e17c9da3d5 Added device collection. 2020-08-06 19:14:44 -07:00
Mike Cao
0f0b1e39e7 Build cli using rollup. 2020-07-24 17:00:56 -07:00
Mike Cao
cb0c912c5b Switch to json web tokens. 2020-07-22 20:45:09 -07:00
Mike Cao
d8c8df2955 Updated schema. 2020-07-18 23:54:25 -07:00
Mike Cao
58a1c63407 Added CORS middleware. Updated umami script. 2020-07-18 10:36:46 -07:00
Mike Cao
c681441601 Add indexes to tables. 2020-07-17 19:33:40 -07:00
Mike Cao
f7f0c05e12 Initial commit. 2020-07-17 01:03:38 -07:00