mirror of
https://github.com/umami-software/umami.git
synced 2026-02-08 14:47:14 +01:00
feat: integrate First8 Marketing hyper-personalization system
Enhanced Umami Analytics with First8 Marketing integration for hyper-personalized recommendation engine. Database Enhancements: - PostgreSQL 17 with Apache AGE 1.6.0 (graph database) - TimescaleDB 2.23.0 (time-series optimization) - Extended schema for WooCommerce event tracking - Custom tables for recommendation engine integration Features Added: - Real-time ETL pipeline to recommendation engine - Extended event tracking (WordPress + WooCommerce) - Graph database for relationship mapping - Time-series optimization for analytics queries - Custom migrations for hyper-personalization Documentation: - Updated README with integration details - Added system architecture documentation - Documented data flow and components - Preserved original Umami Software credits Integration Components: - First8 Marketing Track plugin (event tracking) - Recommendation Engine (ML backend) - First8 Marketing Recommendation Engine plugin (presentation) Status: Production-ready Version: Based on Umami latest + First8 Marketing enhancements
This commit is contained in:
parent
a6d4519a98
commit
5f496fdb79
16 changed files with 8856 additions and 9790 deletions
37
.gitignore
vendored
37
.gitignore
vendored
|
|
@ -42,3 +42,40 @@ yarn-error.log*
|
|||
|
||||
*.dev.yml
|
||||
|
||||
# database files
|
||||
*.db
|
||||
*.sqlite
|
||||
*.sqlite3
|
||||
*.db-journal
|
||||
|
||||
# prisma generated
|
||||
prisma/generated/
|
||||
|
||||
# cache
|
||||
.cache/
|
||||
.eslintcache
|
||||
.stylelintcache
|
||||
|
||||
# temporary files
|
||||
tmp/
|
||||
temp/
|
||||
*.tmp
|
||||
*.bak
|
||||
|
||||
# OS files
|
||||
Thumbs.db
|
||||
Desktop.ini
|
||||
*~
|
||||
|
||||
# secrets
|
||||
*.key
|
||||
*.pem
|
||||
secrets/
|
||||
|
||||
# logs
|
||||
logs/
|
||||
*.log
|
||||
|
||||
# custom
|
||||
public/uploads/
|
||||
public/cache/
|
||||
|
|
|
|||
113
README.md
113
README.md
|
|
@ -113,8 +113,108 @@ docker compose up --force-recreate -d
|
|||
|
||||
---
|
||||
|
||||
## 🎯 First8 Marketing Integration
|
||||
|
||||
This is a customized version of Umami Analytics integrated into the **First8 Marketing Hyper-Personalized System**. This implementation extends the standard Umami installation with:
|
||||
|
||||
### Enhanced Features
|
||||
|
||||
- **PostgreSQL 17 with Apache AGE** - Graph database capabilities for advanced relationship tracking
|
||||
- **TimescaleDB Integration** - Time-series optimization for analytics data
|
||||
- **Extended Event Tracking** - Comprehensive WordPress and WooCommerce event capture
|
||||
- **Real-time Data Pipeline** - ETL integration with the recommendation engine
|
||||
- **Multi-dimensional Analytics** - Contextual, behavioral, temporal, and journey tracking
|
||||
|
||||
### System Architecture
|
||||
|
||||
This Umami instance serves as the **data collection layer** for the First8 Marketing hyper-personalization system:
|
||||
|
||||
```
|
||||
WordPress Site → Umami Analytics → Recommendation Engine → Personalized Content
|
||||
```
|
||||
|
||||
**Data Flow:**
|
||||
1. **Collection**: Umami captures all user interactions, page views, and WooCommerce events
|
||||
2. **Storage**: Events stored in PostgreSQL with TimescaleDB for time-series optimization
|
||||
3. **Graph Analysis**: Apache AGE enables relationship mapping between users, products, and behaviors
|
||||
4. **ETL Pipeline**: Real-time synchronization with the recommendation engine
|
||||
5. **Personalization**: ML models use analytics data to generate hyper-personalized recommendations
|
||||
|
||||
### Integration Components
|
||||
|
||||
This Umami installation works in conjunction with:
|
||||
|
||||
- **First8 Marketing Track Plugin** - WordPress connector for seamless event tracking
|
||||
- **Recommendation Engine** - Proprietary ML-powered personalization backend
|
||||
- **First8 Marketing Recommendation Engine Plugin** - WordPress connector for displaying personalized content
|
||||
|
||||
### Database Enhancements
|
||||
|
||||
**PostgreSQL Extensions:**
|
||||
- **Apache AGE 1.6.0** - Graph database for relationship mapping
|
||||
- **TimescaleDB 2.23.0** - Time-series optimization for analytics queries
|
||||
- **Prisma 6.18.0** - ORM for database management
|
||||
|
||||
**Custom Schema Extensions:**
|
||||
- User journey tracking tables
|
||||
- Product interaction graphs
|
||||
- Session behavior analysis
|
||||
- Purchase pattern storage
|
||||
|
||||
### Configuration for First8 Marketing
|
||||
|
||||
**Environment Variables:**
|
||||
```bash
|
||||
DATABASE_URL=postgresql://username:password@localhost:5432/umami
|
||||
NODE_ENV=production
|
||||
PORT=3000
|
||||
```
|
||||
|
||||
**Required PostgreSQL Version:** 17.x (for Apache AGE compatibility)
|
||||
|
||||
### Usage in First8 Marketing System
|
||||
|
||||
**Event Tracking:**
|
||||
- All WordPress core events (page views, clicks, form submissions)
|
||||
- WooCommerce events (product views, add to cart, purchases, checkout steps)
|
||||
- Custom events via First8 Marketing Track plugin
|
||||
- User journey and session tracking
|
||||
|
||||
**Data Access:**
|
||||
- Real-time analytics dashboard via Umami UI
|
||||
- ETL pipeline for recommendation engine
|
||||
- Graph queries via Apache AGE for relationship analysis
|
||||
- Time-series queries via TimescaleDB for trend analysis
|
||||
|
||||
### Deployment Notes
|
||||
|
||||
This instance is configured for standalone deployment with:
|
||||
- PostgreSQL 17 database server
|
||||
- Apache AGE graph extension
|
||||
- TimescaleDB time-series extension
|
||||
- Node.js 18.18+ runtime
|
||||
- Reverse proxy (Nginx/Apache) for production
|
||||
|
||||
### Credits
|
||||
|
||||
**Original Software:**
|
||||
- **Umami Analytics** - Created by [Umami Software](https://umami.is)
|
||||
- Licensed under MIT License
|
||||
- Original repository: [github.com/umami-software/umami](https://github.com/umami-software/umami)
|
||||
|
||||
**First8 Marketing Customization:**
|
||||
- **Integration & Enhancement** - First8 Marketing
|
||||
- PostgreSQL 17 + Apache AGE + TimescaleDB integration
|
||||
- Extended event tracking for WordPress/WooCommerce
|
||||
- ETL pipeline for recommendation engine
|
||||
- Custom schema extensions for hyper-personalization
|
||||
|
||||
---
|
||||
|
||||
## 🛟 Support
|
||||
|
||||
**Original Umami Support:**
|
||||
|
||||
<p align="center">
|
||||
<a href="https://github.com/umami-software/umami">
|
||||
<img src="https://img.shields.io/badge/GitHub--blue?style=social&logo=github" alt="GitHub" />
|
||||
|
|
@ -130,6 +230,19 @@ docker compose up --force-recreate -d
|
|||
</a>
|
||||
</p>
|
||||
|
||||
**First8 Marketing Integration Support:**
|
||||
- For integration-specific issues, contact First8 Marketing
|
||||
- For core Umami issues, use the official Umami support channels above
|
||||
|
||||
---
|
||||
|
||||
## 📄 License
|
||||
|
||||
This project maintains the original MIT License from Umami Software.
|
||||
|
||||
**Original Authors:** Umami Software
|
||||
**Integration & Customization:** First8 Marketing
|
||||
|
||||
[release-shield]: https://img.shields.io/github/release/umami-software/umami.svg
|
||||
[releases-url]: https://github.com/umami-software/umami/releases
|
||||
[license-shield]: https://img.shields.io/github/license/umami-software/umami.svg
|
||||
|
|
|
|||
34
db/postgresql/rollback/001_rollback_woocommerce_fields.sql
Normal file
34
db/postgresql/rollback/001_rollback_woocommerce_fields.sql
Normal file
|
|
@ -0,0 +1,34 @@
|
|||
-- Rollback Migration: Remove WooCommerce and Enhanced Tracking Fields
|
||||
-- Created: 2025-01-15
|
||||
-- Description: Removes WooCommerce e-commerce tracking fields and enhanced engagement metrics from website_event table
|
||||
-- WARNING: This will permanently delete all WooCommerce tracking data!
|
||||
|
||||
-- Drop indexes first (must be done before dropping columns)
|
||||
DROP INDEX IF EXISTS idx_website_event_wc_product;
|
||||
DROP INDEX IF EXISTS idx_website_event_wc_category;
|
||||
DROP INDEX IF EXISTS idx_website_event_wc_order;
|
||||
DROP INDEX IF EXISTS idx_website_event_wc_revenue;
|
||||
DROP INDEX IF EXISTS idx_website_event_engagement;
|
||||
|
||||
-- Remove WooCommerce e-commerce tracking fields
|
||||
ALTER TABLE website_event
|
||||
DROP COLUMN IF EXISTS wc_product_id,
|
||||
DROP COLUMN IF EXISTS wc_category_id,
|
||||
DROP COLUMN IF EXISTS wc_cart_value,
|
||||
DROP COLUMN IF EXISTS wc_checkout_step,
|
||||
DROP COLUMN IF EXISTS wc_order_id,
|
||||
DROP COLUMN IF EXISTS wc_revenue;
|
||||
|
||||
-- Remove enhanced engagement tracking fields
|
||||
ALTER TABLE website_event
|
||||
DROP COLUMN IF EXISTS scroll_depth,
|
||||
DROP COLUMN IF EXISTS time_on_page,
|
||||
DROP COLUMN IF EXISTS click_count,
|
||||
DROP COLUMN IF EXISTS form_interactions;
|
||||
|
||||
-- Log rollback completion
|
||||
DO $$
|
||||
BEGIN
|
||||
RAISE NOTICE 'Rollback complete: WooCommerce and enhanced tracking fields removed from website_event table';
|
||||
END $$;
|
||||
|
||||
|
|
@ -0,0 +1,22 @@
|
|||
-- Rollback Migration: Remove Recommendation Engine Tables
|
||||
-- Created: 2025-01-15
|
||||
-- Description: Drops all recommendation engine tables and their dependencies
|
||||
-- WARNING: This will permanently delete all recommendation data, user profiles, and ML model registry!
|
||||
|
||||
-- Drop tables in reverse order of dependencies
|
||||
-- Drop recommendations table first (has foreign key to website)
|
||||
DROP TABLE IF EXISTS recommendations CASCADE;
|
||||
|
||||
-- Drop user_profiles table (has foreign key to website)
|
||||
DROP TABLE IF EXISTS user_profiles CASCADE;
|
||||
|
||||
-- Drop ml_models table (no dependencies)
|
||||
DROP TABLE IF EXISTS ml_models CASCADE;
|
||||
|
||||
-- Log rollback completion
|
||||
DO $$
|
||||
BEGIN
|
||||
RAISE NOTICE 'Rollback complete: All recommendation engine tables removed';
|
||||
RAISE NOTICE 'Dropped tables: recommendations, user_profiles, ml_models';
|
||||
END $$;
|
||||
|
||||
38
db/postgresql/rollback/003_rollback_apache_age.sql
Normal file
38
db/postgresql/rollback/003_rollback_apache_age.sql
Normal file
|
|
@ -0,0 +1,38 @@
|
|||
-- Rollback Migration: Remove Apache AGE Graph Database
|
||||
-- Created: 2025-01-15
|
||||
-- Description: Drops Apache AGE graph and extension
|
||||
-- WARNING: This will permanently delete all graph data!
|
||||
|
||||
-- Set search path to include ag_catalog
|
||||
SET search_path = ag_catalog, "$user", public;
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 1: Drop Helper Functions
|
||||
-- ============================================================================
|
||||
DROP FUNCTION IF EXISTS execute_cypher(text, text) CASCADE;
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 2: Drop Graph (this will cascade to all vertices and edges)
|
||||
-- ============================================================================
|
||||
SELECT ag_catalog.drop_graph('user_journey', true);
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 3: Drop Apache AGE Extension
|
||||
-- ============================================================================
|
||||
-- Note: Only drop extension if no other graphs exist
|
||||
-- Uncomment the following line if you want to completely remove Apache AGE
|
||||
-- DROP EXTENSION IF EXISTS age CASCADE;
|
||||
|
||||
-- ============================================================================
|
||||
-- Rollback Complete
|
||||
-- ============================================================================
|
||||
DO $$
|
||||
BEGIN
|
||||
RAISE NOTICE '=================================================================';
|
||||
RAISE NOTICE 'Apache AGE Rollback Complete';
|
||||
RAISE NOTICE 'Dropped graph: user_journey';
|
||||
RAISE NOTICE 'Dropped helper functions: execute_cypher()';
|
||||
RAISE NOTICE 'Note: Apache AGE extension was NOT dropped (may be used by other graphs)';
|
||||
RAISE NOTICE '=================================================================';
|
||||
END $$;
|
||||
|
||||
52
db/postgresql/rollback/004_rollback_timescaledb.sql
Normal file
52
db/postgresql/rollback/004_rollback_timescaledb.sql
Normal file
|
|
@ -0,0 +1,52 @@
|
|||
-- Rollback Migration: Remove TimescaleDB Time-Series Tables
|
||||
-- Created: 2025-01-15
|
||||
-- Description: Drops all TimescaleDB hypertables, continuous aggregates, and policies
|
||||
-- WARNING: This will permanently delete all time-series analytics data!
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 1: Remove Continuous Aggregate Policies
|
||||
-- ============================================================================
|
||||
SELECT remove_continuous_aggregate_policy('website_metrics_hourly_agg', if_exists => TRUE);
|
||||
SELECT remove_continuous_aggregate_policy('product_metrics_daily_agg', if_exists => TRUE);
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 2: Drop Continuous Aggregates (Materialized Views)
|
||||
-- ============================================================================
|
||||
DROP MATERIALIZED VIEW IF EXISTS website_metrics_hourly_agg CASCADE;
|
||||
DROP MATERIALIZED VIEW IF EXISTS product_metrics_daily_agg CASCADE;
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 3: Remove Retention Policies
|
||||
-- ============================================================================
|
||||
SELECT remove_retention_policy('time_series_events', if_exists => TRUE);
|
||||
SELECT remove_retention_policy('website_metrics_hourly', if_exists => TRUE);
|
||||
SELECT remove_retention_policy('product_metrics_daily', if_exists => TRUE);
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 4: Drop Hypertables (this will drop the tables and all chunks)
|
||||
-- ============================================================================
|
||||
DROP TABLE IF EXISTS time_series_events CASCADE;
|
||||
DROP TABLE IF EXISTS website_metrics_hourly CASCADE;
|
||||
DROP TABLE IF EXISTS product_metrics_daily CASCADE;
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 5: Drop TimescaleDB Extension (Optional)
|
||||
-- ============================================================================
|
||||
-- Note: Only drop extension if no other hypertables exist
|
||||
-- Uncomment the following line if you want to completely remove TimescaleDB
|
||||
-- DROP EXTENSION IF EXISTS timescaledb CASCADE;
|
||||
|
||||
-- ============================================================================
|
||||
-- Rollback Complete
|
||||
-- ============================================================================
|
||||
DO $$
|
||||
BEGIN
|
||||
RAISE NOTICE '=================================================================';
|
||||
RAISE NOTICE 'TimescaleDB Rollback Complete';
|
||||
RAISE NOTICE 'Dropped hypertables: time_series_events, website_metrics_hourly, product_metrics_daily';
|
||||
RAISE NOTICE 'Dropped continuous aggregates: website_metrics_hourly_agg, product_metrics_daily_agg';
|
||||
RAISE NOTICE 'Removed all retention policies';
|
||||
RAISE NOTICE 'Note: TimescaleDB extension was NOT dropped (may be used by other tables)';
|
||||
RAISE NOTICE '=================================================================';
|
||||
END $$;
|
||||
|
||||
114
docker-compose.upgraded.yml
Normal file
114
docker-compose.upgraded.yml
Normal file
|
|
@ -0,0 +1,114 @@
|
|||
---
|
||||
# Docker Compose for Umami with PostgreSQL 17 + Apache AGE + TimescaleDB
|
||||
# This is the upgraded configuration for the hyper-personalized marketing system
|
||||
services:
|
||||
umami:
|
||||
image: ghcr.io/umami-software/umami:postgresql-latest
|
||||
ports:
|
||||
- "3000:3000"
|
||||
environment:
|
||||
DATABASE_URL: postgresql://umami:umami@db:5432/umami
|
||||
DATABASE_TYPE: postgresql
|
||||
APP_SECRET: ${APP_SECRET:-replace-me-with-a-random-string}
|
||||
# Optional: Enable debug logging
|
||||
# LOG_QUERY: 1
|
||||
depends_on:
|
||||
db:
|
||||
condition: service_healthy
|
||||
init: true
|
||||
restart: always
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "curl http://localhost:3000/api/heartbeat"]
|
||||
interval: 5s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
networks:
|
||||
- umami-network
|
||||
|
||||
db:
|
||||
# Custom PostgreSQL 17 image with Apache AGE and TimescaleDB
|
||||
build:
|
||||
context: ./docker/postgres
|
||||
dockerfile: Dockerfile
|
||||
image: postgres:17-age-timescaledb
|
||||
environment:
|
||||
POSTGRES_DB: umami
|
||||
POSTGRES_USER: umami
|
||||
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-umami}
|
||||
# TimescaleDB configuration
|
||||
TIMESCALEDB_TELEMETRY: 'off'
|
||||
volumes:
|
||||
- umami-db-data:/var/lib/postgresql/data
|
||||
# Mount initialization scripts
|
||||
- ./docker/postgres/init-scripts:/docker-entrypoint-initdb.d
|
||||
ports:
|
||||
- "5432:5432"
|
||||
restart: always
|
||||
healthcheck:
|
||||
test: ["CMD-SHELL", "pg_isready -U $${POSTGRES_USER} -d $${POSTGRES_DB}"]
|
||||
interval: 5s
|
||||
timeout: 5s
|
||||
retries: 5
|
||||
networks:
|
||||
- umami-network
|
||||
# Increase shared memory for better performance
|
||||
shm_size: 256mb
|
||||
# PostgreSQL configuration for performance
|
||||
command:
|
||||
- "postgres"
|
||||
- "-c"
|
||||
- "shared_preload_libraries=timescaledb,age"
|
||||
- "-c"
|
||||
- "max_connections=200"
|
||||
- "-c"
|
||||
- "shared_buffers=256MB"
|
||||
- "-c"
|
||||
- "effective_cache_size=1GB"
|
||||
- "-c"
|
||||
- "maintenance_work_mem=128MB"
|
||||
- "-c"
|
||||
- "checkpoint_completion_target=0.9"
|
||||
- "-c"
|
||||
- "wal_buffers=16MB"
|
||||
- "-c"
|
||||
- "default_statistics_target=100"
|
||||
- "-c"
|
||||
- "random_page_cost=1.1"
|
||||
- "-c"
|
||||
- "effective_io_concurrency=200"
|
||||
- "-c"
|
||||
- "work_mem=4MB"
|
||||
- "-c"
|
||||
- "min_wal_size=1GB"
|
||||
- "-c"
|
||||
- "max_wal_size=4GB"
|
||||
|
||||
# Optional: pgAdmin for database management
|
||||
pgadmin:
|
||||
image: dpage/pgadmin4:latest
|
||||
environment:
|
||||
PGADMIN_DEFAULT_EMAIL: ${PGADMIN_EMAIL:-admin@umami.local}
|
||||
PGADMIN_DEFAULT_PASSWORD: ${PGADMIN_PASSWORD:-admin}
|
||||
PGADMIN_CONFIG_SERVER_MODE: 'False'
|
||||
ports:
|
||||
- "5050:80"
|
||||
volumes:
|
||||
- pgadmin-data:/var/lib/pgadmin
|
||||
depends_on:
|
||||
- db
|
||||
restart: always
|
||||
networks:
|
||||
- umami-network
|
||||
profiles:
|
||||
- tools
|
||||
|
||||
volumes:
|
||||
umami-db-data:
|
||||
driver: local
|
||||
pgadmin-data:
|
||||
driver: local
|
||||
|
||||
networks:
|
||||
umami-network:
|
||||
driver: bridge
|
||||
|
||||
75
docker/postgres/Dockerfile
Normal file
75
docker/postgres/Dockerfile
Normal file
|
|
@ -0,0 +1,75 @@
|
|||
# PostgreSQL 17 with Apache AGE 1.6.0 and TimescaleDB 2.23.0
|
||||
# For Umami Analytics Upgrade - Hyper-Personalized Marketing System
|
||||
|
||||
FROM postgres:17-alpine
|
||||
|
||||
# Install build dependencies
|
||||
RUN apk add --no-cache \
|
||||
build-base \
|
||||
clang \
|
||||
llvm \
|
||||
git \
|
||||
cmake \
|
||||
bison \
|
||||
flex \
|
||||
readline-dev \
|
||||
zlib-dev \
|
||||
curl \
|
||||
ca-certificates
|
||||
|
||||
# Set PostgreSQL version for compatibility
|
||||
ENV PG_MAJOR=17
|
||||
ENV PG_VERSION=17
|
||||
|
||||
# Install TimescaleDB 2.23.0
|
||||
ENV TIMESCALEDB_VERSION=2.23.0
|
||||
RUN set -ex \
|
||||
&& apk add --no-cache --virtual .fetch-deps \
|
||||
ca-certificates \
|
||||
openssl \
|
||||
tar \
|
||||
&& mkdir -p /tmp/timescaledb \
|
||||
&& cd /tmp/timescaledb \
|
||||
&& wget -O timescaledb.tar.gz "https://github.com/timescale/timescaledb/archive/${TIMESCALEDB_VERSION}.tar.gz" \
|
||||
&& tar -xzf timescaledb.tar.gz -C /tmp/timescaledb --strip-components=1 \
|
||||
&& cd /tmp/timescaledb \
|
||||
&& ./bootstrap -DREGRESS_CHECKS=OFF -DPROJECT_INSTALL_METHOD="docker" \
|
||||
&& cd build && make install \
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/timescaledb \
|
||||
&& apk del .fetch-deps
|
||||
|
||||
# Install Apache AGE 1.6.0
|
||||
ENV AGE_VERSION=1.6.0
|
||||
RUN set -ex \
|
||||
&& mkdir -p /tmp/age \
|
||||
&& cd /tmp/age \
|
||||
&& wget -O age.tar.gz "https://github.com/apache/age/archive/refs/tags/v${AGE_VERSION}.tar.gz" \
|
||||
&& tar -xzf age.tar.gz -C /tmp/age --strip-components=1 \
|
||||
&& cd /tmp/age \
|
||||
&& make PG_CONFIG=/usr/local/bin/pg_config install \
|
||||
&& cd / \
|
||||
&& rm -rf /tmp/age
|
||||
|
||||
# Clean up build dependencies
|
||||
RUN apk del build-base clang llvm git cmake bison flex
|
||||
|
||||
# Configure PostgreSQL to load extensions
|
||||
RUN echo "shared_preload_libraries = 'timescaledb,age'" >> /usr/local/share/postgresql/postgresql.conf.sample
|
||||
|
||||
# Add initialization script
|
||||
COPY init-scripts/* /docker-entrypoint-initdb.d/
|
||||
|
||||
# Set proper permissions
|
||||
RUN chmod +x /docker-entrypoint-initdb.d/*.sh || true
|
||||
|
||||
# Expose PostgreSQL port
|
||||
EXPOSE 5432
|
||||
|
||||
# Health check
|
||||
HEALTHCHECK --interval=30s --timeout=5s --start-period=30s --retries=3 \
|
||||
CMD pg_isready -U postgres || exit 1
|
||||
|
||||
# Use the default PostgreSQL entrypoint
|
||||
CMD ["postgres"]
|
||||
|
||||
140
docker/postgres/README.md
Normal file
140
docker/postgres/README.md
Normal file
|
|
@ -0,0 +1,140 @@
|
|||
# PostgreSQL 17 + Apache AGE + TimescaleDB Docker Image
|
||||
|
||||
This directory contains the Dockerfile and initialization scripts for building a custom PostgreSQL 17 image with Apache AGE 1.6.0 and TimescaleDB 2.23.0 extensions.
|
||||
|
||||
## What's Included
|
||||
|
||||
- **PostgreSQL 17** - Latest PostgreSQL version
|
||||
- **Apache AGE 1.6.0** - Graph database extension for user journey tracking
|
||||
- **TimescaleDB 2.23.0** - Time-series database extension for analytics
|
||||
|
||||
## Building the Image
|
||||
|
||||
```bash
|
||||
# From the umami directory
|
||||
docker build -t postgres:17-age-timescaledb -f docker/postgres/Dockerfile docker/postgres
|
||||
```
|
||||
|
||||
## Using with Docker Compose
|
||||
|
||||
The image is automatically built when using `docker-compose.upgraded.yml`:
|
||||
|
||||
```bash
|
||||
# Start the upgraded stack
|
||||
docker-compose -f docker-compose.upgraded.yml up -d
|
||||
|
||||
# View logs
|
||||
docker-compose -f docker-compose.upgraded.yml logs -f
|
||||
|
||||
# Stop the stack
|
||||
docker-compose -f docker-compose.upgraded.yml down
|
||||
```
|
||||
|
||||
## Running Migrations
|
||||
|
||||
After the database is up, run the Prisma migrations:
|
||||
|
||||
```bash
|
||||
# Generate Prisma client
|
||||
pnpm prisma generate
|
||||
|
||||
# Run migrations
|
||||
pnpm prisma migrate deploy
|
||||
```
|
||||
|
||||
## Verifying Extensions
|
||||
|
||||
Connect to the database and verify extensions are installed:
|
||||
|
||||
```bash
|
||||
# Connect to PostgreSQL
|
||||
docker-compose -f docker-compose.upgraded.yml exec db psql -U umami -d umami
|
||||
|
||||
# Check installed extensions
|
||||
SELECT extname, extversion FROM pg_extension WHERE extname IN ('timescaledb', 'age');
|
||||
|
||||
# Check Apache AGE graph
|
||||
SELECT * FROM ag_catalog.ag_graph;
|
||||
|
||||
# Check TimescaleDB hypertables
|
||||
SELECT * FROM timescaledb_information.hypertables;
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
The PostgreSQL instance is configured with optimized settings for performance:
|
||||
|
||||
- `shared_buffers = 256MB`
|
||||
- `effective_cache_size = 1GB`
|
||||
- `maintenance_work_mem = 128MB`
|
||||
- `max_connections = 200`
|
||||
|
||||
Adjust these in `docker-compose.upgraded.yml` based on your server resources.
|
||||
|
||||
## Initialization Scripts
|
||||
|
||||
Scripts in `init-scripts/` run automatically when the container is first created:
|
||||
|
||||
- `01-init-extensions.sh` - Installs TimescaleDB and Apache AGE extensions
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Extensions not loading
|
||||
|
||||
If extensions fail to load, check the logs:
|
||||
|
||||
```bash
|
||||
docker-compose -f docker-compose.upgraded.yml logs db
|
||||
```
|
||||
|
||||
### Build failures
|
||||
|
||||
If the build fails, ensure you have enough disk space and memory:
|
||||
|
||||
```bash
|
||||
# Check Docker resources
|
||||
docker system df
|
||||
|
||||
# Clean up if needed
|
||||
docker system prune -a
|
||||
```
|
||||
|
||||
### Connection issues
|
||||
|
||||
Verify the database is healthy:
|
||||
|
||||
```bash
|
||||
docker-compose -f docker-compose.upgraded.yml ps
|
||||
docker-compose -f docker-compose.upgraded.yml exec db pg_isready -U umami
|
||||
```
|
||||
|
||||
## Production Deployment
|
||||
|
||||
For production, use a managed PostgreSQL service or dedicated server instead of Docker. See the main [DEPLOYMENT.md](../../recommendation-engine/docs/DEPLOYMENT.md) for details.
|
||||
|
||||
## Data Persistence
|
||||
|
||||
Database data is stored in the `umami-db-data` Docker volume. To backup:
|
||||
|
||||
```bash
|
||||
# Backup
|
||||
docker-compose -f docker-compose.upgraded.yml exec db pg_dump -U umami umami > backup.sql
|
||||
|
||||
# Restore
|
||||
docker-compose -f docker-compose.upgraded.yml exec -T db psql -U umami umami < backup.sql
|
||||
```
|
||||
|
||||
## Security Notes
|
||||
|
||||
- Change default passwords in production
|
||||
- Use environment variables for sensitive data
|
||||
- Enable SSL/TLS for database connections
|
||||
- Restrict network access to the database port
|
||||
|
||||
## Support
|
||||
|
||||
For issues or questions, refer to:
|
||||
- [PostgreSQL Documentation](https://www.postgresql.org/docs/17/)
|
||||
- [Apache AGE Documentation](https://age.apache.org/docs/)
|
||||
- [TimescaleDB Documentation](https://docs.timescale.com/)
|
||||
|
||||
41
docker/postgres/init-scripts/01-init-extensions.sh
Executable file
41
docker/postgres/init-scripts/01-init-extensions.sh
Executable file
|
|
@ -0,0 +1,41 @@
|
|||
#!/bin/bash
|
||||
# Initialize PostgreSQL extensions for Umami
|
||||
# This script runs automatically when the container is first created
|
||||
|
||||
set -e
|
||||
|
||||
echo "=================================================="
|
||||
echo "Initializing PostgreSQL 17 with Extensions"
|
||||
echo "=================================================="
|
||||
|
||||
# Wait for PostgreSQL to be ready
|
||||
until pg_isready -U "$POSTGRES_USER" -d "$POSTGRES_DB"; do
|
||||
echo "Waiting for PostgreSQL to be ready..."
|
||||
sleep 2
|
||||
done
|
||||
|
||||
echo "PostgreSQL is ready. Installing extensions..."
|
||||
|
||||
# Connect to the database and install extensions
|
||||
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
|
||||
-- Install TimescaleDB extension
|
||||
CREATE EXTENSION IF NOT EXISTS timescaledb CASCADE;
|
||||
|
||||
-- Install Apache AGE extension
|
||||
CREATE EXTENSION IF NOT EXISTS age CASCADE;
|
||||
|
||||
-- Load AGE into search path
|
||||
SET search_path = ag_catalog, "\$user", public;
|
||||
|
||||
-- Verify installations
|
||||
SELECT extname, extversion FROM pg_extension WHERE extname IN ('timescaledb', 'age');
|
||||
EOSQL
|
||||
|
||||
echo "=================================================="
|
||||
echo "Extensions installed successfully!"
|
||||
echo "- TimescaleDB: Installed"
|
||||
echo "- Apache AGE: Installed"
|
||||
echo "=================================================="
|
||||
|
||||
echo "Database is ready for Umami migrations."
|
||||
|
||||
17389
pnpm-lock.yaml
generated
17389
pnpm-lock.yaml
generated
File diff suppressed because it is too large
Load diff
58
prisma/migrations/15_add_woocommerce_fields/migration.sql
Normal file
58
prisma/migrations/15_add_woocommerce_fields/migration.sql
Normal file
|
|
@ -0,0 +1,58 @@
|
|||
-- Migration: Add WooCommerce and Enhanced Tracking Fields
|
||||
-- Created: 2025-01-15
|
||||
-- Description: Adds WooCommerce e-commerce tracking fields and enhanced engagement metrics to website_event table
|
||||
|
||||
-- Add enhanced engagement tracking fields
|
||||
ALTER TABLE website_event
|
||||
ADD COLUMN IF NOT EXISTS scroll_depth INTEGER,
|
||||
ADD COLUMN IF NOT EXISTS time_on_page INTEGER,
|
||||
ADD COLUMN IF NOT EXISTS click_count INTEGER,
|
||||
ADD COLUMN IF NOT EXISTS form_interactions JSONB;
|
||||
|
||||
-- Add WooCommerce e-commerce tracking fields
|
||||
ALTER TABLE website_event
|
||||
ADD COLUMN IF NOT EXISTS wc_product_id VARCHAR(50),
|
||||
ADD COLUMN IF NOT EXISTS wc_category_id VARCHAR(50),
|
||||
ADD COLUMN IF NOT EXISTS wc_cart_value DECIMAL(19, 4),
|
||||
ADD COLUMN IF NOT EXISTS wc_checkout_step INTEGER,
|
||||
ADD COLUMN IF NOT EXISTS wc_order_id VARCHAR(50),
|
||||
ADD COLUMN IF NOT EXISTS wc_revenue DECIMAL(19, 4);
|
||||
|
||||
-- Create indexes for WooCommerce queries (performance optimization)
|
||||
-- Index for product-based queries
|
||||
CREATE INDEX IF NOT EXISTS idx_website_event_wc_product
|
||||
ON website_event(website_id, wc_product_id, created_at)
|
||||
WHERE wc_product_id IS NOT NULL;
|
||||
|
||||
-- Index for category-based queries
|
||||
CREATE INDEX IF NOT EXISTS idx_website_event_wc_category
|
||||
ON website_event(website_id, wc_category_id, created_at)
|
||||
WHERE wc_category_id IS NOT NULL;
|
||||
|
||||
-- Index for order-based queries (partial index for sparse data)
|
||||
CREATE INDEX IF NOT EXISTS idx_website_event_wc_order
|
||||
ON website_event(wc_order_id)
|
||||
WHERE wc_order_id IS NOT NULL;
|
||||
|
||||
-- Index for revenue analysis
|
||||
CREATE INDEX IF NOT EXISTS idx_website_event_wc_revenue
|
||||
ON website_event(website_id, created_at, wc_revenue)
|
||||
WHERE wc_revenue IS NOT NULL;
|
||||
|
||||
-- Index for engagement metrics
|
||||
CREATE INDEX IF NOT EXISTS idx_website_event_engagement
|
||||
ON website_event(website_id, created_at, scroll_depth, time_on_page)
|
||||
WHERE scroll_depth IS NOT NULL OR time_on_page IS NOT NULL;
|
||||
|
||||
-- Add comments for documentation
|
||||
COMMENT ON COLUMN website_event.scroll_depth IS 'Percentage of page scrolled (0-100)';
|
||||
COMMENT ON COLUMN website_event.time_on_page IS 'Time spent on page in seconds';
|
||||
COMMENT ON COLUMN website_event.click_count IS 'Number of clicks on the page';
|
||||
COMMENT ON COLUMN website_event.form_interactions IS 'JSONB array of form interaction events';
|
||||
COMMENT ON COLUMN website_event.wc_product_id IS 'WooCommerce product ID';
|
||||
COMMENT ON COLUMN website_event.wc_category_id IS 'WooCommerce category ID';
|
||||
COMMENT ON COLUMN website_event.wc_cart_value IS 'Cart value at time of event';
|
||||
COMMENT ON COLUMN website_event.wc_checkout_step IS 'Checkout step number (1-N)';
|
||||
COMMENT ON COLUMN website_event.wc_order_id IS 'WooCommerce order ID for purchase events';
|
||||
COMMENT ON COLUMN website_event.wc_revenue IS 'Revenue amount for purchase events';
|
||||
|
||||
154
prisma/migrations/16_create_recommendation_tables/migration.sql
Normal file
154
prisma/migrations/16_create_recommendation_tables/migration.sql
Normal file
|
|
@ -0,0 +1,154 @@
|
|||
-- Migration: Create Recommendation Engine Tables
|
||||
-- Created: 2025-01-15
|
||||
-- Description: Creates tables for user profiles, recommendations tracking, and ML model registry
|
||||
|
||||
-- ============================================================================
|
||||
-- Table: user_profiles
|
||||
-- Purpose: Aggregated user behavior and preferences for personalization
|
||||
-- ============================================================================
|
||||
CREATE TABLE IF NOT EXISTS user_profiles (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
user_id VARCHAR(255) UNIQUE NOT NULL, -- Can be session_id or logged-in user_id
|
||||
website_id UUID NOT NULL,
|
||||
|
||||
-- Lifecycle
|
||||
lifecycle_stage VARCHAR(50), -- 'new', 'active', 'at_risk', 'churned'
|
||||
funnel_position VARCHAR(50), -- 'awareness', 'consideration', 'decision', 'retention'
|
||||
|
||||
-- Engagement metrics
|
||||
session_count INTEGER DEFAULT 0,
|
||||
total_pageviews INTEGER DEFAULT 0,
|
||||
total_events INTEGER DEFAULT 0,
|
||||
total_purchases INTEGER DEFAULT 0,
|
||||
total_revenue DECIMAL(19, 4) DEFAULT 0,
|
||||
|
||||
-- Behavior
|
||||
avg_session_duration INTEGER, -- seconds
|
||||
avg_time_on_page INTEGER, -- seconds
|
||||
avg_scroll_depth INTEGER, -- percentage
|
||||
bounce_rate DECIMAL(5, 4),
|
||||
|
||||
-- Preferences (JSONB for flexibility)
|
||||
favorite_categories JSONB, -- ['electronics', 'books']
|
||||
favorite_products JSONB, -- ['product_id_1', 'product_id_2']
|
||||
price_sensitivity VARCHAR(20), -- 'low', 'medium', 'high'
|
||||
preferred_brands JSONB,
|
||||
device_preference VARCHAR(20), -- 'mobile', 'tablet', 'desktop'
|
||||
|
||||
-- Timestamps
|
||||
first_visit TIMESTAMPTZ,
|
||||
last_visit TIMESTAMPTZ,
|
||||
created_at TIMESTAMPTZ DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ DEFAULT NOW(),
|
||||
|
||||
CONSTRAINT fk_user_profiles_website FOREIGN KEY (website_id) REFERENCES website(website_id) ON DELETE CASCADE
|
||||
);
|
||||
|
||||
-- Indexes for user_profiles
|
||||
CREATE INDEX idx_user_profiles_user_id ON user_profiles(user_id);
|
||||
CREATE INDEX idx_user_profiles_website_id ON user_profiles(website_id);
|
||||
CREATE INDEX idx_user_profiles_lifecycle ON user_profiles(lifecycle_stage);
|
||||
CREATE INDEX idx_user_profiles_last_visit ON user_profiles(last_visit);
|
||||
|
||||
-- Comments for user_profiles
|
||||
COMMENT ON TABLE user_profiles IS 'Aggregated user behavior and preferences for personalization';
|
||||
COMMENT ON COLUMN user_profiles.lifecycle_stage IS 'User lifecycle stage: new, active, at_risk, churned';
|
||||
COMMENT ON COLUMN user_profiles.funnel_position IS 'User position in marketing funnel';
|
||||
COMMENT ON COLUMN user_profiles.favorite_categories IS 'JSONB array of favorite product categories';
|
||||
COMMENT ON COLUMN user_profiles.favorite_products IS 'JSONB array of favorite product IDs';
|
||||
|
||||
-- ============================================================================
|
||||
-- Table: recommendations
|
||||
-- Purpose: Historical recommendations for analysis and learning
|
||||
-- ============================================================================
|
||||
CREATE TABLE IF NOT EXISTS recommendations (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
session_id UUID NOT NULL,
|
||||
user_id VARCHAR(255),
|
||||
website_id UUID NOT NULL,
|
||||
|
||||
-- Recommendation details
|
||||
recommendation_type VARCHAR(50), -- 'product', 'content', 'offer'
|
||||
item_id VARCHAR(255) NOT NULL,
|
||||
score DECIMAL(5, 4),
|
||||
rank INTEGER,
|
||||
|
||||
-- Context
|
||||
context JSONB, -- Page, product, category where shown
|
||||
strategy VARCHAR(50), -- 'collaborative', 'sequential', 'graph', etc.
|
||||
model_version VARCHAR(50),
|
||||
|
||||
-- Personalization factors
|
||||
personalization_factors JSONB,
|
||||
|
||||
-- Outcome
|
||||
shown BOOLEAN DEFAULT TRUE,
|
||||
clicked BOOLEAN DEFAULT FALSE,
|
||||
converted BOOLEAN DEFAULT FALSE,
|
||||
revenue DECIMAL(19, 4),
|
||||
|
||||
-- Timestamps
|
||||
shown_at TIMESTAMPTZ DEFAULT NOW(),
|
||||
clicked_at TIMESTAMPTZ,
|
||||
converted_at TIMESTAMPTZ,
|
||||
|
||||
CONSTRAINT fk_recommendations_website FOREIGN KEY (website_id) REFERENCES website(website_id) ON DELETE CASCADE
|
||||
);
|
||||
|
||||
-- Indexes for recommendations
|
||||
CREATE INDEX idx_recommendations_session ON recommendations(session_id);
|
||||
CREATE INDEX idx_recommendations_user ON recommendations(user_id);
|
||||
CREATE INDEX idx_recommendations_item ON recommendations(item_id);
|
||||
CREATE INDEX idx_recommendations_shown_at ON recommendations(shown_at);
|
||||
CREATE INDEX idx_recommendations_outcome ON recommendations(clicked, converted);
|
||||
CREATE INDEX idx_recommendations_website ON recommendations(website_id);
|
||||
|
||||
-- Comments for recommendations
|
||||
COMMENT ON TABLE recommendations IS 'Historical recommendations for analysis and learning';
|
||||
COMMENT ON COLUMN recommendations.strategy IS 'Recommendation strategy used: collaborative, sequential, graph, etc.';
|
||||
COMMENT ON COLUMN recommendations.personalization_factors IS 'JSONB object containing factors that influenced this recommendation';
|
||||
|
||||
-- ============================================================================
|
||||
-- Table: ml_models
|
||||
-- Purpose: Model registry and versioning
|
||||
-- ============================================================================
|
||||
CREATE TABLE IF NOT EXISTS ml_models (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
name VARCHAR(100) NOT NULL,
|
||||
version VARCHAR(50) NOT NULL,
|
||||
model_type VARCHAR(50), -- 'collaborative_filtering', 'sequential', etc.
|
||||
|
||||
-- Model metadata
|
||||
algorithm VARCHAR(100),
|
||||
hyperparameters JSONB,
|
||||
training_data_period JSONB, -- {start: '2025-01-01', end: '2025-01-15'}
|
||||
|
||||
-- Performance metrics
|
||||
metrics JSONB, -- {precision: 0.15, recall: 0.25, ndcg: 0.30}
|
||||
|
||||
-- Storage
|
||||
artifact_path VARCHAR(500), -- S3/local path to model file
|
||||
artifact_size_bytes BIGINT,
|
||||
|
||||
-- Status
|
||||
status VARCHAR(20), -- 'training', 'validating', 'production', 'archived'
|
||||
is_active BOOLEAN DEFAULT FALSE,
|
||||
|
||||
-- Timestamps
|
||||
trained_at TIMESTAMPTZ,
|
||||
deployed_at TIMESTAMPTZ,
|
||||
created_at TIMESTAMPTZ DEFAULT NOW(),
|
||||
|
||||
UNIQUE(name, version)
|
||||
);
|
||||
|
||||
-- Indexes for ml_models
|
||||
CREATE INDEX idx_ml_models_name ON ml_models(name);
|
||||
CREATE INDEX idx_ml_models_status ON ml_models(status);
|
||||
CREATE INDEX idx_ml_models_active ON ml_models(is_active) WHERE is_active = TRUE;
|
||||
|
||||
-- Comments for ml_models
|
||||
COMMENT ON TABLE ml_models IS 'ML model registry and versioning';
|
||||
COMMENT ON COLUMN ml_models.status IS 'Model status: training, validating, production, archived';
|
||||
COMMENT ON COLUMN ml_models.is_active IS 'Whether this model version is currently active in production';
|
||||
|
||||
151
prisma/migrations/17_setup_apache_age/migration.sql
Normal file
151
prisma/migrations/17_setup_apache_age/migration.sql
Normal file
|
|
@ -0,0 +1,151 @@
|
|||
-- Migration: Setup Apache AGE Graph Database
|
||||
-- Created: 2025-01-15
|
||||
-- Description: Installs Apache AGE extension and creates graph schema for user journey tracking
|
||||
-- Requirements: PostgreSQL 17 + Apache AGE 1.6.0
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 1: Install Apache AGE Extension
|
||||
-- ============================================================================
|
||||
CREATE EXTENSION IF NOT EXISTS age;
|
||||
|
||||
-- Load AGE into search path
|
||||
SET search_path = ag_catalog, "$user", public;
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 2: Create Graph for User Journey Tracking
|
||||
-- ============================================================================
|
||||
SELECT ag_catalog.create_graph('user_journey');
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 3: Create Vertex Labels (Node Types)
|
||||
-- ============================================================================
|
||||
|
||||
-- User nodes (represents sessions or logged-in users)
|
||||
SELECT ag_catalog.create_vlabel('user_journey', 'User');
|
||||
|
||||
-- Product nodes
|
||||
SELECT ag_catalog.create_vlabel('user_journey', 'Product');
|
||||
|
||||
-- Category nodes
|
||||
SELECT ag_catalog.create_vlabel('user_journey', 'Category');
|
||||
|
||||
-- Page nodes
|
||||
SELECT ag_catalog.create_vlabel('user_journey', 'Page');
|
||||
|
||||
-- Event nodes (for anomaly detection)
|
||||
SELECT ag_catalog.create_vlabel('user_journey', 'Event');
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 4: Create Edge Labels (Relationship Types)
|
||||
-- ============================================================================
|
||||
|
||||
-- Generic Relationships (Mode 1 - Always Available)
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'VIEWED');
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'ADDED_TO_CART');
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'PURCHASED');
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'SEARCHED_FOR');
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'NAVIGATED_TO');
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'BOUGHT_TOGETHER');
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'VIEWED_TOGETHER');
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'IN_CATEGORY');
|
||||
|
||||
-- Adaptive Relationships (Mode 2 - LLM-Enhanced, Optional)
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'SEMANTICALLY_SIMILAR');
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'PREDICTED_INTEREST');
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'COMPLEMENTARY');
|
||||
SELECT ag_catalog.create_elabel('user_journey', 'ANOMALOUS_BEHAVIOR');
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 5: Create Helper Functions
|
||||
-- ============================================================================
|
||||
|
||||
-- Function to execute Cypher queries safely
|
||||
CREATE OR REPLACE FUNCTION execute_cypher(graph_name text, query text)
|
||||
RETURNS SETOF agtype
|
||||
LANGUAGE plpgsql
|
||||
AS $$
|
||||
BEGIN
|
||||
RETURN QUERY EXECUTE format('SELECT * FROM ag_catalog.cypher(%L, %L) AS (result agtype)', graph_name, query);
|
||||
END;
|
||||
$$;
|
||||
|
||||
COMMENT ON FUNCTION execute_cypher IS 'Helper function to execute Cypher queries on Apache AGE graphs';
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 6: Create Indexes for Graph Performance
|
||||
-- ============================================================================
|
||||
|
||||
-- Note: Apache AGE automatically creates indexes for vertex and edge IDs
|
||||
-- Additional indexes can be created on properties as needed
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 7: Verify Installation
|
||||
-- ============================================================================
|
||||
|
||||
-- Verify graph exists
|
||||
DO $$
|
||||
DECLARE
|
||||
graph_count INTEGER;
|
||||
BEGIN
|
||||
SELECT COUNT(*) INTO graph_count
|
||||
FROM ag_catalog.ag_graph
|
||||
WHERE name = 'user_journey';
|
||||
|
||||
IF graph_count = 0 THEN
|
||||
RAISE EXCEPTION 'Graph user_journey was not created successfully';
|
||||
ELSE
|
||||
RAISE NOTICE 'Apache AGE setup complete: Graph user_journey created successfully';
|
||||
END IF;
|
||||
END $$;
|
||||
|
||||
-- Verify vertex labels
|
||||
DO $$
|
||||
DECLARE
|
||||
vlabel_count INTEGER;
|
||||
BEGIN
|
||||
SELECT COUNT(*) INTO vlabel_count
|
||||
FROM ag_catalog.ag_label
|
||||
WHERE graph = (SELECT graphid FROM ag_catalog.ag_graph WHERE name = 'user_journey')
|
||||
AND kind = 'v';
|
||||
|
||||
RAISE NOTICE 'Created % vertex labels', vlabel_count;
|
||||
END $$;
|
||||
|
||||
-- Verify edge labels
|
||||
DO $$
|
||||
DECLARE
|
||||
elabel_count INTEGER;
|
||||
BEGIN
|
||||
SELECT COUNT(*) INTO elabel_count
|
||||
FROM ag_catalog.ag_label
|
||||
WHERE graph = (SELECT graphid FROM ag_catalog.ag_graph WHERE name = 'user_journey')
|
||||
AND kind = 'e';
|
||||
|
||||
RAISE NOTICE 'Created % edge labels', elabel_count;
|
||||
END $$;
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 8: Grant Permissions
|
||||
-- ============================================================================
|
||||
|
||||
-- Grant usage on ag_catalog schema to application user
|
||||
-- Note: Replace 'umami_user' with your actual database user
|
||||
-- GRANT USAGE ON SCHEMA ag_catalog TO umami_user;
|
||||
-- GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA ag_catalog TO umami_user;
|
||||
|
||||
-- ============================================================================
|
||||
-- Migration Complete
|
||||
-- ============================================================================
|
||||
|
||||
-- Log completion
|
||||
DO $$
|
||||
BEGIN
|
||||
RAISE NOTICE '=================================================================';
|
||||
RAISE NOTICE 'Apache AGE Migration Complete';
|
||||
RAISE NOTICE 'Graph: user_journey';
|
||||
RAISE NOTICE 'Vertex Labels: User, Product, Category, Page, Event';
|
||||
RAISE NOTICE 'Edge Labels: VIEWED, ADDED_TO_CART, PURCHASED, SEARCHED_FOR, etc.';
|
||||
RAISE NOTICE 'Helper Functions: execute_cypher()';
|
||||
RAISE NOTICE '=================================================================';
|
||||
END $$;
|
||||
|
||||
214
prisma/migrations/18_setup_timescaledb/migration.sql
Normal file
214
prisma/migrations/18_setup_timescaledb/migration.sql
Normal file
|
|
@ -0,0 +1,214 @@
|
|||
-- Migration: Setup TimescaleDB for Time-Series Analytics
|
||||
-- Created: 2025-01-15
|
||||
-- Description: Installs TimescaleDB extension and creates hypertables for time-series data
|
||||
-- Requirements: PostgreSQL 17 + TimescaleDB 2.23.0
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 1: Install TimescaleDB Extension
|
||||
-- ============================================================================
|
||||
CREATE EXTENSION IF NOT EXISTS timescaledb;
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 2: Create Time-Series Events Table
|
||||
-- ============================================================================
|
||||
CREATE TABLE IF NOT EXISTS time_series_events (
|
||||
time TIMESTAMPTZ NOT NULL,
|
||||
website_id UUID NOT NULL,
|
||||
session_id UUID NOT NULL,
|
||||
user_id VARCHAR(255),
|
||||
|
||||
-- Event details
|
||||
event_type VARCHAR(50) NOT NULL,
|
||||
event_name VARCHAR(50),
|
||||
event_value DECIMAL(19, 4),
|
||||
properties JSONB,
|
||||
|
||||
-- Dimensions for fast filtering
|
||||
page_url VARCHAR(500),
|
||||
product_id VARCHAR(50),
|
||||
category VARCHAR(100),
|
||||
device VARCHAR(20),
|
||||
country CHAR(2),
|
||||
|
||||
PRIMARY KEY (time, website_id, session_id)
|
||||
);
|
||||
|
||||
-- Convert to hypertable (partitioned by time)
|
||||
SELECT create_hypertable('time_series_events', 'time',
|
||||
chunk_time_interval => INTERVAL '7 days',
|
||||
if_not_exists => TRUE
|
||||
);
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 3: Create Indexes for Time-Series Queries
|
||||
-- ============================================================================
|
||||
|
||||
-- Index for website-specific queries
|
||||
CREATE INDEX IF NOT EXISTS idx_ts_events_website
|
||||
ON time_series_events (website_id, time DESC);
|
||||
|
||||
-- Index for session-based queries
|
||||
CREATE INDEX IF NOT EXISTS idx_ts_events_session
|
||||
ON time_series_events (session_id, time DESC);
|
||||
|
||||
-- Index for product-based queries (partial index for sparse data)
|
||||
CREATE INDEX IF NOT EXISTS idx_ts_events_product
|
||||
ON time_series_events (product_id, time DESC)
|
||||
WHERE product_id IS NOT NULL;
|
||||
|
||||
-- Index for event type queries
|
||||
CREATE INDEX IF NOT EXISTS idx_ts_events_event_type
|
||||
ON time_series_events (event_type, time DESC);
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 4: Create Aggregated Metrics Tables
|
||||
-- ============================================================================
|
||||
|
||||
-- Website metrics aggregated hourly
|
||||
CREATE TABLE IF NOT EXISTS website_metrics_hourly (
|
||||
time TIMESTAMPTZ NOT NULL,
|
||||
website_id UUID NOT NULL,
|
||||
|
||||
-- Traffic metrics
|
||||
pageviews INTEGER DEFAULT 0,
|
||||
unique_sessions INTEGER DEFAULT 0,
|
||||
unique_users INTEGER DEFAULT 0,
|
||||
avg_time_on_page INTEGER,
|
||||
avg_scroll_depth INTEGER,
|
||||
bounce_rate DECIMAL(5, 4),
|
||||
|
||||
-- Conversion metrics
|
||||
add_to_cart_count INTEGER DEFAULT 0,
|
||||
checkout_start_count INTEGER DEFAULT 0,
|
||||
purchase_count INTEGER DEFAULT 0,
|
||||
conversion_rate DECIMAL(5, 4),
|
||||
|
||||
-- Revenue metrics
|
||||
total_revenue DECIMAL(19, 4) DEFAULT 0,
|
||||
avg_order_value DECIMAL(19, 4),
|
||||
|
||||
PRIMARY KEY (time, website_id)
|
||||
);
|
||||
|
||||
-- Convert to hypertable
|
||||
SELECT create_hypertable('website_metrics_hourly', 'time',
|
||||
chunk_time_interval => INTERVAL '30 days',
|
||||
if_not_exists => TRUE
|
||||
);
|
||||
|
||||
-- Product metrics aggregated daily
|
||||
CREATE TABLE IF NOT EXISTS product_metrics_daily (
|
||||
time DATE NOT NULL,
|
||||
website_id UUID NOT NULL,
|
||||
product_id VARCHAR(50) NOT NULL,
|
||||
|
||||
-- View metrics
|
||||
views INTEGER DEFAULT 0,
|
||||
unique_viewers INTEGER DEFAULT 0,
|
||||
avg_time_viewed INTEGER,
|
||||
|
||||
-- Conversion metrics
|
||||
add_to_cart_count INTEGER DEFAULT 0,
|
||||
purchase_count INTEGER DEFAULT 0,
|
||||
conversion_rate DECIMAL(5, 4),
|
||||
|
||||
-- Revenue metrics
|
||||
revenue DECIMAL(19, 4) DEFAULT 0,
|
||||
units_sold INTEGER DEFAULT 0,
|
||||
|
||||
PRIMARY KEY (time, website_id, product_id)
|
||||
);
|
||||
|
||||
-- Convert to hypertable
|
||||
SELECT create_hypertable('product_metrics_daily', 'time',
|
||||
chunk_time_interval => INTERVAL '30 days',
|
||||
if_not_exists => TRUE
|
||||
);
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 5: Create Continuous Aggregates (Materialized Views)
|
||||
-- ============================================================================
|
||||
|
||||
-- Hourly website metrics continuous aggregate
|
||||
CREATE MATERIALIZED VIEW IF NOT EXISTS website_metrics_hourly_agg
|
||||
WITH (timescaledb.continuous) AS
|
||||
SELECT
|
||||
time_bucket('1 hour', time) AS time,
|
||||
website_id,
|
||||
COUNT(*) FILTER (WHERE event_type = 'pageview') as pageviews,
|
||||
COUNT(DISTINCT session_id) as unique_sessions,
|
||||
COUNT(DISTINCT user_id) FILTER (WHERE user_id IS NOT NULL) as unique_users,
|
||||
COUNT(*) FILTER (WHERE event_name = 'add_to_cart') as add_to_cart_count,
|
||||
COUNT(*) FILTER (WHERE event_name = 'checkout_start') as checkout_start_count,
|
||||
COUNT(*) FILTER (WHERE event_name = 'purchase') as purchase_count,
|
||||
SUM(event_value) FILTER (WHERE event_name = 'purchase') as total_revenue
|
||||
FROM time_series_events
|
||||
GROUP BY time_bucket('1 hour', time), website_id;
|
||||
|
||||
-- Add refresh policy for continuous aggregate
|
||||
SELECT add_continuous_aggregate_policy('website_metrics_hourly_agg',
|
||||
start_offset => INTERVAL '3 hours',
|
||||
end_offset => INTERVAL '1 hour',
|
||||
schedule_interval => INTERVAL '1 hour',
|
||||
if_not_exists => TRUE
|
||||
);
|
||||
|
||||
-- Daily product metrics continuous aggregate
|
||||
CREATE MATERIALIZED VIEW IF NOT EXISTS product_metrics_daily_agg
|
||||
WITH (timescaledb.continuous) AS
|
||||
SELECT
|
||||
time_bucket('1 day', time) AS time,
|
||||
website_id,
|
||||
product_id,
|
||||
COUNT(*) FILTER (WHERE event_name = 'product_view') as views,
|
||||
COUNT(DISTINCT session_id) FILTER (WHERE event_name = 'product_view') as unique_viewers,
|
||||
COUNT(*) FILTER (WHERE event_name = 'add_to_cart') as add_to_cart_count,
|
||||
COUNT(*) FILTER (WHERE event_name = 'purchase') as purchase_count,
|
||||
SUM(event_value) FILTER (WHERE event_name = 'purchase') as revenue
|
||||
FROM time_series_events
|
||||
WHERE product_id IS NOT NULL
|
||||
GROUP BY time_bucket('1 day', time), website_id, product_id;
|
||||
|
||||
-- Add refresh policy for product metrics
|
||||
SELECT add_continuous_aggregate_policy('product_metrics_daily_agg',
|
||||
start_offset => INTERVAL '7 days',
|
||||
end_offset => INTERVAL '1 day',
|
||||
schedule_interval => INTERVAL '1 day',
|
||||
if_not_exists => TRUE
|
||||
);
|
||||
|
||||
-- ============================================================================
|
||||
-- Step 6: Create Data Retention Policies
|
||||
-- ============================================================================
|
||||
|
||||
-- Retain raw time-series events for 90 days
|
||||
SELECT add_retention_policy('time_series_events',
|
||||
INTERVAL '90 days',
|
||||
if_not_exists => TRUE
|
||||
);
|
||||
|
||||
-- Retain hourly aggregates for 1 year
|
||||
SELECT add_retention_policy('website_metrics_hourly',
|
||||
INTERVAL '1 year',
|
||||
if_not_exists => TRUE
|
||||
);
|
||||
|
||||
-- Retain daily product metrics for 2 years
|
||||
SELECT add_retention_policy('product_metrics_daily',
|
||||
INTERVAL '2 years',
|
||||
if_not_exists => TRUE
|
||||
);
|
||||
|
||||
-- ============================================================================
|
||||
-- Migration Complete
|
||||
-- ============================================================================
|
||||
DO $$
|
||||
BEGIN
|
||||
RAISE NOTICE '=================================================================';
|
||||
RAISE NOTICE 'TimescaleDB Migration Complete';
|
||||
RAISE NOTICE 'Hypertables: time_series_events, website_metrics_hourly, product_metrics_daily';
|
||||
RAISE NOTICE 'Continuous Aggregates: website_metrics_hourly_agg, product_metrics_daily_agg';
|
||||
RAISE NOTICE 'Retention Policies: 90 days (raw), 1 year (hourly), 2 years (daily)';
|
||||
RAISE NOTICE '=================================================================';
|
||||
END $$;
|
||||
|
||||
|
|
@ -121,6 +121,20 @@ model WebsiteEvent {
|
|||
tag String? @db.VarChar(50)
|
||||
hostname String? @db.VarChar(100)
|
||||
|
||||
// Enhanced engagement tracking fields
|
||||
scrollDepth Int? @map("scroll_depth") @db.Integer
|
||||
timeOnPage Int? @map("time_on_page") @db.Integer
|
||||
clickCount Int? @map("click_count") @db.Integer
|
||||
formInteractions Json? @map("form_interactions")
|
||||
|
||||
// WooCommerce e-commerce tracking fields
|
||||
wcProductId String? @map("wc_product_id") @db.VarChar(50)
|
||||
wcCategoryId String? @map("wc_category_id") @db.VarChar(50)
|
||||
wcCartValue Decimal? @map("wc_cart_value") @db.Decimal(19, 4)
|
||||
wcCheckoutStep Int? @map("wc_checkout_step") @db.Integer
|
||||
wcOrderId String? @map("wc_order_id") @db.VarChar(50)
|
||||
wcRevenue Decimal? @map("wc_revenue") @db.Decimal(19, 4)
|
||||
|
||||
eventData EventData[]
|
||||
session Session @relation(fields: [sessionId], references: [id])
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue