Live dashboards
CampusEvolve Analytics Dashboard Guide¶
URL: https://analytics.campusevolve.ai Data Source: MongoDB Atlas (pathways-prod) → Azure Data Lake → Synapse Refresh Frequency: Hourly (ETL runs on the hour) Excludes: Tester accounts, archived accounts, and developer accounts
Dashboard 1: Daily Operations¶
High-level operational metrics for monitoring platform health and usage trends.
KPI Cards (Top Row)¶
Total Users¶
Count of all student profiles, excluding:
- Tester accounts (isTester: true)
- Archived accounts (isArchived: true)
- Developer accounts (identified by having a first_name field set — real students don't have this field)
Total Messages¶
Total message exchanges (student message + AI response pairs) across all categories: activity, profilechat, and question. Excludes messages from tester/archived accounts.
Avg Quality Score¶
Average quality score across all scored messages. Score is 0–12, calculated as the sum of four dimensions (each scored 0–3):
| Dimension | What it measures | How it's scored |
|---|---|---|
| Relevance | Does the response advance pathway exploration? | 3 = substantive response >200 chars; 2 = substantive ≤200 chars; 0 = deflection |
| Grounding | Does it reference real WA institutions, programs, or URLs? | Count of pattern matches (WA colleges, financial aid programs, career resources, URLs), capped at 3 |
| Actionability | Does it provide concrete next steps? | Count of action patterns (visit, apply, contact, step 1, deadlines, etc.), capped at 3 |
| Readability | Is it well-structured and appropriate length? | 3 = 50–250 words with formatting (bullets/lists); 2 = moderate length; 1 = too short (<20 words) or too long (>300 words) |
Important: Scores are computed automatically by regex pattern matching during the ETL process, not by human review. They are useful for identifying trends and outliers but should not be treated as definitive quality assessments.
Messages (24h)¶
Number of messages created in the last 24 hours (rolling window from the current UTC time).
CARA Funnel (Cumulative)¶
Line chart showing cumulative student progression through the CARA Quality Advising Framework over time, grouped by profile signup date.
Funnel Stages:
| Stage | Definition | How it's tracked |
|---|---|---|
| Profiles | Account created | Profile exists in the system |
| Authenticated | Completed login or signup | Has a login or signup action |
| Disclosed | Accepted disclosure/terms | Has a disclosure action |
| Onboarded | Completed onboarding questions | Has an onboarding action |
| Has Pathway | AI generated a learning pathway | Has an entry in the paths collection |
| Active | Sent at least one message | Has an entry in the messages collection |
Cumulative logic: If a student reached a later stage, they are counted in all prior stages. For example, a student with an onboarding action but no disclosure action is counted as disclosed. This prevents the funnel from appearing non-monotonic due to 18 early users (Feb 3–10 Yakima sessions) who completed onboarding before the disclosure step was added to the flow.
Funnel Snapshot¶
Table showing current funnel totals with drop-off percentage at each stage.
Drop-off formula: (Previous Stage Count − Current Stage Count) / Previous Stage Count × 100
A 0% drop-off means no students were lost at that transition. The funnel should always be monotonically decreasing (each stage ≤ the previous stage).
Daily Message Volume by Category¶
Stacked bar chart showing daily message count broken down by category:
| Category | Color | Description | Typical % |
|---|---|---|---|
| activity | Green | Messages within pathway activities — career, education, financial, wellness explorations | ~91% |
| profilechat | Red | Onboarding conversations and profile-building chat | ~7% |
| question | Purple | Open Q&A questions outside of structured activities | ~1.5% |
Daily Active Users¶
Dual-axis line chart showing: - Purple line: Number of unique users who sent at least one message that day - Green line: Total message count for that day
Useful for distinguishing between high activity from a few users vs. broad engagement across many users. Spikes in the green line without corresponding purple line increases indicate power users driving volume.
Dashboard 2: Quality & Engagement¶
AI response quality analysis and user engagement metrics.
Quality Score Trend (Daily)¶
Average quality score (0–12) per day across all scored messages. Shows quality improvement over time as prompts have been refined. Key context:
- Feb 3–10: Yakima co-creation sessions — lower scores due to initial prompt versions and limited RAG content
- Late Feb: Prompt engineering sprint improved scores significantly
- Mar onward: Stabilized at 9–10 range for activity messages
Quality Score Distribution¶
Histogram showing how many messages received each total score (0–12). A healthy distribution clusters toward the right (10–12). A bimodal distribution may indicate different quality levels across message categories (activity tends to score 9–12, profilechat tends to score 3–6).
Quality Dimensions by Category¶
Stacked bar chart comparing the four quality dimensions across message categories. Each segment represents the average score (0–3):
- Bottom (green): Relevance
- Next (orange): Grounding
- Next (blue): Actionability
- Top (purple): Readability
Why profilechat scores lower: Onboarding responses are conversational and don't typically reference specific institutions or provide concrete action steps, so grounding (~0.2) and actionability (~0.2) are inherently low. This is expected behavior, not a quality problem.
Message Categories¶
Pie chart showing the proportion of messages by category. Activity messages (pathway explorations) should dominate since that's the core product experience.
Response Lengths¶
Distribution of AI response lengths in characters:
| Bucket | Description |
|---|---|
| 0–99 | Very short — may indicate deflections, errors, or safety refusals |
| 100–299 | Brief responses |
| 300–499 | Moderate responses |
| 500–999 | Detailed responses |
| 1000–1999 | Comprehensive responses (the "sweet spot" for quality) |
| 2000+ | Very long responses — may affect readability |
The quality rubric scores readability highest for responses between 50–250 words (~300–1500 chars) with structured formatting.
Quality Scores by Message Category¶
Detailed table showing message count and average score for each quality dimension, broken down by category.
Benchmarks: - Activity: Target avg_total ≥ 9.0 - Profilechat: Expected ~5.0 (inherently less grounded/actionable) - Question: Small sample size — interpret with caution
Post-Secondary Directions¶
Horizontal bar chart showing the distribution of post-secondary directions selected by students during onboarding. "Not Set" indicates students who haven't completed onboarding or early users from before the direction question was added.
User Engagement¶
Per-user engagement table sorted by message count:
| Column | Description |
|---|---|
| User | Truncated profile ID (first 8 characters) |
| District | School district from onboarding (if provided) |
| Direction | Post-secondary direction(s) selected |
| Joined | Profile creation date |
| Msgs | Total message count |
| Activities | Number of distinct pathway activities touched |
| Avg Qual | Average quality score across the user's messages |
| Last Active | Date of more recent message |
Dashboard 3: Cohort Deep Dive¶
Detailed cohort analysis aligned with the CARA Quality Advising Framework. Mirrors the cohort_journey_analysis.ipynb notebook.
Funnel Drop-off¶
Bar chart showing absolute student counts at each CARA funnel stage. Visual height differences between bars indicate where students are lost. Uses the same cumulative logic as Dashboard 1.
Task Completion¶
Breakdown of learning path task statuses. Each pathway generates approximately 30 tasks (3 tasks per activity × ~10 activities). Tasks are concrete action items like "Research federal aid programs" or "Compare salary ranges."
| Status | Description |
|---|---|
| not_started | Task created but student hasn't engaged with it |
| completed | Student marked the task as done |
Context: The ~95% not_started rate is a known product issue. Students engage via chat conversations but rarely use the task checklist UI to mark items complete. This doesn't necessarily mean students aren't exploring the topics — they may be discussing the same content through chat without clicking the task checkboxes.
Engagement by Direction¶
Engagement metrics grouped by post-secondary direction:
| Column | Description |
|---|---|
| direction | Selected post-secondary pathway(s) |
| students | Number of students with this direction |
| msgs | Total messages from these students |
| per_student | Average messages per student (engagement depth) |
| avg_qual | Average quality score for these students' messages |
Useful for identifying which directions have the most engaged students and whether AI response quality varies by direction.
Quality by Message Category¶
Quality metrics by category with additional detail:
| Column | Description |
|---|---|
| Activities | Number of distinct activity IDs (0 for non-activity categories) |
| Messages | Total message count |
| Avg Quality | Mean total quality score (0–12) |
| Relevance / Grounding / Action / Read | Mean score per dimension (0–3 each) |
| Min / Max | Range of quality scores observed |
Top 20 Most Engaged Students¶
The 20 students with the most messages:
| Column | Description |
|---|---|
| Activity / QnA / PChat | Message breakdown by category |
| Avg Qual | Average quality score |
| First / Last | Date range of engagement |
Useful for identifying power users and understanding deep engagement patterns.
All Students¶
Complete student roster (113 rows) with all engagement metrics:
| Column | Description |
|---|---|
| Location Pref | Distance preference from the redesigned onboarding (e.g., "Stay close to home", "Within an hour or two", "Stay in Washington") |
| Activity / QnA / PChat | Message count by category |
| Activities | Number of distinct pathway activities touched |
| Avg Qual / Rel / Grd / Act | Quality scores — total and per-dimension |
| Last Active | Most recent message date |
Sorted by total message count descending.
Glossary¶
| Term | Definition |
|---|---|
| CARA | Quality Advising Framework used by CampusEvolve — 21 steps across 4 categories mapping to 8 post-secondary directions |
| Activity | A structured exploration within a learning pathway (e.g., Career Exploration, Financial Aid Planning) |
| Pathway | An AI-generated learning path with ~10 activities across career, education, financial, and wellness domains |
| RAG | Retrieval-Augmented Generation — the AI retrieves relevant WA State educational content before generating responses |
| Deflection | When the AI refuses or redirects a legitimate question instead of answering it |
| Grounding | References to real, verifiable WA State institutions, programs, URLs, or resources |
| Tester | Internal test accounts excluded from analytics (identified by isTester flag, isArchived flag, or first_name field) |
| Watermark | Timestamp tracking the last successful data extraction — ensures incremental syncs only pull new data |
Data Pipeline¶
MongoDB Atlas (pathways-prod)
↓ hourly ETL (VM cron job)
Azure Data Lake (JSONL → Parquet)
↓ Synapse serverless SQL
Metabase dashboards
- ETL runs: Every hour on the hour
- Latency: Up to 1 hour between a student interaction and dashboard update
- Quality scoring: Computed during ETL via regex pattern matching (not real-time)
- Tester exclusion: Applied at extraction time — tester data never enters the data lake