uw-ssec

lumen-ai

Master AI-powered natural language data exploration with Lumen AI. Use this skill when building conversational data analysis interfaces, enabling natural language queries to databases, creating custom AI agents for domain-specific analytics, implementing RAG with document context, or deploying self-service analytics with LLM-generated SQL and visualizations.

uw-ssec 22 9 Updated 4mo ago
GitHub

Install

npx skillscat add uw-ssec/rse-plugins/lumen-ai

Install via the SkillsCat registry.

SKILL.md

Lumen AI Skill

Overview

Lumen AI is an open-source, agent-based framework for conversational data exploration. Users ask questions in plain English and receive visualizations, SQL queries, and insights automatically generated by large language models.

Key Features

  • Natural Language Interface: Ask questions in plain English
  • Multi-LLM Support: OpenAI, Anthropic, Google, Mistral, local models
  • Agent Architecture: Specialized agents for SQL, charts, analyses
  • Extensible: Custom agents, tools, and analyses
  • Privacy-Focused: Full local deployment option

When to Use Lumen AI

Feature Lumen AI Lumen Dashboards
Interface Conversational Declarative YAML
Use Case Ad-hoc exploration Fixed dashboards
Users Non-technical Developers

Use Lumen AI when: Users need ad-hoc exploration, questions vary unpredictably, enabling self-service analytics.

Use Lumen Dashboards when: Dashboard structure is fixed, no LLM costs desired.

Quick Start

Installation

pip install lumen[ai]
pip install openai  # or anthropic for Claude

Launch Built-in Interface

export OPENAI_API_KEY="sk-..."
lumen-ai serve data/sales.csv
# Or with database
lumen-ai serve "postgresql://user:pass@localhost/mydb"

Python API

import lumen.ai as lmai
import panel as pn
from lumen.sources.duckdb import DuckDBSource

pn.extension()

# Configure LLM
lmai.llm.llm_type = "anthropic"
lmai.llm.model = "claude-3-5-sonnet-20241022"

# Load data
source = DuckDBSource(tables=["./data/sales.csv"])

# Create UI
ui = lmai.ExplorerUI(source=source, title="Sales Analytics AI")
ui.servable()

Example Queries

  • "What tables are available?"
  • "Show me total sales by region"
  • "Create a scatter plot of price vs quantity"
  • "What were the top 10 products last month?"

Core Concepts

1. Agents

Specialized components that handle specific tasks:

  • TableListAgent: Shows available tables and schemas
  • ChatAgent: General conversation and summaries
  • SQLAgent: Generates and executes SQL queries
  • hvPlotAgent: Creates interactive visualizations
  • VegaLiteAgent: Publication-quality charts
  • AnalysisAgent: Custom domain-specific analyses

See: Built-in Agents Reference

2. LLM Providers

Use Case Provider Model
Production OpenAI gpt-4o
Complex SQL Anthropic claude-3-5-sonnet
High volume OpenAI gpt-4o-mini
Sensitive data Ollama llama3.1

See: LLM Provider Configuration

3. Memory and Tools

Agents share a memory system for context persistence. Extend capabilities with tools:

  • DocumentLookup: RAG for document context
  • TableLookup: Schema and metadata access

Common Patterns

Pattern 1: Basic Analytics Interface

import lumen.ai as lmai
from lumen.sources.duckdb import DuckDBSource

lmai.llm.llm_type = "openai"
lmai.llm.model = "gpt-4o"

source = DuckDBSource(tables=["sales.csv"])
ui = lmai.ExplorerUI(source=source, title="Business Analytics")
ui.servable()

Pattern 2: With Document Context (RAG)

source = DuckDBSource(
    tables=["sales.csv", "products.parquet"],
    documents=["./docs/data_dictionary.pdf", "./docs/business_rules.md"]
)

ui = lmai.ExplorerUI(source=source, tools=[lmai.tools.DocumentLookup])

Pattern 3: Custom Agent

from lumen.ai.agents import Agent
import param

class SentimentAgent(Agent):
    """Analyze sentiment in text data."""
    requires = param.List(default=["current_source"])
    provides = param.List(default=["sentiment_analysis"])

    purpose = """
    Analyzes sentiment in text columns.
    Keywords: sentiment, emotion, positive, negative, tone
    """

    async def respond(self, query: str):
        source = self.memory["current_source"]
        yield "Sentiment analysis results..."

ui = lmai.ExplorerUI(source=source, agents=[SentimentAgent, lmai.agents.ChatAgent])

See: Custom Agents Guide

Pattern 4: Custom Analysis

from lumen.ai.analyses import Analysis
from lumen.pipeline import Pipeline
import param

class CohortAnalysis(Analysis):
    """Customer cohort retention analysis."""
    columns = param.List(default=['customer_id', 'signup_date', 'purchase_date'])

    def __call__(self, pipeline: Pipeline):
        df = pipeline.data
        # ... calculate cohorts ...
        return results

ui = lmai.ExplorerUI(
    source=source,
    agents=[lmai.agents.AnalysisAgent(analyses=[CohortAnalysis])]
)

Pattern 5: Multi-Source Data

source = DuckDBSource(
    tables={
        "sales": "./data/sales.parquet",
        "customers": "./data/customers.csv",
        "products": "https://data.company.com/products.csv"
    }
)
ui = lmai.ExplorerUI(source=source)

Configuration

Agent Selection

agents = [
    lmai.agents.TableListAgent,
    lmai.agents.SQLAgent,
    lmai.agents.hvPlotAgent,
]
ui = lmai.ExplorerUI(source=source, agents=agents)

Coordinator Types

# DependencyResolver (default): Recursively resolves agent dependencies
ui = lmai.ExplorerUI(source=source, coordinator="dependency")

# Planner: Creates execution plan upfront
ui = lmai.ExplorerUI(source=source, coordinator="planner")

UI Customization

ui = lmai.ExplorerUI(
    source=source,
    title="Custom Analytics AI",
    accent_color="#00aa41",
    suggestions=["Show me revenue trends", "What are the top products?"]
)

Best Practices

1. Security

import os

# Environment variables for API keys
lmai.llm.api_key = os.getenv("OPENAI_API_KEY")

# Never hardcode secrets

2. Performance

# Limit table sizes for exploration
source = DuckDBSource(
    tables=["large_table.parquet"],
    table_kwargs={"large_table": {"nrows": 100000}}
)

3. User Experience

# Provide example queries
ui = lmai.ExplorerUI(
    source=source,
    suggestions=["Show me revenue trends", "Top 10 products by sales"]
)

Deployment

Development

lumen-ai serve app.py --autoreload --show

Production

panel serve app.py \
  --port 80 \
  --num-procs 4 \
  --allow-websocket-origin=analytics.company.com

Docker

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY app.py data/ ./
CMD ["panel", "serve", "app.py", "--port", "5006", "--address", "0.0.0.0"]

See: Deployment Guide

Troubleshooting

LLM Not Responding

# Test API connection
curl https://api.openai.com/v1/models -H "Authorization: Bearer $OPENAI_API_KEY"

Agent Not Selected

# Debug which agent was selected
print(ui.agent_manager.last_selected_agent)

# View agent purposes
for agent in ui.agents:
    print(f"{agent.__class__.__name__}: {agent.purpose}")

SQL Generation Errors

  • Add data dictionary as document for context
  • Provide example queries in agent prompts
  • Check table schemas match query expectations

Resources

Reference Documentation

External Links

Summary

Lumen AI transforms data exploration through natural language interfaces powered by LLMs.

Strengths: No SQL required for users, flexible LLM support, extensible architecture, privacy-focused options.

Ideal for: Ad-hoc exploration, non-technical users, rapid insights, self-service analytics.

Consider alternatives when:

Related Skills