No description

Python 78.8%
TypeScript 16.8%
CSS 3.7%
Dockerfile 0.5%
HTML 0.2%

Find a file

Griffin b37e291a44 user		2026-06-12 09:52:49 +08:00
.github/agents	0612 refactoring backend	2026-06-12 01:49:10 +08:00
.vscode	version 0610	2026-06-10 10:12:20 +08:00
backend	user	2026-06-12 09:52:49 +08:00
deployment	0612 refactoring backend	2026-06-12 01:49:10 +08:00
frontend	user	2026-06-12 09:52:49 +08:00
.env	0612 refactoring backend	2026-06-12 01:49:10 +08:00
.env.example	0612 refactoring backend	2026-06-12 01:49:10 +08:00
.gitignore	gitignore	2026-06-12 02:17:58 +08:00
.notes.swp	version 0610	2026-06-10 10:12:20 +08:00
CallItADay.png	Prototype	2026-05-28 11:27:41 +08:00
docker-compose.debug.yml	version 3 with trace/semantic splitting/cache/ configable model with better retrieval	2026-05-28 11:27:43 +08:00
docker-compose.full.yml	0612 refactoring backend	2026-06-12 01:49:10 +08:00
docker-compose.langfuse.yml	0612 refactoring backend	2026-06-12 01:49:10 +08:00
docker-compose.observability.yml	0612 refactoring backend	2026-06-12 01:49:10 +08:00
docker-compose.yml	0612 refactoring backend	2026-06-12 01:49:10 +08:00
LICENSE	Initial commit	2026-05-25 18:57:44 +08:00
README.md	0612 refactoring backend	2026-06-12 01:49:10 +08:00

README.md

CallItADay

CallItADay is a diary and chat application built around model-agnostic agents, structured diary storage, hybrid retrieval, and visible tool traces.

The application lets a user write diary entries, search them through dense and sparse retrieval, and chat with an assistant that can call explicit skills for chat history, diary memory, and soul document management.

Current Capabilities

Diary writing with PostgreSQL as the structured source of truth.
Semantic splitting for diary chunks before indexing.
Milvus vector storage for dense embedding recall.
Elasticsearch sparse index for BM25, metadata, and time filtering.
Hybrid retrieval with RRF fusion and CrossEncoder reranking.
LangGraph-based chat workflow with a first-layer intent/tool loop and second-layer response generation.
Three model configurations managed from the UI and persisted to DB:
- intent_recognition
- tool_enrichment
- response_generation
Runtime chat settings managed from the UI, including first-layer max iterations.
Tool trace visualization in the frontend.
Soul documents persisted in DB with changelog support:
- diary-soul.md
- user-soul.md
- soul_system_prompt.md

Architecture

React + Vite frontend
        |
        v
FastAPI backend
        |
        +-- LangGraph chat workflow
        |      |
        |      +-- chat-manager skill
        |      +-- diary-manager skill
        |      +-- soul-manager skill
        |
        +-- Model client
        |      |
        |      +-- OpenAI-compatible HTTP endpoint
        |      +-- User-configured provider/model/base URL/API key
        |
        +-- PostgreSQL
        |      |
        |      +-- chat messages
        |      +-- diary records
        |      +-- diary chunk mapping
        |      +-- model/runtime configs
        |      +-- soul docs and changelogs
        |      +-- local tool trace events
        |
        +-- Milvus
        |      |
        |      +-- dense diary chunk embeddings
        |
        +-- Elasticsearch
               |
               +-- BM25 text
               +-- metadata
               +-- time filters

The backend is provider agnostic at the application layer. It expects a chat model endpoint compatible with the OpenAI chat API shape, but that endpoint can be backed by any provider or gateway that implements the protocol.

The implementation is organized in a cleaner agentic architecture:

Tool layer: atomic operations for chat, diary, and soul document access.
Skill layer: grouped domain capabilities that expose tool definitions, validation, and utility behavior.
Agent layer: LangGraph workflows for deterministic orchestration, reasoning, and tool selection.
Observability layer: Prometheus metrics, OpenTelemetry spans, and LangSmith/Langfuse-friendly trace hooks.

Observability

Prometheus metrics exposed at /metrics
Optional OpenTelemetry tracing with OTLP exporter support
Grafana dashboards can be wired to the Prometheus data source
Jaeger/Tempo support via the OpenTelemetry Collector

Diary Ingestion

Diary ingestion uses this order:

raw diary
  -> semantic splitting
  -> document + chunk metadata extraction
  -> local embedding
  -> PostgreSQL diary/chunk rows
  -> Milvus dense vectors
  -> Elasticsearch sparse/metadata docs

Semantic splitting happens before metadata extraction so the metadata model can produce both document-level fields and chunk-level fields. This makes sparse retrieval and metadata filtering more precise than assigning one coarse metadata object to every chunk.

The semantic splitter uses LangChain's SemanticChunker when available, with a recursive character splitter fallback for local resilience.

Retrieval

Diary search uses a staged retrieval pipeline:

query
  -> query understanding
  -> Milvus dense recall
  -> Elasticsearch BM25 sparse recall
  -> RRF fusion
  -> optional LambdaMART stage
  -> optional ColBERT stage
  -> CrossEncoder rerank

LambdaMART and ColBERT are pluggable stages. CrossEncoder reranking is the first concrete reranker enabled by default.

Chat Workflow

The chat workflow is implemented with LangGraph.

Layer 1 performs intent recognition and information enrichment. It receives:

the layer-1 prompt
length-bounded chat history
diary-soul.md
skill schemas
current iteration and max iteration count

Layer 1 can call:

chat-manager
- query_chat_messages
- count_chat_messages
diary-manager
- search_diaries
- add_diary
soul-manager
- read_soul_docs
- apply_soul_change

If the user request only modifies soul documents and the modification is handled, the workflow can end after layer 1. Otherwise layer 2 generates the final user-facing response.

Model And Runtime Configuration

The UI exposes model configuration for:

intent recognition
tool enrichment
response generation

Configurations are stored in PostgreSQL and loaded into an in-memory registry at backend startup. Updates through the API write to DB and immediately refresh the in-memory registry, so each chat request does not need to reload config from DB.

The runtime config includes chat settings such as:

layer1_max_iterations
history_max_chars
message_max_chars
tool_trace_enabled

Tool Trace

Tool trace is stored locally in PostgreSQL and shown in the frontend after each chat response.

Trace events include:

graph node
layer
event type
tool name
input JSON
output JSON
latency

LangSmith can also be enabled through environment variables for deeper LangGraph/LangChain tracing.

Tech Stack

Area	Technology
Frontend	React, Vite, TypeScript
Backend	FastAPI, SQLAlchemy
Agent workflow	LangGraph
Structured DB	PostgreSQL
Vector DB	Milvus
Sparse search	Elasticsearch
Embedding/rerank	sentence-transformers
Model protocol	OpenAI-compatible chat API
Local trace	PostgreSQL
Optional trace	LangSmith

Quick Start

Prerequisites

Docker and Docker Compose
A model endpoint compatible with the OpenAI chat API shape

Configure

Create or edit .env:

MODEL_BASE_URL=https://your-model-gateway.example.com/v1
MODEL_API_KEY=your-api-key
DEFAULT_LLM_MODEL=your-default-chat-model

EMBEDDING_MODEL=BAAI/bge-small-zh-v1.5
CROSS_ENCODER_MODEL=cross-encoder/ms-marco-MiniLM-L6-v2

LANGSMITH_TRACING=false
LANGSMITH_PROJECT=call-it-a-day
LANGSMITH_API_KEY=
LANGFUSE_ENABLED=false
LANGFUSE_PUBLIC_KEY=
LANGFUSE_SECRET_KEY=
LANGFUSE_HOST=https://api.langfuse.com  # or set LANGFUSE_BASE_URL for local/self-hosted installations
# For the local LangFuse Docker override use LANGFUSE_HOST=http://langfuse-web:3000
LANGCHAIN_VERBOSE=false

You can also change the three chat model configs from the application UI after startup.

If the backend is run with LangChain-compatible callbacks, set LANGCHAIN_VERBOSE=true to print AI trace events to stdout for local debugging. Disable it in production to avoid noisy logs.

Start

docker compose up --build

Optional: local self-hosted LangFuse

To run the repo with a local LangFuse stack, start the main services plus the LangFuse override:

docker compose -f docker-compose.yml -f docker-compose.langfuse.yml up --build

Or use the new full-stack compose file that includes the repo services, local LangFuse, and observability together:

docker compose -f docker-compose.full.yml up --build

If you prefer the override style, you can also run all three files together:

docker compose -f docker-compose.yml -f docker-compose.langfuse.yml -f docker-compose.observability.yml up --build

When using the local LangFuse stack:

Frontend: http://localhost:3000
Backend: http://localhost:8080
LangFuse UI: http://localhost:3002
Grafana: http://localhost:3001
ClickHouse: http://localhost:8123
LangFuse API inside Docker: http://langfuse-web:3000

The backend will automatically target the local LangFuse service via LANGFUSE_HOST=http://langfuse-web:3000.

This compose stack uses explicit platform: linux/amd64 settings for the LangFuse and ClickHouse services, which helps compatibility on Apple Silicon / macOS M1.

For observability, start the backend with Prometheus, Grafana, and OpenTelemetry only if you are not using docker-compose.full.yml:

docker compose -f docker-compose.yml -f docker-compose.observability.yml up --build

Open:

Frontend: http://localhost:3000
Backend API: http://localhost:8080
API docs: http://localhost:8080/docs
Elasticsearch / Kibana: http://localhost:5601
Milvus Insight: http://localhost:8081

Local Development

Debug startup (VS Code + debugpy)

Option A: debug the backend in Docker

If you want to debug the backend in Docker, start the stack with the debug override file:

docker compose -f docker-compose.yml -f docker-compose.debug.yml up --build

Then in VS Code, open the Run and Debug view and choose Python Attach: backend debugpy to attach to port 5678.

After the backend is attached, open:

Frontend: http://localhost:3000
Backend API: http://localhost:8080
Debugger endpoint: http://localhost:5678

Option B: debug the backend locally (no Docker)

If you prefer to run the backend directly on your machine and still debug it in VS Code:

cd backend
conda run -n callItADay python -m pip install -r requirements.txt
cd ..
docker compose up -d postgres milvus elasticsearch

Then open the Run and Debug panel and choose Python Launch: FastAPI (local). This launches the backend with uvicorn --reload directly from the workspace, so you can set breakpoints without starting the Docker backend container.

After the local backend starts, open:

Backend API: http://localhost:8080/docs
Frontend: http://localhost:5173 (started separately with cd frontend && npm run dev)

Backend

cd backend
conda run -n callItADay python -m pip install -r requirements.txt
conda run -n callItADay uvicorn main:app --reload --port 8080

Frontend:

cd frontend
npm install
npm run dev

Verification:

conda run -n callItADay python -m compileall backend
cd frontend && npm run build

Important Notes

The application no longer assumes a specific cloud provider.
The model client uses OpenAI-compatible request semantics as a transport protocol.
Milvus and Elasticsearch are started locally by Docker Compose.
The generated local folders such as frontend/node_modules, frontend/dist, and Python __pycache__ may be kept to speed up repeated local runs.

License

MIT