- Python 78.8%
- TypeScript 16.8%
- CSS 3.7%
- Dockerfile 0.5%
- HTML 0.2%
| .github/agents | ||
| .vscode | ||
| backend | ||
| deployment | ||
| frontend | ||
| .env | ||
| .env.example | ||
| .gitignore | ||
| .notes.swp | ||
| CallItADay.png | ||
| docker-compose.debug.yml | ||
| docker-compose.full.yml | ||
| docker-compose.langfuse.yml | ||
| docker-compose.observability.yml | ||
| docker-compose.yml | ||
| LICENSE | ||
| README.md | ||
CallItADay
CallItADay is a diary and chat application built around model-agnostic agents, structured diary storage, hybrid retrieval, and visible tool traces.
The application lets a user write diary entries, search them through dense and sparse retrieval, and chat with an assistant that can call explicit skills for chat history, diary memory, and soul document management.
Current Capabilities
- Diary writing with PostgreSQL as the structured source of truth.
- Semantic splitting for diary chunks before indexing.
- Milvus vector storage for dense embedding recall.
- Elasticsearch sparse index for BM25, metadata, and time filtering.
- Hybrid retrieval with RRF fusion and CrossEncoder reranking.
- LangGraph-based chat workflow with a first-layer intent/tool loop and second-layer response generation.
- Three model configurations managed from the UI and persisted to DB:
intent_recognitiontool_enrichmentresponse_generation
- Runtime chat settings managed from the UI, including first-layer max iterations.
- Tool trace visualization in the frontend.
- Soul documents persisted in DB with changelog support:
diary-soul.mduser-soul.mdsoul_system_prompt.md
Architecture
React + Vite frontend
|
v
FastAPI backend
|
+-- LangGraph chat workflow
| |
| +-- chat-manager skill
| +-- diary-manager skill
| +-- soul-manager skill
|
+-- Model client
| |
| +-- OpenAI-compatible HTTP endpoint
| +-- User-configured provider/model/base URL/API key
|
+-- PostgreSQL
| |
| +-- chat messages
| +-- diary records
| +-- diary chunk mapping
| +-- model/runtime configs
| +-- soul docs and changelogs
| +-- local tool trace events
|
+-- Milvus
| |
| +-- dense diary chunk embeddings
|
+-- Elasticsearch
|
+-- BM25 text
+-- metadata
+-- time filters
The backend is provider agnostic at the application layer. It expects a chat model endpoint compatible with the OpenAI chat API shape, but that endpoint can be backed by any provider or gateway that implements the protocol.
The implementation is organized in a cleaner agentic architecture:
- Tool layer: atomic operations for chat, diary, and soul document access.
- Skill layer: grouped domain capabilities that expose tool definitions, validation, and utility behavior.
- Agent layer: LangGraph workflows for deterministic orchestration, reasoning, and tool selection.
- Observability layer: Prometheus metrics, OpenTelemetry spans, and LangSmith/Langfuse-friendly trace hooks.
Observability
- Prometheus metrics exposed at
/metrics - Optional OpenTelemetry tracing with OTLP exporter support
- Grafana dashboards can be wired to the Prometheus data source
- Jaeger/Tempo support via the OpenTelemetry Collector
Diary Ingestion
Diary ingestion uses this order:
raw diary
-> semantic splitting
-> document + chunk metadata extraction
-> local embedding
-> PostgreSQL diary/chunk rows
-> Milvus dense vectors
-> Elasticsearch sparse/metadata docs
Semantic splitting happens before metadata extraction so the metadata model can produce both document-level fields and chunk-level fields. This makes sparse retrieval and metadata filtering more precise than assigning one coarse metadata object to every chunk.
The semantic splitter uses LangChain's SemanticChunker when available, with a recursive character splitter fallback for local resilience.
Retrieval
Diary search uses a staged retrieval pipeline:
query
-> query understanding
-> Milvus dense recall
-> Elasticsearch BM25 sparse recall
-> RRF fusion
-> optional LambdaMART stage
-> optional ColBERT stage
-> CrossEncoder rerank
LambdaMART and ColBERT are pluggable stages. CrossEncoder reranking is the first concrete reranker enabled by default.
Chat Workflow
The chat workflow is implemented with LangGraph.
Layer 1 performs intent recognition and information enrichment. It receives:
- the layer-1 prompt
- length-bounded chat history
diary-soul.md- skill schemas
- current iteration and max iteration count
Layer 1 can call:
chat-managerquery_chat_messagescount_chat_messages
diary-managersearch_diariesadd_diary
soul-managerread_soul_docsapply_soul_change
If the user request only modifies soul documents and the modification is handled, the workflow can end after layer 1. Otherwise layer 2 generates the final user-facing response.
Model And Runtime Configuration
The UI exposes model configuration for:
- intent recognition
- tool enrichment
- response generation
Configurations are stored in PostgreSQL and loaded into an in-memory registry at backend startup. Updates through the API write to DB and immediately refresh the in-memory registry, so each chat request does not need to reload config from DB.
The runtime config includes chat settings such as:
layer1_max_iterationshistory_max_charsmessage_max_charstool_trace_enabled
Tool Trace
Tool trace is stored locally in PostgreSQL and shown in the frontend after each chat response.
Trace events include:
- graph node
- layer
- event type
- tool name
- input JSON
- output JSON
- latency
LangSmith can also be enabled through environment variables for deeper LangGraph/LangChain tracing.
Tech Stack
| Area | Technology |
|---|---|
| Frontend | React, Vite, TypeScript |
| Backend | FastAPI, SQLAlchemy |
| Agent workflow | LangGraph |
| Structured DB | PostgreSQL |
| Vector DB | Milvus |
| Sparse search | Elasticsearch |
| Embedding/rerank | sentence-transformers |
| Model protocol | OpenAI-compatible chat API |
| Local trace | PostgreSQL |
| Optional trace | LangSmith |
Quick Start
Prerequisites
- Docker and Docker Compose
- A model endpoint compatible with the OpenAI chat API shape
Configure
Create or edit .env:
MODEL_BASE_URL=https://your-model-gateway.example.com/v1
MODEL_API_KEY=your-api-key
DEFAULT_LLM_MODEL=your-default-chat-model
EMBEDDING_MODEL=BAAI/bge-small-zh-v1.5
CROSS_ENCODER_MODEL=cross-encoder/ms-marco-MiniLM-L6-v2
LANGSMITH_TRACING=false
LANGSMITH_PROJECT=call-it-a-day
LANGSMITH_API_KEY=
LANGFUSE_ENABLED=false
LANGFUSE_PUBLIC_KEY=
LANGFUSE_SECRET_KEY=
LANGFUSE_HOST=https://api.langfuse.com # or set LANGFUSE_BASE_URL for local/self-hosted installations
# For the local LangFuse Docker override use LANGFUSE_HOST=http://langfuse-web:3000
LANGCHAIN_VERBOSE=false
You can also change the three chat model configs from the application UI after startup.
If the backend is run with LangChain-compatible callbacks, set LANGCHAIN_VERBOSE=true to print AI trace events to stdout for local debugging. Disable it in production to avoid noisy logs.
Start
docker compose up --build
Optional: local self-hosted LangFuse
To run the repo with a local LangFuse stack, start the main services plus the LangFuse override:
docker compose -f docker-compose.yml -f docker-compose.langfuse.yml up --build
Or use the new full-stack compose file that includes the repo services, local LangFuse, and observability together:
docker compose -f docker-compose.full.yml up --build
If you prefer the override style, you can also run all three files together:
docker compose -f docker-compose.yml -f docker-compose.langfuse.yml -f docker-compose.observability.yml up --build
When using the local LangFuse stack:
- Frontend: http://localhost:3000
- Backend: http://localhost:8080
- LangFuse UI: http://localhost:3002
- Grafana: http://localhost:3001
- ClickHouse: http://localhost:8123
- LangFuse API inside Docker: http://langfuse-web:3000
The backend will automatically target the local LangFuse service via LANGFUSE_HOST=http://langfuse-web:3000.
This compose stack uses explicit platform: linux/amd64 settings for the LangFuse and ClickHouse services, which helps compatibility on Apple Silicon / macOS M1.
For observability, start the backend with Prometheus, Grafana, and OpenTelemetry only if you are not using docker-compose.full.yml:
docker compose -f docker-compose.yml -f docker-compose.observability.yml up --build
Open:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8080
- API docs: http://localhost:8080/docs
- Elasticsearch / Kibana: http://localhost:5601
- Milvus Insight: http://localhost:8081
Local Development
Debug startup (VS Code + debugpy)
Option A: debug the backend in Docker
If you want to debug the backend in Docker, start the stack with the debug override file:
docker compose -f docker-compose.yml -f docker-compose.debug.yml up --build
Then in VS Code, open the Run and Debug view and choose Python Attach: backend debugpy to attach to port 5678.
After the backend is attached, open:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8080
- Debugger endpoint: http://localhost:5678
Option B: debug the backend locally (no Docker)
If you prefer to run the backend directly on your machine and still debug it in VS Code:
cd backend
conda run -n callItADay python -m pip install -r requirements.txt
cd ..
docker compose up -d postgres milvus elasticsearch
Then open the Run and Debug panel and choose Python Launch: FastAPI (local). This launches the backend with uvicorn --reload directly from the workspace, so you can set breakpoints without starting the Docker backend container.
After the local backend starts, open:
- Backend API: http://localhost:8080/docs
- Frontend: http://localhost:5173 (started separately with
cd frontend && npm run dev)
Backend
cd backend
conda run -n callItADay python -m pip install -r requirements.txt
conda run -n callItADay uvicorn main:app --reload --port 8080
Frontend:
cd frontend
npm install
npm run dev
Verification:
conda run -n callItADay python -m compileall backend
cd frontend && npm run build
Important Notes
- The application no longer assumes a specific cloud provider.
- The model client uses OpenAI-compatible request semantics as a transport protocol.
- Milvus and Elasticsearch are started locally by Docker Compose.
- The generated local folders such as
frontend/node_modules,frontend/dist, and Python__pycache__may be kept to speed up repeated local runs.
License
MIT