AI tools
How you should choose tools
You don’t pick tools to look modern; you pick them to reduce risk and cycle time. Start with the workflow you want: data ingestion, retrieval, evaluation, deployment, and monitoring. Then choose the smallest set of tools that makes the workflow repeatable.
- Prefer boring glue. You want tools that integrate cleanly with identity, CI/CD, and observability.
- Optimize for evaluation. If you can’t measure quality, you can’t improve it.
- Separate experiments from production. Keep a clean path from prototype to hardened service.
App and agent frameworks
- LangChain / LangGraph. Useful when you need structured orchestration, tool-calling, and multi-step flows.
- Semantic Kernel. Helpful if you prefer a plugin-style model for tools and prompts, especially in .NET shops.
- OpenAI / Anthropic SDKs directly. Often the best default when your workflow is simple and you want fewer abstractions.
Retrieval (RAG) building blocks
- Embedding models. You use these to turn text into vectors for similarity search; pick based on language/domain fit.
- Vector stores. Store vectors + metadata; make sure filtering and tenancy boundaries match your data model.
- Chunking and re-ranking. You tune chunk size and add re-ranking to improve relevance and reduce hallucination risk.
- Document pipelines. You need ingestion, deduplication, permissions, and freshness controls.
Evaluation and safety tools
- Offline eval. Golden datasets, regression suites, and rubric-based grading for prompts and RAG.
- Online monitoring. Track latency, cost, refusal rates, tool errors, and user feedback signals.
- Policy gates. You add prompt injection defenses, PII handling, and allowlists for tools/actions.
Production basics
- Identity and access. Your agents should use least privilege and short-lived credentials.
- Observability. You log prompts safely, capture tool-call traces, and build dashboards for failures.
- Cost controls. Use budgets, caching, and rate limits so experimentation doesn’t become a surprise bill.
- Change management. Version prompts, datasets, and evals the same way you version code.
References
- Anthropic — model and safety research.
- OpenAI — platform docs and model capabilities.
- Semantic Kernel docs — plugins, planners, and integrations.
- LangChain docs — chains, tools, and RAG patterns.