Field Journal.ai
More stories
All articles-
GuidesShipping Safe Tooling: Schemas, Validation, and Failure Modes in Tool Calling
A production guide to tool calling safety: designing tight tool contracts, validating outputs, limiting agency, and handling retries, idempotency, and audit logs for tool-using agents.
-
GuidesThe Return of RAG in 2026
RAG is back in 2026 because long context did not solve freshness, permissions, or reliability. Modern RAG looks like search engineering: hybrid retrieval, reranking, and tight evals.
-
OpinionWhy Frontier Models Are Getting More Restrictive
Moderation is no longer a thin filter on top of a chatbot. For frontier labs, it is becoming an end-to-end product and risk system shaped by capability jumps, regulation, and enterprise expectations.
-
GuidesLLM Evals for Chat and Tool-Using Agents: A Practical Guide to Test Suites and Graders
A production-first guide to evaluating chat assistants and tool-using agents with a small, reliable eval suite: datasets, grader types, flake reduction, and CI gates.