Challenges and Opportunities of Agentic Systems

VIEW SPEECH SUMMARY

Part 1: Evolution of Agentic Systems Complexity
- Early AI: simple prompt-to-answer large language model (LLM) systems.
- Retrieval-Augmented Generation (RAG) systems: embed internal data into vector/graph/regular DBs for better context.
- Agentic RAG: adds pre-processing, query splitting, routing to different data sets, plus reflection loops to improve answers.
- Agentic systems: LLM-based agents with tools, planning, memory to act autonomously.
- Semi-multi-agent systems: master agent coordinating sub-agents sharing memory locally.
- Multi-agent systems: distributed agents communicating over the Internet (Internet of Agents).
- Challenges: non-determinism, testing difficulties, memory communication, efficient context usage.

Part 2: Frameworks to Manage Agentic Systems
- "Cowboy Agentic System Hierarchy of Needs":
- POC: use existing AI labs’ models, simple prototypes.
- MVP: add proprietary data storage and orchestration frameworks.
- Beta: add model routing for reliability, observability to reduce black-box issues.
- GA: evaluation and security for production readiness, add memory.
- Caution: rushing order leads to unreliability; observability and evaluation are crucial early.
- "Enterprise Agentic System Hierarchy of Needs":
- Focus shifts to evaluation-driven development.
- Observability and evaluation prioritized before orchestration and routing.

Part 3: Practical Implementation Steps for Enterprise Applications
- Define clear business problem fit for LLMs to avoid waste.
- Build prototype (prototyping can be simple, even an Excel sheet).
- Define success metrics ("North Star metric") and break into leading/input metrics for incremental optimization.
- Define evaluations for each metric and task in the agentic topology.
- Instrument application from day one:
- Trace all runs, collect human feedback.
- Store evaluations centrally and focus on failing cases for improvement.
- Continuous improvement cycle through prompt engineering, topology changes, fine-tuning.
- Integrate evaluation step into CI/CD pipelines as a first-class citizen.
- Production monitoring includes LLM-specific metrics (latency, token counts, cost) and error tracking.
- Stop development when gains plateau, then pursue new improvement problems.
- Use routers to manage multiple problems in one app, keep evolving topology and evaluations.

Additional Notes and Actionable Items:
- Pre-process data with care and implement re-ranking to solve accuracy issues.
- Start with evaluation and observability platforms for reliability; orchestration frameworks may be optional initially.
- Adopt evaluation-driven development mindset for Gen AI projects.
- Instrument application and set up observability/feedback from day one.
- Integrate evaluation into CI/CD pipelines to automate testing of AI improvements.
- Continuous monitoring includes both AI and traditional software engineering metrics.
- Consider joining intent AI engineering boot camp starting June 22nd for hands-on project experience.
- Follow speaker on LinkedIn and Substack for regular insights.

Key Actionable Tasks for Practitioners:
- Define and validate problem fit before development.
- Build simple prototypes rapidly, involving stakeholders early.
- Define clear performance metrics and evaluations.
- Implement traceability and human feedback collection from start.
- Build CI/CD pipelines with mandatory evaluation steps.
- Continuously monitor AI system performance and costs.
- Focus on improving data preprocessing, retrieval accuracy, and re-ranking.
- When scaling, add model routing, observability, security, and memory features as per maturity.
- Explore multi-agent distributed architectures only after mastering single-agent complexities.
- Engage in training programs like the intent AI engineering boot camp for practical knowledge.

Challenges and Opportunities of Agentic Systems

15:20 - 15:50, 28th of May (Wednesday) 2025 / DEV TRENDS STAGE

2025 is positioned to be the year of Agentic Systems - AI agents capable of autonomous decision-making that are transforming the software landscape. In just a few months, we have already seen the technology transition through multiple hype cycles. In this talk, we’ll cut through the noise to explore the real challenges and emerging opportunities in the space. We will examine where we are, where we're headed, and what it all means for developers shaping the future of AI.

LEVEL:

Basic Advanced Expert

TRACK:

AI/ML Data

TOPICS:

AI ML/DL

Aurimas Griciūnas

SwirlAI