Agentic AI - The Evolution of Language Models

Published on Thursday, 26-06-2025

#Tutorials

image info

(Summarized from Stanford Webinar - Agentic AI: A Progression of Language Model Usage: https://www.youtube.com/watch?v=kJLiOGle3Lw&t=837s)

Agentic AI: The Evolution of Language Models

In the rapidly evolving world of artificial intelligence, language models have become indispensable tools. Insop Song recently presented an insightful overview of agentic AI, highlighting its progression and potential. Let’s dive into the key takeaways from his presentation.

Understanding Language Models: The Foundation

At its core, a language model (LM) is a machine learning model that predicts the next word in a sequence given an input text. Think of it like this: if you input “The students open their,” an LM can predict “books” or “laptops” as the most probable next words.

The training of language models typically involves two phases:

Pre-training: Models are exposed to vast amounts of text data from the internet, books, and other public sources. During this phase, they learn to predict the next word or token, developing a broad understanding of language and world knowledge.
Post-training: This stage refines the pre-trained model for practical use. It includes:
- Instruction Following Training: The model is trained on datasets of specific instructions or questions paired with expected, user-friendly answers. This makes the model easier to interact with and enables it to respond in desired styles.
- Reinforcement Learning with Human Feedback (RLHF): Humans provide preferences on model outputs, which are then used to fine-tune the model, aligning its responses more closely with human expectations through a reward system.

These well-trained language models are highly capable of generating text based on instructions and are widely used in applications like AI coding assistants, domain-specific AI copilots, and conversational interfaces such as ChatGPT. Developers can access these models through cloud-based API calls or by hosting smaller models on local or mobile devices.

Crafting Effective Prompts: The Art of Communication

The quality of a language model’s output heavily depends on the input it receives, often referred to as prompting. Insop emphasized several best practices for preparing effective prompts:

Clear and Detailed Instructions: Be specific and descriptive. The more clearly you articulate your request, the better the model will understand and generate the desired output.
Few-Shot Examples: Providing a few examples of input-output pairs helps the model understand the desired format or style of the response.
Relevant Context and References: For factual information, supplying the model with context or references helps reduce “hallucination” (generating incorrect information) and ensures accuracy. For instance, instructing the model to “only answer based on the article provided” can significantly improve reliability.
Enable Reasoning (Chain-of-Thought): Instead of asking for an immediate answer, guide the model to “think step-by-step” or “work out its own solution first.” This encourages the model to perform internal reasoning, often leading to more accurate and robust outputs.
Break Down Complex Tasks: For intricate problems, divide them into smaller, simpler stages. The output of one stage can then become the input for the next, chaining together simple prompts to achieve a complex goal.
Systemic Trace, Logging, and Automated Evaluation: Good engineering practices like logging interactions and setting up automated evaluation from the beginning are crucial for tracking progress, debugging, and adapting to rapidly evolving models.

Overcoming Limitations: Addressing Common Challenges

Despite their power, language models have limitations:

Hallucination: Generating incorrect or non-existent information.
Knowledge Cut-off: Models are trained on data up to a certain point, meaning they lack recent information.
Lack of Attribution: Models don’t typically cite their sources.
Data Privacy: Models trained on public data haven’t seen proprietary information.
Limited Context Length: While increasing, the amount of text a model can process at once is finite.

To mitigate these, Retrieval Augmented Generation (RAG) is a popular solution. RAG involves indexing your own data, converting it into numerical representations (embeddings), and storing them in a database. When a query comes in, relevant chunks of your data are retrieved and provided to the language model as context, allowing it to generate answers based on your specific and up-to-date information. This approach can reduce hallucination, provide citations, use proprietary data, and make efficient use of context length.

Another powerful technique is Tool Usage, also known as function calling. Language models can be instructed to generate specific API calls or code that can interact with external systems (e.g., a weather API, a calculator, or a search engine). This allows the model to access real-time information or perform computations that it cannot do intrinsically.

The Rise of Agentic Language Models: Reasoning and Action

Agentic language models represent a significant leap forward. Unlike simple LMs that just take text in and spit text out, agentic LMs can:

Interact with the Environment: They can generate tool usage or retrieval requests and then receive observations from the environment (anything outside the LM) to inform subsequent actions.
Reason and Act (ReAct): This core concept enables models to reason through a problem (e.g., using chain-of-thought) and then take an action (e.g., making an API call, searching the web, generating code).

By combining reasoning and action, agentic models can tackle far more complex tasks. For example, a customer support AI agent could break down a refund request into sub-tasks: checking refund policy (via retrieval), checking customer and product information (via API calls or direct questions), and then deciding on the appropriate action. This iterative process, where the language model reviews information, makes external tool calls, and stores observations in a “memory” (like conversational history), allows for sophisticated problem-solving.

Agentic Design Patterns: Building Intelligent Systems

Several design patterns are crucial for building effective agentic AI systems:

Planning: Asking the model to break down a complex task into simpler sub-tasks and make a plan of action is fundamental.
Reflection: The model can generate an output, then critically review and critique its own output (or feedback from a “junior engineer” persona) to improve the next iteration. This iterative self-correction leads to better results, especially in tasks like code refactoring.
Tool Usage: As discussed, this enables interaction with external systems for real-time data or computations.
Multi-Agent Collaboration: For highly complex tasks, you can split the work among different specialized agents (e.g., a climate control agent and a lighting control agent in a smart home system). Each agent can have a distinct persona and prompt, potentially using different models best suited for its task, all coordinated by a central scaffolding.

The Future is Agentic

Agentic language models are not just an incremental improvement; they are pushing the boundaries of what AI can achieve. They enable the completion of tasks that would be impossible with simple LM interactions, even with the same underlying model. Real-world applications are emerging in areas like:

Software Development: Code generation, bug fixing, and automated testing.
Research and Analysis: Gathering, synthesizing, and summarizing information.
Task Automation: Automating complex workflows.

The move towards agentic AI means that the language model acts as the “smart intern” or the “reasoning core,” leveraging external tools and iterative processes to accomplish complex goals. As these methods continue to evolve, the possibilities for AI-powered solutions will only expand.