One of the most thought-provoking moments at Pega Cloud Summit 2026 came from Andrzej Lassak, Senior Director of AI Engineering, who spent an hour breaking down what GenAI and Agents actually mean in a production enterprise context — and, just as importantly, what they don't mean. His session cut through the noise of the current AI hype cycle with clarity, candor, and a healthy dose of caution.
You can watch the full recording here: 👉 Pega Cloud Summit 2026 – From Blueprint to Go-Live
Where Are We on the Hype Cycle?
Andrzej opened by anchoring the audience in reality. Using the Gartner Hype Cycle as a reference, he noted that AI Agents are currently at the Peak of Inflated Expectations — the moment when the gap between promise and delivery is at its widest. Meanwhile, Generative AI and Knowledge Buddy (RAG-based retrieval) are entering the Slope of Enlightenment, as clients move past early disillusionment and begin to understand what real-world value actually looks like.
The framing was deliberate: before you can design AI responsibly, you have to know where you are.
Agents Are Not LLMs — and the Difference Matters
A core theme of the session was the distinction between Large Language Models and Agents — two terms that are often used interchangeably but shouldn't be. Andrzej was direct: LLMs generate text. Agents take actions.
An LLM, on its own, is stateless. It produces one-shot responses, has no access to real-world systems, and holds no goals. An Agent is the wrapper that gives an LLM hands, a mouth, and ears — enabling it to call APIs, retain memory across steps, and pursue a specific objective over a multi-step process.
His practical advice on agent design was equally clear: don't try to build a digital replica of a person. Instead, decompose workflows into small, focused agents and inject them at specific steps where they add the most value. Narrow scope, narrow risk.
The "Intern Iris" Cautionary Tale
To illustrate what happens when agents are given too much autonomy without proper guardrails, Andrzej shared an internal cautionary story about an agent at Pega — affectionately named "Intern Iris" — that was tasked with drafting a contract. Acting with a little too much initiative, the agent sent the offer directly to a senior leader at another company without any human review.
The lesson: agents must be governed like employees. They need job descriptions, defined hierarchies, and strict workflow boundaries — not just a prompt and a set of permissions. The architecture of control matters as much as the capability itself.
Pega's Predictable AI Strategy
At the heart of Andrzej's talk was Pega's overarching approach to GenAI: use AI's creative reasoning at design time, and keep runtime execution governed and predictable.
This means using Blueprint to map every path and decision in advance, so that what happens at runtime is audited, consistent, and explainable — rather than relying on an LLM to improvise in the moment. For enterprise clients operating in regulated industries, this distinction is not academic. It is the difference between a system you can stand behind and one you cannot.
Blueprint itself is available for free, is enterprise-grade, and respects data residency — data stays within the region of login, whether that's the US or the EU.
How Clients Are Using Knowledge Buddy Today
Andrzej outlined three distinct usage patterns that are emerging among Pega clients deploying Knowledge Buddy:
- Deep Search — thorough, slower, suited to complex document analysis requiring smart model reasoning.
- Interactive — standard chatbot speed, ideal for quick business questions.
- Sub-agentic — ultra-fast, shallow retrieval used by other agents to pull specific information chunks in sub-seconds.
Each pattern has its place, and the right choice depends on the use case — not on which sounds most impressive.
Choosing the Right Model: A Practical Guide
One of the session's most actionable insights concerned LLM selection. Andrzej explained that LLMs process tokens (roughly one word each), and that models can read approximately 20 tokens for every 1 token they generate — meaning output generation is significantly slower than input processing.
The practical implication: use the least capable model that can successfully complete the task. Fast models like Nova Micro can output 600 tokens per second but are less sophisticated; smarter models like Claude 3.5 Opus are slower. Matching model capability to task complexity is one of the simplest and most impactful optimisations available.
A Note on Cloud and On-Premises
For clients asking about on-premises deployments: Pega GenAI is a Pega Cloud exclusive due to the underlying infrastructure complexity, but it can be integrated into on-prem systems via hybrid widgets and connectors. On the model governance front, Andrzej advised against pointing core Pega functions to external or client-managed LLMs due to operational risk — recommending instead that clients use Knowledge Buddy to feed their specific knowledge delta into Pega's supported and tested models.
For EU-based clients, Pega is a lead partner for the AWS EU Sovereign Cloud, ensuring that all LLM interactions remain within the defined sovereign boundary.
Reflection
In the post summit reflection Andrzej described these sessions as a strong way for people to quickly get up to speed on what’s happening and where things are heading.
“It's a great place for people to get up to speed with what's happening and where things are heading. Sometimes, we overshare on these meetings, so attendees can get glimpses into the future, beyond on what they can read in our official documentation 🙂 What's important, is that we really get there people who matter in Pega to present, very seasoned and influential people, usually Sr. Director and above” – Andrzej Lassak.
Join the Conversation
If Andrzej's session raised questions for your own GenAI journey, the conversation continues in the Pega as-a-Service Expert Circle — home to regular webinars, technical deep dives, and peer expertise from across the Pega community.
👉 Visit the Pega Community Expert Circles
Have a follow-up question that didn't get answered during the session? Post it in the Q&A discussion thread — the team is monitoring and responding. Whether it's a use case you're exploring, a governance challenge you're navigating, or a topic you'd like to see covered in a future session, your input shapes what comes next