AI Unscripted with Kieran Gilmurray

AI Agents: The Rise of Intelligent Automation

โ€ข Kieran Gilmurray

Unlock the transformative potential of AI agents in this deep-dive exploration of how LLM-powered systems are redefining what's possible in automation. We cut through the jargon and hype to reveal exactly what sets AI agents apart from conventional software โ€“ their ability to independently reason, orchestrate complex workflows, and make nuanced decisions without constant human guidance.

Discover the three essential building blocks that power effective agents: the LLM "brain" that drives reasoning, the tools that enable real-world actions, and the carefully crafted guardrails that ensure safe, reliable operation. We examine exactly where these systems deliver breakthrough value โ€“ in complex decision-making scenarios, situations with brittle rule systems, and workflows drowning in unstructured data.

Whether you're exploring potential applications or planning implementation, we provide practical insights on model selection, tool integration, instruction design, and orchestration patterns. Learn why starting simple with single-agent approaches often yields better results, and when to consider more sophisticated multi-agent architectures. Plus, discover the critical importance of layered safety mechanisms and thoughtful human oversight in creating responsible, effective systems.

As these powerful agents become more integrated into our workflows, they're not just changing how automation works โ€“ they're transforming our fundamental understanding of what work itself means. Ready to navigate this paradigm shift? Subscribe now to stay ahead of the AI revolution reshaping business and technology.

Support the show

For more information:

๐ŸŒŽ Visit my website: https://KieranGilmurray.com
๐Ÿ”— LinkedIn: https://www.linkedin.com/in/kierangilmurray/
๐Ÿฆ‰ X / Twitter: https://twitter.com/KieranGilmurray
๐Ÿ“ฝ YouTube: https://www.youtube.com/@KieranGilmurray

๐Ÿ“• Buy my book 'The A-Z of Organizational Digital Transformation' - https://kierangilmurray.com/product/the-a-z-organizational-digital-transformation-digital-book/

๐Ÿ“• Buy my book 'The A-Z of Generative AI - A Guide to Leveraging AI for Business' - The A-Z of Generative AI โ€“ Digital Book Kieran Gilmurray

AI Speaker 1:

Hi there. You're probably here because, like us, you want to get a real handle on something important you know, without wading through endless articles and jargon.

AI Speaker 2:

Yeah, cut right to the chase.

AI Speaker 1:

Exactly, and today we're diving deep into AI agents. We've gathered a bunch of info that really paints a picture of these systems, the ones powered by large language models, or LLMs.

AI Speaker 2:

Right.

AI Speaker 1:

And they seem like more than just you know clever software. This could genuinely be a fundamental shift in automation.

AI Speaker 2:

It really could. A move beyond the workflows you have to manually kick off.

AI Speaker 1:

But something that can, what act independently on your behalf.

AI Speaker 2:

Pretty much. Our mission today, then, is to pull out the core knowledge you really need to understand this potential game changer.

AI Speaker 1:

Okay, so let's start there. An AI agent How's it different from software that helps me do something?

AI Speaker 2:

Ah, good question. Well, the sources we looked at define an agent as a system specifically designed to independently accomplish tasks. It's about delegating entire processes, not just like individual steps.

AI Speaker 1:

Independently accomplishing tasks, yeah, okay. So what makes it an agent, then, rather than just a really fancy program that happens to use an LLM? What are the essential ingredients?

AI Speaker 2:

The material consistently points to three key things. First, the agent uses an LLM as its core. Like its brain, its reasoning engine, it actively manages a workflow and makes decisions as it goes. So it's not just spitting out text, it's reasoning engine, it actively manages a workflow and makes decisions as it goes.

AI Speaker 1:

So it's not just spitting out text, it's directing things.

AI Speaker 2:

Exactly Orchestrating actions. Second, it needs access to what are called tools.

AI Speaker 1:

Tools Like software tools.

AI Speaker 2:

Sort of yeah, Think of them like extensions or plugins. They let the agent interact with the outside world, query a database, send an email, search the web, that kind of thing.

AI Speaker 1:

Gotcha, so it can actually do stuff.

AI Speaker 2:

Right. And third, and this is crucial, its operation is governed by defined guardrails, instructions and boundaries to make sure it behaves acceptably.

AI Speaker 1:

Okay, LLM brain tools for action and guardrails for safety Makes sense, but when would you actually go through the effort of building one? It sounds like a bigger deal than standard automation.

AI Speaker 2:

It definitely can be, and that's a really important question. The sources address Agents truly shine where traditional, like rule-based automation starts hitting its limits. Limits like what Well take payment fraud analysis, for instance. A standard system might just flag transactions matching very specific preset rules, bang rule triggered. But an AI agent, it can reason through the context. It can look at subtle indicators, things that don't fit a neat rule, and make a more nuanced judgment. It's almost like having a tiny fraud investigator working 24-7.

AI Speaker 1:

Ah, I see. So it's less about rigid if this than that and more about understanding the bigger picture.

AI Speaker 2:

Exactly, it moves beyond those brittle rules towards something more flexible, almost intuitive, you could say.

AI Speaker 1:

So are there specific areas where this really pays off, signs that an agent might be the way to go?

AI Speaker 2:

Yeah. The material highlights three main value areas. First is complex decision making. You know workflows needing judgment calls, handling weird exceptions, adapting on the fly, like approving a tricky customer refund.

AI Speaker 1:

Right where it's not just black and white.

AI Speaker 2:

Precisely. Second, situations where your rules have become insanely complicated and a nightmare to maintain Think vendor security reviews with thousands of branching rules.

AI Speaker 1:

Oh yeah, I can imagine.

AI Speaker 2:

And third is when you're drowning in unstructured data, like sifting through thousands of customer emails written in natural language or pulling key facts from messy insurance claim documents.

AI Speaker 1:

Okay, complex decisions, hard to maintain rules or lots of unstructured data.

AI Speaker 2:

If your problem ticks one or more of those boxes, an agent is definitely worth considering.

AI Speaker 1:

Right. So okay, let's say you've identified a good use case. Where do you start designing one? What are those core building blocks?

AI Speaker 2:

again, so back to those three core components. We mentioned First the model, the LLM itself.

AI Speaker 1:

The brain.

AI Speaker 2:

The brain. Yeah, and different models have different strengths, right. Some are better at complex reasoning, some are faster, some are cheaper.

AI Speaker 1:

So how do you choose?

AI Speaker 2:

Well, the common advice seems to be start prototyping with the most capable model you can get access to. Really push the boundaries, see what's possible.

AI Speaker 1:

Prove the concept first.

AI Speaker 2:

Exactly. Then, once you've got something working, you can experiment, try smaller, faster, cheaper models and see if the performance is still good enough for your specific needs. Optimization comes later.

AI Speaker 1:

Smart, Prove it, then refine it. Component one the model. What was number two?

AI Speaker 2:

The tools. These are those external functions or APIs application programming interfaces that let the agent interact with the world outside the LLM.

AI Speaker 1:

The hands, basically the hands.

AI Speaker 2:

yeah, that's a good way to put it. The sources break them down into roughly three types. You've got data tools for fetching info, querying databases, reading files, searching the web.

AI Speaker 1:

Okay.

AI Speaker 2:

Then action tools for doing things sending emails, updating Salesforce records, creating support tickets.

AI Speaker 1:

Makes sense.

AI Speaker 2:

And interestingly, there are also orchestration tools where one agent can actually call another agent as one of its tools to handle a subtask.

AI Speaker 1:

Whoa agents using other agents? Okay, meta.

AI Speaker 2:

It can get pretty sophisticated. The point is equipping the agent with exactly the capabilities it needs for its job.

AI Speaker 1:

Got it Model tools and the third piece was instructions.

AI Speaker 2:

Instructions yes, these are the explicit guidelines and the guardrails that define how the agent should behave. Think of it as the agent's rulebook or standard operating procedure.

AI Speaker 1:

And getting these right sounds critical.

AI Speaker 2:

Absolutely vital. Clear instructions reduce ambiguity, improve the quality of the agent's decisions and prevent it from going off the rails.

AI Speaker 1:

So how do you write good instructions for an AI? It can be quite like writing an email to a colleague, right?

AI Speaker 2:

Not quite. No, the sources suggest starting with what you already have existing standard operating procedures, maybe customer support scripts, internal wikis.

AI Speaker 1:

Leverage existing knowledge Exactly.

AI Speaker 2:

It's also really helpful to prompt the agent itself to break down big tasks into smaller steps Like okay, outline the steps you'd take to resolve this issue.

AI Speaker 1:

Ah, make it think about its own process.

AI Speaker 2:

Yes, and for each step you need to define a really clear action or outcome. Minimize wiggle room and this is key. Anticipate the weird stuff, the edge cases. What happens if the database is down? What if the customer gives contradictory information? You need instructions for that.

AI Speaker 1:

Plan for the unexpected.

AI Speaker 2:

You have to. Interestingly, the sources even mention using other advanced LLMs to help generate the initial set of instructions by feeding them your existing documents. There is even an example prompt for doing that.

AI Speaker 1:

Using AI to bootstrap the instructions for another AI. That's efficient, I guess. It's a potential accelerator for sure. Okay, so you've got your model, your tools, your carefully crafted instructions. How do you actually make the agent you know run? How does it execute a workflow? This is orchestration, right.

AI Speaker 2:

Precisely. Orchestration is all about the patterns and strategies that let the agent follow those instruction and use its tools effectively to reach the goal and where do you start?

AI Speaker 1:

seems like it could get complicated fast it can.

AI Speaker 2:

The advice is generally to start simple, usually with what's called a single agent system meaning, just one agent does everything well, one primary agent manages the whole process.

AI Speaker 2:

It might have lots of tools, but it's one central brain coordinating things. It runs in a loop. Basically, yeah, think of it as read the instructions, figure out the next step, maybe use a tool, get the result, figure out the next step, and so on. This run keeps going until a specific condition is met, like what maybe the agent calls a specific task, complete tool, or it generates the final output you wanted, or maybe it hits an error it can't resolve, or, importantly, it might hit a maximum number of turns or steps to prevent it from just running forever a safety mechanism.

AI Speaker 2:

Definitely the material actually mentioned a function like runner dot run from something called the agents SDK, a software development kit for building these. Think of that as the go button for the agent's loop.

AI Speaker 1:

Okay, and if that single agent have like dozens of tools and complex logic, how to keep that manageable?

AI Speaker 2:

Ah, good point. Prompt templates are apparently very useful here. Instead of writing unique instructions for every tiny variation, you create a template with placeholders, variables.

AI Speaker 1:

Like a fill in the blanks prompt.

AI Speaker 2:

Exactly so for a call center agent. You might have variables for customer name accountage issue type. You fill those in based on the current situation. It makes the core instructions much easier to manage and scale.

AI Speaker 1:

Makes sense. Reuse the core logic.

AI Speaker 2:

Yeah, and the sources generally advise pushing that single agent approach as far as you can before jumping to multiple agents.

AI Speaker 1:

Why is that?

AI Speaker 2:

Because coordinating multiple agents just adds another layer of complexity. You'd only really move to multi-agent systems if the logic gets super tangled or if the single agent has so many tools it keeps picking the wrong one, you know.

AI Speaker 1:

Okay, so only add complexity when you really have to yeah, but if you do need more than one agent, what then? That's multi-agent systems right.

AI Speaker 2:

This is where you break down the workflow and have several agents collaborating. The sources focus on two main patterns here okay, pattern one is manager pattern. Imagine a central manager agent acting like a project lead. It doesn't do all the work itself, instead, it directs traffic. It calls on specialized worker agents using tools. Hey translation agent, translate this to Spanish. Hey database agent fetch this customer record.

AI Speaker 1:

So the worker agents were basically tools for the manager agent Pretty much the manager assigns tasks, collects the results from the workers agent.

AI Speaker 2:

Pretty much the manager assigns tasks, collects the results from the workers and then synthesizes the final output or decides the next overall step. The example given was that translation scenario a manager using separate Spanish, french, italian agents.

AI Speaker 1:

Got it Like an orchestra conductor, making sure everyone plays their part.

AI Speaker 2:

It's a perfect analogy the manager keeps control. The sources did mention a contrast here with some visual flowchart style builders saying that, while those look clear, a code first approach, like with the agent SDK, might offer more flexibility for these complex interactions.

AI Speaker 1:

Interesting trade-off. Okay, so manager pattern is one. What's the other? Big one?

AI Speaker 2:

The other is the decentralized pattern. Here agents act more like peers on a team. They hand off tasks directly to each other, based on specialization.

AI Speaker 1:

So no central manager.

AI Speaker 2:

Not really. No, it's more like an assembly line or a relay race. An agent finishes its part and then uses a specific tool or function to pass the whole task onto the next appropriate specialist agent.

AI Speaker 1:

And it's usually a one-way handoff.

AI Speaker 2:

Typically yeah. Once Agent A hands off to Agent B, Agent B takes over. The example used was a customer service flow.

AI Speaker 1:

How did that work?

AI Speaker 2:

Well, you might have a general triage agent that first talks to the customer, Based on the issue. It might hand off to a technical support agent or sales agent or an order management agent.

AI Speaker 1:

Ah. Routing based on need.

AI Speaker 2:

Exactly. Each specialist handles their piece. This pattern is apparently really good for that kind of conversation routing or task triage.

AI Speaker 1:

I guess you're building a team of specialists. Okay, but with all those power agents making decisions, taking actions, potentially using other agents, how do you keep them from messing up or doing things they shouldn't? Guardrails right.

AI Speaker 2:

Absolutely critical. Guardrails are your safety net. You're managing risks like exposing private data, saying something off-brand or just making bad decisions. Think of them like safety features on heavy machinery.

AI Speaker 1:

And it's not just one big stop button.

AI Speaker 2:

No, the sources really emphasize a layered defense, multiple types of guardrails working together.

AI Speaker 1:

Okay, like what? Give me some examples.

AI Speaker 2:

Sure, you might have a relevance classifier that flags if a user asks the agent something totally unrelated to its job.

AI Speaker 1:

Keep it on topic.

AI Speaker 2:

Right, A safety classifier to detect harmful inputs. People trying to jailbreak the agent or feed it malicious instructions.

AI Speaker 1:

I'm taking the agent itself.

AI Speaker 2:

Exactly. Then things like a PII filter to stop the agent from unnecessarily asking for or revealing personal info like credit card numbers.

AI Speaker 1:

Privacy protection Crucial.

AI Speaker 2:

Very. Also moderation tools to check the agent's output for harmful or inappropriate content before it reaches the user.

AI Speaker 1:

So checking both input and output, yes, you can also have tool safeguards.

AI Speaker 2:

Maybe certain tools are riskier, like delete customer account. You could rate that tool as high risk, triggering extra checks or even needing human approval before the agent can use it. Smart.

AI Speaker 1:

Risk-based controls.

AI Speaker 2:

And then there are more traditional things too Simple, rules-based protections like block lists for certain words, limits on input length, using rejects patterns to validate formats and, finally, output validation, just to ensure the agent's tone and style match your brand voice.

AI Speaker 1:

Wow, that's quite a few layers. How do you decide where to focus? You can't build all of that on day one, surely?

AI Speaker 2:

No, probably not. The guidance suggests this pragmatic approach. Start by focusing on the big risks privacy and basic safety. Get those fundamentals in place, then add more specific guardrails reactively, based on actual failures or near misses. You see, when testing or deploying the agent, learn from experience.

AI Speaker 1:

Let reality guide the hardening process.

AI Speaker 2:

Pretty much it's a continuous balancing act between security and making sure the agent is still useful and not annoying to interact with. The material showed a code snippet using the agent's SDK for an input guardrail, specifically detecting if a customer seems likely to churn.

AI Speaker 1:

And how did that work?

AI Speaker 2:

It used an optimistic execution approach. The main agent process would continue, but in the background, this guardrail would analyze the input for churn signals. If detected, it could trigger a specific action, like alerting a human retention specialist.

AI Speaker 1:

So the guardrail runs in parallel, potentially.

AI Speaker 2:

In that example. Yes, it avoids blocking the main flow unless necessary.

AI Speaker 1:

Okay, but even with all these automated checks, is there still a place for a human in the loop?

AI Speaker 2:

Oh, absolutely. Human intervention is highlighted as a critical safeguard, especially early on.

AI Speaker 1:

Why especially early on?

AI Speaker 2:

Well, it helps you catch those unforeseen issues, discover edge cases you didn't anticipate in your instructions and just generally build confidence in the agent's performance before you let it run completely free.

AI Speaker 1:

Makes sense Train it with supervision first.

AI Speaker 2:

Right, and the sources point to two main triggers for pulling a human in. First, if the agent starts failing too often, maybe it exceeds a certain threshold for errors or retries on a task.

AI Speaker 1:

Too many mistakes Call for help.

AI Speaker 2:

Exactly. And second, when the agent is about to perform a particularly high-risk action we mentioned deleting an account, maybe issuing a large refund or sending a critical communication. For those kinds of things, having a human review and give the final okay is often the safest bet.

AI Speaker 1:

Better safe than sorry, especially with high stakes of things, having a human review and give the final OK is often the safest bet. Better safe than sorry, especially with high stakes. Ok, so let's try and wrap this up. If we boil it all down, what's the main thing people should take away about AI agents from this deep dive?

AI Speaker 2:

I think the core message is that AI agents are a significant step up in automation. They're not just about making existing processes faster. They enable automation of complex, multi-step tasks that require judgment and interaction with the world in ways that, frankly, older software just couldn't handle.

AI Speaker 1:

And they're especially good for.

AI Speaker 2:

For those really tricky workflows, the ones involving complex decisions, messy, unstructured data or those brittle, hard-to-maintain rule systems we talked about. That's where they can be transformative.

AI Speaker 1:

And building them reliably means.

AI Speaker 2:

It means focusing on those foundations the right model, the right tools and crystal-clear instructions. Then choosing the right orchestration patterns. Start simple, scale up carefully and, crucially, layering in those robust guardrails to manage the risks Safety, privacy, reliability they're paramount.

AI Speaker 1:

Right, so for you listening. Hopefully that gives you a much clearer picture of what AI agents are, where they might fit and what it takes to build them effectively and responsibly.

AI Speaker 2:

Yeah, the potential is definitely there.

AI Speaker 1:

It really is. And it leads to a final thought, I suppose as these agents become more common, more integrated, how is that going to change our basic ideas about what work even means or what assistance looks like?

AI Speaker 2:

That's a big question.

AI Speaker 1:

It is Definitely something to chew on. Well, thanks for joining us for this deep dive.

AI Speaker 2:

My pleasure.

People on this episode