Welcome to the tenth edition of Barefoot Bytes. In this edition, I want to talk about AI Agents. I went 42 full trips around the sun without using the word “agentic”, but I’ve already used it like three times today. So what are agents? What can they do? And where are we headed with them?
IBM defines an agent as “a system or program that is capable of autonomously performing tasks on behalf of a user or another system.” More simply, an AI agent autonomously takes actions to achieve goals. Programming computers requires humans to tell them explicitly how to do something, while agents only need to be told what to do.
There are a few key components of agents. I’m going to use an LLM-based agent just to simplify the discussion, but there can be others. The key components are Knowledge, Tools, and Goals.
Knowledge in this context represents what a model already knows going in. This might mean the billions of parameters that an LLM was trained on. It also might mean some supplementary content to layer on top, like perhaps some proprietary company data. It can also include information about how to use tools, like API documentation.
Tools here represent capabilities that an LLM might have external to its own knowledge. This could include searching the web for more information, making an API call to a third party, or even using your computer on your behalf. These are often pre-defined, but it is possible for an agent to define new tools on its own, given the right environment. Remember when ChatGPT couldn’t tell us who won last night’s baseball game because of the cutoff date of its training data? Web search was one of the first tools introduced by OpenAI to address that limitation, and tool calling is a defining characteristic of an AI agent.
And finally, goals represent the purpose of the agent. What is it trying to accomplish? These would typically be set externally by a human or perhaps another agent.
So take a simple example, “Who won the Padres game last night?” An agent might put together the following plan:
Step 1
Call a tool to figure out today’s date (funny quirk about LLMs – they don’t inherently know the date)
Step 2
Calculate the date and time of “last night”
Step 3
Call the web search tool to search for the the team that played the Padres and the score
Step 4
Check knowledge to understand how scoring in baseball works
Step 5
Calculate the winner, based on the scoring rules, by determining who has the highest score
Step 6
Generate a response for my user that the Padres did, in fact, win the game last night.
This agent used a combination of knowledge and tool calling to meet my goal of figuring out if the Padres won last night.
Agents have evolved over time, and the future will likely see further iterations. We started with reflexive agents like your thermostat. We then moved to model-based agents like a Roomba or a Tesla. Now we’re talking about goal-based agents. Moving forward we’ll likely see utility based agents that still achieve a goal, but optimized for a particular reward function, like maximizing the number of dollars earned or minimizing the amount of human suffering. And then finally we’ll have learning-based agents that are continuously incorporating new information back into their core knowledge.
In my lifetime I believe that I will have my very own Jarvis (the AI from Iron Man), but his name will be Gunther. Gunther will manage a team of specialty agents that he can call upon for all sorts of work and personal tasks. He will know everything about me, my style, preferences, strengths and weaknesses. He will have ingested everything I’ve ever written. He will know the composition of my family and my social and professional networks.
Hey Gunther …
Order Chinese takeout. Prepare a competitive analysis for my new startup. Remind Kep that we’re playing Padel tonight. Schedule a 30-minute call with Peter. Plan a trip to Scotland next summer with my wife and kid. Monitor my biosensors and plan meals for this week to optimize health. Deploy some capital in a moderately risk-averse portfolio.
Now throw in a robot. Hey Gunther …
Cook us dinner. Go grab the mail and bring me anything important. I can’t find my keys. Can you fix this hard drive? Take my son to school.
Now zoom out beyond just a personal or work agent …
Write and execute this three-year business plan. Build and operate this hospital.
That escalated quickly. Just to even see a potential path to some of those things is incredible. This is the beginning of a wild ride and we have front row seats. What a time to be alive.
While it’s fun to imagine what the future could hold, I always try to bring things back to the practical. What can you do with agents right now? It’s simple – workflow automation. Closely examine how you and others at your organization are spending their time. If they are reading or writing long documents, updating internal systems, doing data entry, generating marketing content, providing customer support, answering HR questions, or generally performing fairly repetitive tasks, then it’s never been easier to automate those things. Not sure how? Reach out and I would be happy to point you in the right direction.
Well that’s a wrap! I’d like to take this last minute to thank all of you for reading and giving great feedback, which motivates me to keep writing. It is much appreciated.
4241 Jutland Dr., Suite 300
San Diego, CA 92117