Software Testing Podcast - Agentic AI Quality Engineering - The Evil Tester Show Episode 030

This podcast episode is a chat with Dragan Spiridonov who created the Agentic QE fleet. Agentic AI for Quality Engineering. It can work with code, or websites. We will explore Agentic AI, what it means, and how to use it to improve your Software Development and Testing. Also introduce the Agentic QE Fleet and how it augments Claude Code and Claude Flow.

Agentic AI Quality Engineering

Dragan Spiridonov created the Agentic QE Fleet tooling:

This augments Claude Code to extend the Agentic processing to cover Quality Reviews and insights. The AI Tooling can use Playwright, Vibium and other Visual Automating tools to automate browsers and capture screenshots.

You can learn more about Dragan on his web sites:

For a more technical overview of the Agentic QE tool in action:

https://video.agentics.org/media/t/1_htfe35tu/370465892

Transcript

Alan Richardson: welcome to the Evil Tester Show, and in this episode we’re gonna be talking about agentic AI and software development. I am joined by a guest, the creator of the Agentic QE tool, Dragan Spiridonov

Alan Richardson: and you can find his blog, the Quality Forge at Forge Quality Dev.

Uh, so Hello Dragan.

Dragan Spiridonov: Hello, Alan. Nice to, be here as a guest to your podcast as I’m your, you know, fan for a long time.

Alan Richardson: Well, thanks for turning up. ‘cause I’m a fan of your stuff and it’s making me think and learn new things and that’s important. And you have a, an awful lot of websites, so I’m not going to read them all out. I’m just gonna link to them all in the show notes. but I think. Forge quality dev is easiest one to remember, and you just got your blog on it and you’re listing lots of experiments in there because what you’re doing with Agentic stuff is create a tool, but you’re also learning how Agentic stuff works and you’re learning in public and putting all this content out there.

Alan Richardson: I thought it might be worth starting if you gave a quick overview of yourself and your work.

Dragan Spiridonov: Quick one, that that will be hard. So in short, as it’s, we are now 2026, so yeah, I would say like 30 years in, it started as a computer service, technician, went into network administration, was teaching computer science in grammar high school and assistant professor at university, combining. You know, web development, web administration, database administration. In the end, somewhere like 12 or 13 years ago, I ended up in quality assurance world all the pieces and things I’ve been doing previous years just, clicked and I am been learning and. Building on the quality side for now, let’s say 13 years. Uh, for the last nine years, I helped, one startup build completely quality assurance and quality engineering, left that four months ago to start my own, consultancy. Now because, these things. Agentic related are, you know, I felt obligated as I’m learning about them to start sharing that more because, I’m noticing that not many people in industry are embracing this new wave that is coming

Alan Richardson: So, I mean, that’s hard because, ‘cause the agent is not the first thing you encounter when you’re doing ai. Right. We all start with the chat interfaces, chat, GBT, we do stuff. Uh, and I, I started with Claude Code in the chat system and created some applications and then essentially gave up on claude code because you can only do so much in there and it runs outta context and then stops building and then you have to move into the, the CLI.

So there’s a, a kinda natural progression through this. So did you work through that progression? Did you start on the chat systems then other code systems? Or did you just dive straight in with Claude?

Dragan Spiridonov: No, two years ago I started experimenting with a, prompt, so you know how I can prompt an agent. At that time, we only had chats to experiment with, so I would do like a tester, mindset.

I would give the same prompt to chat GPT, Gemini, or claude, and compare the results that I’m getting, a year ago. Or I would say maybe now more year and a half ago, I started diving deeper into really prompt engineering. So understanding how I can, make those general prompts more using like. Role based things, providing context markups, giving examples, so really going into prompt engineering. And that was still not enough because I would get like maybe 50, 60% of results that I could keep. So the next step. Next level was contact engineering. I would say a year ago I started diving more deeper into AI coding assistant tools like client root code that provide their own memory banks and give you like a first level for context engineering, where you are grounding your agents in a certain context of your project when you are, letting them run in your code base that improved the results. not, to get over 80% of the accepted results. sometime nine months ago I experienced agentic development and this approach and agentic engineering that’s like, next level after context engineering. And, you know, the results were mind blowing and now, you know, getting, I would say I can keep more than 90% of the things that agent produce,

Alan Richardson: So we should probably then explain what agent based AI is and how it differs then.

Dragan Spiridonov: yeah, so classical Chat GPT, LLM, approaches. That’s, you are only prompting LLM and there is, there is general behavior there of the l LMS as uh, like a standard transformers. So they will try to, find the best next word in, in the area and give you some answers. Agentic approach is. agents need first to observe the environment, so they need to collect the data about the, you know, let’s say you, you give them, some goal they need to achieve. They need to observe, collect the data that they need. They need to reason, so that’s the moment when either they’re calling the LLMs or they’re using the, some other mechanisms that some of the folks from Agentic Foundations are developing and I’m using like Reasoning Bank and some local specially developed models for specific purposes. And then they’re autonomously executing. towards the targeted goal. So that’s like you are giving the goal, but you are letting the agent do the all the work and heavy lifting and reach the goal. And you as a human in the loop, you are observing the results. And if you implement correctly, you can collect and implement your agents to do the. Their work ex in explainable way, so you can observe what what they’re doing while they’re doing

Alan Richardson: So that was an interesting description and because what you said was the agent, when it does reasoning, will call the LLM, which suggests that the agent can do things without calling an LLM where I’d always just assumed the agent is working as essentially a prompt into that LLM doing stuff.

Dragan Spiridonov: So agents have their own also their also programs. So they’re, pieces of software that they have their own logic. So you can code a lot of, deterministic logic into agent for types of tasks that you are certain that must be done in the deterministic way. But you call LLM only, when there is like a, new. , Crossroads. Or, point where they need to decide in which direction, you know, what would be the next step to execute to get to the desired goal?

Alan Richardson: So is the agent more than just the agents MD file or is that the agent? Because the agent’s MD file is readable. It’s uh, eng English language. So an LLM must have to read it to understand it, but you’ve got code embedded in there to let it do things.

Dragan Spiridonov: that’s not the agent.

Alan Richardson: Okay.

Dragan Spiridonov: only, agent definition that like claude or, codex or Germany will use to start a subtask using that, like a warmup prompt. And that’s, you know, if. Part of that call. So if in the MD agent definition you have a call to a platform, like in this case my agentic QE platform, it extends and it has additional capabilities on top of what is defined in the MD file.

Alan Richardson: Okay. Right. So the agent is kind of like a thread that starts up, it’s given some additional context. Some of the context you feed in through the agent’s file, some of it comes somewhere else, and you could essentially create specialized agents that do certain things that don’t even necessarily need to use LLMs, depending on the kind of orchestration framework that you have.

Alright, see, this is why, I wanted you. In here to explain this stuff properly. And from a, a kind of, this is just gonna be an overview of what we’re we’re covering. ‘cause you’ve done so many presentations on this, I’m gonna link to some of them. In the show notes so people can go down and through this, in more detail because I’m actually giving you a hard task.

‘cause I’m asking you to describe all this stuff in the work without any slides, without any IDE open. You’re not allowed to look at code, you’re just gonna have to verbalize it. So we’ll see how we get on. That’s interesting with the, the agents and you’ve pretty much built. The Agentic QE system on top of Claude at the moment.

So how would you describe the agentic QE system that you have built?

Dragan Spiridonov: it’s an quality oriented orchestration system with specialized agents and skills, like. Covering complete software development lifecycle quality related tasks. So whatever I was doing in the previous 13 or 14 years, like trying to become that calm shape person that, picked up different, specialties on the way the testing the requirements to. Creating test plans, test strategies, different generating test cases on different levels, analyzing the tests for flakiness, so, creating like a combo tool. For all the tasks that me as a tester previously needed to create separate tools. So, you know, you are also like a tool Smith, so I know you’re a colleague that creates his own tools to help him with the testing tasks. So we are using tools to surface the certain types of information and certain phases in the project as we are working on it. So my idea behind the whole fleet was to create. agentic based system that will provide tools I can use in different phases of the project and the product development to surface the information that I would be interested as a tester to identify faster risks to, maybe uncover things that I’m not specialist in, so I’m not. You know, I have a general knowledge of security testing, but I can make my agent, with induced with some practices that I not aware, and I can use that to surface some information that I would miss if I wouldn’t have a person of that specialization in my team.

Alan Richardson: Do you think that your, background, ‘cause when you described your background. It was, an evolution across the entire software development process where you’re, you’re doing development work, you’re doing networking, there’s very technical stuff. You’re teaching it. It’s not, it wasn’t just a I test things role, it was very much a software development role.

Then the testing on top of that showed you that quality aspect that’s required across the software development process. So your agent stack kinda reflects that. It’s not just a testing agent. That’s why this, the qe, the quality engineering aspect encompasses that whole thing. so what’s interesting about that is when you start creating agents that cover topics that you’re not an expert in, how do you evaluate them?

Like, how do you evaluate the security one? How do you evaluate the accessibility agent?

Dragan Spiridonov: I ask experts to evaluate that, so I’m asking other colleagues that I know that are in that field to check the results that these types of agents are producing and provide me feedback what should be improved. And I’m also not. Uh, I’m never satisfied with the first version that I get, so, I will always ask, you know, use, that’s why I develop certain skills I’m using now to evaluate the work of my agents while working on developing my agent fleet. So, I’m like doing a combo of agents and skills, reevaluating the results that they are, producing.

Alan Richardson: So since you’ve mentioned skills, what’s the difference between a skill and an agent?

Dragan Spiridonov: oh, this, this is like a metaphor that, I think people understand. Uh, easiest, everybody knows Matrix. Matrix, right? The movie. So you have Neo, you have Agent Smith, both them. A learned a skill just by, uploading it. So a skill is a knowledge of how you are doing some things. Agent is, let’s say object that executes that skill. So I know kung fu, but I need a physical, or in this case agent, I need like a body to execute the skill and, fight the kung fu with some other agent.

Alan Richardson: Okay. And are they pulled in dynamically when the agent needs them? If you tell agent use Skill X, then it’ll go in and pull it in.

Dragan Spiridonov: yes. And skills have

Alan Richardson: I.

Dragan Spiridonov: own related agents that they are calling because it’s not only skill is not like using only one agent. So for different, types of skills, you can use multiple agents in parallel on sequential or hierarchical, depending on the. , Context of the task again that you are trying to solve.

Alan Richardson: Okay, so since we’re doing this as an audio and we’re gonna start describing the agent QE fleet. So in order to use a agentic qe, you have to install Claude. Get claude working, then you do an NPM install minus G for the agentic QE fleet. Then that installs, something that you can run aqe to initialize information in your project, which then creates a set of, agents and other files in there.

Dragan Spiridonov: there,

Alan Richardson: And then you can start using

aqe on a project and you can do that within Claude and say, Claude, use the quality engineering fleet to. Do something. What kinda things would you use it to do on your project?

Dragan Spiridonov: Um. You can do it for different types of things, depending where your project is. So, let’s, you know, I, I will, for example, if your project is, is in a greenfield phase, so you’re starting with the requirements, you are generating some ideas you created maybe like a product requirements document or some documents that you want, you can use different agents to verify that.

So there is like a QE requirements validator. You can use, it’ll use INVEST and SMART criteria and other, good practices from the field to analyze the requirements to see really if they are, if we are missing something. And a lot of times it’ll find things that are missing. There is like. Lalitkumar Bhamare, he created QCSD version of skills that there is like a QC id, SD ideation form that the whole, heuristic base software testing model. Like, SFDPO or analytic analytics. So complete, heuristic based testing applied to the ideation or to the development or the refinement. So depending on which of the phases, there are different, swarm skills.

So they, he call them like, swarms, like, Q Cs, the ideation swarm or something like that. they’re, shift left testing skill if you are in the left. Like requirements analysis, feature analysis, and the things before we start implementation, we have skills for analyzing what’s happening. Okay, we need it now we have requirements. Let’s create a test strategy. Let’s create a test plan. Let’s define what will be the test charters that we want to, and that’s usually something that you get automatically if you use that QCSD, you know, a requirement or ideations form. It’ll create you suggestions, not only which test on integration or end-to-end level to implement, but which test sessions you should exploratory next to the automation that it provided.

Alan Richardson: Okay, so this, because earlier on you mentioned that the agent will go out

Dragan Spiridonov: out

Alan Richardson: and kind of find its context. So you could instantiate a QE in an existing project. It’ll have all the code. And you could then ask it to do things on the code, or you could instantiate a project in just a J any folder.

You could then start adding text files that describe what testing you’ve done, what you might want to do, how the architecture works, and it’ll use that. Or you could then say, A QE, here’s a website. Here’s a URL. Go away, have a look at it. Then it will go and fetch the content, pull it down, use that as the context, and then do stuff there.

Dragan Spiridonov: Yes. And we, there are visual agents, so there are MCPs and, let’s say, libraries, that agentic fleet. Agents and skills are using to go and really do the browser. uh, I’m using even Vibium for some stuff. So some of our visual testers have like a fallback If Vibium is not working because there is some problem on the Mac.

If the fallback to the agent browser, then as agent browser is already using like a Playwrite. Like an engine behind. If that is not available, it’ll just use the pure, pure Playwrite? And it run Chromium headless. It’ll capture screenshots, it’ll follow URLs. So whatever needed, it can do also from that side.

Alan Richardson: And do I have to install the kind of playwright MCP in Claude first? Or when I use a QE, does it dynamically bring down the MCP?

Dragan Spiridonov: dynamically, pull as a dependency based on my MPM package. The playwright, as a depend, it should pull playwright as a dependency, but is by some chance it’s not there. You can always ask claude. Okay? Please add, playwright, MCP and install Chromium. Run it in headless, mode. And when you set, set it up in your environment.

You are good to go.

Alan Richardson: Okay. So one of the interesting things about your website, the, quality Forge you have on there, I think it’s on the front page, a list of experiments. So it’s experiments that you’ve done, experiments you’re currently doing, experiments you’re gonna do. Then you blog about those experiments so we can see it.

you also have the, I can’t remember what it’s called, the Serbian Agent Foundation.

Dragan Spiridonov: There be an agentic foundation chapter in the time starting as ambassador.

Alan Richardson: Yep. And that’s all streamed on YouTube. And those videos are interesting to watch because you see the evolution of the tool and, you can see that it’s, you’re, you’re learning in public and experiment in public and. Some people might not like the videos ‘cause things are going wrong in the video. But I like that because then I can see it’s not just me.

there’s other stuff that we need to learn and then you see how to overcome it and you, you get the thinking process. ‘cause there’s a lot to learn in here. So this, do you view this as a finished product or a version product? Or is this an experiment that’s ongoing?

Dragan Spiridonov: Um. I mean, definitely a long way to be finished, but it’s quite useful currently, and I’m already getting a, a number of feedbacks from people using it on their products and getting value, especially from developers. So testers are you know, little, we are, skeptic by nature. I would, call it like that. And that’s okay. And as people are starting to use it. And experiment and see the results. I’m expecting them to provide the feedback because again, my experience is limited to my knowledge. I can build the. Fleet best based on my knowledge. But until I’m, and I’m getting, you know, there are contributors already, so Ali started contributing users as they’re using, when they, find problems, they will report a problem and I will, try to fix it, as my time allows. But I’m trying to allocate working daily on this project and keeping it live. it brings value in my other projects as I’m using that to test all other projects that I’m, working, using the agentic approach.

Alan Richardson: So it’s interesting that, I mean, I can understand why people who are more specialized in programming get more out of it than the, people in testing field. It’s partly because they’re the ones that are using these tools more like they’re using Claude much more. It’s part of their workflow. they don’t necessarily.

They haven’t studied security testing, accessibility testing, even to the limited extent that we might have done in

Dragan Spiridonov: in

Alan Richardson: just as part of testing. So the fact that something has a set of encoded information that they can use and it generates a human readable report, right? Because that’s one of the things when I’m using the agentic qe, I’m getting it to generate markdown, human readable reports.

It’s not. Doing stuff for me. It’s not like raising defects around this. It’s generating a report that I read and evaluate and then I can see the gaps in. And I had it running against one of my podcast transcription apps this morning. It was telling me in black and white, all the things I knew were wrong with this.

and I have to work on it. So when you’re using this, the agentic QE part is the.

Dragan Spiridonov: That

Alan Richardson: Quality information and evaluation. Does that mean that Claude, you’re using Claude as your development programming tool and then Agentic QE is a, an evaluation tool on top of that?

Dragan Spiridonov: them, but not only single, you know, not only claude. I am using, orchestration framework developed by founder of the agentics Foundation. And that helped, that really enabled me to get here. I use claude-flow for orchestration, so there is. claude-flow for orchestration, but not only, it’s like a planning research development, CICD, GitHub action.

So whatever, I need development related. There are a couple of testing agents there, but they’re not, they were not on the level that I wanted to have. That’s why I. Created and initially idea of Agentic QE fleet was like a extension for claude flow. So when I originally developed it, it was really tightly coupled with the claude flow.

You couldn’t even use it without the claude flow. And then, in the versions it, evolved into now, you know, standalone platform that can be used not only with the Claude, but any tool that has like, you can use with the MCP. Because my, agentic QA fleet provides MCP server. Any tool with the MCP client can access it. You can also access it through the CLI, so capabilities of the platform are not limited only to the claude code. Interface, but they provide the best results if you use the claude code interface because the subtasks system that they created and the prompts that Claude code pre prepares when he’s activating the agents or, calling the tools in the sequences, I, I need.

Alan Richardson: So that’s interesting ‘cause I didn’t realize that. So when I installed Agentic qe, I did see after the installation that said, now type with this claude install MCP. So I could have done that in any of the. CLI tools. I could be using that in Gemini. I could use that in open code as an MCP server, so I might try that.

That’s interesting. Because also at the moment, I am abusing Claude because I’ve got Claude pointed to open router. ‘cause you can now configure. I used to have to use the proxy, but now you can configure Claude to use that. So I’m using very different models in Claude than the Opus ones. So it’s running a little bit more slowly for me, but I can use this quite happily.

Dragan Spiridonov: on

Alan Richardson: On free models. Completely free models, so you don’t need a Claude plan for this. And, but that’s even more interesting that you’ve got the MCP server, so you could use that directly somewhere else. so, and also, so you’ve got Claude code for coding, then you’ve got Claude Flow for Development. So Claude Code gives you the ability to plan and then execute.

But Claude Flow gives you all this requirements analysis, everything.

Dragan Spiridonov: With the claude flow agents. So

Alan Richardson: Okay.

Dragan Spiridonov: approach, if you watch the videos, I’m using certain agents from the claude flow. There is research agent for a, doing the research. There is like a goal. Based agent, that creates me goal oriented action plan that I can use then to execute. There are, different agents and recently, like a domain driven design. Uh. Approach. You know, we introduced that, two months ago, and both the claude flow V three is rewritten using domain-driven design and heavily using now architecture decision records I’m now describing why some. Implementation decision was made and then that is keeping agents more on track to go much deeper and not, sidetracking while working on the task. So this is, again, one small trick. That significantly or Okay. Significantly. Again, when we are a above 90% significant increases from 90 to 95. So if you can, accept the results that you know from the, that you’re getting from the agents.

Alan Richardson: So you’re kind of hinting at this kinda memory system that’s in there. How does that, so one of the risks with a memory system for any kind of AI is that it learns the wrong things and, and so how do you avoid that?

Dragan Spiridonov: Oh, I have a number, I, I, I would need to go to, to list, all of them, but I have a lot of novel ways, to try to prevent that. So there is a couple of things built by, again, other genetic, foundation members. Roof created a tool called Mean Cut that will do and try to evaluate if some of them. are, really, going into real direction or not. So I, I, you know, I, I, I’m not even qualified enough to explain how some of these things work because, a lot of them are really like. Uh, PhD level mathematics or, academic level of, as new, new, research things that are being done in the, and as soon as something new in the mathematics pops up, guys are implementing a tool around that. So I’m having, now I, so let’s remember a couple of last things that I did. One was governance system. implemented completely governments system, that should uh, evaluation all of all skills we have in the system, and keep that evaluation against every model change that comes on the market.

And it has like a built-in evaluation that tests against three models to see if the change, if the change in the model, we change the skill behavior. Uh, expected results. So there is a lot of stuff, you know, I, I, I, sometimes I even need to find, you know, ask Claude, what do we have here? So I’m creating like a, a notebook, LMs from different versions. Of my repo because you can copy paste like a URL of GitHub repo. I give it like my, release notes page or something like that. And then I’m building like a notebook, LMs, so I can have like a history and ask, questions related to the things that we developed because it’s, it’s going really fast.

I cannot even track what, what we built, but there are things that I don’t think, yeah. Anybody is doing right now in the industry. And I have like one competitive analysis document. I will share you the link later. I did that, just to see, how Agentic QE fleet compares to, what exists currently in the market because

Alan Richardson: so I will be very interested in that competitive analysis, but since you’ve mentioned the competitive space, what I find it interesting, when I first heard about your work in the Agentic QE, I, in my head it was in the competitive space of testing tooling. And I don’t think it really is, it’s, an evaluation of the entire development process.

So what else would you be comparing it to?

Dragan Spiridonov: I can compare parts of the fleet vendors, and that’s usually when I see, when you see list, you know, I have this and they have only, each of them has, one box or two boxes or something like that.

You know, only parts of that because they’re covering different things and they’re focusing. I saw, most of the tools were focused on like, UI. You know, testing generation, not to count the tools that developers use that are checking code quality and on that unit integration level and there, but testers would focus their attention on the UI level, test generation, self-healing and things, and that’s wasted effort I would call it. So their information and, testing. Gold nuggets all over the project and the process that we are working on, and there are places where we can, make a bigger difference removing one of the bottlenecks than, working on creating better UI tests.

Alan Richardson: One of the benefits is, the Agentic QE fleet is open source so anyone can download it. just in case anyone wants to, I’ve had best results installing it on Mac, in fact, all the AI tools. I’ve had best results, install ’em on the Mac or can a virtual machine running Linux or a Linux machine. I’ve tried installing this stuff on Windows directly.

Some of it works. but when, like some of the coding tools work, when they start using tools, they all default into Bash. So you have to run it under the, the GitHub, CLI. But then when Node gets involved, some of the tools struggle because the Windows node installation is non-standard, doesn’t work as well.

I mean, you’ve mentioned that you’re trying to improve that on your tool to get it working more on a Windows system, so that’ll be interesting. But it’s open source so anyone can try it and it’s uh, free.

Dragan Spiridonov: Fork it and adapt it. So this is like my experience and in this case Lalit’s, because he was contributing a lot from his QCSD framework, combined, condensed into a fleet itself. But this is a showcase to show you what’s possible. You can fork it, you can build more specialized agents. My context is we are not doing this.

I got feedback from one user, but we are not doing this like that. Okay? use it. You have everything Engine is there. Modify the agent definition. to your context, modify domain definition to your context. So you can always ask Claude to modify to your context after forking it. you see contribution, something that everybody would, benefit from, you know, please create a PR back and let’s include that and let’s work as a community to make this tool, valuable to everybody. Who, wants to use it because I think will not give you 10 times, you know better, but it’ll give you some examples from which you can start, building and, , leveling up your knowledge. and experience and using in, in your process to, get results faster than, needing to sit and, code for, three hours and, and checking your flaky tests, run like, five times or something like that. So, yeah,

Alan Richardson: So, I mean. I would advise people not to do what I did when I first looked at the tool, which is download it and go, well, I don’t use Claude, so I’m gonna take all the agents and the skills. I’m just gonna plug them into, open code and experiment it and try it. That was an interesting experiment, but ultimately I didn’t know if it was working, it wasn’t working or not.

So I’d say try and get it working. Properly out the box, then go and review the agents that are in there. That’s an interesting experience. ‘cause you can see what’s codified in there. That actually got me thinking things like what would it take for me to codify my knowledge? What? What would an agent that does what I do look like and that, that’s an interesting thought experiment, an interesting thing I think, to try and practice and do.

So how would you recommend people go about learning? To use your tool.

Dragan Spiridonov: Uh, I would first direct them to you. You, you already mentioned, I think oh, not, you didn’t mention that URL, but you will share it. Agentid dash QE dot dev, it’s whole explanation of the agentic QE framework playbook, with the steps with the assessment. Where you are right now, you can do your, because it’s grounded in the classical quality engineering, you can assess your current maturity status to see how ready your processes are to go into agentic. So that’s like, if you want to, implement that in your company, in your team and processes, you can be guided there. But if you just want to experiment, take a look at the examples. There is a number of use cases. You can review the skills and then go into repo and find the skills definitions there. You know, you can even use this in the, claude code, but in the claude desktop or in the claude web. But instead of NPM install, ask Cloud, like run NPX Agentic dash QE at latest space in it. Space dash dash auto and it’ll install in the sandbox environment Agentic, QE fleet. And you can run it that. You know, first initial, just ask it, explaining what I can do with the Gent, QE fleet, and start experimenting with it.

Alan Richardson: I didn’t know you could do that. So can you do that on the free Claude plan as well?

Dragan Spiridonov: Uh. I don’t think you have the claude code available in the free plan. You need PRO license at

Alan Richardson: Okay. Because it’s just when you said, you mentioned you could do it on the web.

Dragan Spiridonov: but Claude code

Alan Richardson: Okay. Okay. Because Claude on the web will generate code, but that’s different from Claude code. Okay.

Dragan Spiridonov: like a,

Alan Richardson: I.

Dragan Spiridonov: a code. is just the chat, but slash code, you get the code, but you need to be pro or max account to get the, access, I think to the code

Alan Richardson: So, so one of the things, when I’m looking at the agentic QE thing, it’s, and the, the whole agent concept, it’s one of the, motivations for me looking at paid plans of things like claude. ‘cause I’ve been really avoiding trying to commit to a, a tool.

Dragan Spiridonov: I’m, you know, I need to jump in. I’m working, really making the fleet, vendor independent. So that’s, you know, I, and I see now, you know, claude can be used now with any. LLMs So you know, claude itself,

Alan Richardson: Yep.

Dragan Spiridonov: uh, Agentic QE Fleet already has capability. You can configure it to use open router to use other models.

There is configuration option. You can ask it to configure it to use other models. already used something that’s called Tiny Dancer that will. Allocate different complexity of tasks to different models. So it’ll not burn your cost if you’re even using to the API, not to the paid on accounts, but to the API, like, simple tasks will go to the haiku. Models then, medium complex tasks will go to sonnet and then, like a test strategy and some things will go to the Opus model. So it’ll already do automatically allocation of the task when the agents are being spun to different models

Alan Richardson: Because the Haiku models,

Dragan Spiridonov: Yeah.

Alan Richardson: yeah, the haiku models are certainly very affordable through an API,

Dragan Spiridonov: Yep.

Alan Richardson: the, sonnet and other ones are less so. and. I, ‘cause when I’m looking at the agents now, I would just imagine this is just gonna burn through my API credits. I’ve assigned it all to free models, but then I’m give, I’m working on code content.

I don’t really care so they can train their models on it Doesn’t matter to me. I.

Dragan Spiridonov: working on, on setting up a, OlLama, version. I cannot run everything on my Mac, so I’m, working on setting up cloud version where I will, have my own model. So that’s something that’s already there, but I need a model I can, have access to and I can connect it. And when I connect both cloud to that model and my fleet to that model, I’m vendor independent.

Alan Richardson: Yep.

Dragan Spiridonov: that’s, in the plan soon.

Alan Richardson: I, I feel kind of vendor independent at the moment. ‘cause I’ve, I, Claude code itself is pointing to other models and your, your tool is working absolutely fine in that instance. So we’ve covered how to learn, agentic QE tool. How would people go about learning agent based AI in general?

Dragan Spiridonov: Oh yeah. A a lot of places. But what are the good. So I’m suggesting, if you never experience anything in related to this, start with the basics. I watched like a Stanford CS 230, set of lectures from 2025, fall winter season. So that’s seven or eight videos are really giving. good explanation of all the concepts you need to understand about agentic development, will give you like a, good starting point to dive deeper into agentic engineering approaches promoted by agentic foundation. So we are preparing, We already have one-on-one. we are starting working on 2 0 1, 3 0 1, 4 0 1. So different courses are coming. There will be accreditation for agentic engineering role. That’s like a future role coming. And there are all, I would say all from adjunct foundation are sharing as they’re, learning in public.

So uh, like next to my. Agentic Serbian Foundation, YouTube channel. There are regular agentic foundation, events happening every Thursday and Friday. That’s like 6:00 PM Central European time. That’s like a noon Eastern European time, American Eastern time. so, links are shared usually on the LinkedIn. People can join there. Uh, there is videos.agentics.org. Website. That’s a collection of all the recordings of all the previous sessions from the webcast and Aker spaces from last, I would say year and a half of these Thursday and Friday meetups. So there is a lot. Of content. And a really interesting thing on that website, you have like a chat agent, you can ask questions.

For instance, how can I use claude flow and Q Fleet for software development or agentic development? And you will get like a slide deck, flashcards with the lessons and uh, links to videos you can access and look, deeper. Into the topics. So, you know,

Alan Richardson: I’ve started.

Dragan Spiridonov: yeah, you can, that those are like good starting points and then you will, follow people on LinkedIn that are sharing things and Yeah. You grow your network from there.

Alan Richardson: Yeah, so I’ve started going through the the video.agentics.org. And that summary system is necessary because the videos are not hosted on YouTube. ‘cause if they’re on YouTube, I would just take the link, post it into Notebook LM and do it that way. So, but again, it gives you more control over, over what is happening.

So one of the things you mentioned there is this is leading up to accreditation for a different role. How do you view the use of Agentic AI as changing your skillsets and your particular role?

Dragan Spiridonov: This is like, moving. Into a new role. I would call it like, I’m calling it for six months or more like orchestrator. So I needed like a mind shift change. I needed to, let go of some of the things and unlearn some of the things I’ve been learning for 20 years or plus. What helps me, that’s that critical mindset. , People say, but you will, you know, become dumber. But no, I’m just thinking on a different level. It’s a different level of abstraction. I don’t, you know, I, you know, are you looking at the code?

Yes. I’m looking. The code, meaning if I, see hard coded values or I see, something that’s not logically, you know, should be there just to, glance over it. But as the amount of code is, you know, code is becoming irrelevant. outputs, again, focus on the output. the code is doing what is expected, is it, you know. Developed by these quality characteristics that are important for our project is the performance. uh, below 200 milliseconds is the security SOC two or iso certified, So what are the things that I’m interested? I’m, need to let go of that code, and that’s something that heavily needed to lean on this. And to let agents do all the heavy lifting thing from there. And I need to do thinking, controlling, how I saying decisive, you know, the correction measures. when I notice that behavior that it’s, trying to go into that, that was happening more before. You know now with the guardrails and as agents are working and learning patterns, that’s also as you start using the Agent Q fleet, there is a whole memory system there where agents are learning patterns and self improving time and do a lot of things that in the end. most of the tasks that are observed as probabilistic because they need the LLM reasoning for them to make the decision in certain places in their, task execution. They will have patterns to recognize they can reuse, so they will become again, deterministic. But having, now covering them whole SDLC, I am feeling like I’m having superpowers. Yeah, but

Alan Richardson: I do know that feeling. Yep.

Dragan Spiridonov: okay. You know, step back. I’m still not there to make it fully autonomous. There are guys in the foundation doing everything without the human in the loop. They’re just doing the final verification when the app, okay, app is built, this is the new version, deploy to staging, let’s go and check. So they’re not even checking the git diff. And I’m still, you know, heavy on diffs.

Alan Richardson: I guess it’s, it depends on what kind of app you’re building and how you’re building it. It sounds as though when you’re using claude code with, Claude Flow, you’re getting a lot of unit tests in there at the same time, and potentially unit tests and requirement level, domain level tests, that gives you confidence so you can review those tests rather than necessarily reviewing all the code.

Dragan Spiridonov: code.

Alan Richardson: And then it depends, I guess, how much you care about the design patterns in the code as to whether you review the code for those design patterns either being met. But you also, in the QE fleet, it will tell you whether the code complexity is too high on certain classes, in which case you can say, go in and refactor this.

So you’re getting some prompts back from the QE fleet as to where you maybe should be concerned and direct some human reviews in the code. But it is interesting when you start using these tools. ‘cause when it’s a, a really small prototype, I don’t really review the code. I review does it work, and I go into test mode.

So I’ve gone into product mode to say, do this thing. Then I go into test mode. So all the skills that we’ve learned in the, the quality and testing process are really important. The skills that you learn at a more senior engineer level with a staff engineer. leading teams, those now are important because that’s what you’re doing with the, the coding tools.

Dragan Spiridonov: you know, team of agents and you need to coordinate them and give them context and assign a task clear enough way, so they are not lost in what they need to do. And that’s also a good thing that you pointed. I noticed that like, leadership, management skills are also important, critical thinking, communication, collaboration, and creativity. So, those are the skills that I see and that though they stay human related, so that that’s something that you cannot, automate away.

Alan Richardson: But having said that I did buy a SaaS. A couple of weeks ago that I refunded ‘cause I was absolutely sure they were vibe coding it ‘cause there were bugs. And I asked for a feature request. The feature request came in about an hour later and it’s like, there’s no way this has been tested properly. And then I started using it and it was buggy and I just, I’m not, I’m not using this thing.

So there are many people out there using these tools that do not have that attitude that you just mentioned or is clearly embodied in the agentic foundation yet. So when you’ve been building this tool, I’m conscious of time, but I want to try and squeeze in some extra lessons learned when you’ve been building the, the tool.

‘cause a lot of people might not jump into agentic AI yet, but there’s probably lessons there that they could take in any of the, the use of AI tooling that they’ve got. So what major lessons have you picked up from the use of AI tools?

Dragan Spiridonov: Yeah, spend your time to understand better prompt and context engineering. That’s, that’s like a ground base for, going step further. Because if you are not familiar of, what the good prompts looks like and what, how to set up a good context for your agents to ground them when they start working because, like. Each of the agents, the coding agents assistant, has their own grounding basic rules. So the Cloud MD agent, md, Germany, md. So in the root of each project, you go get like a basic grounding role, but don’t. Don’t try to put everything I, I notice, you know, people are doing, you know, mistakes. That was my mistake. And like your claude MD shouldn’t be more than 500 lines. Put links with helper files. If you have a need, you need to put the basic rules there. if you have a need of having more than 500 lines, because that, takes more context when claude is being loaded, so you can check the size of the context changing. As your claude MD files in is increasing, put the links and put helper files. That’s, that’s like a, basic thing to help people keep your claude MD clean and neat with, the provided like create something that’s, they go, also think anthropic just release, constitution give it Why So it, you know. Works better if it has something that you know, some people from Identity Foundation called North Star create your golden examples, golden documentation. So you have your own way of doing things. Use that, give that to agent and ask it now. Create me a prompt that I can repeat this every time, or create me a skill that will do this or create me agent that will do exactly this. But start from your experience and then use the agent to augment and amplify that because it’s like a, library on wheels so you can ask things. But again. It’s, again, at the end, again, critical thinking. So that’s why I have either brutal honesty review skill, or I have like, six thinking hats skill to observe the problem from different perspective, not only from one, trajectory.

So if you give like a, I have a devil’s advocate agent, but that’s like a single trajectory. So you know also what works like. Trust of brains. Ask a agent like act as a team of users. It changes the behavior so you know the small tricks from prompt and context engineering. And start experimenting in building and testing because un until you start, you know, feeling it in your fingers. You know, if, experiencing that mind shift change, you, you, yeah. That, that’s, you know, you need to, experience it in order to see how it can level your you know yourself to another level. Yeah.

Alan Richardson: Thank you. Yeah, I think that’s really important. That’s really important that people get to, it’s, the theory is useful, but the hands-on practical use is where you really. Get it. And also the theory can be really intimidating, right? Not everyone has a computer science background or has done the university degrees, but everyone has done the job.

And you can get hands on and use this. you can install it quite easily if you’ve got the right, Linux type setup and just start working with it. So, so that was a useful brain dump of knowledge. So now I think the hardest question is, so where is this going in the future? How the agentic ai, how’s it gonna impact testing and development?

Dragan Spiridonov: So, yeah, more and more autonomous. smaller models, specialized models or specific reasoning banks packed for specific tasks. What I’m seeing already guys from the Foundation are doing, they’re training small co models like, half a billion parameters, 1 billion parameters for specific tasks.

In for your tech stack. So that’s where I’m seeing, and when you have these specialized trained models or reasoning banks with the specialized agents, you’re again getting the deterministic behavior that most of us expect in the software development. But, done autonomously with agents.

Alan Richardson: All right. Well thanks very much, Dragan. so everyone can find you at Forge Quality Dev, and I will include a whole bunch of other links in the show notes here and I’ll try and include some of the other ones you’ve mentioned, but that was incredibly useful for me. I’m hoping it’ll be useful for everyone else.

But the key point is.

Dragan Spiridonov: is

Alan Richardson: get hands-on, experiment and have a look at the agentic qe, dot dev, try and use it, install it even at a very loose level, just on a website. You don’t even need the code base. Just see how you get on. Alright, thanks very much.

Dragan Spiridonov: Thank you, Alan for having me.