AI Agents That Can Literally Use Your Computer

A bit about AI
The day before yesterday OpenAI introduced a new toolkit for building AI agents, and today I finally got around to reading the release notes and the description of what is actually there.
And there is a lot:
First, a convenient SDK for building your own AI agents.
Second, a set of tools that almost removes any limits on a developer’s imagination, namely:
1. Calling predefined functions
2. Web search
3. File search
4. Using the computer (!)
The fourth point is where it gets serious, at least because the 4o model’s 'vision' can now be used to perform actions in the OS or in the browser on the user’s behalf. How it works: 4o looks at whatever is on the computer screen → based on the context it performs the required action (click, text input, scroll, wait, key press, etc.) → then returns a screenshot back to the model for the next step or to finish the loop (the diagram in the screenshot shows exactly how this works). It is absolutely mind-blowing. Just imagine: AI can now do on your computer ABSOLUTELY EVERYTHING you can do.
We live in interesting times.
More to explore
The Failure of AI Skepticism: Why Manual Coding Is Already the Wrong Layer
In recent months I keep seeing the same pattern: someone posts another example where some "vibe coders" used AI to generate a project, left API keys on the fron…
You Don’t Need a Perfect Modern Stack to Agentize Your SDLC
Continuing the idea from the previous post. Many companies believe that to add agents to their SDLC they first need to completely get rid of legacy, move to mic…
AI in Software Development: What Comes Next?
AI. What’s next? Up until around December, using AI in development basically meant: prompt -> copy some code -> paste -> tweak -> repeat. Now this a…
Startup Taxes Between Estonia and Portugal: A Quick Reality Check
As a tax resident of an EU country who files my own returns, today is my quarterly 'Tax Day'. On this day I set aside a few hours to file social security report…