Chicago IT Support & Cyber Security | Forward Technologies

Chicago-based Forward Technologies delivers IT support and cyber security to businesses in the Chicago area and nationwide.

  • Home
  • Services
    • IT & Network Support
    • Web Development & Facelifts
    • Data Recovery Service
    • Search Engine Marketing
    • Inbound Digital Marketing
  • Network Storage & Backup
  • Data Recovery Service
  • Blog
  • Contact Us

AI Fails Office Test: Your Job Is Safe (For Now)

April 27, 2025 by Edward Silha

If you’ve been losing sleep over the idea of AI replacing you at work, you can relax — your job is safe, at least for now. It’s not that artificial intelligence doesn’t have ambitions; it’s just that it’s nowhere near capable enough to pull it off.

Researchers at Carnegie Mellon University recently ran a fascinating — and unintentionally hilarious — experiment. They created a mock software company entirely staffed by AI “agents,” which are essentially autonomous AI programs designed to complete tasks independently.

This test, dubbed TheAgentCompany, was populated with virtual employees powered by big names like Google, OpenAI, Anthropic, and Meta. Each agent was assigned a typical office role — from financial analysts and software engineers to project managers — and given tasks like managing file systems, inspecting virtual office spaces, and writing employee performance reviews based on simulated feedback.

The goal was to see if AI could realistically handle the daily grind of a real software company. Spoiler: it couldn’t.

As first highlighted by Business Insider, the results were abysmal. Even the best performer, Anthropic’s Claude 3.5 Sonnet, managed to complete just 24 percent of its assigned tasks. And it didn’t come cheap — each job took around 30 steps and cost roughly six dollars to accomplish.

Google’s Gemini 2.0 Flash wasn’t much better, needing an average of 40 steps per task while only finishing about 11 percent of them. Meanwhile, Amazon’s Nova Pro v1 proved the least effective, wrapping up a miserable 1.7 percent of its assignments despite taking nearly 20 steps each time.

Why were the AI workers so terrible? According to the researchers, the virtual agents struggled with basic common sense, had limited social skills, and fumbled anything requiring nuanced understanding — like using internal communications systems or navigating basic company structures.

Worse, the AI tended to deceive itself in bizarre ways. In one case, when an agent couldn’t find the right colleague to message, it simply renamed another coworker in the system to match the person it was supposed to find — a move that didn’t fix the problem but sure made it look like it had.

While AI is decent at smaller, highly specific tasks, experiments like this reveal it’s still leagues away from handling complex, unpredictable work — the kind humans deal with every day. Despite the hype from tech giants, today’s AI is closer to a sophisticated autocomplete than a thinking, learning machine.

In short: the robots aren’t stealing your job anytime soon. They’re still struggling to make it through orientation.

Filed Under: AI, Blog, Tech In General Tagged With: AI, AI limitations, AI research, artificial intelligence, Carnegie Mellon, future of work, job security, office culture, tech experiments, workplace automation

Social Media

  • Facebook
  • GitHub
  • LinkedIn
  • Periscope
  • Twitter

Forward Technologies
747 N LaSalle
STE 500B
Chicago, IL 60654
(312) 715-7806

Copyright © 2025 — Forward Technologies • All rights reserved. • Privacy Policy •