Can-AI-replace-tools-like-Asana-I-spent-15-minutes.jpeg

Can AI replace tools like Asana? I spent 15 minutes building an app to find out.

Just 15 minutes, from concept to publish. That’s how long it took for me to create a basic version of a work management tool like Asana without writing a single line of code.

The idea struck me one weekday when I asked my team: “What annoys you most about our workflow? What’s the one thing you want me to change?”

One pain point came up: They weren’t a huge fan of the Google Doc I’d been using to track our long-term reporting. So, I thought: I have 15 minutes for a coffee break. Let me see how far I get using vibe coding — the concept of using prompts to create software with AI — on Base44, an app-building platform.

While my dashboard app has far fewer features than Asana, the results impressed me. And it taught me an important lesson about the “SaaSpocalypse,” the idea that AI can create products for free that work as well as software products companies would usually sell. Those fears have battered software stocks, with Asana down about 54% this year, and Atlassian, the company behind project management app Trello, down about 59%.

An Asana spokesperson told Business Insider that a productivity dashboard is “a small piece of what companies need to run work effectively, at scale.” They added that Asana’s tools help coordinate work across “many teams and large departments” — including between humans and AI agents.

My project also gave me other reasons why AI may not mean game over for productivity software companies.

From prompt to publish in minutes

I’ve used a range of productivity tools, from Notion to Monday.com and Asana. I find all of them useful, particularly Notion’s high level of customizability and Asana’s flexibility for collaborative teamwork.

They gave me a good idea of what I wanted, so I started off with a simple prompt:

I want to vibe code a slick dashboard for a small team of reporters at Business Insider. I want it to be a slate for enterprise reporting, allowing each user to input their stories to a common dashboard. I also want functionality that lets a user drag and drop entries into a publishing calendar, with daily/weekly/monthly/yearly view toggles.

I plopped this prompt into ChatGPT and asked it to generate a detailed prompt for Base44 and Lovable. These are two of many one-stop shops on the market that let users build and launch the app directly on their platform.

ChatGPT gave me tailor-made prompts for each platform. I refined the prompts by asking for more functionality, then prompted ChatGPT to troubleshoot in advance if the instructions might create any issues on the Lovable and Base44 backends.

After five minutes of planning, I had my detailed prompts locked and loaded.

10 minutes to build

This wasn’t my first time using Base44 or Lovable. I’d used vibe-coding platforms to try to code other apps, including one for tracking collectible cards, so there wasn’t the same learning curve as a newcomer.

It was extremely easy to get started. All I did was plug my ChatGPT-generated prompt into both platforms. I walked away for five minutes as the platforms’ chatbots “thought” their way through my request, figuring out how best to execute it.

When I returned to my laptop with a warm mug of tea, I had two complete prototypes generated on both apps.

I dedicated 5 minutes to ensuring the app was secure, adding login and authorization permissions for each reporter and editor. That’s something that’s baked into off-the-shelf apps like Asana, and security has caused headaches for other apps built with AI. I also got picky about customizing the dashboard’s aesthetics, and spent a minute or so changing the font types and colors on each platform.

It was important that the app allowed me to sort projects by progress and see at a glance all the work each reporter had on their plate. I also wanted a broad calendar view to see which stories I was planning to publish in the next month. And I wanted a repository of works-in-progress.


A screenshot of Cheryl Teh's newsroom dashboard on Base44.

It was important to me that the app had tabs for a dashboard, a calendar view, and a section for works in progress. 

Cheryl Teh



It was also essential that the app include tabs for a dashboard, a calendar view, and a section for works in progress.

I also asked the vibe-coding apps to make sure all the dashboard data could be downloadable in one click, so my writers have fast, easy access to their complete story slate.

After some back-and-forth prompting, I got all of these features — but I did burn through all my free credits on Lovable before getting the app ready to use. But in under 15 minutes and within the free credit limit, my Base44 dashboard for drafts was ready for launch.

The hype train for vibe coding is real

I’m no coding wizard. I have distinct and embarrassing memories from college of having a minor crisis trying to build a website on Dreamweaver and struggling to build a codebase for my master’s thesis. As I see it, vibe coding has opened the door wide for nontechnical people like me to build the bones of simple applications in a short time.


Lee Chong Ming, Cheryl Teh, and Aditi Bharade.

My team and I vibe-coded apps on various platforms to see how the products stack. 

Amanda Goh



My team and I recently vibe-coded apps on various platforms as an experiment to see how the products stack. We built several apps — including a thumbnail composite-maker, a writing companion, and an AI-powered photo critic. In most cases, we got these apps to a usable state in under 30 minutes.

Those experiences make it easy to see why AI is such a problem for software companies like Asana. In an interview with Business Insider’s Alistair Barr, Asana’s CEO, Dan Rogers, acknowledged the existential threat that companies like his face. He said this threat also presents a new opportunity for Asana: to go all in on coordinating a workforce in which humans need to work hand in hand with AI.

I’m also hesitant to write these firms off. For many users, Asana’s links to email, Slack, and apps like Canva and Zoom remain valuable. That infrastructure, plus things like cybersecurity, is typically baked into off-the-shelf software and lacking in vibe-coded projects. And, obviously, my dashboard doesn’t have the capability to track AI agents and their workflows, as Asana plans to do.

“Orchestrating humans and AI is an incredibly complex thing to do — and that complexity is underscored by the fact that many AI-native startups and foundational model providers use Asana to run their own work,” the Asana spokesperson said.

Since I made the tool in March, my team’s been using it every day, and it’s front-and-center during team pow-wows and at our 1:1s. It’s safe to say my vibe-coded app meets my basic workflow needs — and for free.




Source link

AI-agents-failed-at-real-world-consulting-tasks-—-but-Mercors.jpeg

AI agents failed at real-world consulting tasks — but Mercor’s CEO says they’re still on track to replace consultants

New research suggests an AI agent can’t fully replace a human consultant — at least for now.

Mercor, the AI training giant, tested how well leading AI models, acting as agents, performed real-world consulting, banking, and legal tasks.

The models failed most of the time, but Mercor’s CEO, Brendan Foody, told Business Insider that the results tell only part of the story.

The consulting tasks in Mercor’s APEX-Agents benchmark were designed to simulate real management consulting work, based on expert surveys and input from consultants at McKinsey, BCG, Deloitte, Accenture, and EY.

Across all task categories, the AI agents successfully completed the tasks less than 25% of the time on the first try. Given eight attempts, the agents could only complete 40% of the tasks. For the management consulting tasks, OpenAI’s GPT 5.2 initially performed the best, completing nearly 23% of the tasks on its first attempt. Anthropic’s Opus 4.6, released this week, performed even better at nearly 33%.

While many of the tasks were not completed, Foody said the success rate for GPT 3 was only 3%, compared to 23% for GPT 5.2. Anthropic’s model went from 13% to 33% on consulting tasks in a matter of months. Foody said he expects the success rate of the models to be closer to 50% by the end of the year.

“These are some of the hardest tasks in the economy that people pay millions of dollars to consulting firms to do, and the models are finally being able to do them with an incredible rate of progress,” Foody said.

AI has already disrupted the consulting industry, changing the way firms hire and make money, but the likelihood of agents displacing consultants grows as the models continue to improve.

McKinsey chief Bob Sternfels recently said the prestigious management consulting firm had 60,000 employees, 25,000 of which were AI agents.

Sternfels recently said it’s the first time in McKinsey’s history that the company is able to grow without growing its head count.

Where AI agents fail in consulting tasks

The frontier models Mercor tested included those from OpenAI, Google, and Anthropic, among others.

One example consulting task instructed the AI agent to “analyze category consumption patterns and market penetration using the Category Penetration Score methodology for PureLife’s portfolio strategy,” asking for several specific outputs in response. The AI agents failed to produce an accurate response.

“No model is ready to replace a professional end-to-end,” the findings concluded.

Mercor found the AI agents were great at research and pretty good at data analysis, Foody said.

Where they consistently got tripped up was on longer-horizon tasks — the longer it would take a human to complete a task, or the more steps it took, was the biggest indicator that the model might have a hard time.

Unlike a human, Foody said, the models struggle to understand where in a specific file system they should look for the right information, so they often end up looking at the wrong files. They struggle with the planning side of figuring out how to work with multiple tools and cross-referencing files at the same time.

For tasks that can be done in an hour or less or that only require the use of a single tool, the models perform relatively well.

Foody said the agents are almost like interns, where they might have a 50% pass rate, and the partner is still noticing a lot of issues in the work.

Frank Jones, a former KPMG consultant who now works as an expert contractor for Mercor, said in his experience training AI, he’s found the models can get close at certain tasks, but that some human refinement is often needed.

He also said the models need very specific prompts because they don’t always understand common expectations or phrases in consulting, like “client-ready.”

“Most consultants, they know what that means. But for AI, I think there’s a lot of nuance in that,” he said.

The AI models are quickly improving

According to Foody, continuing to improve the models doesn’t require a breakthrough — it requires more and better training, which the frontier labs are already investing heavily in.

“That’s why we have so much revenue,” he said, adding, “We’re in the business of replacing human judgment.”

Mercor, whose clients have included OpenAI, Anthropic, and Meta, secured a funding deal in the fall that valued the company at $10 billion. Mercor employs more than 30,000 contractors around the world who help train AI models through tasks like rewriting chatbot responses. Foody previously said the company grew its revenue in 2025 by 4,658%.

Foody said he believes consulting, and especially lower-level roles, are among the jobs he’s confident will be displaced by AI. He said the next version of the AI agents benchmark will expand to evaluate the whole value chain of a professional services firm: “Instead of evaling the analyst, we’re evaling McKinsey itself.”

Right now, he says Mercor’s AI agent benchmark tells an appealing story for McKinsey, because the company could say it shows they can use AI to add value but not replace humans.

“The next version of APEX tells a very scary story for McKinsey,” he said, adding, “In the coming two years, we’re going to have chatbots that are as good as the best consulting firm.”




Source link

A-Nobel-Prize-winning-physicist-explains-how-to-use-AI-without.jpeg

A Nobel Prize-winning physicist explains how to use AI without letting it replace your thinking

Think AI makes you smarter?

Probably not, according to Saul Perlmutter, a Nobel Prize-winning physicist who was credited for discovering that the universe’s expansion is accelerating.

He said AI’s biggest danger is psychological: it can give people the illusion they understand something when they don’t, weakening judgment just as the technology becomes more embedded in our daily work and learning.

“The tricky thing about AI is that it can give the impression that you’ve actually learned the basics before you really have,” Perlmutter said on a podcast episode with Nicolai Tangen, CEO of Norges Bank Investment Group, on Wednesday.

“There’s a little danger that students may find themselves just relying on it a little bit too soon before they know how to do the intellectual work themselves,” he added.

Rather than rejecting AI outright, Perlmutter said the answer is to treat it as a tool — one that supports thinking instead of doing it for you.

Use AI as a tool — not a substitute

Perlmutter said that AI can be powerful — but only if users already know how to think critically.

“The positive is that when you know all these different tools and approaches to how to think about a problem, AI can often help you find the bit of information that you need,” he said.

At UC Berkeley, where Perlmutter teaches, he and his colleagues developed a critical-thinking course centered on scientific reasoning, including probabilistic thinking, error-checking, skepticism, and structured disagreement, taught through games, exercises, and discussion designed to make those habits automatic in everyday decisions.

“I’m asking the students to think very hard about how would you use AI to make it easier to actually operationalize this concept — to really use it in your day-to-day life,” he said.

The confidence problem

One of Perlmutter’s concerns is that AI often speaks with far more certainty than it deserves and can be “overly confident” in what it says.

The challenge, Perlmutter said, is that AI’s confident tone can short-circuit skepticism, making people more likely to accept its answers at face value rather than question whether they’re correct.

That confidence, he said, mirrors one of the most dangerous human cognitive biases: trusting information that appears authoritative or confirms our existing beliefs.

To counter that instinct, Perlmutter said people should evaluate AI outputs the same way they would any human claim — weighing credibility, uncertainty, and the possibility of error rather than accepting answers at face value.

Learning to catch when you’re being fooled

In science, Perlmutter said, researchers assume they are making mistakes and build systems to catch them. For example, scientists hide their results from themselves, he said, until they’ve exhaustively checked for errors, thereby reducing confirmation bias.

The same mindset applies to AI, he added.

“Many of [these concepts] are just tools for thinking about where are we getting fooled,” he said. “We can be fooling ourselves, the AI could be fooling itself, and then could fool us.”

That’s why AI literacy also involves knowing when not to trust the output, he said — and being comfortable with uncertainty, rather than treating AI outputs as absolute truth.

Still, Perlmutter is clear that this isn’t a problem with a permanent solution.

“AI will be changing,” he said, “and we’ll have to keep asking ourselves: is it helping us, or are we getting fooled more often? Are we letting ourselves get fooled?”




Source link