Mike Monegan saw the writing on the wall in January. For weeks, he’d had difficulty sleeping.
As vice president of product management for Australian artificial intelligence software vendor Appen, Monegan and many of his colleagues had been doing their best to keep things afloat as tech behemoths slashed their spending on the company’s AI training data.
Five customers — Microsoft, Apple, Meta, Google, and Amazon — accounted for 80% of Appen’s revenue, and this was supposed to be the company’s moment to shine. Across the industry, companies were committing to hefty investments in generative AI, trying to ensure they weren’t left behind in the sudden race to embed the latest large language models into all of their projects.
Appen has a platform of about one million freelance workers in more than 170 countries. In the past, it’s used that network of people to train some of the world’s leading AI systems, working for a star-studded list of tech companies, including the top consumer names as well as Adobe, Salesforce and Nvidia.
But just as AI’s big moment was arriving, Appen was losing business — and fast. Revenue declined 13% in 2022, a drop the company attributed in part to “challenging external operating and macro conditions.” Former employees, who asked not to be named for fear of retaliation, told CNBC that the company’s current struggle to pivot to generative AI reflects years of weak quality controls and a disjointed organizational structure.
In mid-December, Appen announced a change at the top. Armughan Ahmad, a 25-year veteran of the tech industry, would be taking over as CEO, replacing Mark Brayan, who had helmed the company for the prior seven years. Upon starting the following month, Ahmad called generative AI “one of the most exciting advancements” in the industry and noted that he “was happy to learn that our team has already put the technology to work on our marketing content.”
Monegan wasn’t buying it. He told CNBC that after his first meeting with Ahmad he began looking for another job. Monegan had been watching Appen fall behind, and he didn’t see Ahmad, whose LinkedIn profile says he’s based in Seattle, presenting a realistic path out.
Monegan left in March to help start his own company.
The numbers seem to prove him right.
Despite Appen’s enviable client list and its nearly 30-year history, the company’s struggles have intensified this year. Revenue in the first half of 2023 tumbled 24% to $138.9 million, amid what it called a “broader technology slowdown.” The company said its underlying loss widened to $34.2 million from $3.8 million a year earlier.
“Our data and services power the world’s leading AI models,” Ahmad said on last week’s earnings call. “However, our results are far from satisfactory. They reflect the ongoing global macroeconomic pressures and continued slowdown in tech spending, particularly amongst our largest customers.”
In August 2020, Appen’s shares peaked at AU$42.44 on the Australian Securities Exchange, sending its market cap to the equivalent of $4.3 billion. Now, the stock is trading at around AU$1.52, for a market cap of around $150 million.
‘Resetting the business’
Along with its troubled financials, the company is dealing with a string of executive departures. Helen Johnson, who was appointed finance chief in May, left after just seven weeks in the role. Marketing chief Fab Dolan, whose departure was announced on the earnings call, spent just over two months in the position. The departure of Chief Product Officer Sujatha Sagiraju was also just announced.
“In the environment of a turnaround, we anticipate changes,” a representative for Appen told CNBC.
Elena Sagunova, global human resources director, left in April, followed by Jen Cole, senior vice president of enterprise, in July and Jukka Korpi, senior manager of business development for the Europe, Middle East and Africa Region, in August.
Still, Ahmad said on the earnings call that the company remains “laser-focused on resetting the business” as it pivots to providing data for generative AI models. He added that “the benefits from our turnaround have yet to show meaningful results” and that “the revenue growth does not offset the declines we are experiencing in the remainder of the business.”
Appen’s past work for tech companies has been on projects like evaluating the relevance of search results, helping AI assistants understand requests in different accents, categorizing e-commerce images using AI and building out map locations of electric vehicle charging stations, according to public information and interviews conducted by CNBC.
Appen has also touted its work on search relevance for Adobe and on translation services for Microsoft, as well as in providing training data for lidar companies, security applications and automotive manufacturers.
Depending on the data that a customer requires, an Appen freelancer could be sitting at a laptop to label or categorize images or search results or using Appen’s mobile application to capture the sounds of glass breaking or background noise in a vehicle.
During Appen’s growth years, that manual collection of data was key for the state of AI at the time. But LLMs of today have changed the game. The underlying models behind OpenAI’s ChatGPT and by Google’s Bard are scouring the digital universe to provide sophisticated answers and advanced images in response to simple text queries.
To fuel their LLMs, which are powered largely by state-of-the-art processors from Nvidia, companies are spending less on Appen and a lot more on competitive services that already specialize in generative AI.
Ahmad told CNBC in a statement that, while the company’s financials are being hurt by the economy and a reduction in spending by top customers, “I’m confident that our disciplined focus and the early progress we are making to turn around the business will enable us to capture value from the growing generative AI market and return Appen to growth.”
Cash-strapped
Ahmad said on the earnings call that there’s customer interest in niche types of data that’s more difficult to acquire. For Appen, that would mean finding specialists in particular types of information that can bolster generative AI systems. That also means it needs to expand its base of workers while simultaneously finding ways to preserve cash.
Appen’s cash on hand was $55 million as of June 30, thanks to proceeds from a $38 million equity raise. Prior to the new infusion, cash had been dwindling, from $48 million at the end of 2021 to $23.4 million a year later.
Even before the generative AI transition, wages for Appen’s data labelers were a sticking point. In 2019, Google said its contractors would need to pay their workers $15 an hour. Appen didn’t meet that requirement, according to public letters written by some workers.
In January, after months of organizing, raises went into effect for Appen freelancers working on the Bard chatbot and other Google products. The rates went up to between $14 and 14.50 per hour.
That wasn’t the end of the story. In May, Appen was accused of squeezing freelancers focused on generative AI, allotting strict time limits for time-consuming tasks such as evaluating a complex answer for accuracy. One worker, Ed Stackhouse, wrote a letter to two senators stating his concerns about the dangers of such constrained working conditions.
“The fact that raters are exploited leads to a faulty, and ultimately more dangerous product,” he wrote. “Raters are not given the time to deliver and test a perfect AI model under the Average Estimated Time (AET) model they are paid for,” a practice that “leads raters to spot check only a handful of facts before the task must be submitted,” he added.
In June, Appen faced charges from the U.S. National Labor Relations Board after allegedly firing six freelancers who spoke out publicly about frustrations with workplace conditions. The workers were later reinstated.
Appen employees who spoke to CNBC on behalf of the company in recent months said the rapidly changing AI environment poses challenges. Erik Vogt, vice president of solutions at Appen, told CNBC in May that the sector was in a state of flux.
“There’s a lot of uncertainty, a lot of tentativeness for experimentation, and new startups trying out new things,” Vogt said. “How to make new use cases a reality usually means acquiring unusual data – sometimes astronomical volumes of data, or highly rare resource types. There’s a need for specialists in a wide range of different capabilities.”
For recent projects, Vogt said Appen needed to enlist the help of doctors, lawyers and people with experience using project-tracking software Jira.
“People you wouldn’t necessarily think of as being gig workers, we had to engage with these specialists for these expert systems in a way there hadn’t been a huge demand for before,” Vogt said.
Kim Stagg, Appen’s vice president of product, said the work required for generative AI services was different than what the company has needed in the past.
“A lot of work we’ve done has been around the relevance of search for big engines – a lot of those are more, ‘Is this a hot dog or not,’ ‘Is this a good search or not,'” Stagg said. “With generative AI, we see a different demand.”
One focus Stagg highlighted was the need to find “what we would call really good quality creative people,” or those who are particularly good with language. “And another is domain experts: sports, hobbies, medical.”
However, former employees expressed deep skepticism of Appen’s ability to succeed given its tumultuous position and the executive shuffling taking place. Part of the problem, they say, is the organizational structure.
Appen was divided into a global business unit and an enterprise business unit, which were at one time made up of about five clients and more than 250 clients, respectively. Each had a separate team and communication between them was limited, creating inefficiencies internally, ex-employees said. One former manager said it felt like two separate companies. Appen said that in the last quarter, the company has integrated the global and enterprise business units.
The company’s plunging stock price suggests that investors don’t see the company’s business offerings transferring to the generative AI space.
Lisa Braden-Harder, who served as CEO of Appen until 2015, echoed that sentiment, telling CNBC that “data-labeling is completely different” than how data collection works in a ChatGPT world.
“I am not clear that their past experience of data labeling is a competitive advantage now,” she said.
Former Appen employees say the company has in recent years been dealing with quality control problems, hurting its ability to provide valuable training data for AI models. For example, one former department manager said people would annotate rows of data using automated tools instead of the manual data labeling required for accuracy, which is what clients thought they were buying.
Customers’ expectations of a “clean data set” were often not met, the person said, leading them to leave Appen for competitors such as Labelbox and Scale AI. When the manager started at the company, there were more than 250 clients in the enterprise business unit. Within 18 months, he said, that number had dwindled to less than 100.
Appen told CNBC that in the first half of the year it “secured 89 new client wins.”
Monegan recalled that many customer relationships were “hanging on by a thread.”
Following the earnings report, Canaccord Genuity analysts cut their price target on Appen by more than half to AU$1.56. One concern the analysts referenced was a 34% reduction in spending by Appen’s top customer, a number that Appen wouldn’t confirm or deny.
The more existential problem, the analysts note, revolves around Appen’s effort to win business while also looking to cut costs by 31% in fiscal 2023.
“That seems like a brutal level of cost reduction,” they wrote, as the company tries to stabilize its “core revenue base while growing a business around Generative AI.”
Beta Technologies shares surged more than 9% after air taxi maker Eve Air Mobility announced an up to $1 billion deal to buy motors from the Vermont-based company.
Eve, which was started by Brazilian airplane maker Embraer and is now under Eve Holding, said the manufacturing deal could equal as much as $1 billion over 10 years. The Florida-based company said it has a backlog of 2,800 vehicles.
Shares of Eve Holding gained 14%.
Eve CEO Johann Bordais called the deal a “pivotal milestone” in the advancement of the company’s electric vertical takeoff and landing, or eVTOL, technology.
“Their electric motor technology will play a critical role in powering our aircraft during cruise, supporting the maturity of our propulsion architecture as we progress toward entry into service,” he said in a release.
Amazon’s cloud unit on Tuesday announced AI-enabled software designed to help clients better understand and recover from outages.
DevOps Agent, as the artificial intelligence tool from Amazon Web Services is called, predicts the cause of technical hiccups using input from third-party tools such as Datadog and Dynatrace. AWS said customers can sign up to use the tool Tuesday in a preview, before Amazon starts charging for the service.
The AI outage tool from AWS is intended to help companies more quickly figure out what caused an outage and implement fixes, Swami Sivasubramanian, vice president of agentic AI at AWS, told CNBC. It’s what site reliability engineers, or SREs, do at many companies that provide online services.
SREs try to prevent downtime and jump into action during live incidents. Startups such as Resolve and Traversal have started marketing AI assistants for these experts. Microsoft’s Azure cloud group introduced an SRE Agent in May.
Rather than waiting for on-call staff members to figure out what happened, the AWS DevOps Agent automatically assigns work to agents that look into different hypotheses, Sivasubramanian said.
“By the time the on-call ops team member dials in, they have an incident report with preliminary investigation of what could be the likely outcome, and then suggest what could be the remediation as well,” Sivasubramanian told CNBC ahead of AWS’ Reinvent conference in Las Vegas this week.
Commonwealth Bank of Australia has tested the AWS DevOps Agent. In under 15 minutes, the software found the root cause of an issue that would have taken a veteran engineer hours, AWS said in a statement.
The tool relies on Amazon’s in-house AI models and those from other providers, a spokesperson said.
AWS has been selling software in addition to raw infrastructure for many years. Amazon was early to start renting out server space and storage to developers since the mid-2000s, and technology companies such as Google, Microsoft and Oracle have followed.
Since the launch of ChatGPT in 2022, these cloud infrastructure providers have been trying to demonstrate how generative AI models, which are often training in large cloud computing data centers, can speed up work for software developers.
Over the summer, Amazon announced Kiro, a so-called vibe coding tool that produces and modifies source code based on user text prompts. In November, Google debuted similar software for individual software developers called Antigravity, and Microsoft sells subscriptions to GitHub Copilot.
Attendees pass an Amazon Web Services logo during AWS re:Invent 2024, a conference hosted by Amazon Web Services, at The Venetian hotel in Las Vegas on Dec. 3, 2024.
Noah Berger | Getty Images
Amazon has found a way to let cloud clients extensively customize generative AI models. The catch is that the system costs $100,000 per year.
The Nova Forge offering from Amazon Web Services gives organizations access to Amazon’s AI models in various stages of training so they can incorporate their own data earlier in the process.
Already, companies can fine-tune large language models after they’ve been trained. The results with Nova Forge will lean more heavily on the data that customers supply. Nova Forge customers will also have the option to refine open-weight models, but training data and computing infrastructure are not included.
Organizations that assemble their own models might end up spending hundreds of millions or billions of dollars, which means using Nova Forge is more affordable, Amazon said.
AWS released its own models under the Nova brand in 2024, but they aren’t the first choice for most software developers. A July survey from Menlo Ventures said that by the middle of this year, Amazon-backed Anthropic controlled 32% of the market for enterprise LLMs, followed by OpenAI with 25%, Google with 20% and Meta with 9% — Amazon Nova had a less than 5% share, a Menlo spokesperson said.
The Nova models are available through AWS’ Bedrock service for running models on Amazon cloud infrastructure, as are Anthropic’s Claude 4.5 models.
“We are a frontier lab that has focused on customers,” Rohit Prasad, Amazon head scientist for artificial general intelligence, told CNBC in an interview. “Our customers wanted it. We have invented on their behalf to make this happen.”
Nova Forge is also in use by internal Amazon customers, including teams that work on the company’s stores and the Alexa AI assistant, Prasad said.
Reddit needed an AI model for moderating content that would be sophisticated about the many subjects people discuss on the social network. Engineers found that a Nova model enhanced with Reddit data through Forge performed better than commercially available large-scale models, Prasad said. Booking.com, Nimbus Therapeutics, the Nomura Research Institute and Sony are also building models with Forge, Amazon said.
Organizations can request that Amazon engineers help them build their Forge models, but that assistance is not included in the new service’s $100,000 annual fee.
AWS is also introducing new models for developers at its Reinvent conference in Las Vegas this week.
Nova 2 Pro is a reasoning model whose tests show it performs at least as well as Anthropic’s Claude Sonnet 4.5, OpenAI’s GPT-5 and GPT-5.1, and Google’s Gemini 3.0 Pro Preview, Amazon said. Reasoning involves running a series of computations that might take extra time in response to requests to produce better answers. Nova 2 Pro will be available in early access to AWS customers with Forge subscriptions, Prasad said. That means Forge customers and Amazon engineers will be able to try Nova 2 Pro at the same time.
Nova 2 Omni is another reasoning model that can process incoming images, speech, text and videos, and it generates images and text. It’s the first reasoning model with that range of capability, Amazon said. Amazon hopes that, by delivering a multifaceted model, it can lower the cost and complexity of incorporating AI models into applications.
Tens of thousands of organizations are using Nova models each week, Prasad said. AWS has said it has millions of customers. Nova is the second-most popular family of models in Bedrock, Prasad said. The top group of models are from Anthropic.