On Wednesday, Googlepreviewed what could be one of the largest changes to the search engine in its history.
Google will use AI models to combine and summarize information from around the web in response to search queries, a product it calls Search Generative Experience.
related investing news
2 hours ago
5 hours ago
Instead of “ten blue links,” the phrase that describes Google’s usual search results, Google will show some users paragraphs of AI-generated text and a handful of links at the top of the results page.
The new AI-based search is being tested now for a select group of users and isn’t widely available yet. But website publishers are already worried that if it becomes Google’s default way of presenting search results, it could hurt them by sending fewer visitors to their sites and keeping them on Google.com.
The controversy highlights a long-running tension between Google and the websites it indexes, with a new artificial intelligence twist. Publishers have long worried that Google repurposes their verbatim content in snippets on its own website, but now Google is using advanced machine learning models that scrape large parts of the web to “train” the software to spit out human-like text and responses.
Rutledge Daugette, CEO of TechRaptor, a site focusing on gaming news and reviews, said that Google’s move was made without considering the interests of publishers and Google’s AI amounts to lifting content.
“Their focus is on zero-click searches that use information from publishers and writers who spend time and effort creating quality content, without offering any benefit other than the potential of a click,” Rutledge told CNBC. “Thus far, AI has been quick to reuse others’ information with zero benefit to them, and in cases like Google Bard doesn’t even offer attribution as to where the information it’s using came from.”
Luther Lowe, a longtime Google critic and chief of public policy at Yelp, said that Google’s update is part of a decades-long strategy to keep users on the site for longer, instead of sending them to the sites that originally hosted the information.
“The exclusionary self-preferencing of Google’s ChatGPT clone into search is the final chapter of bloodletting the web,” Lowe told CNBC.
According to SearchEngineLand, a news website that closely tracks changes to Google’s search engine, the AI-generated results are displayed above the organic search results in testing so far.
SGE comes in a differently colored box — green in the example — and includes boxed links to three websites on the right side. In Google’s primary example, all three of the website headlines were cut off.
Google says that the information isn’t taken from the websites, but is instead corroborated by the links. SearchEngineLand said the SGE approach was an improvement and a “healthier” way to link than Google’s Bard chatbot, which rarely linked to publisher websites.
Some publishers are wondering if they can prevent AI firms such as Google from scraping their content to train their models. Companies such as the firm behind Stable Diffusion are already facing lawsuits from data owners, but the right to scrape web data for AI remains an undecided frontier. Other companies, such as Reddit, have announced plans to charge for access to their data.
Leading the charge in the publishing world is Barry Diller, Chairman of IAC, which owns websites including All Recipes, People Magazine and The Daily Beast.
“If all the world’s information is able to be sucked up into this maw, and then essentially repackaged in declarative sentences, in what’s called chat, but it isn’t chat — as many grafs as you want, 25 on any subject — there will be no publishing, because it will be impossible,” Diller said last month at a conference.
“What you have to do is get the industry to say that you cannot scrape our content, until you work out systems where the publisher gets some avenue towards payment,” Diller continued, saying that Google will face this problem.
Diller says he believes publishers can sue AI firms under copyright law and that current “fair use” restrictions need to be redefined. The Financial Times reported on Wednesday that Diller is leading a group of publishers “that is going to say we are going to change copyright law if necessary.” An IAC spokesperson declined to make Diller available for an interview.
One challenge facing publishers is confirming that their content is being used by an AI. Google did not reveal training sources for its large language model that underpins SGE, PaLM 2, and Daugette says while he’s seen examples of quotes and review scores from competitors repurposed on Bard without attribution, it’s hard to tell when the information is from his site without directly linked sources.
Google didn’t respond to a request for comment. “PaLM 2 is trained on a wide range of openly available data on the internet and we obviously value the health of the web ecosystem. And that’s really part of the way we think about how we build our products, to ensure that we have a healthy ecosystem where creators are a part of that thriving ecosystem,” Google VP of Research Zoubin Ghahramani said in a media briefing earlier this week.
Daugette says that Google’s moves make being an independent publisher tough.
“I think it’s really frustrating for our industry to have to worry about our hard work being taken, when so many colleagues are being laid off,” Daugette said. “It’s just not okay.”