Start Up No.2263: how LLMs degenerate, Crowdstrike’s $10 miss, how news reaches us now, Chernobyl’s real effects, and more


Golf fans were able to follow last week’s open via a detailed VR system with every detail of holes such as Royal Troon’s “Postage Stamp”. CC-licensed photo by easylocum 2.0 on Flickr.

You can sign up to receive each day’s Start Up post by email. You’ll need to click a confirmation link, so no spam.


There’s another post coming this week at the Social Warming Substack on Friday at 0845 UK time. Free signup.


A selection of 9 links for you. Use them wisely. I’m @charlesarthur on Twitter. On Threads: charles_arthur. On Mastodon: https://newsie.social/@charlesarthur. Observations and links welcome.


AI models fed AI-generated data quickly spew nonsense • Nature

Elizabeth Gibney:

»

Training artificial intelligence (AI) models on AI-generated text quickly leads to the models churning out nonsense, a study has found. This cannibalistic phenomenon, termed model collapse, could halt the improvement of large language models (LLMs) as they run out of human-derived training data and as increasing amounts of AI-generated text pervade the Internet.

“The message is we have to be very careful about what ends up in our training data,” says co-author Zakhar Shumaylov, an AI researcher at the University of Cambridge, UK. Otherwise “things will always, provably, go wrong,” he says.” The team used a mathematical analysis to show that the problem of model collapse is likely to be universal, affecting all sizes of language model that use uncurated data, as well as simple image generators and other types of AI.

The researchers began by using an LLM to create Wikipedia-like entries, then trained new iterations of the model on text produced by its predecessor. As the AI-generated information — known as synthetic data — polluted the training set, the model’s outputs became gibberish. The ninth iteration of the model completed a Wikipedia-style article about English church towers with a treatise on the many colours of jackrabbit tails (see ‘AI gibberish’).

More subtly, the study, published in Nature1 on 24 July, showed that even before complete collapse, learning from AI-derived texts caused models to forget the information mentioned least frequently in their data sets as their outputs became more homogeneous.

«

There’s a neat illustration of this, in visual terms, on the story.
unique link to this extract


CrowdStrike offers a $10 apology gift card to say sorry for outage • TechCrunch

Lorenzo Franceschi-Bicchierai:

»

CrowdStrike, the cybersecurity firm that crashed millions of computers with a botched update all over the world last week, is offering its partners a $10 Uber Eats gift card as an apology, according to several people who say they received the gift card, as well as a source who also received one.

On Tuesday, a source told TechCrunch that they received an email from CrowdStrike offering them the gift card because the company recognizes “the additional work that the July 19 incident has caused.” 

“And for that, we send our heartfelt thanks and apologies for the inconvenience,” the email read, according to a screenshot shared by the source. The same email was also posted on X by someone else. “To express our gratitude, your next cup of coffee or late night snack is on us!”

The email was sent from a CrowdStrike email address in the name of Daniel Bernard, the company’s chief business officer, according to a screenshot of the email seen by TechCrunch. According to one post on X, in the United Kingdom the voucher was worth £7.75, or roughly $10 at today’s exchange rate.

On Wednesday, some of the people who posted about the gift card said that when they went to redeem the offer, they got an error message saying the voucher had been canceled. When TechCrunch checked the voucher, the Uber Eats page provided an error message that said the gift card “has been canceled by the issuing party and is no longer valid.”

CrowdStrike spokesperson Kevin Benacci confirmed to TechCrunch that the company sent the gift cards.

“We did send these to our teammates and partners who have been helping customers through this situation. Uber flagged it as fraud because of high usage rates,” Benacci said in an email.

«

When it rains, it pours. Though this is more like a shower after a hurricane.
unique link to this extract


Microsoft’s China-based AI staff face relocation decision deadline • Rest of World

Viola Zhou:

»

Alan, a young engineer at Microsoft, has been living a comfortable life in Beijing working for the tech giant on cloud computing. He earns six times the average income in the city, allowing him to dine out frequently and take taxis whenever he wants. 

But Microsoft is now asking Alan to start a new life across the Pacific. For the past two months, he’s been weighing up a request made to hundreds of Chinese employees who work on artificial intelligence and cloud computing to consider relocating to places including Canada, Australia, or Microsoft’s headquarters in Redmond, Washington.

Alan, who spoke under a pseudonym, received an offer to go to Vancouver, but he just couldn’t make up his mind. “No matter how comfortable Chinese people could be in Vancouver, it wouldn’t be as comfortable as Beijing,” he told Rest of World. 

…the kind of US-China tech collaboration Microsoft once pioneered might be facing an end. The Biden administration has blocked China from accessing chips used to develop AI technologies, proposed restrictions on tech investments in China, and threatened bans on Chinese-owned platforms like TikTok. In Washington, Microsoft’s presence in China is increasingly viewed as a national security threat.

…“Emigration is not that appealing to many Microsoft people in China,” a Beijing-based employee told Rest of World, after declining an offer to relocate to Vancouver. “If you deduct taxes, every place except Seattle may come with a pay cut compared to Beijing. The living quality would really suffer.”

While Microsoft says it makes about 1.5% of its global sales revenue in China, research contributions from its Chinese engineers are far more valuable to the company, Jean-Marc Blanchard, executive director of California-based think tank Wong MNC Center, told Rest of World.

«

unique link to this extract


The ways people hear about big news these days: “into a million pieces,” says source • Nieman Journalism Lab

Joshua Benton:

»

Here’s an obviously incomplete list of some of the ways that Americans and others around the world heard the news [that Joe Biden was stepping aside].

A lifeguard on the loudspeaker at a D.C. pool
Live TV playing at the gym
A Twitter account that tracks whether or not Liza Minnelli has yet outlived a particular person or thing
A friend’s text with a link to The Guardian
Top-of-the-hour headlines on a public radio station
An “old-school message board”
Horny copypasta
Phone call from college-aged son while during housework
A New York Times push alert
A push alert at Whataburger
A text from someone sitting nearby at a Nationals game
The chat in a Twitch stream of a live crossword competition
In a discipleship class at church
In a bar-exam-prep Discord channel
From the happy screams of 100-plus women at a mass outdoor workout
Push alerts while taping a Dungeons & Dragons podcast
BBC push alert during a Dungeons & Dragons game
From the radio announcers at a Yankees game
From the radio announcers at a Mets game
An American Girl doll-themed Instagram meme account
From a Try Guy at a farmer’s market
“In a text from my husband…from across the room…because he didn’t want to say it out loud because we were at my conservative dad’s house for a family birthday party.”
A news alert on a patient’s Apple Watch mid-exam
A Discord notification: “It’s Kamencing”
The Twitter account of a Minecraft server
An Alexa notification
An email alert from The New York Times
“For the first time ever, Apple News push alert actually broke news to me”
A push alert in an ice cave
A spouse asking a smart speaker: “hey Google, who’s going to replace Joe Biden?”
“On a queer camping trip on the yuba river, a friend got an update text from her girlfriend that their pepper plant had new peppers on it…and also Joe Biden pulled out”
Overheard in an hour-long lobster-roll line in Maine
A text: “YOOOOOO”
Watching MSNBC
“From a reality TV Instagram account posting how the current Big Brother candidates won’t know that Biden dropped out of the race til October.”
“FB alert from a fellow journalist while sitting at a dog agility trial waiting to run my corgi”
Overheard at an art gallery
Wall Street Journal push notification while grocery shopping at Publix
Slack

«

unique link to this extract


The data, networking and GenAI driving The British Open golf championship • Computer Weekly

Bryan Glick:

»

Typically, the temporary village set up to cater to the quarter of a million fans and the world’s broadcasters and media requires on-site links into the fibre backbone using standard Cat5 cabling that’s installed for the event and thrown away afterwards.

To replace this, NTT Data trialled the use of a private 5G “network in a box” at Troon, focused around the hospitality area. This required the short-term purchase of radio spectrum from telecoms regulator Ofcom – a requirement for any private 5G installation in the UK. But it meant that connectivity within its 2km range was available to any device with a suitable nanoSIM card, offering 400Mbps bandwidth.

The next generation of the networking equipment will offer eSIM capabilities, which means that fans visiting the Open could simply scan a QR code to activate eSim software to connect to the private 5G. That’s important to The R&A because they want to maximise fans’ engagement with the event – and the more they use the digital offerings available, the better.

For example, for the 150th Open in 2022, The R&A and NTT Data launched Shot View, a precise virtual representation of each course, which allows fans to track every shot played by the competing golfers using a digital twin that represents the actual trajectory of every shot, in as close to real-time as can be achieved.

Every one of The Open courses has been mapped using drones and Lidar scanning to capture every bump, bunker and slope to 2cm accuracy. During the championship, computer vision cameras set up at every hole track golf balls across the green, while 60 people around the course use GPS trackers to record the location on the fairway where every ball comes to rest.

All of that data is fed into a virtual reality (VR) environment running on Unreal Engine, one of the most popular gaming engines, to plot every movement of the ball. As players tee up at each hole, fans can use Shot View to see exactly how they played the hole on previous days, as well as keep up with what’s happening around the whole course.

«

Amazing – didn’t even know there was this app. It’s an amazing effort. And also, hello, Apple, can you see the possibilities of a VR environment following a golf tournament?
unique link to this extract


The political economic determinants of nuclear power: evidence from Chernobyl • NBER

Alexey Makarin, Nancy Qian and Shaoda Wang:

»

The rapid growth in the number of nuclear power plants (NPP) declined dramatically after Chernobyl [in April 1986], especially in countries with democratic governments which had the highest number of NPPs at the time. To understand the mechanisms driving such change, we examine two case studies in detail: the United States and the United Kingdom.

In the US, we document that: (a) after the Chernobyl accident, campaign contributions to House and Senate races from fossil fuel special interest groups became strongly associated with negative votes on nuclear-related bills, and such donations increased significantly; and (b) newspapers with more fossil fuel advertisements published more anti-nuclear articles after Chernobyl, while we do not observe significant changes in advertisement spending by the fossil fuel industry.

In the UK, MPs sponsored by mining unions were much more likely to give anti-nuclear speeches in parliament after Chernobyl. We examine air pollution as a downstream outcome of reduced nuclear investment. We estimate that the decline in NPP caused by Chernobyl led to the loss of approximately 141 million expected life years in the US, 33 million in the UK and 318 million globally.

…Nuclear energy competes with and threatens the fossil fuel industry. This paper asks a simple question: did fossil-fuel special interests leverage the 1986 Chernobyl reactor meltdown and the public fear that it triggered to influence government policy against nuclear investment in the democracies with the most NPPs at the time?

«

It’s an academic study, but full of rigour for that reason. And concludes that fossil fuel interests did jump on the opportunity to diss nuclear post-Chernobyl (and then Fukushima in 2011).

Such a butterfly wing. Chernobyl’s chief wanted to run the safety test near the end of the month. But it couldn’t run the low-power test during the day because grid power was needed for factories meeting quotas. Which meant the test was run it at night with inexperienced operators, with the reactor in a state where the test itself was certain to fail – and by trying to make it happen, ran into an incredible edge case of the RBMK reactor design that could cause an explosion.

And nuclear power everywhere was stymied.
unique link to this extract


Inrupt debuts data wallet as digital wallet use grows • Pymnts

»

Enterprise software firm Inrupt has introduced a digital wallet designed to hold user data.

Businesses and governments, the company wrote on its blog Tuesday (July 23), can use the technology to give customers and citizens a place to store their data.

“Over 60% of the world’s population is expected to use digital wallets regularly by 2026, and over half of consumers report interest in using them for a broader range of purposes,” the company said. “But the existing market has focused largely on financial transactions and is dominated by a handful of Big Tech vendors.”

The Data Wallet, Inrupt said, differs from alternatives by accepting a range of different data, and makes it easy for users to consent to access to their data. The company argued that the Data Wallet creates opportunities for organizations as the web pivots toward a “user-centric approach” to sharing, using and managing personal data.

“Browsers shaped the Web 1.0 era, and Web 2.0 was all about apps. But Web 3.0 is all about empowered individuals and personal data,” said Sir Tim Berners-Lee, co-founder of Inrupt and esteemed computer scientist.

(Berners-Lee is widely credited for inventing the World Wide Web, the first web browser, and the building blocks that allowed the internet to scale.)

“The Data Wallet becomes a fundamental tool for users,” he added. “By making this key piece of technology available, Inrupt is ensuring that the opportunities and benefits of secure personal Data Wallets are open for everyone.”

«

Yes, you were wondering what TBL had got up to. Look, it’s pretty hard to follow inventing the web and opening the 2012 Olympics.
unique link to this extract


Google’s plan to get rid of cookies crumbles • Inc.com

Kit Eaton:

»

Google’s 2020 plan sounded simple. Cookies really work: they’re why you keep seeing ads for say, Stanley cups, after you click on a single ad for one online. [Why keep showing you an ad for something you already clicked an ad for? – Overspill Ed.] But by accruing extensive info on users’ browsing habits, cookies seemed more and more like a privacy nightmare, especially when comparing Google to rivals. Apple, in particular, shapes its brand identity around always placing customer privacy front and center.

At the time of its initial announcement, Google said cookies would go inside two years and be replaced by newer “privacy-preserving and open-standard” systems. The intended result? A “healthy ad-supported web” would exist, just as it did when third-party cookies were supported on Chrome, but with stronger privacy protections in place. The new tracking tech–which was always short on detail–would still allow targeted ads, but in a way that wouldn’t store as much user data.

The new “Privacy Sandbox” system sounded like a great thing for end users, who would enjoy increased privacy while using Chrome. But it also was potentially very bad news for digital advertisers because it could undo the effectiveness of targeting ads–simply because users’ interests, likes and habits weren’t being tracked as closely. 

The outcry from advertisers was the main reason Google subsequently failed to turn off cookies for four years after its initial promise, and why the company has changed its mind completely.

The Wall Street Journal reports that Google’s U-turn was actually forced by “digital advertising companies and regulators.” There were numerous objections to the plan to end cookies and replace them with newer Google tech that worked in the company’s Privacy Sandbox. When Google began a trial switching off a tiny fraction of Chrome users’ cookies in January this year, research by Adobe found that 75% of marketers were shunning a raft of alternative ad-tracking systems, and were still relying on traditional cookies.

«

So, in short, inertia, and leverage. Not even Google can – or wants to – stand up to the might of the digital advertising lobby.
unique link to this extract


There are no good options left with bird flu • The Atlantic

Yasmin Tayag:

»

Of all the news about bird flu, this month has brought some of the most concerning yet. Six people working on a chicken farm in Colorado have tested positive for the virus—the biggest human outbreak detected in the U.S. The country’s tally is now up to 11 since 2022, but that’s almost certainly a significant undercount considering the lack of routine testing.

Since the current strain of bird flu, known as “highly pathogenic avian influenza H5N1,” began spreading around the world in late 2021, it has become something like a “super virus” in its spread among animals, Richard Webby, an influenza expert at St. Jude Children’s Research Hospital in Memphis, told me. Wild birds have been decimated, as have poultry farms: The virus has been detected in more than 100 million birds in 48 states. H5N1 has been around for longer than 25 years, but only recently has it regularly jumped to mammals, infecting cats, sea lions, and bears. In March, it was detected for the first time in American cattle and, since then, has already spread to 163 herds in 13 states.

All of that would be worrying enough without reports of people also falling sick. Everyone who has tested positive in the U.S. has worked closely with farm animals, but each additional case makes the prospect of another human pandemic feel more real. “That’s absolutely the worst-case scenario,” Webby said. It’s a possibility, although not the likeliest one. For now, the virus seems poised to continue its current trajectory: circulating among wild birds, wreaking havoc on poultry farms, and spreading among cattle herds. That outcome wouldn’t be as catastrophic as a pandemic. But it’s still not one to look forward to.

«

*sighs* Just a watching brief, honest.
unique link to this extract


• Why do social networks drive us a little mad?
• Why does angry content seem to dominate what we see?
• How much of a role do algorithms play in affecting what we see and do online?
• What can we do about it?
• Did Facebook have any inkling of what was coming in Myanmar in 2016?

Read Social Warming, my latest book, and find answers – and more.


Errata, corrigenda and ai no corrida: none notified

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.