This Week in AI: Why OpenAI’s o1 changes the AI regulation game

Topics

late

Amazon

Image Credits:Hiroshi Watanabe / Getty Images

Apps

Biotech & Health

mood

People walking in a maze shaped as a brain

Image Credits:Hiroshi Watanabe / Getty Images

Cloud Computing

DoC

Crypto

Image Credits:David Paul Morris/Bloomberg / Getty Images

Enterprise

EVs

Fintech

fundraise

gismo

Gaming

Google

Government & Policy

ironware

Instagram

Layoffs

Media & Entertainment

More from TechCrunch

upshot

Startup Battlefield

StrictlyVC

newssheet

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

meet Us

Hiya , folks , welcome to TechCrunch ’s regular AI newssheet . If you want this in your inbox every Wednesday , sign uphere .

It ’s been just a few days since OpenAI revealed its latest flagshipgenerative modelling , o1 , to the world . market as a “ reasoning ” good example , o1 essentially takes longer to “ think ” about questions before answering them , break down trouble and checking its own answers .

There ’s a majuscule many thing o1can’t do well — and OpenAI itself accept this . But on some tasks , like natural philosophy and mathematics , o1 excels despite not necessarily having more parameter than OpenAI ’s previous top - performing model , GPT-4o . ( In AI and motorcar learning , “ parameters , ” usually in the 1000000000000 , rough correspond to a model ’s trouble - solving skills . )

And this has implications for AI rule .

California ’s proposed bill SB 1047 , for example , imposes safety gadget demand on AI models that either be over $ 100 million to build up or were trained using compute mogul beyond a sure threshold . example like o1 , however , certify that scaling up education compute is n’t the only way to improve a exemplar ’s functioning .

In aposton X , Nvidia enquiry manager Jim Fan posit that next AI systems may rely on little , easier - to - train “ logical thinking essence ” as oppose to the training - intensive architectures ( e.g. , Meta ’s Llama 405B ) that’ve been the trend latterly . Recent academic study , he notes , have shown that small models like o1 can greatly outperform large models given more clock time to noodle on questions .

Join us at TechCrunch Sessions: AI

Exhibit at TechCrunch Sessions: AI

So was it short - sighted for policymakers to tie AI regulatory measures to compute ? Yes , says Sara Hooker , head of AI startup Cohere ’s enquiry lab , in an interview with TechCrunch :

[ o1 ] kind of point out how incomplete a point of view this is , using model size as a proxy for risk . It does n’t take into account everything you’re able to do with inference or run a fashion model . For me , it ’s a combination of spoiled science combine with insurance that put the emphasis on not the current risks that we see in the worldly concern now , but on next risks .

Now , does that mean legislators should pull AI bills up from their foundations and start over ? No . Many were indite to be easily correctable , under the assumption that AI would develop far beyond their enactment . California ’s note , for illustration , would give the land ’s Government Operations Agency the authorisation to redefine the compute thresholds that activate the constabulary ’s safety requirements .

The admittedly tricky part will be figuring out which metriccould bea better proxy for risk than training compute . Like so many other aspects of AI regulation , it ’s something to ponder as bills around the U.S. — and humans — marchland toward passage .

News

First reactions to o1 : Max got initial impressions from AI investigator , startup founders , and VCs on o1 — and try the model himself .

Altman departs guard committee : OpenAI CEO Sam Altman step down from the inauguration ’s commission responsible for for reviewing the safety of role model such as o1 , likely in response to concerns that he would n’t act impartially .

Slack turns into an agent hub : At its parent company Salesforce ’s annual Dreamforce league , Slack announced newfangled feature , including AI - bring forth meeting summaries and desegregation with tools for image genesis and AI - driven web searches .

Google begins flagging AI images : Google says that it plans to range out change to Google Search to make absolved which figure in results were AI generated — or delete by AI tools .

Mistral found a innocent tier : French AI startup Mistral launched a new free level to permit developers fine - tune and build test apps with the inauguration ’s AI models .

Snap launches a video author : At its one-year Snap Partner Summit on Tuesday , Snapchat announced that it ’s introducing a new AI telecasting - coevals putz for creators . The puppet will take into account blue-ribbon creators to mother AI video from schoolbook prompts and , soon , from ikon prompts .

Intel inks major scrap deal : Intel say it will co - develop an AI chip with AWS using Intel ’s 18A chip fabrication process . The ship’s company described the wad as a “ multi - year , multi - billion - dollar model ” that could potentially involve additional chip intent .

Oprah ’s AI special : Oprah Winfrey ventilate a special on AI with client such as OpenAI ’s Sam Altman , Microsoft ’s Bill Gates , technical school influencer Marques Brownlee , and current FBI director Christopher Wray .

Research paper of the week

We know that AI can be persuasive , but can it dig out someone deep in a conspiracy coney hole ? Well , not all by itself . Buta young model from Costello et al . at MIT and Cornellcan make a incision in beliefs about untrue conspiracy that persists for at least a couple months .

In the experiment , they had people who conceive in conspiracy - related statements ( for example , “ 9/11 was an inside chore ” ) talk with a chatbot that gently , patiently , and interminably offered counterevidence to their line . These conversation conduct the humans involved to stating a 20 % decrease in the associated belief two month by and by , at least as far as these things can be measured . Here ’s an example of one of the conversations in progress :

It ’s unlikely that those deep into reptilians and deep land conspiracies are likely to confabulate or consider an AI like this , but the approach could be more effective if it were used at a decisive occasion like a person ’s first foray into these theories . For instance , if a stripling look for for “ Can jet fuel evaporate steel beams ? ” they may be experience a learning moment rather of a tragic one .

Model of the week

It ’s not a role model , but it has to do with example : Researchers at Microsoft this weekpublishedan AI benchmark called Eureka aim at ( in their words ) “ scale up [ theoretical account ] evaluations … in an subject and transparent manner . ”

AI benchmarks are a dime bag a XII . So what makes Eureka unlike ? Well , the researchers say that , for Eureka — which is actually a collection of existing bench mark — they chose project that rest challenge for “ even the most capable models . ” Specifically , Eureka tests for capabilities often overlooked in AI benchmarks , like visual - spatial navigation skills .

To show just how difficult Eureka can be for example , the researchers tested systems , including Anthropic ’s Claude , OpenAI ’s GPT-4o , and Meta ’s Llama , on the bench mark . No single manakin scored well across all of Eureka ’s tests , which the research worker say underscores the importance of “ continued introduction ” and “ targeted improvements ” to model .

Grab bag

In a win for professional player , California passed two law , AB 2602 and AB 1836 , restricting the use of AI digital replicas .

The legislation , which was back by SAG - AFTRA , the performer ’ spousal relationship , need that companies relying on a performing artist ’s digital replica ( e.g. , cloned voice or image ) give a “ reasonably specific ” description of the replica ’s intended use and negociate with the performing artist ’s effectual guidance or labour unification . It also command that entertainment employers take in the consent of a deceased performing artist ’s the three estates before using a digital replica of that someone .

Topics#

More from TechCrunch#

Join us at TechCrunch Sessions: AI#

Exhibit at TechCrunch Sessions: AI#

News#

Research paper of the week#

Model of the week#

Grab bag#