Ice Lounge Media

Ice Lounge Media

Reports that Dominion Software, which provides voting tabulation tools to about half the states in the U.S., “deleted” millions of votes have been soundly rebuffed after outgoing President Trump parroted numbers from a random internet forum.

Tweeting Thursday morning about baseless claims of election fraud, Trump cited OANN, a right-wing news outlet, which itself seemed to have found its numbers in a thread on pro-Trump Reddit knock-off thedonald.win. (The tweet was quickly wrapped in a warning that the contents are disputed.)

The anonymous person posting there claimed to have compared numbers from Edison Research, a company that does exit polls and other election-related measures, to those from Dominion, and come up with very different sums. The methods are not very well explained, nor are the results. It’s not really clear what is being compared to what and why, or for what reason this alleged fraud was published publicly by the company supposedly perpetrating it. No one has verified (if that’s the word) this analysis in any way.

In a comment to Politifact, Edison President Larry Rosin wrote that “we have no evidence of any voter fraud,” and that it pretty much has no idea what the purported analysis is referring to.

Dominion attracted attention earlier in the week when it seemed that a glitch had caused a number of votes to be registered for President-elect Joe Biden instead of Trump. But the miscount was immediately caught and found to be the result of human error. The company has dedicated a page to combating the misinformation around its software.

Politifact rated Trump’s claim “Pants on Fire,” calling it “ridiculous” for good measure. It’s worth noting that the tweet didn’t even state the numbers of the supposed fraud correctly.

There doesn’t seem to be any merit to the “analysis” at all, but it provides an excellent example of how people who are unfamiliar with how the voting apparatus works — which is to say almost everyone not directly involved — tend to find the software portion inherently untrustworthy.

Yet there is no way to count, tabulate and verify millions of ballots in hours or days after an election that does not rely heavily on private software tools, and it is in fact highly reliable and secure. The process of elections is bipartisan and extremely closely monitored.

Elections commissioners and state leadership have been unanimous in declaring the election a surprisingly smooth one considering the difficulties of holding one during a pandemic and with extremely high turnout both in person and by mail.

A major federal committee under the Cybersecurity and Infrastructure Security agency today called last week’s election “the most secure in American history… There is no evidence that any voting system deleted or lost votes, changed votes, or was in any way compromised. We can assure you we have the utmost confidence in the security and integrity of our elections, and you should too.”

Despite accusations from a dwindling number of highly placed individuals in the government, there has been no evidence presented that there was any significant voter fraud or other irregularities in last week’s election, which resulted in the victory of former vice president, now President-elect Joe Biden.

Read more

The latest Mac operating system arrives, Amazon faces a lawsuit over PPE and Disney+ turns one. This is your Daily Crunch for November 12, 2020.

The big story: Apple releases macOS Big Sur

This update, which was first announced five months ago at WWDC, includes a number of design changes that continue to blur the line between macOS and iOS.

One of the big additions is the Control Center, an iOS/iPadOS feature that presents a translucent pane down the right side of the screen. Meanwhile, Safari added features like built-in translation. And app icons and sounds have been updated throughout.

Brian Heater has been using the beta since June, and he concluded that Big Sur “boasts some key upgrades to apps and the system at large, but more importantly from Apple’s perspective, it lays the groundwork for the first round of Arm-powered Macs and continues its march toward a uniformity between the company’s two primary operating systems.”

The tech giants

Facebook’s Snapchat-like ‘Vanish Mode’ feature arrives on Messenger and Instagram — The feature, meant for more casual conversations, allows users to set chats to automatically delete after the message is seen and the chat is closed.

Amazon faces lawsuit alleging failure to provide PPE to workers during pandemic — The class action suit alleges Amazon failed to properly protect its warehouse workers and violated elements of New York City’s human rights law.

Apple HomePod Mini review: Remarkably big sound — A smart speaker for the masses.

Startups, funding and venture capital

Menlo Security announces $100M Series E on $800M valuation — CEO and co-founder Amir Ben-Efraim told us the startup remains focused on web and email as major attack vectors.

Livestorm raises $30M for its browser-based meeting and webinar platform — It’s purely browser based, without requiring presenters or attendees to install any software.

Nana nabs $6M for an online academy and marketplace dedicated to appliance repair — Nana runs a free academy to teach people how to fix appliances, then gives them the option to become a part of its repair marketplace.

Advice and analysis from Extra Crunch

Are subscription services the future of fintech? — As subscriptions become an increasingly alluring business model, fintechs will have to consider whether this strategy is worth the risk.

Conflicts in California’s trade secret laws on customer lists create uncertainty — Read this before you jump ship or hire a salesperson who already has.

As public investors reprice edtech bets, what’s ahead for the hot startup sector? — Selling edtech on the vaccine news (as investors did) was a bet that growth in the sector would be constrained by a return to normalcy.

(Reminder: Extra Crunch is our membership program, which aims to democratize information about startups. You can sign up here.)

Everything else

Disney+ has more than 73M subscribers — The streaming service launched one year ago today.

L’Oréal rolls out a line of ‘virtual makeup’ — This builds on L’Oréal’s 2018 acquisition of an augmented reality filter company called Modiface.

The Daily Crunch is TechCrunch’s roundup of our biggest and most important stories. If you’d like to get this delivered to your inbox every day at around 3pm Pacific, you can subscribe here.

Read more

Just over a week after the U.S. elections, Twitter has offered a breakdown of some of its efforts to label misleading tweets. The site says that from October 27 to November 11, it labeled some 300,000 tweets as part of its Civic Integrity Policy. That amounts to around 0.2% of the total number of election-related tweets sent during that two-week period.

Of course, not all Twitter warnings are created equal. Only 456 of those included a warning that covered the text and limited user engagement, disabling retweets, replies and likes. That specific warning did go a ways toward limited engagement, with around three-fourths of those who encountered the tweets seeing the obscured texts (by clicking through the warning). Quote tweets for those so labeled decreased by around 29%, according to Twitter’s figures.

The president of the United States received a disproportionate number of those labels, as The New York Times notes that just over a third of Trump’s tweets between November 3 and 6 were hit with such a warning. The end of the election (insofar as the election has actually ended, I suppose) appears to have slowed the site’s response time somewhat, though Trump continues to get flagged, as he continues to devote a majority of his feed to disputing the election results confirmed by nearly every major news outlet.

His latest tweet as of this writing has been labeled disputed, but not hidden, as Trump repeats claims against voting machine maker, Dominion. “We also want to be very clear that we do not see our job as done,” Legal, Policy and Trust & Safety Lead Vijaya Gadde and Product Lead Kayvon Beykpour wrote. “Our work here continues and our teams are learning and improving how we address these challenges.”

Twitter and other social media sites were subject to intense scrutiny following the 2016 election for the roles the platforms played in the spread of misinformation. Twitter sought to address the issue by tweaking recommendations and retweets, as well as individually labeling tweets that violate its policies.

Earlier today, YouTube defended its decision to keep controversial election-related videos, noting, “Like other companies, we’re allowing these videos because discussion of election results & the process of counting votes is allowed on YT. These videos are not being surfaced or recommended in any prominent way.”

Read more

Truebill, a startup offering a variety of tools to help users take control of their finances, announced today that it has raised $17 million in Series C funding.

When I first wrote about the startup in 2016, it was focused on helping users track and cancel unwanted subscriptions. Since then, it’s expanded into other financial products, like reports on your personal expenses and the ability to negotiate lower bills.

This week, Chief Revenue Officer Yahya Mokhtarzada told me that with the pandemic leading to a dramatic reduction in ad costs, Truebill was able to make TV advertising a key channel for reaching new users.

And of course, the financial uncertainty has made the product more appealing too — particularly its smart savings tool, where users can automatically set aside money for their goals.

“People became aware of the need to have some cushion,” Mokhtarzada said. “You should start saving when things are going well, before you need it, but [saving during the pandemic] is better than not doing it at all. We’ve seen a big bump in smart savings adoption, which is at an all-time high.”

Truebill Net worth

Image Credits: Truebill

The new round brings Truebill’s total funding to $40 million. It was led by Bessemer Venture Partners, with participation from Eldridge Capital, Cota Capital, Firebolt Ventures and Day One Ventures.

The startup says the round will allow it to develop new products and features, including net worth tracking, automated debt payments and shared accounts.

Mokhtarzada added that the company will be making big investments in data science to help follow its “north star” of financial health, where he said, “The data challenge is significant.”

Sure, it’s pretty straightforward to recognize whether someone’s doing well or poorly financially, but the real goal is to “recognize trends and shortfalls before they happen.”

For example, instead of simply alerting users when they’ve been charged an overdraft fee on their account, Mokhtarzada said, “What is helpful is to have predictive models analyze data to anticipate a cashflow shortage and have the right tools in place that prevent it.”

Read more

Netflix already borrowed the concept of short-form video “Stories” from social apps like Snapchat and Instagram for its Previews feature back in 2018. Now, the company is looking to the full-screen vertical video feed, popularized by TikTok, for further inspiration. With its latest experiment, Fast Laughs, Netflix is offering a new feed of short-form comedy clips drawn from its full catalog.

The feed includes clips from both originals and licensed programming, Netflix says. It also includes video clips from the existing Netflix social channel, “Netflix Is A Joke,” which today runs clips, longer videos and other social content across YouTube, Twitter, Facebook and Instagram.

Fast Laughs resembles TikTok in the sense that it’s swiped through vertically, offers full-screen videos and places its engagement buttons on the right side. But it’s not trying to become a place to waste time while being entertained.

Like many of Netflix’s experiments, the goal with the Fast Laughs feed is to help users discover something new to watch.

Instead of liking and commenting on videos, as you would in a social video app, the feed is designed to encourage users to add shows to their Netflix watch list for later viewing. In this sense, it’s serving a similar purpose to Netflix’s “Previews” feature, which helps users discover shows by watching clips and trailers from popular and newly released programming.

As users scroll through the new Fast Laughs feed, they’ll encounter a wide range of comedy clips — like a clip from a Kevin Hart stand-up special or a funny bit from “The Office,” for example. The clips will also range in length anywhere from 15 to 45 seconds.

In addition to adding clips to Netflix’s “My List” feature, users can also react to clips with a laughing emoji button, share the clip with friends across social media, or tap a “More” button to see other titles related to the clip you’re viewing.

The feature was first spotted by social media consultant Matt Navarra, based in the U.K. In his app, Fast Laughs appeared in front of the row of Previews, where it was introduced with text that said “New!”

Netflix confirmed to TechCrunch the experiment had been tested with a small number of users earlier this year, but has recently started rolling out to a wider group this month — including users in the U.K., the U.S. and other select markets.

It’s currently available to a subset of Netflix users with adult profiles or other profiles without parental controls on iOS devices only. However, users don’t need to be opted in to experiments nor do they need to be on a beta version of the Netflix app to see the feature. It’s more of a standard A/B test, Netflix says.

And because it’s a test, users may see slightly different versions of the same feature. The product may also evolve over time, in response to user feedback.

Netflix is hardly the first to “borrow” the TikTok format for its own app. Social media platforms, like Instagram and Snapchat, have also launched their own TikTok rivals in recent months.

But Netflix isn’t a direct competitor with TikTok — except to the extent that any mobile app competes for users’ time and attention, as there are only so many hours in a day.

Instead, the new feed is more of an acknowledgment that the TikTok format of a full-screen vertical video feed with quick engagement buttons on the side is becoming a default style of sorts for presenting entertaining content.

“We’re always looking for new ways to improve the Netflix experience,” a Netflix spokesperson said, confirming the experiment. “A lot of our members love comedy so we thought this would be an exciting new way to help them discover new shows and enjoy classic scenes. We experiment with these types of tests in different countries and for different periods of time — and only make them broadly available if people find them useful,” they added.

Read more

The news: Digital contact tracing apps have faced a wide range of difficulties, but that doesn’t mean we should abandon the idea, according to the authors of a new essay in the journal Science. Instead, they argue, successful digital contact tracing needs to be ethical, trustworthy, locally rooted, and adaptive to new data on what works.

The problem: Modern public health relies on contact tracing during disease outbreaks, and digital apps promised to add jet fuel to the fight against covid-19. Early in the pandemic, companies and governments spun up contact tracing apps as part of a massive effort to stop the spread of the disease. Improbably, Google and Apple even joined forces. Now we’re seeing the flaws in this premise play out. Download rates are low, usage rates appear even lower, and apps face lots of other logistical hurdles. Contact tracing, both manual and automated, still isn’t delivering desperately needed results at scale. A recent Pew survey shows that people struggle to trust public health officials with their data, and don’t like answering the phone when it’s an unknown caller (like a health department), among other obstacles.

Not only that, but digital contact tracing has clearly failed to effectively reach many people. It’s not just those without a smartphone, but also marginalized groups like the elderly, the unhoused, and those who are worried about law enforcement and immigration. 

What to do instead: In their Science essay, authors Alessandro Blasimme and Effy Vayena, bioethicists at ETH Zurich in Switzerland, say “adaptive governance” is one important missing ingredient. It’s all about acting collaboratively, nimbly, and locally: stop looking for centralized, top-down campaigns and strategies that may fizzle out when they don’t fit local needs. It’s time to rely on local partnerships, cross-border collaborations, and all the human teamwork that’s easy to forget when there’s a shiny new button to click.

The US doesn’t currently have a national contact tracing app, but if the authors are correct, perhaps that’s not a major issue. They say instead that if we want more people to adopt new technologies, we need to rely on “the piecemeal creation of public trust.” It’s an ongoing process of authorities learning from their mistakes and listening to users. It’s also important to create genuine oversight, so that people feel their data isn’t being misused, and put effort into cross-border collaborations so that your app doesn’t stop working when you move from one place to another. 

The bottom line: There are still plenty of questions to be answered about the effectiveness and development of contact tracing apps. But instead of dropping digital contact tracing efforts or scaling up existing efforts without taking a hard look, it’s time to reconsider. Digital contact tracing is just one part of a toolkit that needs research-based, on-the-ground teamwork to build trust and relationships among users, governments, and the technologies themselves.

Read more

Several years ago, researchers at the Polytechnic University of Catalonia and the University of Cambridge performed a series of simple experiments that could have huge implications for cooling and refrigeration.

They placed plastic crystals of neopentyl glycol—a common chemical used to produce paints and lubricants—into a chamber, added oil, and cranked down a piston. As the fluid compressed and applied pressure, the temperature of the crystals rose by around 40 ˚C.

It was the largest temperature shift ever recorded from placing materials under pressure, at least when the findings were published in a Nature Communications paper last year. And alleviating the pressure has the opposite effect: cooling the crystals dramatically.

The research team said the results highlight a promising approach to replacing traditional refrigerants, potentially delivering “environmentally friendly cooling without compromising performance.” Such advances are crucial, since increasing wealth, growing populations, and rising temperatures could triple energy demands from indoor cooling by 2050 without major technological improvements, the International Energy Agency projects.

The temperature change in the materials was comparable to those that occur in the hydrofluorocarbons that drive cooling in standard air conditioning systems and refrigerators. Hydrofluorocarbons, however, are powerful greenhouse gases.

The work is based on a long-known phenomenon, familiar if you’ve ever stretched a balloon and touched it to your lips, in which so called caloric materials release heat when placed under pressure or stressed. Submitting certain materials to magnetic and electrical fields, or some combination of these forces, also does the trick in some cases.

Scientists have been developing magnetic refrigerators based on these principles for decades, though they tend to require large, powerful, and expensive magnets. But researchers are making considerable strides in the field, according to a review paper in Science on Thursday, written by Xavier Moya and N.D. Mathur, materials scientists at the University of Cambridge who worked on the experiments described above.

Research teams are pinpointing numerous caloric materials that undergo large temperature shifts and putting them to work in prototype heating and cooling devices, the authors note. Materials and devices that can release and transfer large amounts of heat using electricity, strain, and pressure—approaches that only really took off starting a little more than a decade ago—are already catching up with the performance achieved through decades of work in magnet-based cooling devices.

In addition to reducing the need for hydrofluorocarbons, the hope is the technology could eventually be more energy efficient than standard cooling devices, given the heat released relative to the amount of energy needed to drive the change. A critical difference with this technology is that the materials remain in a solid state, while traditional refrigerants, like hydroflurocarbons, work by shifting between gas and liquid phases.

Triggering a phase change

Here’s how the technology works:

Many materials exhibit small temperature changes under certain forces. But researchers have been hunting for materials that undergo large shifts, ideally from as little added energy as possible. Among other materials, certain metal alloys have shown promising results under strain; some ceramics and polymers respond well to electrical fields; and inorganic salts and rubber look promising for pressure.

The forces or fields line up the atoms or molecules within the materials in more orderly ways, bringing about a phase change similar to what occurs when free-flowing water molecules turn into compact ice crystals. (In the case of caloric materials, however, the phase change occurs while the materials remain in a solid state, though one that is more rigid.) This process releases enough latent heat to account for the energy difference between the two states. When the materials revert back as the forces are released, it produces a temperature decrease that can then be exploited for cooling.

This isn’t very different from how cooling devices work today: they decompress hydrofluorocarbons to the point that they switch from a liquid to a gas. But this solid-state cooling approach can be far more energy efficient, at least in part because you don’t have to move the molecules nearly as far to bring about the phase change, says Jun Cui, a senior scientist with Ames Laboratory.

Moving into the market

The key to delivering competitive commercial devices is identifying affordable materials that undergo large temperature shifts, easily revert back, withstand extended cycles of these changes without breaking down (commercial refrigerators can run for millions of cycles), and aren’t expensive.

Certain materials and use cases are getting close to reaching the commercial market, says Ichiro Takeuchi, a materials scientist at the University of Maryland. He launched a company to produce cooling devices from materials that respond to stress about a decade ago, called Maryland Energy & Sensor Technology.

His research group developed a prototype cooling device that compresses and releases tubes made from nickel titanium to induce heating and cooling. Water running through the tubes absorbs and dissipates heat during the initial phase, and the process then runs in reverse to chill water that can be used to cool a container or living space.

The prototype cooling device developed by Ichiro Takeuchi’s research group.
COURTESY: ICHIRO TAKEUCHI

The company plans to produce a wine cooler, which doesn’t require the same cooling power as a large refrigerator or window AC unit, as an initial product, using an unspecified but less expensive material.

Moya, one of the authors of the Science paper, cofounded his own startup about a year and half ago. Barocal, based in Cambridge, England, has developed a prototype heat pump relying on plastic crystals that are “related to neopentyl glycol but better,” he says.

All told, a dozen or so startups have been formed to commercialize the technology, and a number of existing companies, including Chinese home appliances giant Haier and Astronautics Corporation of America, have explored its potential as well.

Cui expects we’ll see some of the first commercial products based on materials that change temperature in response to force and stress within the next five to 10 years, but he says it will likely take years longer for prices to become competitive with standard cooling products.

Update: This story was updated the clarify the timing of the neopentyl glycol experiments.

Read more

Axiom Space has signed three private astronauts to join former NASA astronaut Michael López-Alegría on Ax-1, the first private mission into orbit and to the International Space Station.

The mission: In March, Axiom Space announced plans to launch “history’s first fully private human spaceflight mission to the International Space Station.” The mission, dubbed Ax-1, would go forward using SpaceX’s Crew Dragon vehicle to deliver private astronauts to the ISS for at least eight days. 

At the International Astronautical Congress last month, Axiom CEO Michael Suffredini said the company was aiming to launch in the fourth quarter of 2021.

The crew: Details are still sparse. We know that former NASA astronaut Michael López-Alegría will be part of the mission, but the three other astronauts have not been announced yet. The promotional image Axiom posted Wednesday features three male silhouettes, suggesting there will be no female astronauts on board.

There is some excitable chatter on Twitter and other places suggesting that two of the other astronauts might be actor Tom Cruise and director Doug Liman, who have been in talks with NASA about filming a movie on the ISS. NASA administrator Jim Bridenstine mentioned in June that Axiom was involved in those talks.  

Broader implications: Beyond moviemaking, Ax-1 would be a milestone for NASA’s goal of opening up the ISS to private industry activity, and using the space station as a platform to spur increased commercialization of low Earth orbit, before its planned life span ends by 2030. Axiom plans to launch a habitat module to attach to the ISS in 2024, which is supposed to be just the first part of a larger private space station to be constructed and assembled throughout the rest of the decade. 

Sign up for our space newsletter, The Airlock, here.

Read more

The news: When a German hospital patient died in September while ransomware disrupted emergency care at the facility, police launched a negligent-homicide investigation and said they might hold the hackers responsible. The case attracted worldwide attention because it could have been the first time law enforcement considered a cyberattack to be directly responsible for a death.

But after months of investigation, police now say the patient was in such poor health that she likely would have died anyway, and that the cyberattack was not responsible. 

The findings: “The delay was of no relevance to the final outcome,” Markus Hartmann, the chief public prosecutor at Cologne public prosecutor’s office, told Wired. “The medical condition was the sole cause of the death, and this is entirely independent from the cyberattack.”

Although police have dropped the claim that hackers are responsible for the patient’s death, German law enforcement is still investigating the case. Hartmann, and many cybersecurity experts, believe it’s only a matter of time before an attack against hospitals causes such a tragedy.

The warning: In October, a wave of ransomware attacks hit American hospitals just as coronavirus cases started spiking. No one died as a result, but the prolific hackers involved did make their money, which means all the incentives are there for more attacks—just as coronavirus rates continue to rise rapidly around the western world. 

Read more

Last month Nature published a damning response written by 31 scientists to a study from Google Health that had appeared in the journal earlier this year. Google was describing successful trials of an AI that looked for signs of breast cancer in medical images. But according to its critics, the Google team provided so little information about its code and how it was tested that the study amounted to nothing more than a promotion of proprietary tech.

“We couldn’t take it anymore,” says Benjamin Haibe-Kains, the lead author of the response, who studies computational genomics at the University of Toronto. “It’s not about this study in particular—it’s a trend we’ve been witnessing for multiple years now that has started to really bother us.”

Haibe-Kains and his colleagues are among a growing number of scientists pushing back against a perceived lack of transparency in AI research. “When we saw that paper from Google, we realized that it was yet another example of a very high-profile journal publishing a very exciting study that has nothing to do with science,” he says. “It’s more an advertisement for cool technology. We can’t really do anything with it.”

Science is built on a bedrock of trust, which typically involves sharing enough details about how research is carried out to enable others to replicate it, verifying results for themselves. This is how science self-corrects and weeds out results that don’t stand up. Replication also allows others to build on those results, helping to advance the field. Science that can’t be replicated falls by the wayside.

At least, that’s the idea. In practice, few studies are fully replicated because most researchers are more interested in producing new results than reproducing old ones. But in fields like biology and physics—and computer science overall—researchers are typically expected to provide the information needed to rerun experiments, even if those reruns are rare.

Ambitious noob

AI is feeling the heat for several reasons. For a start, it is a newcomer. It has only really become an experimental science in the past decade, says Joelle Pineau, a computer scientist at Facebook AI Research and McGill University, who coauthored the complaint. “It used to be theoretical, but more and more we are running experiments,” she says. “And our dedication to sound methodology is lagging behind the ambition of our experiments.”

The problem is not simply academic. A lack of transparency prevents new AI models and techniques from being properly assessed for robustness, bias, and safety. AI moves quickly from research labs to real-world applications, with direct impact on people’s lives. But machine-learning models that work well in the lab can fail in the wild—with potentially dangerous consequences. Replication by different researchers in different settings would expose problems sooner, making AI stronger for everyone. 

AI already suffers from the black-box problem: it can be impossible to say exactly how or why a machine-learning model produces the results it does. A lack of transparency in research makes things worse. Large models need as many eyes on them as possible, more people testing them and figuring out what makes them tick. This is how we make AI in health care safer, AI in policing more fair, and chatbots less hateful.

What’s stopping AI replication from happening as it should is a lack of access to three things: code, data, and hardware. According to the 2020 State of AI report, a well-vetted annual analysis of the field by investors Nathan Benaich and Ian Hogarth, only 15% of AI studies share their code. Industry researchers are bigger offenders than those affiliated with universities. In particular, the report calls out OpenAI and DeepMind for keeping code under wraps.

Then there’s the growing gulf between the haves and have-nots when it comes to the two pillars of AI, data and hardware. Data is often proprietary, such as the information Facebook collects on its users, or sensitive, as in the case of personal medical records. And tech giants carry out more and more research on enormous, expensive clusters of computers that few universities or smaller companies have the resources to access.

To take one example, training the language generator GPT-3 is estimated to have cost OpenAI $10 to $12 million—and that’s just the final model, not including the cost of developing and training its prototypes. “You could probably multiply that figure by at least one or two orders of magnitude,” says Benaich, who is founder of Air Street Capital, a VC firm that invests in AI startups. Only a tiny handful of big tech firms can afford to do that kind of work, he says: “Nobody else can just throw vast budgets at these experiments.”

The rate of progress is dizzying, with thousands of papers published every year. But unless researchers know which ones to trust, it is hard for the field to move forward. Replication lets other researchers check that results have not been cherry-picked and that new AI techniques really do work as described. “It’s getting harder and harder to tell which are reliable results and which are not,” says Pineau.

What can be done? Like many AI researchers, Pineau divides her time between university and corporate labs. For the last few years, she has been the driving force behind a change in how AI research is published. For example, last year she helped introduce a checklist of things that researchers must provide, including code and detailed descriptions of experiments, when they submit papers to NeurIPS, one of the biggest AI conferences.

Replication is its own reward

Pineau has also helped launch a handful of reproducibility challenges, in which researchers try to replicate the results of published studies. Participants select papers that have been accepted to a conference and compete to rerun the experiments using the information provided. But the only prize is kudos.

This lack of incentive is a barrier to such efforts throughout the sciences, not just in AI. Replication is essential, but it isn’t rewarded. One solution is to get students to do the work. For the last couple of years, Rosemary Ke, a PhD student at Mila, a research institute in Montreal founded by Yoshua Bengio, has organized a reproducibility challenge where students try to replicate studies submitted to NeurIPS as part of their machine-learning course. In turn, some successful replications are peer-reviewed and published in the journal ReScience. 

“It takes quite a lot of effort to reproduce another paper from scratch,” says Ke. “The reproducibility challenge recognizes this effort and gives credit to people who do a good job.” Ke and others are also spreading the word at AI conferences via workshops set up to encourage researchers to make their work more transparent. This year Pineau and Ke extended the reproducibility challenge to seven of the top AI conferences, including ICML and ICLR. 

Another push for transparency is the Papers with Code project, set up by AI researcher Robert Stojnic when he was at the University of Cambridge. (Stojnic is now a colleague of Pineau’s at Facebook.) Launched as a stand-alone website where researchers could link a study to the code that went with it, this year Papers with Code started a collaboration with arXiv, a popular preprint server. Since October, all machine-learning papers on arXiv have come with a Papers with Code section that links directly to code that authors wish to make available. The aim is to make sharing the norm.

Do such efforts make a difference? Pineau found that last year, when the checklist was introduced, the number of researchers including code with papers submitted to NeurIPS jumped from less than 50% to around 75%. Thousands of reviewers say they used the code to assess the submissions. And the number of participants in the reproducibility challenges is increasing.

Sweating the details

But it is only a start. Haibe-Kains points out that code alone is often not enough to rerun an experiment. Building AI models involves making many small changes—adding parameters here, adjusting values there. Any one of these can make the difference between a model working and not working. Without metadata describing how the models are trained and tuned, the code can be useless. “The devil really is in the detail,” he says.

It’s also not always clear exactly what code to share in the first place. Many labs use special software to run their models; sometimes this is proprietary. It is hard to know how much of that support code needs to be shared as well, says Haibe-Kains.

Pineau isn’t too worried about such obstacles. “We should have really high expectations for sharing code,” she says. Sharing data is trickier, but there are solutions here too. If researchers cannot share their data, they might give directions so that others can build similar data sets. Or you could have a process where a small number of independent auditors were given access to the data, verifying results for everybody else, says Haibe-Kains.

Hardware is the biggest problem. But DeepMind claims that big-ticket research like AlphaGo or GPT-3 has a trickle-down effect, where money spent by rich labs eventually leads to results that benefit everyone. AI that is inaccessible to other researchers in its early stages, because it requires a lot of computing power, is often made more efficient—and thus more accessible—as it is developed. “AlphaGo Zero surpassed the original AlphaGo using far less computational resources,” says Koray Kavukcuoglu, vice president of research at DeepMind.

In theory, this means that even if replication is delayed, at least it is still possible. Kavukcuoglu notes that Gian-Carlo Pascutto, a Belgian coder at Mozilla who writes chess and Go software in his free time, was able to re-create a version of AlphaGo Zero called Leela Zero, using algorithms outlined by DeepMind in its papers. Pineau also thinks that flagship research like AlphaGo and GPT-3 is rare. The majority of AI research is run on computers that are available to the average lab, she says. And the problem is not unique to AI. Pineau and Benaich both point to particle physics, where some experiments can only be done on expensive pieces of equipment such as the Large Hadron Collider.

In physics, however, university labs run joint experiments on the LHC. Big AI experiments are typically carried out on hardware that is owned and controlled by companies. But even that is changing, says Pineau. For example, a group called Compute Canada is putting together computing clusters to let universities run large AI experiments. Some companies, including Facebook, also give universities limited access to their hardware. “It’s not completely there,” she says. “But some doors are opening.”

Haibe-Kains is less convinced. When he asked the Google Health team to share the code for its cancer-screening AI, he was told that it needed more testing. The team repeats this justification in a formal reply to Haibe-Kains’s criticisms, also published in Nature: “We intend to subject our software to extensive testing before its use in a clinical environment, working alongside patients, providers and regulators to ensure efficacy and safety.” The researchers also said they did not have permission to share all the medical data they were using.

It’s not good enough, says Haibe-Kains: “If they want to build a product out of it, then I completely understand they won’t disclose all the information.” But he thinks that if you publish in a scientific journal or conference, you have a duty to release code that others can run. Sometimes that might mean sharing a version that is trained on less data or uses less expensive hardware. It might give worse results, but people will be able to tinker with it. “The boundaries between building a product versus doing research are getting fuzzier by the minute,” says Haibe-Kains. “I think as a field we are going to lose.” 

Research habits die hard

If companies are going to be criticized for publishing, why do it at all? There’s a degree of public relations, of course. But the main reason is that the best corporate labs are filled with researchers from universities. To some extent the culture at places like Facebook AI Research, DeepMind, and OpenAI is shaped by traditional academic habits. Tech companies also win by participating in the wider research community. All big AI projects at private labs are built on layers and layers of public research. And few AI researchers haven’t made use of open-source machine-learning tools like Facebook’s PyTorch or Google’s TensorFlow.

As more research is done in house at giant tech companies, certain trade-offs between the competing demands of business and research will become inevitable. The question is how researchers navigate them. Haibe-Kains would like to see journals like Nature split what they publish into separate streams: reproducible studies on one hand and tech showcases on the other.

But Pineau is more optimistic. “I would not be working at Facebook if it did not have an open approach to research,” she says. 

Other large corporate labs stress their commitment to transparency too. “Scientific work requires scrutiny and replication by others in the field,” says Kavukcuoglu. “This is a critical part of our approach to research at DeepMind.”

“OpenAI has grown into something very different from a traditional laboratory,” says Kayla Wood, a spokesperson for the company. “Naturally that raises some questions.” She notes that OpenAI works with more than 80 industry and academic organizations in the Partnership on AI to think about long-term publication norms for research.

Pineau believes there’s something to that. She thinks AI companies are demonstrating a third way to do research, somewhere between Haibe-Kains’s two streams. She contrasts the intellectual output of private AI labs with that of pharmaceutical companies, for example, which invest billions in drugs and keep much of the work behind closed doors.

The long-term impact of the practices introduced by Pineau and others remains to be seen. Will habits be changed for good? What difference will it make to AI’s uptake outside research? A lot hangs on the direction AI takes. The trend for ever larger models and data sets—favored by OpenAI, for example—will continue to make the cutting edge of AI inaccessible to most researchers. On the other hand, new techniques, such as model compression and few-shot learning, could reverse this trend and allow more researchers to work with smaller, more efficient AI.

Either way, AI research will still be dominated by large companies. If it’s done right, that doesn’t have to be a bad thing, says Pineau: “AI is changing the conversation about how industry research labs operate.” The key will be making sure the wider field gets the chance to participate. Because the trustworthiness of AI, on which so much depends, begins at the cutting edge. 

Read more
1 2,384 2,385 2,386 2,387 2,388 2,481