By Kathryn Palmer
Academic researchers know that artificial intelligence (AI) technology has the potential to revolutionize the technical aspects of nearly every industry. And while theyâre trained to apply such innovations in ethical, equitable ways, compared to profit-driven tech companies, they have limited access to the expensive, powerful technology required for AI research.
That divide has scholars and other government-funded researchers concerned that the developments emerging from the AI Gold Rush could leave marginalized populations behind.
For instance, a radiology technician could use a generative AI agent to read X-rays, in theory leading to more accurate diagnoses and better health outcomes. But if that AI agent were trained solely on data from a hospital in an affluent neighborhood, it might fail to pick up on signs and symptoms that are more common in lower-income communities.
The wealthier population âcould have a fundamentally different distribution of tell-tale signs that would not necessarily match that same distribution in a population of folks who, for example, have a hard time making it to a medical practitioner regularly,â said Bronson Messer, the director of science for the U.S. Department of Energyâs Oak Ridge Leadership Computing Facility in Tennessee, which houses Summit, one of the nationâs most powerful publicly-funded available supercomputers that some academics are using for AI research.
âThereâs this persistent concern that the data thatâs being used to train generative AI could have inherent biases that are almost impossible to discern until after the fact because a generative AI agent can only interpret what itâs been given.â
The Resource Divide
Removing that bias is one of the overarching goals of the National Artificial Intelligence Research Resource pilot (NAIRR), which the National Science Foundation (NSF) helped launch in January.
âItâs something that needs to be paid attention to and the U.S. academic community is most well-positioned to suss that out,â said Messer, who is a member of the NAIRR Allocations Working Group. âI donât want to leave that to Meta or Google. Thatâs a problem that should be debated in the open literature.â
The NAIRR pilot is the result of President Joe Bidenâs executive order on the safe, secure and trustworthy development and use of AI, according to a news release from the NSF, which is leading the pilot in partnership with the Energy Department.
Through the two-year pilot, so far 77 projectsâthe majority of which are affiliated with universitiesâhave received an allocation of computing and data resources and services, including remote access to Summit and other publicly funded supercomputers. The two-year NAIRR pilot has prioritized projects that focus on using AI to address âsocietal challengesâ in sectors such as agriculture and health care.
Although university researchers have long spearheaded innovation in those and other fields, access to the increasingly complex infrastructure necessary to do AI-driven research and developmentâknown as AI computeâis expensive and highly concentrated among private tech companies, such as Open AI and Meta, in select geographic locations such as the Bay Area, New York City and Seattle.
Compared to tech companies, even the nationâs most well-endowed research universities donât have anywhere near the compute power needed for âquerying, fine-tuning, and training Large Language Models (LLMs) to develop their own advances,â as a recent paper from the Brookings Institute put it. And smaller institutions, most of them located far from major tech hubs, have even fewer resources and expertise to undertake AI research.
To put it in perspective, Reuters reported in April that Meta plans to accumulate about 600,000 state-of-the art geographic processing units (GPUs),which are computer chips that aid AI applications, by the yearâs end. Yet, Summit at Oak Ridge National Laboratory in Tennessee, has about 27,000 GPUs.
âFundamental Democracy Issueâ
The possible implications of these AI research resource disparities âare a real fundamental democracy issue,â said Mark Muro, a senior fellow at Brookings Metro who specializes in the interplay of technology, people and place. âTo the extent that AI becomes a huge driver of productivityâif that turns out to be trueâthen it is a problem if only a short list of places are truly benefitting on its impact on the economy.â
The same goes for choosing which research questions to investigate, as well as where and how to pursue them.
âWe may end up with a narrow set of research questions chosen solely by big tech,â Muro said.
And those companies may not have the interest or monetary incentives to address region-specific problems, such as a particular health-care crisis or forest-fire management. âThose things could be really energized by an AI solution, but there may be nobody researching in that place, though sometimes place-based solutions can be really important.â
Race and gender biases are also a concern because much like the upper echelons of academic scientific research communities, the tech industry is also dominated by white men.
âIf we donât try to include more Minority Serving Institutions, HBCUs and broaden who gets to do AI research, weâre just going to reinforce this issue of a lack of a diversity in the field,â said Jennifer Wang, a computer science student at Brown University, who co-authored a paper with Muro on the AI-research divide Brookings published earlier this month.
âRight now, a lot of AI research is focused on developing better, more performant models and less attention is given to the biases within these models,â Wang said. âThereâs less thought given to capturing linguistic nuances and cultural contexts because those models arenât really built with certain populations in mind.â
Democratizing AI research is one of the primary goals of the NAIRR pilot. It is expected to run for two years, though the directors of both the NSF and the federal Office of Science and Technology Policy have expressed hope for enough funding to allow NAIRR to carry on beyond that.
While several big-name research universities, including Brown, Harvard and Stanford have received allocations, research institutions with smaller profiles, including the University of Memphis, Florida State University and Iowa State University, are also part of the NAIRR pilot.
Iowa Stateâs project, for one, aims to use the Frontera supercomputer housed at the Texas Advanced Computing Center at the University of Texas at Austin to develop âlarge, vision-based artificial intelligence tools to identify and eventually recommend controls for agricultural pests,â according to a news release from the university.
Equitable Access
But without support from the NAIRR pilot, it might never have launched.
âUniversity research is often early-stage and has the flexibility to concentrate on societally relevant problems that industry may not currently be interested in, ensuring that critical issues like agricultural resilience receive the attention they deserve,â Baskar Ganapathysubramanian, an engineering professor leading the project and director of Iowa Stateâs AI Institute for Resilient Agriculture, wrote in an email. âThis allows academic research to prioritize benefits to the public good over commercial interests, focusing on ethical considerations and long-term societal impacts.â
For as long as resources are available, the NSF expects âthe next cohort of projects to be announced shortly, and approximately each month thereafter,â Katie Antypas, director of NSFâs Office of Advanced Cyberinfrastructure wrote in an email.
But considering that Congress slashed the NSFâs 2024 budget by 8 percent, itâs not clear that NAIRR will become a permanent fixture of the academic research enterprise.
âWe believe the two-year NAIRR pilot has great potential,â Julia Jester, deputy vice president for government relations and public policy for the Association of American Universities, said in an email. âBut, to meet the projectâs goal of widely improving access to the AI infrastructure needed to advance research and train the next generation of researchers, NSF and other agencies will need significantly more resources.â
Suresh Venkatasubramanian, a professor of computer and data science at Brown and director of the universityâs Center for Tech Responsibility, is just getting started on his NAIRR pilot project, which aims to develop tools that bring more transparency to the data used to train LLMs.
As AI technology begins to permeate every profession, including those in research, business and health care, reckoning with its full implications is critical.
âItâs really important that across the board, institutions of higher learning can embrace what weâre seeing with AI and learn about it, play with it and reimagine how we should use it beyond the imaginations and the solutions being provided by the people at the tech companies,â Venkatasubramanian said. âWe canât do that without equitable access to the core compute units.â