What Research Data Infrastructure Requires from Library Leadership
For a long time, the conversation about research data infrastructure was dominated by two camps: IT departments, who were responsible for compute and storage, and researchers, who generated the data and had opinions about how it should be managed.
Libraries were usually not in that conversation. We were downstream. We helped people cite datasets. We held workshops on data management plans.
That was not wrong, but it was insufficient.
The gap that libraries are uniquely positioned to fill
When we built UCLA Dataverse as a campus publishing platform, we were not just installing software. We were making policy decisions: what counts as publishable data, how curation works, what metadata standards apply, which data requires restricted access, how long things need to be retained and by whom.
Those decisions required domain knowledge, community relationships, and a long-term institutional perspective. IT could not make them unilaterally. Researchers did not always agree with each other. Someone had to hold the policy surface and negotiate across those interests.
That is a library function.
The same dynamic emerged when we licensed Redivis for restricted data. The platform provides the technical capability. But the governance — who can request access, what Data Use Agreements look like, how the approval workflow is structured, how to keep it from becoming a bottleneck — that required sustained library leadership attention.
Libraries occupy a position between researchers, IT, funders, and the institution. We are trusted by all of them in different ways. We have obligations that run across projects and beyond individual grant cycles. That position is valuable precisely because it is not fully captured by any one interest.
What this requires of library leaders
It requires systems thinking, not just service delivery.
Service delivery asks: what does this researcher need today? Systems thinking asks: what does the campus need in five years, and what policies, platforms, and capacity have to be in place to support that?
Both matter. But library leaders who stay only in service delivery mode will not build infrastructure. They will build a series of one-off responses, each of which is individually helpful and collectively incoherent.
It also requires a willingness to be in rooms where decisions are being made — budget conversations, IT governance, research computing strategy — and to make an affirmative case for what libraries contribute. Not just to advocate for library resources, but to clarify that certain problems (curation, policy, community trust, long-term access) genuinely require library expertise to solve.
Infrastructure as scholarly stewardship
I find the word “infrastructure” clarifying because it redirects attention from service to substrate. The goal is not to help individual researchers. The goal is to build the conditions under which research of a certain kind can happen at all.
When we built the restricted data environment, we were not just helping a few research groups with sensitive datasets. We were making it possible for UCLA researchers to work with health records, voting data, and other high-sensitivity sources that would otherwise require travel to a federal data enclave or could not be used at all.
When we scaled computational training across the UC system, we were not just running workshops. We were changing the baseline skill level of a researcher cohort that will produce scholarship for the next 30 years.
That is stewardship at a different scale than what we usually claim as library work.
The claim worth making
Libraries should be central to research data infrastructure — not as vendors of services, but as stewards of the scholarly ecosystem that infrastructure enables.
That claim requires us to show up differently. It requires governance work, policy work, technical partnership, and a long-term view that survives individual grant cycles and personnel changes.
It is work that most research institutions badly need and that most IT departments cannot do alone.
Libraries can do it. But we have to be willing to lead.