Date:

Groups Step Up to Rescue At-Risk Public Data

Public Data Under Threat: Groups Rush to Preserve Government Datasets

The Threat to Public Data

Public data is the lifeblood of open research and scientific inquiry. However, the possibility of losing public datasets, including academic, government, and scientific data generated as part of research, is now spurring several groups to take action to save it.

The Cuts Deep

In early February, the New York Times reported that more than 8,000 Web pages had been taken down across more than a dozen websites as part of President Trump’s orders to eliminate controversial diversity, equity, and inclusion (DEI) programs. Unfortunately, the cuts have gone deeper than gender and racial ideology. According to the Times, they spanned 3,000 pages from CDC websites, including 1,000 research articles on everything from chronic disease prevention to the warning signs of Alzheimer’s disease.

Efforts to Document the Data

One of the groups racing to document the data before it disappears is the End of Term Web Archive, which is dedicated to documenting government websites every four years when the reins of power are handed to the next president. The group has worked to document every transition since 2008.

Other Groups Working to Save Data

Another group working to save data is the Environmental Data & Governance Initiative, which bills itself as a research collaborative and network of professionals working to promote scientific data. The group formed following President Trump’s first election in 2016 and helped to save 200 terabytes of data from government websites running under the Obama Administration.

The Data Rescue Project, founded by members of the International Association for Social Science Information Service & Technology (IASSIST), the Research Data Access & Preservation (RDAP), and members of the Data Curation Network, is also working to save data. The group encourages volunteers to document at-risk datasets by using Data Lumos, a crowdsourced repository for government data created by the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan.

Harvard University’s Data.gov Archive

Harvard University’s Library Innovation Lab is also working to help protect data. Last month, the group launched a new project called the Data.gov Archive, which is designed to preserve datasets that have been linked to Data.gov, the Federal Government’s home for open data. The university group says it has "harvested" more than 310,000 datasets linked through Data.gov, for a total of 15 terabytes of data.

The Importance of Data Preservation

"It’s not uncommon for data to get lost under the normal course of business," said Lynda Kellam from the Data Rescue Project. "The difference is that we are seeing data being removed from studies that don’t match up with the ideology of the administration. This pace of takedown has been much quicker than it’s been in the past."

Conclusion

As Philip Bourne, the dean of the School of Data Science at the University of Virginia, wrote in a blog post, "As deans and university leaders, we need to make clear to governments that to be a public university means public accessibility to all the scholarship we produce, including the data from which that scholarship is derived."

FAQs

Q: What is the End of Term Web Archive?
A: The End of Term Web Archive is a group dedicated to documenting government websites every four years when the reins of power are handed to the next president.

Q: What is the Environmental Data & Governance Initiative?
A: The Environmental Data & Governance Initiative is a research collaborative and network of professionals working to promote scientific data.

Q: What is the Data Rescue Project?
A: The Data Rescue Project is a group founded by members of the International Association for Social Science Information Service & Technology (IASSIST), the Research Data Access & Preservation (RDAP), and members of the Data Curation Network to save at-risk datasets.

Q: What is Data Lumos?
A: Data Lumos is a crowdsourced repository for government data created by the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan.

Q: What is the Data.gov Archive?
A: The Data.gov Archive is a project launched by Harvard University’s Library Innovation Lab to preserve datasets linked to Data.gov, the Federal Government’s home for open data.

Latest stories

Read More

LEAVE A REPLY

Please enter your comment!
Please enter your name here