The RDAP Summit provides a venue for reaching across disciplines, institutions, and organizations to learn about common solutions to issues surrounding research data management. Attendees of the summit have multiple opportunities to expand professional networks and acquire practical knowledge and skills that can be applied to their own work and projects.
The 2021 summit was held virtually.
Amelia Kallaher and Reid Otsuji
Colgate University University Libraries, International Association for Social Science Information Services and Technology (IASSIST), University of Wisconsin-Madison iSchool, Iowa State University Library, University of Illinois Libraries, University of Wisconsin-Madison Libraries, Ex Libris, FigShare, National Network of Libraries of Medicine – New England and Greater Midwest, Purdue Libraries and School of Information Studies, SPARC (New Venture Fund), University of Arizona Libraries, Northeastern University, Syracuse iSchool, and an anonymous donor.
All times are listed in PST Times/ (12:00 noon EST start time)
9:00am - 9:10am Opening Remarks
9:10am - 10:10am Opening Keynote
No! Thinking About Critical Refusal as Data Practice
Dr. Tonia Sutherland, PhD
University of Hawaiʻi at Mānoa
Refusal does not just imply an act of negation, a response to authority, or resistance reimagined. Rather, refusal is intentionally generative and strategic, signifying a deliberate move away from one belief or practice and a considered reorientation toward another. In a world suffused with corporatized, militarized big data practices and exploitative, extractive artificial intelligence systems, how might we reorient our data practices through acts of critical refusal? Using feminist, Black, and Indigenous frameworks, this talk engages ideas around critical refusal and data sovereignty to imagine—and make the case for—more equitable and just futures. In this talk, Sutherland draws on the work of Sarah Ahmed, Audra Simpson, Ruha Benjamin and the authors of the Feminist Data Manifest-No in addition to her own work to discuss critical refusal as an informed practice of “talking back” and of knowing “when to stop,” arguing that critical refusal can be a generative concept for challenging harmful data practices as we simultaneously negotiate and develop viable strategies for change.
Dr. Tonia Sutherland is assistant professor in the Department of Information and Computer Sciences at the University of Hawaiʻi at Mānoa. Global in scope, Sutherland’s research focuses on entanglements of technology and culture, with particular emphases on critical and liberatory work within the fields of archival studies, digital studies, and science and technology studies (STS). In her work, Sutherland focuses on various national infrastructures–technological, social, human, cultural–addressing important concerns such as gaps and vagaries; issues of inclusivity and equality; and developing more liberatory praxes. Sutherland is the author of Digital Remains: Race and the Digital Afterlife (forthcoming from University of California Press). She is a faculty affiliate of the Center for Critical Race and Digital Studies at New York University and a member of the Center for Critical Internet Inquiry (C2i2)’s Scholar’s Council at UCLA.
10:10am - 10:20am Break
10:20am - 11:20am Panel 1: Responsible Data Practices
Diversity Scholars' Data: Practices, Gaps, and Potential Resources
This presentation will discuss the results of original research conducted with the National Center for Institutional Diversity’s Diversity Scholars Network. (The DSN is a self-identified, multidisciplinary group of scholars at various institutions whose work advances understanding of identity, difference, culture, representation, power, oppression, and inequality). Using an exploratory mixed-methods approach—interviews followed by a survey—we collected information about scholars’ data practices as well as the areas of the data lifecycle where they would benefit from more support. We also solicited feedback on a set of potential data management toolkit resources specifically targeted toward supporting practitioners around diversity, equity, inclusion, and accessibility concerns regarding their data, and the presentation will address our progress toward building such a toolkit from new and existing resources.
Supporting Responsible Research with Big Social Data by Connecting Communities of Practice
Big social data (e.g. social media and blogs) represent a radical change in the way research is conducted and the way data are curated, and thus introduce new ethical, legal, and epistemological challenges. This presentation suggests that big social research has key similarities to qualitative data reuse (e.g. research using archived interview transcripts and diaries). Both types of research repurpose existing data to advance discoveries in social science, and both also present challenges relating to context, informed consent, privacy, and intellectual property. However, despite these similarities, big social research has not yet been widely framed as a form of qualitative data reuse, and these communities of practice remain under-connected. This presentation will review preliminary results of interviews with qualitative researchers, big social researchers, and data curators, ultimately suggesting that robust data curation strategies developed to support responsible qualitative data reuse can inform similar practices for big social data.
Data Consultations, Racism, and Critiquing Colonialism in Demographic Datasheets
Nina Exner, Erin Carrillo, Sam Lief
Race in the U.S. is a colonial construct. Racial demographic terms are heavily influenced by early European concepts of race. “Asian-American” is used to refer to groups from Pakistani-American to Vietnamese-American. Very different tribal cultures are combined into “Native Americans”. Children of African immigrants are grouped with people whose great-grandparents were forcibly taken from Africa. Nevertheless, most of our researchers automatically use colonial racial categories. It rarely occurs to non-demographers to consider how to address race in their demographics sheets. How might this affect a researcher exploring factors that contribute to educational or health disparities? How might stepping away from colonial racial constructs affect research publication ability if journals aren’t moving forward with us? We would like to give examples of how this affects our own consultations and data collection and discuss how other data librarians might use this awareness to advise researchers on more authentically representative data collection.
11:20am - 11:30am Break
11:30am - 12:00noon Networking with Sponsors in GatherTown
12:00noon - 12:50pm Lightning Talks
Data Visualizations for Everybody - A Lesson on Accessibility
Ever wondered how to make your data visualizations appealing to everybody? Discover and learn the do's and don'ts of data visualizations as it relates to accessibility. Effective data visualizations should be more than just something "pretty" for the select few; effective visualizations should be functional and accessible to everybody! Learn the basics of accessible data visualizations in this 5-minute lightning talk. No prior knowledge required.
Do I have to be an “other” to be myself? Nonbinary Gender in Taxonomy, Data Collection, and through the Lifecycle
Sam Leif, Ari Gofman, Hannah Gunderman, Nina Exner
Relatively stringent meta-analyses suggest .3% of adults (3 per 1,000) in the U.S. identify as a nonbinary gender or otherwise gender nonconforming. Data structures that limit gender to “male” and “female” or ontological structures that use mapping to collapse gender demographics to binary values are excluding this whole population. We sometimes see “Other” as a gender response option, but “other” is a very excluding category. Critical examinations of taxonomies and ontologies are suggesting revisions to gender description. As RDAP members well know, the repository for sharing/discovering the dataset has to support the metadata describing gender identity. A repository would need particular ontological mapping to allow for long-term representation of diverse gender identities. In this talk, we will summarize the problem on gender inclusion and discuss some of the solutions underway. We will discuss some of the critical taxa being used, data collection options, how inclusive practices interact with ontological approaches, and how inclusive gender representation looks throughout the data lifecycle.
Caseload: Expanding REDCap Support to Host COVID-19 Testing and Tracking Projects
The University of Washington (UW) hosts one of the world’s largest REDCap instances, with more than 12,000 active users and 16,000 projects since launching in 2012. User support for REDCap, a HIPAA-compliant data collection platform, is administered through a ticketing system supported by librarians and staff from the Health Sciences Library and UW Medicine IT.
With Seattle emerging as one of the first COVID-19 hot spots in North America, UW’s REDCap instance became a key clinical resource for UW Medicine and local and county government health agencies, supporting projects from early case tracking to contract tracing to back-to-campus testing. Ticketing agents triaged more than 10,000 tickets between March and October 2020, an increase of almost 70 percent on the year before. This presentation will cover examples of how health officials used REDCap to support COVID-19 projects and how UW staff and librarians responded to the sudden surge in high-priority usage.
Reflecting on Teaching Carpentries Workshops Online
This presentation explores the lessons learned from pivoting a Data Carpentries workshop from in-person to online. Before the initial onslaught of tightened COVID-19 restrictions, Carpentries workshops worldwide were traditionally held in-person, facilitating a learning environment where personalized instruction was further enhanced by helpers who would streamline troubleshooting processes. Two instructors, one librarian and one Wildlife Biology PhD student, jumpstarted and co-taught the University of Montana’s first in-person Carpentries workshop focused on the R programming language during February 2020. Due to COVID-19 complications, a repeated workshop was postponed to the 2020 fall semester and was adapted for a fully online setting. These circumstances granted the instructors a unique opportunity to reflect upon and compare the effectiveness of facilitating Carpentries workshops for novice learners in a face-to-face environment versus a virtual modality.
What's Next: Sustaining Data Science Support with a Community of Practice
Many academic libraries at research-intensive institutions have now offered data services in some shape for years. How do we evolve our support for emerging Data Science initiatives to foster strong partnerships with our institutions' researchers, research offices, and campus IT? This short talk will cover one experience forming a cross-library data science community of practice to keep up with shifting priorities and opportunities, while staying grounded in the libraries' mission to provide distinctive and diverse user-centered services.
Update of the SPARC Data Sharing Resource
Reid Boehm, Jonathan Petters
Working under the auspices of RDAP, a group has updated the content of the SPARC data sharing resource. This community resource allows for tracking, comparing, and understanding current U.S. federal funder research data sharing policies. We will introduce this resource and its updates as useful for data management professionals, especially those supporting US government-funded research.
American Archives and Climate Change
Climate change is a major threat to archives. Archival materials are located at numerous diverse institutions in the public and private sector. Examples include a county health department’s birth and death records, a state archive’s land surveys, a national archive’s records of government activities, a university research library’s collection of papers from authors and politicians, or a local historical society’s neighborhood scrapbooks. Archives are a critical part of both democratic accountability and part of our cultural heritage. Climate change has the potential to impact everything from demands on building cooling and heating systems, to disaster response strategies, to permanent relocation of records from repositories in geographically vulnerable areas. Archivist Eira Tansey (University of Cincinnati) will discuss her current research activities concerning the climate change risks to geographically vulnerable American archives. Much of this research concerns an unprecedented effort to gather geospatial data about the locations of thousands of American archives.
Preparing a Data Archive or Repository for Changing Research Data and Materials Retention Policies
Journal and university based research data policies are increasingly requiring researchers to make available their code and other reproduction materials. Many data archives and repositories were designed only with the data itself in mind. Data librarians and archivists need to consider how to best archive these additional materials. The recently redesigned CISER Data and Reproduction Archive offers one model to build upon. Data and reproduction materials are housed in the same searchable archive but built to display differing metadata requirements. For example, as the reproduction materials are an effective extension of the article, a “reference article” field is needed to differentiate from citations of the material.
12:50pm - 1:00pm Break
1:00pm - 2:00pm Panel 2: DMPs and FAIR Practices
Finding Way Between Assuring Trustworthiness of Digital Repositories and Enabling FAIR data
Trustworthy repository services and FAIR data have wide benefits ranging from data users, research communities, and society at large. At the same time, they both support the mission and objectives of repositories to offer valuable services to their designated communities. The trustworthiness of digital repositories can be evaluated at different levels of complexity, (1) Core (CoreTrustSeal), (2) Extended (Nestor/DIN 31664), and (3) Formal (ISO16363). The question becomes to what extent and how data repositories can support the principles of FAIR data. CoreTrustSeal (CTS) certification is highly relevant for repositories in this respect because the CTS requirements align well with and complement the FAIR data principles. Even though they address the issue from different perspectives (i.e. data vs. repository level), they share the same goal – making data reusable. The paper aims to present the framework for audit and certification and to illustrate the alignment between CoreTrustSeal requirements and the FAIR principles.
The DMPs / IRB Nexus and How Data Librarians Can Help
Data librarians play a key role in addressing specific project inquiries on data management and potential sharing, mediating among various stakeholders involved: investigators, institutional review boards (IRBs), and data repositories. At the same time, there is little formal interaction among data librarians and IRB professionals and the former might not be aware of evolving views among the latter on data-sharing practices, especially those for sensitive data.
Data Management Planning for Systematic Reviews/Knowledge Syntheses: Development of a DMP Template and LibGuide
Heather Ganshorn, Zahra Premji
There is little formal guidance around data management for systematic reviews and other forms of knowledge synthesis. As librarian co-authors on many systematic reviews, we are often involved in data-gathering and management for these types of projects. With funding from the Canadian Association of Research Libraries’ Portage Network, we developed a DMP template and supplementary LibGuide for use with Portage’s DMP Assistant tool. Our intention is that this guidance will be a living document that evolves over time with community feedback. While the template itself is structured based on the existing DMP Assistant template categories, the LibGuide mirrors the workflow of a systematic review project (planning, locating studies, screening, data extraction, synthesis, and manuscript preparation). We will discuss how the template and LibGuide can be used together to create an effective DMP for systematic reviews and knowledge synthesis reviews.
All times are listed in PST Times/ (12:00 noon EST start time)
9:00am - 10:00am Panel 3: Collaborative Data Projects
No-nonsense, Practical Guide to Implementing Effective Data Practices
Data curation, discovery, reuse, and citation are accelerating scholarship across diverse fields and disciplines, but fall far short of the potential. Stakeholders across campuses are asked to “do better” at managing research data outputs. Often this comes with jargon-filled mandates that are difficult to translate into concrete action. In the summer of 2019, NSF published a clear, concrete call to action for researchers in a Dear Colleague Letter. In an effort to amplify that call and offer clarity to campus stakeholders, ARL, AAU, APLU, and CDL partnered to convene 40 experts in order to create clear, actionable, easy-to-understand guidance. The resulting report Implementing Effective Data Practices: Stakeholder recommendations for collaborative research support was recently released. This report provides specific recommendations for adopting and implementing persistent identifiers and supporting machine readable data management plans across an institution and within an organization or technology platform. It also provides key considerations for funders in adopting and requiring these critical infrastructure components. While the adoption and implementation of these best practices may be straightforward for some, communication about the importance of this infrastructure and the ease with which it may be implemented is needed. To support institutional and organizational efforts in the adoption of this infrastructure, the project team has developed a communication toolkit that includes various slide decks and talking points. This presentation will introduce these key areas and provide attendees with an introduction to the toolkit and how they can leverage it to promote adoption of persistent identifiers for various stakeholders.
Research as Design-Design as Research: Developing a Researcher-Driven Collaborative Model for Data Services
Cinthya Ippoliti and Kay Bjornen
Managing research data is challenging for researchers and academic libraries are expanding their services to help support them. Library surveys have been conducted to understand researcher behavior, motivation, and habits related to data management, but they have some limitations: surveys condense complex issues to a few broad categories that may not be applicable to researchers even within the same discipline; proposed solutions are seldom designed in concert with researchers’ input. Our IMLS-funded project consisted of working with Oklahoma State University faculty and using customer journey mapping and design thinking, document their practices throughout the research lifecycle to develop workflows for their data needs based on their own insights. We will also discuss how we are transitioning these workflows into a toolkit that we are piloting with local institutions to help us refine these processes and provide a way for smaller organizations to apply these tools to their own context.
Radical Change for RDM in Canada – Stakeholders, Services, and Synergies
Building on many years of effort and progress, Research Data Management (RDM) is entering an exciting and radical new era in Canada with the creation of a New Digital Research Infrastructure Organization (NDRIO) that brings RDM, Advanced Research Computing (ARC), and Research Software (RS) together into one cohesive enterprise. The significant government investment in RDM this represents reflects confidence in work accomplished to date and a shared aspirational vision for the future. This presentation will outline strategic service and platform directions for RDM in Canada – making data more open and discoverable -- and the synergies that merging with ARC and RS will foster. We’ll also look at the foundational and ongoing role of a Canada-wide Network of Experts coordinated by a national RDM Secretariat and aligned closely with the research data life cycle. We’ll introduce new tools, resources, and platforms and how community engagement has underpinned everything we do.
10:00am - 10:10am Break
10:10am - 11:10am Panel 4: Meeting Community Needs
The Zooming Winds of Change: Developing a New Curriculum for RDM Instruction from the Virtual Ground Up
Alisa Beth Rod, NuRee Lee, and Sandy Hervieux
This presentation will discuss a rapidly developed series of 5 workshops on research data management (RDM) at a large Canadian university in the context of a fully virtual academic year due to the COVID-19 pandemic. Across the world, many institutions had to adapt quickly to a “new normal” of online teaching. With this change, instructors endeavored to adapt courses and training to a virtual environment while maintaining rigor and pedagogical best practices. The presenters will outline and discuss specific lessons learned in developing a completely new curriculum model for virtual RDM instruction at their institution, including team teaching (relying on both the RDM specialist and subject liaison librarians), keeping the content and ambiance light-hearted and fun, building interactive elements throughout the workshops using Poll Everywhere, and incorporating an instructional scaffolding approach through the provision of digital resources.
Meeting the Challenges for Data Curation and Access in Lock Down
The COVID-19 pandemic has sparked even greater demand for high-quality social, economic, and health data to assist our understanding of the virus and the social and economic impact of the pandemic. The UK has a wealth of high-quality data collections, many of which have reacted quickly to COVID-19. This pace is unprecedented and the challenge has been to ensure that both the data quality and security are not compromised in the process. The closure of academic institutions during lockdown meant many researchers were not able to access secure data only available under strict conditions, but the UK Data Archive has reacted quickly to this and negotiated new access pathways with the data owners, enabling policy changing research to continue. In this presentation, we will discuss the work involved in meeting these challenges as well as what impact these changes will have on the future of data curation and access.
Adapting to Faculty & Student Needs for Data and Support for Social Justice and Other Projects
Sarah A. Norris
During 2020, the need for access and equity has been more important than ever, particularly with data. At the University of Central Florida (UCF), a group of librarians representing various library units, including Research & Information Services, Scholarly Communication, and Technology Solutions & Digital Initiatives has been engaged in data services and outreach forming a working group in 2019. This group undertook several projects in 2020 related to open access data and social justice resources. One such project included creating an interdisciplinary research guide to identify data resources around topics, such as diversity and inclusion, public health, and cultural issues related to race, ethnicity, gender, and identity. Another project included identifying open access data related to COVID-19, while another emphasized open and freely available digital textbooks and resources related to social justice. This session will highlight how these projects support research, new program certificates, and program accreditations at UCF.
11:10pm - 11:20am Break
11:20am - 11:50am Lightning Talks
Update on the Data Curation Network
Jonathan Petters, Wanda Marsolek
Begun in 2016, the Data Curation Network is a network of US research data curators who pool their expertise to improve the reusability of datasets. In this lightning talk the recent and future activities of the Network will be discussed. The Network is soon ending its Sloan Foundation-funded phase and will be transitioning to a member-sustaining network. This transition and sustainability plan and how other research data curation services may participate in the Network in the future will be discussed. The Network’s newer initiative to address racial justice through its actions will be highlighted.
Reflections on “Best Practices” from a Neurodivergent and Anxious RDM Librarian
The term “best practices” is used heavily in librarianship, including within research data management support. While following best practices can help ensure a person is appropriately and ethically managing their data and meeting benchmarks, the term itself can have a negative effect on a researcher who lives with anxiety and/or panic disorders. In this lightning talk, I reflect on how small changes in our language within data management consultations can create more inclusive, welcoming learning environments. Data management is difficult, and for many people, they aren’t taught these concepts in their coursework. For a person living with anxiety, hearing the term “best practices” can add additional barriers to adopting data management into their workflow, especially if they cannot meet those best practice standards. RDM librarians have an incredible opportunity to make small changes to their language within data consultations to create a more welcoming community around data management education.
Meet the Data Jobs Data Set
Abigail Goben, Liang Tang
Are you curious how data librarian jobs have emerged and evolved now that we are a decade into the NSF DMP requirement. Meet the data set that might help you ask questions, find answers, and determine if your job ad has too many requirements. This dataset has been compiled over the past decade and will be introduced as a resource for the data librarian community to use as a resource to learn how data jobs have evolved, identify trends and challenges, and plan for what might come next.
A Sensitive Data Toolkit for Researchers: Supporting Sensitive Data Sharing in Canada
Following the Open Science movement, sensitive data are becoming increasingly discoverable and accessible. Researchers need creative tools and solutions to address the ethics and privacy considerations involved in making their sensitive data shareable. The Portage Sensitive Data Expert Group has produced a Sensitive Data Toolkit to help researchers and research ethics boards support sensitive data sharing. This toolkit will help to steward the paradigm shift from the present model, in which sensitive data are typically presumed to be single-use only, to a new model in which valuable sensitive data are continuously re-used and preserved as appropriate. This new model enhances participant autonomy and supports the principles of equity and justice embedded in Canadian research ethics policy. Innovations in sensitive data management increase Canadian research capacity while providing Canadians with greater opportunity to contribute to research.
11:50am - 12:00noon Break
12noon - 1:00pm Closing Keynote
Christina Gosnell and Zane Selvans
Distributing power with open data
Lessons learned from getting electricity system data into the hands of advocates and researchers to shift power in energy policy making. How does data actually make change? What does accessibility mean? When should data be a flow and not an archive?
Christina Gosnell (she/her) has spent the last ten years researching and organizing to advance climate and energy policy. Christina is the president and co-founder of Catalyst Cooperative, where she enjoys working at the intersection of public data curation, utility policy, and data analysis. Zane Selvans (he/him) is Catalyst Cooperative’s Chief Data Wrangler. After spending a PhD studying other planets with NASA, he decided to work on keeping Earth habitable instead.
1:00pm - 2:00pm RDAP Business Meeting
All times are listed in PST Times/ (12:00 noon EST start time)
9:00am - 11:45am Workshop 1
Using Storytelling for New Ways of Teaching Data Management
Margaret M Janz
University of Pennsylvania
At the beginning of the pandemic, some libraries noticed an uptick in attendance at online workshops. As time as word on, researchers and librarians alike have burned out on virtual workshops. This workshop will teach participants to use elements of storytelling as a basis for reimagining data management education. Examples of data management themes in popular stories and tropes will be given.
In addition to storytelling, we’ll discuss options for turning stories into engaging synchronous and asynchronous lessons using novel delivery methods such as podcasts, games, and crowd sourced activities. Participants will workshop stories and lesson plans in small groups during the workshop.
This workshop will draw on the presenter’s seven years of teaching data management workshops, her work on research communication in her position at the University of Pennsylvania Libraries, and her research on the connections between science communication and educational practices as a graduate student at Hamline University.
PST Times/ 12:00 noon EST start time
9:00am - 11:45am Workshop 2
Changes and Continuities in Curating Interdisciplinary Data
This workshop will address key issues surrounding the management and curation of data generated by interdisciplinary and highly collaborative research (IHCR), defined broadly as research that crosses disciplinary and institutional boundaries. IHCR addresses complex problems and identifies paths for positive change in such areas as public health, the environment and the diversity and equity movements. As IHCR grows, information professionals must respond accordingly with the necessary tools, resources, and services.
The workshop will be organized in two parts. During the first part participants will learn about the practices of interdisciplinary research and approaches to interdisciplinary data curation in the context of the research data lifecycle. During the second part participants will share their experiences with interdisciplinary data curation and discuss how data professionals can change or improve their services to respond to the needs of IHCR. The workshop will be useful to anyone interested in working with IHCR data.
11:45am - 12:15pm Break
PST Times/ 3:15pm EST start time
12:15pm - 3:00pm Workshop 3
Three birds, One stone: Advancing Data Literacy through RDM and Carpentry Instruction Integration
University of California, Santa Barbara
Carpentry workshops and research data management (RDM) instruction focus on the mechanics of working with data: manipulating and plotting data in the Carpentries; managing, citing, and archiving data in RDM. These topics are essential but not exhaustive, for they fit under the broader umbrella of data literacy where other essential topics are to be found. Data quality, fitness, and credibility are important attributes that should be assessed by anyone working with data. Understanding how data and visualizations can be misleading is an essential skill. We believe that instruction in general data literacy could take advantage of well-established pedagogies to minimize learners’ vulnerability to hidden errors and to avoid propagating errors. This is a participatory workshop in which educators are invited to explore how additional data literacy topics can be introduced and integrated into existing RDM and Carpentry offerings. The output of the workshop will be a published summary of the suggestions produced.
PST Times/ 3:15pm EST start time
12:15pm - 3:00pm Workshop 4
How Data Professionals Can Help Make Science a Tool for Social Change
Institution for Social and Policy Studies, Yale University
During times of change, public trust in science matters more than ever. Scientists by and large determine whether good questions are asked, the right data were collected, the correct analyses conducted, the conclusions well supported. But academic data professionals also have a role to play in enhancing the public’s trust in science: they help ensure scientific transparency and reproducibility. This workshop suggests that traditional stewardship responsibilities must evolve to meet the moment. Participants will learn practical strategies for enhancing research materials for transparency and reproducibility. The workshop will focus on computational reproducibility and on the activities that ensure that statistical and analytic claims about given data can be reproduced with that data. Participants will use YARD (the Yale Application for Research Data) to experience the CURE workflow using examples and hands-on activities. This 3-hour workshop is intended for librarians, data curators, and researchers of diverse professional backgrounds and experience.