Teaching Computational Archival Science: DCMI Keynote

Teaching Computational Archival Science: DCMI Keynote

Unlocking the Power of Data in the Archives

In today’s rapidly evolving digital landscape, the field of archival science is undergoing a transformative shift, driven by the vast potential of computational technologies. As the world generates an unprecedented amount of data, the need to effectively manage, preserve, and extract meaningful insights from archival records has become increasingly crucial.

Enter Computational Archival Science (CAS), an emerging transdisciplinary field that combines the expertise of archivists, information scientists, and computer scientists. CAS explores the application of computational methods and resources to tackle the complex challenges faced by modern archives. From large-scale data processing to enhancing access and preservation, CAS holds the key to unlocking the full potential of archival collections.

In this comprehensive article, we delve into the significance of CAS, the key pedagogical developments that have emerged, and the exciting future directions in metadata, including the potential of Generative AI and Large Language Models. Together, we’ll explore how this innovative approach can transform the way we understand and interact with our collective historical record.

The Rise of Computational Archival Science

Computational Archival Science (CAS) was formally established in 2016, as a response to the transformative effects of computing technology on archival practice and theory. A group of technologists and archivists from the US, Canada, and the UK recognized the need for a new transdisciplinary field that could bridge the gap between traditional archival methods and the emergent computational tools and techniques.

The foundational paper, “Archival records and training in the Age of Big Data,” published in 2018, defined CAS as:

“A transdisciplinary field grounded in archival, information, and computational science that is concerned with the application of computational methods and resources, design patterns, sociotechnical constructs, and human-technology interaction, to large-scale (big data) records/archives processing, analysis, storage, long-term preservation, and access problems, with the aim of improving and optimizing efficiency, authenticity, truthfulness, provenance, productivity, computation, information structure and design, precision, and human technology interaction in support of acquisition, appraisal, arrangement and description, preservation, communication, transmission, analysis, and access decisions.”

This comprehensive definition highlights the multifaceted nature of CAS, encompassing the integration of computational thinking, data science, and traditional archival practices. By leveraging the power of technology, CAS seeks to enhance the efficiency, accuracy, and accessibility of archival processes, ultimately serving the needs of both archives and their diverse user communities.

Bridging the Gap: Integrating Computational Thinking into Archival Science

One of the most significant lessons learned in the development of CAS has been the importance of integrating computational thinking (CT) concepts into archival science. This approach mirrors the growing emphasis on incorporating CT into mathematics and science curricula in K-12 education.

Computational thinking involves a set of problem-solving skills and strategies that are essential for navigating the digital age. These skills include:

  1. Decomposition: Breaking down complex problems or datasets into smaller, more manageable parts.
  2. Pattern Recognition: Identifying and leveraging patterns and trends within data to inform decision-making and optimize processes.
  3. Abstraction: Focusing on the essential elements of a problem or system, while filtering out unnecessary details.
  4. Algorithm Design: Developing step-by-step procedures or instructions to solve specific problems efficiently.

By weaving these CT principles into archival education and practice, CAS empowers archivists to tackle the challenges of data-driven archives with a more systematic and technology-savvy approach. This integration not only enhances the technical skills of future archivists but also fosters a deeper understanding of how computational tools and techniques can be leveraged to improve archival workflows and enhance user experiences.

Pedagogical Developments in Computational Archival Science

As CAS continues to evolve, the educational landscape has seen a surge of innovative pedagogical approaches to ensure that the next generation of archivists is equipped to navigate the digital landscape. Here are some key developments:

Interdisciplinary Curricula

Many iSchools and archival programs are incorporating CAS-related coursework into their undergraduate and graduate programs. These interdisciplinary curricula blend traditional archival studies with computer science, data analytics, and information management. By exposing students to this cross-disciplinary approach, they gain a holistic understanding of the challenges and opportunities presented by computational technologies in the archival field.

Hands-on Experiential Learning

To bridge the gap between theory and practice, CAS programs are emphasizing hands-on, project-based learning opportunities. Students may engage in digitization projects, develop custom metadata schemas, or participate in the creation of digital archives. These experiential learning activities allow students to apply computational concepts directly to real-world archival challenges, preparing them for the demands of the modern workplace.

Collaborative Partnerships

Recognizing the value of interdisciplinary collaboration, CAS educators are forging partnerships with local archives, libraries, and technology companies. These partnerships provide students with access to diverse datasets, mentorship from industry experts, and opportunities to work on innovative, community-driven projects. By fostering these connections, students gain a deeper understanding of the practical applications of CAS and develop the skills to navigate the evolving landscape of archival practices.

Ethical and Social Justice Considerations

As CAS continues to advance, it is crucial to maintain a strong focus on ethical and social justice principles. Educators are incorporating discussions on topics such as algorithmic bias, digital privacy, and equitable access to archival resources. By instilling these values in CAS curricula, students learn to develop computational solutions that prioritize the rights and needs of diverse user communities, ensuring that the archives of the future are inclusive and responsive to societal needs.

The Future of Metadata in Computational Archival Science

One of the most exciting frontiers in CAS is the field of metadata, which plays a pivotal role in organizing, preserving, and providing access to archival collections. As technology continues to evolve, the future of metadata in CAS holds immense promise, particularly in the areas of Generative AI and Large Language Models (LLMs).

Generative AI and CAS

Generative AI models, such as GPT-3 and DALL-E, have demonstrated the ability to create human-like text, images, and even audio based on input prompts. In the context of CAS, these models could be leveraged to assist in the creation and enhancement of metadata, significantly streamlining the cataloging and description of archival materials.

Imagine a scenario where archivists can provide a brief description or a few keywords about a document, and the generative AI model can generate detailed, contextual metadata to accompany the item. This automation could dramatically improve productivity, ensure consistent metadata application, and free up valuable time for archivists to focus on higher-level analysis and decision-making.

Large Language Models and CAS

Large Language Models (LLMs), such as those developed by OpenAI, Google, and others, have the potential to transform the way we interact with and extract insights from archival records. These powerful language models can understand natural language, identify patterns, and generate human-like responses based on the information they have been trained on.

In the CAS context, LLMs could be used to enhance search and discovery, enable more intuitive and natural language-based querying of archival collections, and even assist in the analysis and interpretation of historical documents. Imagine being able to ask an LLM-powered system questions about the content, context, or significance of a particular archival item, and receiving detailed, contextual responses that draw upon the collective knowledge of the model.

Moreover, LLMs could aid in the identification of trends, themes, and relationships within large-scale archival datasets, empowering archivists and researchers to uncover new insights and connections that were previously hidden or difficult to discern.

Embracing the Future of Archival Science

As we’ve explored, Computational Archival Science represents a transformative shift in the field of archival science, harnessing the power of computational technologies to enhance the efficiency, accessibility, and preservation of our collective historical record.

By integrating computational thinking into archival curricula and fostering collaborative partnerships, educational institutions are preparing the next generation of archivists to navigate the digital landscape with confidence and innovation. Moreover, the emergence of Generative AI and Large Language Models holds the promise of revolutionizing the way we approach metadata, search, and analysis within archival collections.

As we move forward, it is crucial that we remain mindful of the ethical and social justice implications of these technological advancements, ensuring that the archives of the future are inclusive, equitable, and responsive to the needs of diverse user communities.

At Stanley Park High School, we recognize the importance of staying at the forefront of these exciting developments in Computational Archival Science. By incorporating CAS-related coursework and providing hands-on learning opportunities, we are committed to empowering our students to become the next generation of archivists, ready to tackle the challenges and opportunities of the digital age.

We encourage you, whether you’re a student, parent, or member of the community, to explore the wealth of resources and information available through the Dublin Core Metadata Initiative (DCMI) and other leading organizations in the field of CAS. Together, we can unlock the full potential of our archives and ensure that our collective history remains accessible, preserved, and cherished for generations to come.

Scroll to Top