Web Archiving

This guide covers web archiving, UTSA's web archiving efforts, and considerations for researchers.

Scope

UTSA Special Collections selectively captures, preserves, and makes accessible websites that document the culture and history of UTSA, San Antonio, and our surrounding South Texas communities.

Collections are curated into thematic subject areas and captured using web crawling tools provided by the Internet Archive’s Archive-It web archiving service. Special Collections currently has 19 active collections, which have captured and preserved over 87 million web documents.

Web content sought for preservation includes:

  • University of Texas at San Antonio websites, including official websites hosted by the university and social media sites created by staff, faculty, and student organizations.
  • Websites from organizations and communities in the following subject areas:
    • Bilingual Education
    • Border Studies
    • Food Culture
    • Gender Studies
    • Race/Ethnicity Studies
    • San Antonio Culture and History
    • Sexuality Studies
    • South Texas Culture and History

Partner description

UTSA Libraries Special Collections sustains the university's teaching, research, and outreach mission by preserving and providing access to valuable primary resources, and by creating digital collections for use by students and scholars at UTSA and from around the world. Through its web archiving program, Special Collections uses Archive-It to

  1. regularly capture and preserve web content created by UTSA’s administrative units, academic programs, and student life groups as part of our University Archives’ collecting mandate; and
  2. capture websites of organizations and web coverage of topical events that complement and/or supplement our physical collection development strategies.

Methods

The UTSA Special Collections Web Archiving Team:

  • Identifies a topic/subject/theme for a collection (if creating a new collection)
  • Selects relevant, specific web resources to crawl
  • Administers crawl mechanics and adds descriptions (metadata)
  • Determines the frequency of content changes/updates and sets crawl frequency accordingly.

Web Archiving Team discussion and approval is required for creation of new web collections/groups, seed selection, and for major changes to existing collections.

How to be an ethical researcher

Your responsibility as a researcher/end user:

Researchers are responsible for ensuring that they use archived web resources in an ethical manner. Ethical use includes ensuring that the archived web resource is not represented as the most current version as well as including accurate citations of archived web resources. Researchers must also ensure that they are observing all copyright, property, and libel laws that apply. If necessary, researchers must obtain formal consent from copyright holders to republish or reuse the content of archived web resources.

Our responsibility as the collector:

Special Collections strives to respect the rights of content owners and to follow professional best practices for intellectual property rights management in web resource preservation. The Web Archiving Team seeks to follow the Section 108 Study Group’s recommendations for changes to the Copyright Act for web resource preservation. This group of copyright experts asserts that archives and libraries have the right to capture “publicly available” content (i.e., materials that do not require passwords, entry forms, or subscriptions) and that all governmental web resources should be freely accessible to web crawlers.

The Web Archiving Team does not determine the copyright status of web resources. All intellectual property rights are retained by the legal copyright holders. If UTSA does not clearly hold the copyright to a web resource, we cannot grant or deny permission to use the material.

Info for archived website creators

Special Collections acknowledges that organizations and individuals as content creators of web resources have agency over their born-digital content. If you believe we may have harvested your web content in error, or that maintaining your content in our web archive does not adequately reflect your organization, please contact us. Also, if you are the copyright owner of material found in our web archive and believe UTSA has used the work beyond fair use and without permission, we want to hear from you. Please email us at specialcollections@utsa.edu.

While we are able to remove captured web resources from UTSA’s Archive-It partner page, we cannot remove these resources from the Wayback Machine. Removing a web resource from our partner page will diminish its discoverability, but we do not have the power to fully delete a captured website.

As a preemptive measure, the Web Archiving Team strives to capture content which was created for public consumption (e.g., a public official’s Twitter feed, the website of a business, etc.). Content within official UTSA web resources (any content on utsa.edu or content created by UTSA for University purposes) is predominantly considered to be public record. In the rare event that we select web resources created by private individuals for personal purposes for capture, an opt-out notification will be distributed to groups or individuals prior to crawls.

Copyright

It is the sole responsibility of the user to determine the copyright status of archival collections before publishing materials.