Skip to Main Content

Personal Digital Archiving

File Format Considerations

When archiving your files, it is important to review the files to ensure they are in preservation-ready file formats. Technology updates quickly, and over time some file formats are phased out. 

Important considerations for file formats are: does it allow for high-quality files, is it a well-documented and used file format, does it have lossless or lossy compression. There are a lot of file formats out there, so some good resources to consult are the National Archives’ list of file formats and the Smithsonian's Recommended Preservation Formats. To learn more about digital content that is deemed most at risk of being lost, see the Digital Preservation Coalitions’s Global ‘Bit List’ of Digital Material

Some file formats can be large and complex to access or archive over time. It is also important to consider if you are just interested in the content of the file, e.g. the body of an email, or if you are interested in saving the look and feel of the content, e.g. saving an email as a PDF which closely mirrors the original look of the email. 

Important Digital Archiving Concepts

  • File Formats: Encoding standards that allow computers to store and access information as different types of media so it can be rendered by software. File formats are commonly referred to by the extension that appears at the end of a file name, such as .docx, .mp3, etc. File formats that are widely adopted by large populations of users are likely to continue to be supported by software, browsers, and other rendering platforms.
  • Compression: Some file formats use compression algorithms to make file sizes more efficient by eliminating data that the algorithm considers redundant. Depending on the type of compression, this may result in a permanent reduction in quality for the content of the file. This may be fine for everyday uses of your files, but you may want to be aware of the type of compression used by your file formats when saving important files over the long-term or when you will be editing and re-saving files multiple times. Each time a file with lossy compression is saved, it loses quality. Saving it a few times is usually not something the average user would notice, but over time, it can become very noticeable.
  • Open Source vs. Proprietary: If a file format's specifications (the documentation that describes how the format works) are open to public view (aka open source), the format is likely to remain accessible across different platforms. If a format is proprietary, its specifications are maintained by an organization that may want to keep the format locked in a specific commercial software, which can make it more difficult to preserve. In general, open formats have fewer software dependencies and are therefore better for preservation.
  • Obsolescence: When a file format, disk media, or computer system is no longer widely used or supported, there's a growing risk that the digital content it contains may one day become completely inaccessible. Digital preservation experts are concerned with migrating content from at-risk formats and storage media to more stable standards that are likely to remain accessible over time.

Text, Photos, Audio, and Video

For these formats, it is important to consider what you want to do with them long-term. Of course, high-quality preservation-ready file formats are generally preferred, but they often yield larger file sizes, and it may not be necessary given your specific use. For instance, if you have iPhone photos from a vacation, and you’re intending to use these files casually, for everyday things like viewing and sharing them online, then a compact, lossy compression file format may suffice. Or, if these files are photos from a digital camera, you may want to retain the original raw files that are larger, but have better quality.

Email

Archiving email can be complicated. You can export a single email, a group of emails, or your entire email account. Different email systems may export emails in different formats that can be difficult to access outside of the email application. Be sure to download any needed attachments. While internal links in emails remain, if they link to a cloud or network drive, that material is not downloaded. 

You can choose to save a single email as a PDF, or learn more about archiving your Microsoft 365 email from the Help Desk

Google has a similar process for Google data, including email, documents, calendars, photos, etc. Learn more about how to download your Google data.

Websites

When it comes to archiving a publicly accessible website, there are a few different approaches:

  1. Use the Internet Archive Wayback Machine Save Page Now feature.
    • The Wayback Machine is a collaborative digital archive of information on the internet. The service enables users to see archived versions of web pages across time. Type a URL into the Save Page Now feature to capture a webpage. Be sure to check the URL later to see if it captured the webpage in the desired fashion.
      • Please note that static websites usually archive better than dynamic websites. Static websites have stable content, where every user sees the exact same thing on each individual page, whereas dynamic websites pull content on the fly, allowing its content to change with the user. If you have material that doesn’t archive well with the Wayback Machine, consider exploring Webrecorder’s tools.
  2. Create a screen recording
    • If you aren’t sure you want your website to be accessible indefinitely, you could capture a personal screen recording of the site. 
  3. Export the files and/or intellectual content from your website

Social Media

If you’re interested in archiving your social media accounts, each platform has instructions for how to export your content. Consider what you are interested in retaining: any photos, comments on posts, metrics on your posts, etc. Sometimes it may be easier to take a screenshot or screen recording if you’re just interested in capturing the look of a single post and its comments. Learn more about exporting a copy of your Facebook information and your Instagram information.