How to Download Entire Website from the Wayback Machine

by

Hey there! Some links on this page may be affiliate links which means that, if you choose to make a purchase, I may earn a small commission at no extra cost to you. I greatly appreciate your support!

If you need to download a website from the Wayback Machine but don’t know how, you’re in the right place! After my friend asked me to download her old WordPress blog, I knew wget was probably the best way to do this. (Turns out it is)

Below, find a complete tutorial on how to download an entire website from the Internet Archive’s Wayback Machine.

What Is wget?

Real quick, wget is a small computer program that allows you to download content from web servers. This program runs as a command line application meaning it doesn’t have buttons. Although there are no buttons, wget is simple, lightweight, and safe to install.

While this blog post won’t teach you how to install wget, there are plenty of tutorial available for Mac and Windows.

Use wget to Download Wayback Machine Website

When you have wget installed on your computer, launch Terminal if you’re using Mac or Command Prompt on Windows.

Type in the following command. At this point, I’m assuming that you already have your own Wayback Machine URL snapshot. Most likely, the URL will be in the format of:

https://web.archive.org/web/YYYYMMDDHHmmSS/https://yourdomain.com/

Be sure to replace the URL in the wget command below with your own.

wget --recursive --no-clobber --page-requisites --convert-links --domains web.archive.org --no-parent https://web.archive.org/web/20180314211747/https://tonyflorida.com

When everything looks good, hit enter. The wget program will begin to recursively retrieve the contents of your website from the Wayback Machine from that point in time.

The options that we passed to the wget program do the following:

  • –recursive: follow download HTML links from one page to the next
  • –no-clobber: don’t download the same file more than once
  • –page-requisites: download all the files that are necessary to properly display a page
  • –convert-links: convert the links to make them suitable for offline viewing
  • –domains: only download from these domains
  • –no-parent: limit download to files and directories below specified

Examine Your Website Download

After wget finishes downloading your website archive to your computer, you can check it out. Open the folder containing your downloaded website. In most cases, it defaults to a folder called web.archive.org.

You will have to navigate into a few folders until you open the folder with your domain name. In here you’ll find a file called index.html. Double click this file to open the home page of your website in your web browser.

Website download from Wayback MachinePin

Because we used the “convert links” wget option, you can navigate around your website as if it was being hosted on a server. The only difference here is that it’s being hosted right from your local computer.

Potential Drawbacks to Wayback Machine

The Wayback Machine isn’t perfect. Far from it actually. The internet is huge, and the Wayback Machine is only so smart.

In it’s simplest form, the Wayback Machine automatically follows links from one website to the next while saving copies of web pages that it visits. Some websites or web pages may not exist in the Wayback Machine because it isn’t aware of their existence. In other words, you may only be able to download a portion of a website, if at all.

Additionally, some functionally of Wayback Machine websites may not work. This happens when certain Javascript and CSS files aren’t captured by the Wayback machine.


Consider yourself lucky if you find a copy of an old website on the Wayback Machine. My hope is that this blog post has taught you how to download sites from the Wayback Machine.

If you have any questions about Wayback Machine downloads, let me know in the comments below.


Meet Tony

With a strong software engineering background, Tony is determined to demystify the web. Discover why Tony quit his job to pursue this mission. You can join the Tony Teaches Tech community here.

4 thoughts on “How to Download Entire Website from the Wayback Machine”

    • But You need to pay to get more than 200 files…
      Is there another solution ?

      I tried Teleport – nothing was got.
      Tried WinHTTrack – downloaded just the first page.
      The same as WinHHtrack was with Wget – only the first page.

      Reply
  1. But You need to pay to get more than 200 files…
    Is there another solution ?

    I tried Teleport – nothing was got.
    Tried WinHTTrack – downloaded just the first page.
    The same as WinHHtrack was with Wget – only the first page.

    Reply

Leave a Comment


The reCAPTCHA verification period has expired. Please reload the page.