How to Download Entire Website from the Wayback Machine

by

Hey there! Some links on this page may be affiliate links which means that, if you choose to make a purchase, I may earn a small commission at no extra cost to you. I greatly appreciate your support!

If you need to download a website from the Wayback Machine but don’t know how, you’re in the right place! After my friend asked me to download her old WordPress blog, I knew wget was probably the best way to do this. (Turns out it is)

Below, find a complete tutorial on how to download an entire website from the Internet Archive’s Wayback Machine.

What Is wget?

Real quick, wget is a small computer program that allows you to download content from web servers. This program runs as a command line application meaning it doesn’t have buttons. Although there are no buttons, wget is simple, lightweight, and safe to install.

While this blog post won’t teach you how to install wget, there are plenty of tutorial available for Mac and Windows.

Use wget to Download Wayback Machine Website

When you have wget installed on your computer, launch Terminal if you’re using Mac or Command Prompt on Windows.

Type in the following command. At this point, I’m assuming that you already have your own Wayback Machine URL snapshot. Most likely, the URL will be in the format of:

https://web.archive.org/web/YYYYMMDDHHmmSS/https://yourdomain.com/

Be sure to replace the URL in the wget command below with your own.

wget --recursive --no-clobber --page-requisites --convert-links --domains web.archive.org --no-parent https://web.archive.org/web/20180314211747/https://tonyflorida.com

When everything looks good, hit enter. The wget program will begin to recursively retrieve the contents of your website from the Wayback Machine from that point in time.

The options that we passed to the wget program do the following:

  • –recursive: follow download HTML links from one page to the next
  • –no-clobber: don’t download the same file more than once
  • –page-requisites: download all the files that are necessary to properly display a page
  • –convert-links: convert the links to make them suitable for offline viewing
  • –domains: only download from these domains
  • –no-parent: limit download to files and directories below specified

Examine Your Website Download

After wget finishes downloading your website archive to your computer, you can check it out. Open the folder containing your downloaded website. In most cases, it defaults to a folder called web.archive.org.

You will have to navigate into a few folders until you open the folder with your domain name. In here you’ll find a file called index.html. Double click this file to open the home page of your website in your web browser.

Website download from Wayback Machine

Because we used the “convert links” wget option, you can navigate around your website as if it was being hosted on a server. The only difference here is that it’s being hosted right from your local computer.

Potential Drawbacks to Wayback Machine

The Wayback Machine isn’t perfect. Far from it actually. The internet is huge, and the Wayback Machine is only so smart.

In it’s simplest form, the Wayback Machine automatically follows links from one website to the next while saving copies of web pages that it visits. Some websites or web pages may not exist in the Wayback Machine because it isn’t aware of their existence. In other words, you may only be able to download a portion of a website, if at all.

Additionally, some functionally of Wayback Machine websites may not work. This happens when certain Javascript and CSS files aren’t captured by the Wayback machine.


Consider yourself lucky if you find a copy of an old website on the Wayback Machine. My hope is that this blog post has taught you how to download sites from the Wayback Machine.

If you have any questions about Wayback Machine downloads, let me know in the comments below.


Meet Tony

With a background in computer science and engineering, Tony is determined to demystify the web. Discover why Tony is pursuing this mission. You can also connect with Tony here.

7 thoughts on “How to Download Entire Website from the Wayback Machine”

    • But You need to pay to get more than 200 files…
      Is there another solution ?

      I tried Teleport – nothing was got.
      Tried WinHTTrack – downloaded just the first page.
      The same as WinHHtrack was with Wget – only the first page.

      Reply
  1. But You need to pay to get more than 200 files…
    Is there another solution ?

    I tried Teleport – nothing was got.
    Tried WinHTTrack – downloaded just the first page.
    The same as WinHHtrack was with Wget – only the first page.

    Reply
  2. Howdy! This is kind of off topic but I need some advice from an established blog. Is it very difficult to set up your own blog? I’m not very techincal but I can figure things out pretty fast. I’m thinking about making my own but I’m not sure where to start. Do you have any tips or suggestions? With thanks

    Reply
  3. I think I did what you said but the CMD line tells me that ‘wget’ is not recognized as an internal or external command, operable program or batch file.
    Any thoughts on how to fix this, please?

    Reply
  4. I’ve recently started using elevate thca , and they’ve exceeded my expectations. From Delta 8 products to HHC products, the benefits are undeniable. They boost modify emphasis, improve have a zizz, and even expedite slight aches. What I ardour most is that they’re natural and don’t leave me feeling numbed or out of it. The rank of hemp products makes a monumental difference, so I always look for trusted brands. Whether you’re brand-new to hemp or savvy, these products are a game-changer for overall wellness.

    Reply

Leave a Comment