Limitations


Whilst MakeStaticSite is functional and configurable, it grew from the particular needs of an individual and has many limitations in the face of the scale and diversity of websites.

  • This is prototype software, provided as-is and tested on only a few sites, but in the hope that it will prove useful and become community-supported
  • This is a static crawler that doesn’t run any JavaScript for client-side rendering, not a dynamic crawler that can process the JavaScript on that page and then render it. The workflow architecture might still download the JavaScript necessary for such processing, but offline usage may be restrictive; for example, the use of AJAX will not be supported in this mode.
  • It uses GNU Wget for crawling, whereas most development effort is now on GNU Wget2.
  • Any Web crawler makes assumptions about the websites it is crawling that impact the directory layout of its output. In particular, Wget’s directory-based limits with --no-parent=on means that an anchor link to a blog directory at http://example.org/blog will be saved in the root as blog.html, whereas a link to http://example.org/blog/ will be saved as blog/index.html.
  • It has been designed for individual sites. Even though MakeStaticSite can capture assets from multiple domains and has batch support, it is not intended to index huge swathes of the Internet.
  • Static site generation is generally not a good fit for collections databases with a large inventory, though it can work quite well if the site architecture supports simple navigation to individual record pages.
  • The script can only provide a snapshot of comments, discussions, surveys and so on; the interactivity of such components along with the persistence of user-contributed data is generally lost. (In the long run a project might isolate these in a hybrid setup.)
  • Whilst Wget is a mature product that embodies a deep understanding of Internet protocols and the networking environment, it doesn’t have intimate CMS knowledge and so it might not retrieve everything. This may be the case for orphan pages, which a WordPress plugin, for example, might be able to access. For Wget they need to be added explicitly as extra input.
  • Any Web components with deeply nested external URLs, for example in JavaScript and CSS, may be omitted. This is often the case with fonts and icons.
  • Performance: MakeStaticSite is not compiled code; it requires a command line interpreter and the scripts have not yet been much optimised for speed. It typically takes up to a few minutes to build a small- to medium- sized site, which, depending on usage scenario, may or may not be a significant duration. A substantial part of this is due to the reliance on Wget to re-crawl many pages at each run, but there is further overhead with Wayback Machine sites, for which additional routines are needed to properly limit Wget requests.
    For phase 3 (fetching additional assets), there is the wget_threads option that can reduce the time to download from the Web.
  • Links generated dynamically by JavaScript are not included.
  • For WordPress sites, using WP-CLI remotely over ssh may not be fully supported by hosting providers running jailed shells for shared hosting. In that case, WordPress updates need to be done manually.

To overcome at least some of these limitations, there are alternatives that may be explored.

This page was published on 31 October 2022 and last updated on 3 September 2025.