Troubleshooting


Just a few tips to begin with covering some common situations: poorly formatted pages, incomplete mirrors; missing page elements due to cross-origin resource sharing restrictions; and tweaking runs of MakeStaticSite by editing the constants file.

Poor Formatting and Displaced Images

When viewed offline, pages may appear to be complete in content, but the layout and formatting has gone awry. There may be large gaps where banner images are meant to be, other images may be displaced and enlarged, appearing outside the intended holders. The text may be shown in default system fonts such as Times Roman.

Whilst disconcerting, this is quite a common issue and is generally due to a page’s use of CSS and/or Javascript. In either case, there may be a straightforward remedy.

If the page lacks any styling, then that strongly suggests that one or more CSS files have failed to load. This may be the case if CSS file names contain version numbers as query strings. In this case, you can try to remove the query strings completely by the following settings, which you can put in your .cfg file:

prune_query_strings=y
prune_filename_extensions_querystrings=y

JavaScript issues are more varied. They typically arise when the page assumes the site is deployed on a web server. If you are converting a dynamic site to a static one with a view to hosting as a production site, then the desired formatting should become apparent when deploying to the host.

Irrespective or whether the site is to be deployed, it is recommended that you set up a local Web server, at least to preview the pages for convenience.

Local Web Server as a Solution

Setting up a web server is not as difficult as it might sound. There are tutorials available for different operating systems, and many programming languages provide packages that require few installation steps: for example, for Windows, you can install Python and then the web server.

Rest assured that when setting up, a default configuration should suffice for the static output; there’s no need for additional modules to support server-side scripting. Once it has been set up, then simply copy the output directory to the Web root of the server and launch a web browser to view the site.

Incomplete mirror and/or broken links

If the output generated by MakeStaticSite is missing pages and/or assets, try the following:

(A) URL seed

The URL that you enter is the one that is first fetched during the crawl by Wget. It is called the seed URL.

  • Check the value of the seed URL in config file (option url) — if the URL has a path, then parent pages will not be retrieved, nor pages that are children of parents.
  • Some websites use mixed domains, serving off the primary domain and subdomains, such as example.org and www.example.org. However, whereas the home page may be the same for both, the inner pages may vary along with the content. Try separate runs with and without the subdomain.
  • The mixed domain issue is also reflected in Wayback Machine archives. In this case, again try separate runs, modifying the original URL within the Wayback (Memento URL), not the Wayback host. Also, make sure that in constants.sh, you have set wayback_links_relative_rewrite=yes
(B) Site checks

There are a number of general runtime checks you can make to ensure that the site has been captured as intended.

  • Check the MakeStaticSite log, particularly for network connectivity; if you lose Internet access during a run, then the download of files may be skipped.
  • Check the robots exclusion (robots.txt) file on server where you are retrieving files from — Wget respects this by default and recommends not changing this behaviour, for good reason.
(C) Configuration Tuning

MakeStaticSite has various options that can affect the extent of the output.

  • Wget will by default save filenames to match URLs on the original site and that will be reflected in MakeStaticSite, both in the mirror and in the Zip file.
    If you are viewing the output on MS Windows and the URLs contain characters that are illegal for Windows filenames (any of \, /, :, *, ?, "<, >, and |), then these files will either not be accessible or (if using a tool such as 7Zip), the auto-correction, e.g. of ? to _, will lead to broken links.
    The solution here is to edit the configuration file, add the Wget option, --restrict-file-names=windows to wget_extra_options.
  • Try increasing the value of wget_extra_urls_depth
  • if you are trying to download assets that have an extension that’s not recognised, then add it to the list for the constants asset_extensions and/or asset_extensions_external.
  • if you are trying to download assets that don’t have an extension, then set asset_extensions / asset_extensions_external to the empty string.
  • If the site contains punctuation characters (apart from ‘-‘, ‘_‘ and ‘.‘) in filenames, then ideally these characters should be removed or replaced and any links updated. If that’s not possible, one or two characters, particularly round brackets ), might be removed from url_grep_search_pattern provided they are not used as URL boundaries in the web pages.

If none of the above tips can resolve the situation then for any page with missing content use the browser’s web console and look for the errors reported on the page. (One way to access these is to right-click on the relevant area of a page and select ‘Inspect’.)

Some omissions may be rooted in errors arising from web security policy as variously determined by the server / browser / markup. There is no general remedy; solutions are devised on a case by case basis, but as case history is gradually built up, MakeStaticSite can encode a successively more thorough solution. We illustrate a particular case below.

Cross-Origin Resource Sharing

A relatively common issue arises when scripts (usually in JavaScript) make requests for assets from another origin (usually a domain). This results in page elements not displaying or even blank screens.

The Cross-Origin Resource Sharing (CORS) is a W3C standard that defines how to allow some cross-origin requests, relaxing the policy, while rejecting others. Restrictions to such assets have been gradually introduced and expanded. The scope presently includes:

  • Invocations of fetch() or XMLHttpRequest.
  • Web Fonts (for cross-domain font usage in @font-face within CSS), so that servers can deploy TrueType fonts that can only be loaded cross-origin and used by websites that are permitted to do so.
  • WebGL textures.
  • Images/video frames drawn to a canvas using drawImage().
  • CSS Shapes from images.

(For further details, see Mozilla developer docs).

Particularly with respect to offline use of static sites, the following error might be reported when trying to read JavaScript files that were originally from another origin:

Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at … (Reason: CORS request not http).

Here, the ‘same origin policy’ means that whereas the use of these files works on the original server, once downloaded and accessed on the file system (i.e., not over the Internet), then such files are considered to come from an opaque origin, i.e. are not to be trusted and hence are blocked.

Example (for illustration):

<script> source URI is not allowed in this document: “file:///home/paul/Downloads/websites/www_example_com20240215_145332/www.example.com/imports/assets.example.com/universal/scripts-compressed/extract-css-runtime-39e87d4f1d6ff921db43-min.en-US.js”. <a href="/home/paul/Downloads/websites/www_example_com20240215_145332/www.example.com/">index.html:36:160</a>
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at file:///home/paul/Downloads/websites/www_example_com20240215_145332/www.example.com/imports/assets.example.com/universal/scripts-compressed/extract-css-moment-js-vendor-675f9459672cf966ca51-min.en-US.js. (Reason: CORS request not http).

This restriction has been in place since a security vulnerability was identified in 2019, reflecting a general trend to increasing strictness.

The suggested remedy is to set up a local Web server, as above.

Developers who need to perform local testing should now set up a local server. As all files are served from the same scheme and domain (localhost) they all have the same origin, and do not trigger cross-origin errors.

Source: MDN Web Docs, Reason: CORS request not HTTP.
Workarounds

Fortunately, at least in some cases, there is a straightforward solution in two steps suitable for offline usage. The first step is to simply fetch assets directly outside of JavaScript (e.g., using Wget).

For the second step, inspecting the HTML source, if you see lines such as:

<script src=" ... >

Then simply delete occurrences of crossorigin="anonymous". The rationale is that you are mirroring a website you have designed or, at least, trust. If it needs to retrieve and work with assets from particular locations, then that should still apply offline.

Accordingly, MakeStaticSite uses Wget in phase 3 to fetch these additional assets and on setting the constant cors_enable=yes, will remove all occurrences of the crossorigin attribute. There may be other attributes that are not required in the specific offline context.

Otherwise, a possible workaround might be to incorporate the content of such files into an existing file that is accepted as same origin.

MakeStaticSite Constants

If something doesn’t work out as expected, then quite often a situation can be addressed by reviewing and tweaking the default settings in lib/constants.sh. as listed in the configuration guide.

This page was published on 14 February 2024 and last updated on 9 April 2025.