Runtime Options

In addition to the site-specific configuration generated by and stored as a .cfg file, MakeStaticSite has further options, stored in lib/, that control various runtime settings. It is not just a list of constants as some shell scripting is involved, for example, to generate some variables as a function of others — for an overview, refer to the page on options management.

These may be modified according to your preferences, but it is strongly recommended that you make a backup first.

As at version 0.29.9, the options are:

Default: 5
Maximum number of redirects allowed for determining the effective URL being mirrored. In this case the URL originally entered will be replaced by the effective URL.
Default: /etc/hosts
Location of hosts file. When creating the source website locally, it can be useful for url_base and deploy_domain have the same domain, particularly to test certain functionality such reCAPTCHA. In this case, with the aid of the constants, ip4re and ip6re, MakeStaticSite will inspect the hosts file for an entry that anticipates the DNS for the domain and temporarily comment it out when it comes to deployment, so that there’s no interruption to site editing.
Default: 600
Default Unix file permissions for file creation.
Default: 700
Default Unix file permissions for directory creation.
Default: tmp
Directory where temporary files are to be stored. These are mainly to support Wget, including input files and cookies.
Default: " "
Tab spacing for file outputs, e.g. the site map (XML) file.
Default: auto
Host directory mode when creating a site mirror with Wget; empty or ‘no’ corresponds to -nh, effectively removing one directory level. Otherwise, the host directory is included in the output.
Default: .netrc
‘Run commands’ file for (temporary) storage of credentials — either .wgetrc or .netrc
Default: yes
Delete references to credentials in temp files and .rc file on completion of run (y/n)?
Default: pass
Path to binary for managing (and encrypting) credentials.
URL where credentials manager may be downloaded.
Default: MSS
MakeStaticSite-specific directory for storing credentials (usernames, passwords, tokens, etc.).
Default: plain
How to store credentials: config to store in the configuration file, as-is; plain to store separately, as-is, in plain text; encrypt to store separately and encrypt.
Default: gpg
Encryption file type extension.
Default: "$HOME/.password-store"
Password-designated directory under which credentials are stored.
Default: wget
Path to Wget binary. If wget is available in PATH, then simply enter wget. Otherwise, enter its full path.
Default: 6
The lowest Wget error code tolerated or else aborts (>8 for no tolerance).
Default: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15)
The browser user agent to be used by Wget (if not supplied, then access might be refused by the host’s web application firewall.
Default: yes
Allow protocol-relative URLs to be fetched by Wget by prefixing a protocol (y/n).
Default: https
Protocol to prefix protocol-relative URLs.
Default: user
Wget’s user login field for HTTP authentication.
Default: password
Wget’s password field for HTTP authentication.
Default: cookies.txt
The name of the cookies file used by Wget.
Default: 5
The minimum number of lines for a valid non-empty Wget cookies file.
Default: no
When wget_user_agent is defined above as a non-empty string, should it be reset to null for handling cookies (yes/no)
Default: wget_post.txt
The name of the file containing POST data.
Default: wget_inputs_main.txt
The name of the input file for wget, used in the first run. This file comprises URLs that might not be reachable by standard crawls.
Default: wget_inputs_extra.txt
The name of the input file for wget, used in subsequent runs. This file is auto-generated during a deep search of URLs.
Default: (--recursive --timestamping --level=inf --no-remove-listing)
The standard default settings for wget to generated a mirror. They can be tweaked. For example, to create only a partial mirror, set --level to be a number.
Default: ("${wget_mirror_options[@]}" --convert-links --adjust-extension --page-requisites)
These are the basic options for wget to crawl a URL and download a static version. This should only be changed if there’s undesirable behaviour. Additional options should be specified per site in the .cfg file.
Default: auto
Should capturing URLs with directories include the --no-parent option? Set to auto or yes to check and add automatically; manual to check and ask during runtime; otherwise no intervention.
Default: (-r -l inf -nc --adjust-extension)
Used in phase 3 (augment assets). Similar to wget_core_options, these are the basic options for wget to crawl a URL and download a static version. They are slightly different, with -nc (no clobber) instead of --page-requisites, reflecting the context of targeting supporting assets (such as images) to augment an existing site. As the retrieval method is blunt, not specifying this could be very time-consuming.
Default: (--show-progress --progress=bar:force:noscroll)
Wget progress bar, currently used when output_level=quiet (leave empty to omit), used when running Wget in both phases 2 and 3. It gives minimal updates per download during site capture, whilst recording more details may be recorded in the log file.
Default: 1
The number of parallel threads for running Wget (integer). This is a recently-introduced feature and should be regarded as experimental.
Default: 5
The number of times to call wget_extra_urls() to scan for and fetch extra URLs (integer).
Default: feed/index.xml
Newsfeeds are generally XML standards, whereas Wget typically saves these with a .html extension and updates anchors accordingly. The URLs of such feeds need replacing and this setting, currently targeted at WordPress, which stores feeds in a number of feed/ folders, specifies the tail of the invalid URLs.
Default: feed/index.xml
This setting specifies the tail of valid replacement feed URLs (ending .xml) for feed_html URLs. To properly support this in deployment, on the web server, add index.xml as the last entry to the DirectoryIndex directive in .htaccess at the site’s root.
Default: 3
For determining the capture level (0 fewest, 5 most) for URL matching of assets to download and localise.
Default: no
Use a wildcard for matching URLs in asset processing (y/n)? If set to ‘yes’, when capturing asset URLs on pages, a simple regex capture group will be used instead of the input file of itemised URLs generated in phases 2 and 3.
Default: "[,:]"
Additional class of separator characters (regular expression capture class) of URLs to be captured: for example, data-src (comma) and JSON (colon). Leave empty to omit.
Default: htm,html,xml,txt
List of web document file extensions, intended for assets search.
Comma-separate list of directories to exclude (relative to working mirror directory).
Default: js,css,svg,map,ico
Comma-separate list of file extensions for standard Web page components .
Default: cff,ttf,eot,woff,woff2
Comma-separate list of file extensions for Web fonts .
Default: jpeg,jpg,gif,png
Comma-separate list of file extensions for Web images.
Default: heic,webp,mp3,m4a,ogg,wav,avi,mpg,mp4,mov,ogv,wmv,3gp,3gp2
Comma-separate list of file extensions for audio and video assets.
Default: pdf,doc,docx,odt,ppt,xls,xlsx
Comma-separate list of file extensions for office documents.
Default: $web_element_extensions,$image_extensions,$audiovideo_extensions,$doc_extensions,$font_extensions
List of file extensions for assets that may be retrieved by Wget in phase 3 (derived from allowable upload file types). If no extensions are defined, then cURL will be used to remove non-HTML assets, but all other assets will be accepted.
Default: $web_element_extensions,$image_extensions,$font_extensions
List of file extensions for assets from external (3rd-party domains), a more limited set than for asset_extensions.
Default: yes
Convert absolute links to relative links for primary domain assets (y/n)?
Default: no
Remove query strings appended to paths and URLs in anchors limited to files of type given in query_prune_list (y/n)?
Default: js,css,svg,png,$font_extensions
List of file extensions in requests that may have query string appended for versioning or other non-essential purposes that can be pruned without loss of functionality.
Default: yes
Allow Wget to fetch additional URLs with query strings in phase 3 (y/n)?
Default: 100000
Only fetch URLs with query strings when the total number of assets is less than this number.
Default: contain
How assets from extra domains should be incorporated: empty or ‘off’ to keep in separate directories under mirror ID; ‘contain’ will move the directories inside the assets directory (see separate constant).
Default: webassets
Directory immediately under primary domain directory where extra assets are stored per extra domain (set empty to place assets in root).
Default: imports
Directory immediately under assets_directory for storing assets imported for extra domains.
Default: contain
For URLs with directories, specify what to do with assets that lie outside the mirrored directory: empty or off to keep assets where they are after the Wget mirror; contain to move the directories inside the assets directory.
Specify what to do with links to resources on same domain, but outside the mirrored tree: empty or off to not make relative, only point to the deployment domain; local to make relative, to the assets directory.
Default: yes
Option to cut directories, effectively shortening the URL. Enter yes or on for a MakeStaticSite-specific cut that moves content from the directory path specified in the URL up to the root directory. When this is enabled, there is no need (and it’s not recommended) to specify Wget option --cut-dirs. Leave empty or enter no or off to disable (when Wget option --cut-dirs may be used instead).
Default: yes
Enable cross-origin resources once downloaded (y/n)?
Default: yes
Include <link rel="canonical"...> tag in header (yes or no)? This helps search engines to index the site.
Default: /
The tail of canonical URLs and internal links, e.g. index.html or a trailing slash, /, which is assumed if left blank.
The tail for internal links, e.g. index.html or / (leave blank for /). The value should normally match link_href_tail.
Default: yes
Generate and overwrite robots.txt (yes or no)? Whilst a CMS may generate a virtual robots file, it might be unduly restrictive or not be a good fit for the static output. Selecting ‘yes’ signals the generation of a new robots.txt file.
Default: robots.txt
File name for default robots.txt (inside lib/files/). A sitemap will subsequently be appended.
Default: yes
Generate and overwrite the site map file (yes or no)? Whilst a CMS may generate a virtual site map, it might not be a good fit for the static output. Selecting ‘yes’ signals the generation of a new site map, which currently is constructed from a listing off all pages on the site.
Default: sitemap.xml
Name of sitemap (XML) file.
Site map XML schema URL.
Default: htm,html
A comma-separated list of file extensions allowed for inclusion in the sitemap file.
Wayback Machine module filename.
Default: no
Use a third-party client to download sites from the Wayback Machine (y/n)? If not set to ‘yes’, then any Wayback sites will be retrieved natively using default (Wget).
Comma-separated list of domains where a Wayback Machine is hosted.
Default: no
Perform dynamic check for Memento site using HTTP request header (y/n)?
Default: Memento-Datetime:
The search string that will be used in the HTTP header request to identify support for Memento URLs.
Earliest date timestamp (YYYYMMDDhhmmss) for Wayback Machine snapshot files.
Latest date timestamp (YYYYMMDDhhmmss) for Wayback Machine snapshot files.
Latest date timestamp (YYYYMMDDhhmmss) for Wayback Machine snapshot files.
Default: prefix
Wayback Machine CDX server match type: domain will return all results from host domain and all its subdomains; host will return results from host domain, but no other domains; exact will return results matching URL exactly; and ‘prefix’ will return results for all results under a URL path. Currently, the only options supported are prefix (the default) or exact.
URL of Hartator’s Wayback Machine Downloader GitHub repository.
Default: wayback_machine_downloader
[Path to] binary for the Wayback Machine downloader.
Restrict downloading to URLs that match this filter (enclose in slashes // to treat as a regex and place in quotes). For example, to include only HTML files with .html extension use: "/.*\.html/"
Skip downloading of URLs that match this filter (enclose in slashes // to treat as a regex and place in quotes). For example, to exclude ASP files use: "/.*\.asp.*/"
Accepted status codes. The default is 200 — OK. Enter all for 30x (redirections), 40x (not found, forbidden) and 50x (server error).
Default: *login*,*logout*
For connections that require a login, wget is run with a --reject parameter to avoid logouts.
Filename of the WordPress module, as stored in the lib/ directory.
The URL of where to install WP-CLI.
Default: yes
The permalinks structure has a key bearing on the output. This setting will force it to make use of the post name rather than post ID or dates.
The URL of a (temporary) version of the WP Static Search plugin tweaked to work offline.
Default: wp-static-search
Directory name of search plugin. Within the standard WordPress layout, a directory of this name will be created under the wp-plugins/ directory.
Default: yes
Remove query strings from WordPress core URLs.
Default: yes
Remove WordPress shortlinks.
Default: yes
Disable embeds in WordPress.
Default: yes
Disable support for XML-RPC in WordPress.
Default: yes
Remove Windows Live Writer <link> tag from header.
Default: yes
Remove support for REST API in WordPress.
Default: yes
Remove Really Simple Discovery (RSD) tag in WordPress.
Default: tidy
The command to invoke HTML Tidy, which is usually tidy.
Default: -m -q -indent --indent-spaces 2 --show-filename yes --tidy-mark no
Command line options for HTML Tidy. Errors will be collated in a single file in the MakeStaticSite root folder
Default: errors_htmltidy.txt
The error reporting generated by HTML Tidy will be saved in this file.
Default: "htm,html"
List of web document file extensions intended for HTML Tidy.
Default: red
(Similarly ink_warning (amber), ink_ok (green), ink_info (lime).) Ink colours supported on all displays, using standard labels: black, red, green, yellow, blue, magenta, cyan, and white. A few additional colours that need 256-colour support, with custom labels: amber, lime, paleblue.
Default: no
Remove query strings from filenames (yes/no).
Default: Thumbs.db,.DS_Store
List of unwanted system files, to be removed from mirror output.
Default: local
Timestamps are used for marking the creation of .cfg files and for mirror directories. There are three options: local (local time), utc (UTC time, with no local adjustment), and utclocal (local time specified in relation to UTC).
Default: quiet
This determines the level of reporting to the terminal when running There are four options with increasing levels of output: silent, quiet, normal and verbose. The setting for output_level tends to be quieter than that for logs (see following entry).
Default: normal
This determines the level of logging to file when running There are four options with increasing levels of output: silent, quiet, normal and verbose. The setting for log_level tends to be more verbose than that for terminal output (see previous entry).
Default: makestaticsite.log
The file name for logs. A single file stores all logged activity; separate processes (manual or automated) can carry out log rotation, as required.
Default: no
Trap errors with immediate script termination (yes/no). This is used to support debugging during development. It stops the script if any command [in a pipeline] fails, if a variable is unset, or an exit code indicates failure, i.e. is nonzero. It then reports the system error.
Default: no
In a few instances, may prompt the user with a warning message and then ask whether or not to continue; for example, after encountering an error code on running wget or when it is about to write data to a non-empty directory. If run_unattended is set to yes, it will be generally assumed that the choice is made to always continue, without manual intervention.
Default: extras
This is the name of the directory containing any files — in nested folders relative to the site’s web root — that should be added after the mirror has been generated.
Default: yes
Convert anchors to deployment domain to https (yes/no). The name of this constant deliberately echoes the use in WordPress.
Default: yes
Automatically replace occurrences of the source domain with the deployment domain (yes/no). If set to ‘no’, then a prompt will be issued at runtime reporting on the number of matches found.
Default: //
Domain prefix for matches (in sed).
Default: //
Domain prefix for substitutions (in sed).
Default: (-a -z -h)
Core rsync options (excludes the output level). -a archive mode preserves permissions, ownership, and modification times, etc.; -z compression during transfer; -h outputs numbers in human-readable format

It’s recommended that other options are left as they are.

This page was published on 1 November 2022 and last updated on 3 June 2024.