Runtime Options


In addition to the site-specific configuration generated by setup.sh and stored as a .cfg file, MakeStaticSite has further options, stored in lib/constants.sh, that control various runtime settings. It is not just a list of constants as some shell scripting is involved, for example, to generate some variables as a function of others.

These may be modified according to your preferences, but it is strongly recommended that you make a backup first.

As at version 0.29.6, the options are:

max_redirects
Default: 5
Maximum number of redirects allowed for determining the effective URL being mirrored. In this case the URL originally entered will be replaced by the effective URL.
etc_hosts
Default: /etc/hosts
Location of hosts file. When creating the source website locally, it can be useful for url_base and deploy_domain have the same domain, particularly to test certain functionality such reCAPTCHA. In this case, with the aid of the constants, ip4re and ip6re, MakeStaticSite will inspect the hosts file for an entry that anticipates the DNS for the domain and temporarily comment it out when it comes to deployment, so that there’s no interruption to site editing.
mss_file_permissions
Default: 600
Default Unix file permissions for file creation.
mss_dir_permissions
Default: 700
Default Unix file permissions for directory creation.
tmp_dir
Default: tmp
Directory where temporary files are to be stored. These are mainly to support Wget, including input files and cookies.
tab
Default: " "
Tab spacing for file outputs, e.g. the site map (XML) file.
credentials_rc_file
Default: .netrc
‘Run commands’ file for (temporary) storage of credentials — either .wgetrc or .netrc
credentials_cleanup
Default: yes
Delete references to credentials in temp files and .rc file on completion of run (y/n)?
credentials_manage_cmd
Default: pass
Path to binary for managing (and encrypting) credentials.
credentials_manage_cmd_url
Default: https://www.passwordstore.org/#download
URL where credentials manager may be downloaded.
credentials_storage_namespace
Default: MSS
MakeStaticSite-specific directory for storing credentials (usernames, passwords, tokens, etc.).
credentials_storage_mode
Default: plain
How to store credentials: config to store in the configuration file, as-is; plain to store separately, as-is, in plain text; encrypt to store separately and encrypt.
credentials_extension
Default: gpg
Encryption file type extension.
credentials_home
Default: "$HOME/.password-store"
Password-designated directory under which credentials are stored.
wget_cmd
Default: wget
Path to Wget binary. If wget is available in PATH, then simply enter wget. Otherwise, enter its full path.
wget_error_level
Default: 6
The lowest Wget error code tolerated or else aborts (>8 for no tolerance).
wget_user_agent
Default: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15)
The browser user agent to be used by Wget (if not supplied, then access might be refused by the host’s web application firewall.
wget_protocol_relative_urls
Default: yes
Allow protocol-relative URLs to be fetched by Wget by prefixing a protocol (y/n).
wget_protocol_prefix
Default: https
Protocol to prefix protocol-relative URLs.
wget_http_login_field
Default: user
Wget’s user login field for HTTP authentication.
wget_http_password_field
Default: password
Wget’s password field for HTTP authentication.
wget_cookies
Default: cookies.txt
The name of the cookies file used by Wget.
wget_cookies_min_filelength
Default: 5
The minimum number of lines for a valid non-empty Wget cookies file.
wget_cookies_nullify_user_agent
Default: no
When wget_user_agent is defined above as a non-empty string, should it be reset to null for handling cookies (yes/no)
wget_post
Default: wget_post.txt
The name of the file containing POST data.
wget_inputs_main
Default: wget_inputs_main.txt
The name of the input file for wget, used in the first run. This file comprises URLs that might not be reachable by standard crawls.
wget_inputs_extra
Default: wget_inputs_extra.txt
The name of the input file for wget, used in subsequent runs. This file is auto-generated during a deep search of URLs.
wget_mirror_options
Default: (--recursive --timestamping --level=inf --no-remove-listing)
The standard default settings for wget to generated a mirror. They can be tweaked. For example, to create only a partial mirror, set --level to be a number.
wget_core_options
Default: ("${wget_mirror_options[@]}" --convert-links --adjust-extension --page-requisites)
These are the basic options for wget to crawl a URL and download a static version. This should only be changed if there’s undesirable behaviour. Additional options should be specified per site in the .cfg file.
wget_no_parent
Default: auto
Should capturing URLs with directories include the --no-parent option? Set to auto or yes to check and add automatically; manual to check and ask during runtime; otherwise no intervention.
wget_extra_core_options
Default: (-r -l inf -nc --adjust-extension)
Used in phase 3 (augment assets). Similar to wget_core_options, these are the basic options for wget to crawl a URL and download a static version. They are slightly different, with -nc (no clobber) instead of --page-requisites, reflecting the context of targeting supporting assets (such as images) to augment an existing site. As the retrieval method is blunt, not specifying this could be very time-consuming.
wget_progress_indicator
Default: (--show-progress --progress=bar:force:noscroll)
Wget progress bar, currently used when output_level=quiet (leave empty to omit), used when running Wget in both phases 2 and 3. It gives minimal updates per download during site capture, whilst recording more details may be recorded in the log file.
wget_threads
Default: 1
The number of parallel threads for running Wget (integer). This is a recently-introduced feature and should be regarded as experimental.
wget_extra_urls_depth
Default: 5
The number of times to call wget_extra_urls() to scan for and fetch extra URLs (integer).
feed_html
Default: feed/index.xml
Newsfeeds are generally XML standards, whereas Wget typically saves these with a .html extension and updates anchors accordingly. The URLs of such feeds need replacing and this setting, currently targeted at WordPress, which stores feeds in a number of feed/ folders, specifies the tail of the invalid URLs.
feed_xml
Default: feed/index.xml
This setting specifies the tail of valid replacement feed URLs (ending .xml) for feed_html URLs. To properly support this in deployment, on the web server, add index.xml as the last entry to the DirectoryIndex directive in .htaccess at the site’s root.
url_asset_capture_level
Default: 3
For determining the capture level (0 fewest, 5 most) for URL matching of assets to download and localise.
url_wildcard_capture
Default: no
Use a wildcard for matching URLs in asset processing (y/n)? If set to ‘yes’, when capturing asset URLs on pages, a simple regex capture group will be used instead of the input file of itemised URLs generated in phases 2 and 3.
url_separator_chars
Default: "[,:]"
Additional class of separator characters (regular expression capture class) of URLs to be captured: for example, data-src (comma) and JSON (colon). Leave empty to omit.
web_source_extensions
Default: htm,html,xml,txt
List of web document file extensions, intended for assets search.
web_source_exclude_dirs
Default:
Comma-separate list of directories to exclude (relative to working mirror directory).
web_element_extensions
Default: js,css,svg,map,ico
Comma-separate list of file extensions for standard Web page components .
font_extensions
Default: cff,ttf,eot,woff,woff2
Comma-separate list of file extensions for Web fonts .
image_extensions
Default: jpeg,jpg,gif,png
Comma-separate list of file extensions for Web images.
audiovideo_extensions
Default: heic,webp,mp3,m4a,ogg,wav,avi,mpg,mp4,mov,ogv,wmv,3gp,3gp2
Comma-separate list of file extensions for audio and video assets.
doc_extensions
Default: pdf,doc,docx,odt,ppt,xls,xlsx
Comma-separate list of file extensions for office documents.
asset_extensions
Default: $web_element_extensions,$image_extensions,$audiovideo_extensions,$doc_extensions,$font_extensions
List of file extensions for assets that may be retrieved by Wget in phase 3 (derived from WordPress.com allowable upload file types). If no extensions are defined, then cURL will be used to remove non-HTML assets, but all other assets will be accepted.
asset_extensions_external
Default: $web_element_extensions,$image_extensions,$font_extensions
List of file extensions for assets from external (3rd-party domains), a more limited set than for asset_extensions.
prune_query_strings
Default: yes
Remove query strings appended to paths and URLs in anchors limited to files of type given in query_prune_list (y/n)?
query_prune_list
Default: js,css,svg,png,$font_extensions
List of file extensions in requests that may have query string appended for versioning or other non-essential purposes that can be pruned without loss of functionality.
extra_assets_allow_query_strings
Default: yes
Allow Wget to fetch additional URLs with query strings in phase 3 (y/n)?
extra_assets_query_strings_limit
Default: 100000
Only fetch URLs with query strings when the total number of assets is less than this number.
extra_assets_mode
Default: contain
How assets from extra domains should be incorporated: empty or ‘off’ to keep in separate directories under mirror ID; ‘contain’ will move the directories inside the assets directory (see separate constant).
assets_directory
Default: webassets
Directory immediately under primary domain directory where extra assets are stored per extra domain (set empty to place assets in root).
imports_directory
Default: imports
Directory immediately under assets_directory for storing assets imported for extra domains.
parent_dirs_mode
Default: contain
For URLs with directories, specify what to do with assets that lie outside the mirrored directory: empty or off to keep assets where they are after the Wget mirror; contain to move the directories inside the assets directory.
external_dir_links
Default:
Specify what to do with links to resources on same domain, but outside the mirrored tree: empty or off to not make relative, only point to the deployment domain; local to make relative, to the assets directory.
mss_cut_dirs
Default: yes
Option to cut directories, effectively shortening the URL. Enter yes or on for a MakeStaticSite-specific cut that moves content from the directory path specified in the URL up to the root directory. When this is enabled, there is no need (and it’s not recommended) to specify Wget option --cut-dirs. Leave empty or enter no or off to disable (when Wget option --cut-dirs may be used instead).
cors_enable
Default: yes
Enable cross-origin resources once downloaded (y/n)?
link_rel_canonical
Default: yes
Include <link rel="canonical"...> tag in header (yes or no)? This helps search engines to index the site.
link_href_tail
Default: /
The tail of canonical URLs and internal links, e.g. index.html or a trailing slash, /, which is assumed if left blank.
a_href_tail
Default:
The tail for internal links, e.g. index.html or / (leave blank for /). The value should normally match link_href_tail.
robots_create
Default: yes
Generate and overwrite robots.txt (yes or no)? Whilst a CMS may generate a virtual robots file, it might be unduly restrictive or not be a good fit for the static output. Selecting ‘yes’ signals the generation of a new robots.txt file.
robots_default_file
Default: robots.txt
File name for default robots.txt (inside lib/files/). A sitemap will subsequently be appended.
sitemap_create
Default: yes
Generate and overwrite the site map file (yes or no)? Whilst a CMS may generate a virtual site map, it might not be a good fit for the static output. Selecting ‘yes’ signals the generation of a new site map, which currently is constructed from a listing off all pages on the site.
sitemap_file
Default: sitemap.xml
Name of sitemap (XML) file.
sitemap_schema
Default: http://www.sitemaps.org/schemas/sitemap/0.9
Site map XML schema URL.
sitemap_file_extensions
Default: htm,html
A comma-separated list of file extensions allowed for inclusion in the sitemap file.
mod_wayback
Default: mod_wayback.sh
Wayback Machine module filename.
wayback_enable
Default: yes
Enable Wayback Machine module (y/n)? If not enabled, then any Wayback sites will be retrieved using the default mirroring method (Wget).
wayback_hosts
Default: web.archive.org,www.webarchive.org.uk
Comma-separated list of domains where a Wayback Machine is hosted.
wayback_date_from
Default:
Earliest date timestamp (YYYYMMDDhhmmss) for Wayback Machine snapshot files.
wayback_date_to
Default:
Latest date timestamp (YYYYMMDDhhmmss) for Wayback Machine snapshot files.
wayback_date_to
Default:
Latest date timestamp (YYYYMMDDhhmmss) for Wayback Machine snapshot files.
wayback_matchtype
Default: prefix
Wayback Machine CDX server match type: domain will return all results from host domain and all its subdomains; host will return results from host domain, but no other domains; exact will return results matching URL exactly; and ‘prefix’ will return results for all results under a URL path. Currently, the only options supported are prefix (the default) or exact.
wayback_machine_downloader_url
Default: https://github.com/hartator/wayback-machine-downloader
URL of Hartator’s Wayback Machine Downloader GitHub repository.
wayback_machine_downloader_cmd
Default: wayback_machine_downloader
[Path to] binary for the Wayback Machine downloader.
wayback_machine_only
Default:
Restrict downloading to URLs that match this filter (enclose in slashes // to treat as a regex and place in quotes). For example, to include only HTML files with .html extension use: "/.*\.html/"
wayback_machine_excludes
Default:
Skip downloading of URLs that match this filter (enclose in slashes // to treat as a regex and place in quotes). For example, to exclude ASP files use: "/.*\.asp.*/"
wayback_machine_statuscodes
Default:
Accepted status codes. The default is 200 — OK. Enter all for 30x (redirections), 40x (not found, forbidden) and 50x (server error).
wget_reject_clause
Default: *login*,*logout*
For connections that require a login, wget is run with a --reject parameter to avoid logouts.
mod_wp
Default: mod_wp.sh
Filename of the WordPress module, as stored in the lib/ directory.
wp_cli_install
Default: https://wp-cli.org/#installing
The URL of where to install WP-CLI.
wp_permalinks_postname
Default: yes
The permalinks structure has a key bearing on the output. This setting will force it to make use of the post name rather than post ID or dates.
wp_search_plugin
Default: https://makestaticsite.sh/download/contrib/wp-static-search-1-1-1.zip
The URL of a (temporary) version of the WP Static Search plugin tweaked to work offline.
wp_search_dir
Default: wp-static-search
Directory name of search plugin. Within the standard WordPress layout, a directory of this name will be created under the wp-plugins/ directory.
wp_remove_query_strings
Default: yes
Remove query strings from WordPress core URLs.
wp_remove_shortlink
Default: yes
Remove WordPress shortlinks.
wp_disable_embeds
Default: yes
Disable embeds in WordPress.
wp_disable_xmlrpc
Default: yes
Disable support for XML-RPC in WordPress.
wp_remove_wlwmanifest_link
Default: yes
Remove Windows Live Writer <link> tag from header.
wp_remove_rest_api_links
Default: yes
Remove support for REST API in WordPress.
wp_remove_rsd_link
Default: yes
Remove Really Simple Discovery (RSD) tag in WordPress.
htmltidy_cmd
Default: tidy
The command to invoke HTML Tidy, which is usually tidy.
htmltidy_options
Default: -m -q -indent --indent-spaces 2 --show-filename yes --tidy-mark no
Command line options for HTML Tidy. Errors will be collated in a single file in the MakeStaticSite root folder
htmltidy_errors_file
Default: errors_htmltidy.txt
The error reporting generated by HTML Tidy will be saved in this file.
htmltidy_source_extensions
Default: "htm,html"
List of web document file extensions intended for HTML Tidy.
ink_error
Default: red
(Similarly ink_warning (amber), ink_ok (green), ink_info (lime).) Ink colours supported on all displays, using standard labels: black, red, green, yellow, blue, magenta, cyan, and white. A few additional colours that need 256-colour support, with custom labels: amber, lime, paleblue.
clean_query_extensions
Default: no
Remove query strings from filenames (yes/no).
system_files_cleanup
Default: Thumbs.db,.DS_Store
List of unwanted system files, to be removed from mirror output.
timezone
Default: local
Timestamps are used for marking the creation of .cfg files and for mirror directories. There are three options: local (local time), utc (UTC time, with no local adjustment), and utclocal (local time specified in relation to UTC).
output_level
Default: quiet
This determines the level of reporting to the terminal when running makestaticsite.sh. There are four options with increasing levels of output: silent, quiet, normal and verbose. The setting for output_level tends to be quieter than that for logs (see following entry).
log_level
Default: normal
This determines the level of logging to file when running makestaticsite.sh. There are four options with increasing levels of output: silent, quiet, normal and verbose. The setting for log_level tends to be more verbose than that for terminal output (see previous entry).
log_filename
Default: makestaticsite.log
The file name for logs. A single file stores all logged activity; separate processes (manual or automated) can carry out log rotation, as required.
trap_errors
Default: no
Trap errors with immediate script termination (yes/no). This is used to support debugging during development. It stops the script if any command [in a pipeline] fails, if a variable is unset, or an exit code indicates failure, i.e. is nonzero. It then reports the system error.
run_unattended
Default: no
In a few instances, makestaticsite.sh may prompt the user with a warning message and then ask whether or not to continue; for example, after encountering an error code on running wget or when it is about to write data to a non-empty directory. If run_unattended is set to yes, it will be generally assumed that the choice is made to always continue, without manual intervention.
extras_dir
Default: extras
This is the name of the directory containing any files — in nested folders relative to the site’s web root — that should be added after the mirror has been generated.
force_ssl
Default: yes
Convert anchors to deployment domain to https (yes/no). The name of this constant deliberately echoes the use in WordPress.
force_domains
Default: yes
Automatically replace occurrences of the source domain with the deployment domain (yes/no). If set to ‘no’, then a prompt will be issued at runtime reporting on the number of matches found.
domain_match_prefix
Default: //
Domain prefix for matches (in sed).
domain_subs_prefix
Default: //
Domain prefix for substitutions (in sed).
rsync_options
Default: (-a -z -h)
Core rsync options (excludes the output level). -a archive mode preserves permissions, ownership, and modification times, etc.; -z compression during transfer; -h outputs numbers in human-readable format

It’s recommended that other options are left as they are.

This page was published on 1 November 2022 and last updated on 13 May 2024.