This is a JavaScript crawler to back up my site. Please download this site and keep it somewhere. I will be overwritten by glowies otherwise. Feel free to post it anywhere, I would like to be known in this dark world.
Press the button...
Implementation Notes
My script is kind of janky but it works, it is also extremely simple. The most complicated stuff it handles are import maps.
-
The
index.html
is crawled first, from there it does a depth-first search (DFS) for links to other files. - It only looks for links in HTML, JS, and CSS files, everything else is treated as a blob. There is no AST parsing, it just looks for anything that might be a link using a regular expression (RegExp).
- The special import map script is parsed in HTML files, which re-maps ES Module imports to different URLs. This is extremely useful for referencing ES modules without build tooling.
- All of the crawled contents are put in a ZIP file using the excellent zip.js library.