Check for Broken Links in Jekyll
You know the frustrating experience: There you are, reading an interesting article, you click on that promising link… and all you get is a 404 page.
If you are lucky, that is.
Let’s try to do better than that. Fixing dead links will not only delight your esteemed readers but also make search engine crawlers happy. Here’s a simple way to check for broken links in your Jekyll-based website.
The html-proofer Ruby gem is a tool to check the validity of your HTML output. Among other things, it checks for broken links, both internal and external.
Installation is quick and easy:
- Add
gem "html-proofer"
to yourGemfile
- Run
bundle install
Run the tool on your Jekyll build directory:
bundle exec htmlproofer --assume_extension '.html' ./_site
The --assume_extension
option automatically adds extensions to file paths, as this is used by extension-less URLs in Jekyll 3 and GitHub pages.
See the HTMLProofer README for more details and options. In my case, I added the --http_status_ignore=999
option to filter some external links returning that particular error when not using a regular browser.
You can use the Ruby-based task automation tool Rake to further automate the process. Add gem "rake"
to your Gemfile
and run bundle install
again to install it. Then create a Rakefile
with the following contents:
require 'html-proofer'
task :test do
sh "bundle exec jekyll build"
options = { :assume_extension => '.html' }
HTMLProofer.check_directory('_site/', options).run
end
Now you can simply run bundle exec rake test
to check your HTML files.
Depending on your setup, you could also integrate this into your build pipeline to run the test automatically, e.g., when you push new changes or regularly during a nightly build.