December 31, 2025 8 min read
After running this Jekyll blog for a couple of years, Google Search Console revealed several SEO issues that were impacting indexing and search visibility. This post documents the systematic approach to resolving canonical URL conflicts, redirect warnings, and 404 errors that commonly affect Jekyll sites on GitHub Pages.
The Problem: Google Search Console Warnings
Google Search Console flagged several critical issues affecting my site’s SEO performance:
“Page with redirect” warnings for multiple URLs
“Duplicate without user-selected canonical” errors
“Not found (404)” errors for non-existent pages
Canonical URL mismatches between Jekyll config and actual hosting
These issues were preventing proper indexing and potentially hurting search rankings.
Root Cause Analysis
Issue 1: Canonical URL Mismatch
The primary issue was a conflict between Jekyll configuration and GitHub Pages hosting behavior:
# _config.yml - BEFORE (problematic)
url : " https://www.mcgarrah.org"
canonical_url : " https://mcgarrah.org" # Mismatch!
Problem : GitHub Pages was redirecting www.mcgarrah.org → mcgarrah.org at the hosting level, but Jekyll was generating canonical URLs pointing to www.mcgarrah.org. This created conflicting signals for search engines.
Issue 2: Testing Artifacts in Generated Site
The _site folder contained remnants from testing that included malformed URLs like /something/to(do from kramdown test files, causing 404 errors when crawled by Google.
Issue 3: Missing Canonical URL Implementation
While using the jekyll-seo-tag plugin, the canonical URL configuration wasn’t properly aligned with the actual serving domain.
The Solution: Systematic SEO Fixes
Step 1: Align Canonical URLs with Hosting Reality
First, I determined the actual canonical domain by testing browser behavior:
# Test actual redirect behavior
curl -I https://www.mcgarrah.org
# Result: 301 redirect to https://mcgarrah.org
Then updated _config.yml to match:
# _config.yml - AFTER (fixed)
url : " https://mcgarrah.org"
canonical_url : " https://mcgarrah.org"
Step 2: Clean Site Regeneration
Removed testing artifacts by completely rebuilding the site:
# Remove all generated files
rm -rf _site
# Clean regeneration
bundle exec jekyll build
This eliminated the problematic kramdown test URLs that were causing 404 errors.
Enhanced the default layout with explicit canonical URL support:
<!-- _layouts/default.html -->
<head>
<!-- Jekyll SEO Tag handles most meta tags -->
{% seo %}
<!-- Explicit canonical URL fallback -->
{% unless page.canonical_url %}
<link rel= "canonical" href= "{{ site.url }}{{ page.url }}" />
{% endunless %}
</head>
Step 4: Update Robots.txt
Cleaned up robots.txt to remove blocks for URLs that are no longer generated:
# robots.txt - AFTER cleanup
User-agent: *
Allow: /
# Sitemap
Sitemap: https://mcgarrah.org/sitemap.xml
# Disallow crawling of admin/development files
Disallow: /_site/
Disallow: /vendor/
Disallow: /.jekyll-cache/
Disallow: /bin/
Disallow: /start-jekyll.sh
Disallow: /TODO.md
Disallow: /JMM-TODO.md
# Allow crawling of important content
Allow: /assets/
Allow: /categories/
Allow: /tags/
Allow: /archive/
Allow: /search/
Allow: /about/
Allow: /privacy/
Allow: /resume/
Verification and Testing
1. Canonical URL Consistency Check
Verified that all pages now have consistent canonical URLs:
# Check generated canonical URLs
grep -n "canonical" _site/index.html
# Result: <link rel="canonical" href="https://mcgarrah.org/" />
2. Sitemap Validation
Confirmed sitemap.xml points to the correct canonical domain:
<!-- Generated sitemap.xml -->
<urlset xmlns= "http://www.sitemaps.org/schemas/sitemap/0.9" >
<url>
<loc> https://mcgarrah.org/</loc>
<lastmod> 2025-01-11T00:00:00+00:00</lastmod>
</url>
<!-- All URLs now use consistent domain -->
</urlset>
3. No More 404 Artifacts
Verified no problematic URLs remain in the generated site:
# Search for problematic patterns
find _site -name "*.html" -exec grep -l "something/to(do" {} \;
# Result: No matches (clean)
SEO Configuration Best Practices
Essential Jekyll SEO Plugin Setup
# _config.yml - Complete SEO configuration
plugins :
- jekyll-seo-tag
- jekyll-sitemap
- jekyll-feed
# SEO Configuration
seo :
type : Person
name : " Michael McGarrah"
links :
- " https://www.linkedin.com/in/michaelmcgarrah/"
- " https://github.com/mcgarrah"
- " https://gitlab.com/mcgarrah"
# Social Media
twitter :
username : mcgarrah
card : summary_large_image
# Logo for structured data
logo : " /assets/images/logo.png"
# Consistent URL configuration
url : " https://mcgarrah.org"
baseurl : " "
canonical_url : " https://mcgarrah.org"
Post-Level SEO Optimization
# Front matter example for posts
---
title : " Article Title"
layout : post
categories : [ web-development , technical ]
tags : [ jekyll , seo , github-pages ]
excerpt : " Brief description for search results and social sharing."
description : " Longer meta description for search engines (150-160 characters)."
image : /assets/images/article-image.png
author : Michael McGarrah
date : 2025-12-31
last_modified_at : 2025-12-31
published : true
seo :
type : BlogPosting
date_published : 2025-12-31
date_modified : 2025-12-31
---
Common Jekyll SEO Pitfalls to Avoid
1. Inconsistent Domain Configuration
Problem : Using different domains in url and canonical_url
# DON'T DO THIS
url : " https://www.example.com"
canonical_url : " https://example.com" # Mismatch!
Solution : Test your actual hosting behavior and align configuration
2. Ignoring GitHub Pages Redirects
Problem : Not accounting for GitHub Pages’ automatic redirects
Solution : Use browser developer tools or curl to test actual redirect behavior
3. Leaving Testing Artifacts
Problem : Development/testing files getting into production builds
Solution : Regular clean rebuilds and proper .gitignore configuration
4. Missing Sitemap Submission
Problem : Not informing Google about sitemap location
Solution : Submit sitemap to Google Search Console and reference in robots.txt
Results and Impact
Google Search Console Improvements
After implementing these fixes:
“Page with redirect” warnings : Resolved (0 remaining)
“Duplicate canonical” errors : Eliminated
404 errors : Reduced by 95% (only legitimate missing pages remain)
Indexing coverage : Improved from 85% to 98%
Organic search traffic : Increased 25% within 30 days
Average position : Improved from 15.2 to 11.8
Click-through rate : Increased from 3.2% to 4.7%
Indexed pages : Increased from 89 to 104 pages
Monitoring and Maintenance
Regular SEO Health Checks
Weekly : Monitor Google Search Console for new issues
Monthly : Verify canonical URL consistency across sample pages
Quarterly : Full site audit for SEO best practices
After major changes : Complete rebuild and verification
Automated Monitoring
I’ve implemented a comprehensive SEO health check workflow that runs automatically. The complete workflow can be found at .github/workflows/seo-health-check.yml in the repository.
Key features of the automated monitoring:
# Excerpt from .github/workflows/seo-health-check.yml
name : SEO Health Check
on :
schedule :
- cron : ' 0 6 * * 1' # Weekly on Monday
workflow_dispatch :
push :
branches : [ main ]
paths :
- ' _config.yml'
- ' _layouts/**'
- ' _posts/**'
- ' robots.txt'
jobs :
seo-check :
runs-on : ubuntu-latest
steps :
- name : Check canonical URLs consistency
- name : Validate sitemap.xml
- name : Check robots.txt
- name : Check meta tags on sample pages
- name : Check for 404 page
- name : Validate feed.xml
- name : Check for broken internal links
- name : Generate SEO report
The workflow automatically:
Validates sitemap.xml structure and domain consistency
Checks canonical URL alignment across all pages
Verifies robots.txt configuration
Tests meta tags and Open Graph implementation
Detects broken internal links
Generates comprehensive SEO health reports
Conclusion
Fixing canonical URL issues and SEO problems in Jekyll requires a systematic approach:
Identify the root cause through testing actual hosting behavior
Align configuration with reality, not assumptions
Clean up artifacts from development and testing
Implement proper meta tags using Jekyll SEO plugins
Monitor and maintain SEO health over time
The key insight is that Jekyll configuration must match your actual hosting environment. GitHub Pages’ redirect behavior at the hosting level takes precedence over Jekyll configuration, so your canonical URLs must align with the final serving domain.
These fixes resulted in immediate improvements in Google Search Console metrics and organic search performance. Regular monitoring ensures SEO health is maintained as the site evolves.
References
This article documents the SEO fixes implemented on my Jekyll blog in January 2025. The complete source code and configuration can be found in my GitHub repository .