The _data Folder in Action Powering Dynamic Jekyll Content

Among Jekyll’s most underrated yet powerful features is the _data folder. For developers and bloggers alike, understanding how to harness structured data within Jekyll can transform a static site into something flexible, scalable, and surprisingly dynamic. In this guide, we’ll unpack the full potential of the _data folder, explore real-world use cases, walk through hands-on examples, and highlight common pitfalls and best practices.

Table of Contents

Why the _data Folder Matters

When building a site with Jekyll, repetition is one of the biggest time-wasters. Think about maintaining multiple navigation menus, updating team profiles, or keeping track of external resources. Without a central source of truth, changes quickly become messy. The _data folder solves this by letting you store information in a structured format (YAML, JSON, or CSV) and reuse it throughout your layouts, posts, and includes. This is especially useful for GitHub Pages projects where maintainability matters.

Understanding Structured Data in Jekyll

Structured data in Jekyll is nothing more than key-value pairs organized in files. These can be:

  • YAML – human-readable, most common for Jekyll sites.
  • JSON – good for API-like data structures or when interoperating with other tools.
  • CSV – useful for tabular data like product listings or event calendars.

Inside your project, Jekyll automatically loads everything in _data into the global site.data object. This means you can call {{ site.data.filename.key }} anywhere in your templates.

Basic Usage of the _data Folder

Let’s walk through a simple example:

_data/
  authors.yml

Inside authors.yml:

john:
  name: John Doe
  bio: Developer and open-source enthusiast
  twitter: johndoe

jane:
  name: Jane Smith
  bio: Technical writer and blogger
  twitter: janesmith

You can now display this anywhere in your templates:

{% raw %}{% for author in site.data.authors %}
  <div class="author">
    <h3>{{ author[1].name }}</h3>
    <p>{{ author[1].bio }}</p>
  </div>
{% endfor %}{% endraw %}

This ensures consistency across your site without hardcoding the same information repeatedly.

Case Study: Centralizing Navigation

Imagine a blog with 50+ posts and multiple categories. If you hardcode your navigation menu in multiple layouts, every new page or update becomes a maintenance nightmare. By moving your menu to _data/navigation.yml, you gain flexibility:

main:
  - title: Home
    url: /
  - title: Blog
    url: /blog/
  - title: About
    url: /about/

Then in your layout:

{% raw %}<ul class="menu">
{% for item in site.data.navigation.main %}
  <li><a href="{{ item.url }}">{{ item.title }}</a></li>
{% endfor %}
</ul>{% endraw %}

This way, updating navigation is as simple as editing one file, and the changes propagate site-wide instantly.

Building Dynamic Content with YAML and JSON

The _data folder is particularly effective for repeating elements. Common use cases include:

  • Team pages – store bios and links in YAML, render them dynamically.
  • Resource libraries – manage toolkits, references, or downloadable files.
  • Event calendars – keep dates in JSON or CSV, loop through them in templates.

Example for JSON:

{
  "2025-01-15": {
    "title": "Jekyll Workshop",
    "location": "Online"
  },
  "2025-02-20": {
    "title": "GitHub Pages Meetup",
    "location": "Berlin"
  }
}

Looping through JSON data is just as straightforward as YAML.

Advanced Patterns for Data

Beyond basics, you can create more advanced structures:

  • Hierarchical data – nest categories and subcategories for product catalogs.
  • Localization – store translations in _data/i18n and render content based on language keys.
  • External data pipelines – generate JSON or YAML files with scripts, then feed them into Jekyll.

For example, in multilingual setups, you might have:

_data/i18n/
  en.yml
  fr.yml

With en.yml:

welcome: "Welcome to our site"
about: "About us"

This can then be referenced dynamically in layouts using Liquid conditionals.

Common Mistakes to Avoid

  • Placing _data outside the project root – Jekyll won’t recognize it.
  • Using invalid YAML formatting – a single missing space can break your build.
  • Hardcoding values when you should centralize them – losing the benefit of DRY (Don’t Repeat Yourself).

FAQ About the _data Folder

Can I use _data in GitHub Pages without extra configuration?

Yes. GitHub Pages supports YAML, JSON, and CSV in _data by default.

Is there a size limit for data files?

While no strict limit is documented, keeping files under a few megabytes is recommended for performance reasons.

Can _data pull from external APIs?

Not directly during build, but you can preprocess data into _data via scripts or GitHub Actions.

Summary and Next Steps

The _data folder is a game-changer for anyone serious about managing a Jekyll site efficiently. From centralizing navigation to powering multilingual experiences, it allows you to scale your site without scaling your headaches. In future articles, we’ll dive deeper into integrating _data with collections, advanced filtering, and dynamic layouts that blur the line between static and dynamic web design.

If you haven’t already, experiment with moving one hardcoded component into _data. You’ll quickly see the benefits of centralized, reusable content in action.