Getting From URL to Template

So you load a page on a WordPress site. How did it know to load the page template? How did it know it was a page and not a category listing? Is it hardcoded somewhere?

To begin with lets list out what happens, then we’ll explain each step in basic terms:

  1. .htaccess rules pass the URL to the root  index.php
  2. Theme and plugins are loaded
  3. WordPress Core matches the URL to a regular expression
  4. Regular expression pulls out fields and maps them on to query variables
  5. Redirection and Canonical URL Checks
  6. Query variables are passed into main WP_Query object
  7. The Theme Template is loaded
  8. Loaded Template runs the main loop

Before I continue, the reason people recommend against the use of query_posts, is that it repeats steps 6, 7 and 8, almost doubling the work done, and it doesn’t do it in a nice way, causing all sorts of problems. However there are ways of getting around this, TLDR, never use query_posts.

1. htaccess rules

Having pointed your browser at http://www.tomjn.com/hello/ you expect to see a page, probably some kind of greeting, so what happens first?

Well, the first thing that happens is Apache will see your machine sending a request. It looks up your sites folder on the server and see a handy .htaccess file. In this file is probably something like this:

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress

Some of you may know what this means, but basically it’s telling Apache to redirect everything to index.php. This is not the index.php in your theme, this is the index.php provided by WordPress, the one that sites next to the wp-admin, wp-content and wp-includes folders. There are similar files and rules for IIS and nginx.

From this, index.php does a quick look at some PHP global variables and extracts the URL that was requested.

2. Theme and Plugins are Loaded

At this point, WordPress makes some internal calls and sets some things up. It looks inside the mu-plugins folder and loads what it can find, tests it has a database connection, etc.

Then, it loads a value from the options table, telling it which theme it needs to load, and which plugins are active. It loads each one by one, and loads the functions.php of the active theme.

Note that at this stage, no content has been requested from the database, and no templates have been loaded. Nothing has been sent to the browser yet.

Warning: If you’ve loaded a page with SHORT_INIT defined as true, any custom post types registered in your themes functions.php will be ignored. This is another reason why you should put post types and taxonomies in a dedicated plugin and not inside a theme.

3. WordPress Core matches the URL to a regular expression

Part of the pretty permalinks is a long list of regular expressions. These are similar to search queries with wildcards ( * ), but on steroids. A lot of programmers struggle with regular expressions, but it’s useful to know what they do, even if you don’t understand them.

Starting from the beginning, WordPress checks each expression until it finds the first match. There can be a lot of rules, and it isn’t always clear which ones apply or how they translate. To help with this, I recommend the Monkeyman Rewrite Analyzer plugin. It will show you all rewrite rules, and lets you type in URLs to see which rules match up and at which priority.

Monkeyman Rewrite Analyzer

A good article for adding, modifying, and removing rewrite rules and pretty permalinks can be found here

4. Regular expression pulls out fields and maps them on to query variables

These regular expressions extract data from the URL, which is then mapped on to query variables. So what are these query variables?

Remember when you made a call to WP_Query/query_posts/get_posts, and had to pass in some arguments? Those are the query variables.

Query variables define exactly what it is that a query to find posts should be looking for. They’re like an order at a restuarant for food, only we’re ordering posts from the database, not steak from the kitchen.

E.g. the ‘ s‘ query variable is a search term. So s=hello would be a search for “hello”. You can pass query variables in directly to the URL, this is how page loading works when pretty permalinks are turned off.

You can see a list of all the valid parameters/queryvars on the WP_Query codex page. You can also add custom query variables of your own but that’s a subject for another post.

5. Redirection and Canonical URL Checks

Some search engines consider example.com and www.example.com to be different pages. To correct this, WordPress makes a call to redirect_canonical to check if the requested URL is the canonical, authorative URL. If it isn’t, the browser is redirected to the correct place.

You can find the implementation of that function here, but be warned, it is long and complex.

Sidenote: redirect_canonical doesn’t always do what it promises, and people have ran into unexpected issues in the past.

“redirect_canonical() is larger than I remember…”

Rarst

“It grows automagically with each release.”

Toscho

6. Query variables are passed into main WP_Query object

It’s at this point that 2 things happen:

  • The query variables are put in a nice package and passed through a filter
  • The newly filtered variables are passed into a WP_Query object

The filter in question is pre_get_posts, the de-facto standard way of modifying what is loaded on a page. Using this filter lets you change values such as how many posts per page, and lets you add extra parameters ( e.g. only showing a certain authors posts on the homepage, or a certain category ).

After it’s passed through this filter, a new query object is created, and assigned to $wp_query. This is the main query that powers the main loop, complete with dedicated wrappers for easy usage, e.g. the_post, have_posts, etc E.g.:

function have_posts() {
    global $wp_query;
    $wp_query->have_posts();
}

All post loops and querys are WP_Query, so skip the monkey and go straight to the organ grinder/ WP_Query.

For more information about how to retrieve posts, refer to this presentation by Andrew Nacin, an experienced Core developer. It will quickly teach you the proper way of querying posts, and why.

7. The Theme Template is loaded

Now that we have a query, we can do a post loop and display all the posts requested. Usually, only 1 post was requested, e.g. a page, so the loop will only go round once. It’s time to load the template!

But how does it know which template to load?

Earlier when building the list of query variables, flags were set, and these are used to determine what kind of query was being made. For example, if an ‘s’ parameter was passed in, then calling is_search() will return true, and search.php will be loaded. If search.php is not present, it looks back until it reaches index.php. This is why index.php is necessary for a theme to work, as it is the last fallback.

Below you can see the template hierarchy in a flow chart, showing how WordPress decides which template to use, based on the main query it was given.

Template Hierarchy

Template Hierarchy

Parent & Child Themes

To account for parent and child themes, the function get_template_part is used. It’s recommended that you use this function instead of require or include for theme files that display html.

When called, the template hierarchy is ran through, but for each file, a check is made on the child, then the parent theme.

Also, when loading functions.php, the child themes functions.php is loaded first.

But What about Page Templates?

Where do they fit in? If the current query is for a single post, and that post is of type page, WordPress retrieves the post meta by the name _wp_page_template. You won’t see this in your custom fields box because of the _ at the start.

The _wp_page_template field will contain the filename of a page template to load, and if the file is present in the theme, it will bypass the template hierarchy and load that instead. Any file in the theme can be a page template, though this does mean it’s possible for files such as index.php or header.php to be page templates, and this is strongly discouraged.

Using similar checks inside plugins, you can hijack the template loading process to add custom templates to other kinds of posts. There are many plugins that do this on wordpress.org, and many other ways of subtly modifying the template hierarchy  but that is a topic for another post.

8. The Loaded Template runs the main loop

It is at this point that WordPress has loaded your template, and you are in full control of the html. This should be familiar territory.

There should be a post loop that displays each post (or the post if it’s a blog post/page), and some other basic templating calls, such as the_title() or the_content() inside that loop.

The Whole Process

We’ve come full circle from a URL being passed to WordPress, down to html being sent to the browser. After this point little happens that isn’t caused by the template loading itself, and a lot is hooked into the header and footer hooks called from wp_footer and wp_head.

For those wanting a more technical birds eye view, this diagram by Rarst showing Core load is a good starting point:

Wordpress Core Load

WordPress Core Load

1 thought on “Getting From URL to Template

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.