To improve your Search Engine Optimization (SEO), you might need to add a sitemap or robots.txt
file to your Next.js site.
A sitemap defines the relationship between pages of your site. Search engines utilize this file to more accurately index your site. You can also provide additional information such as last updated time, how frequently the page changes, and more.
A robots.txt file tells search engines which pages or files the crawler can or can't request from your site.
Static Sitemaps
If your site does not update frequently, you might currently have a static sitemap.
This is a basic .xml
file defining the content of your site. Here's a simple example:
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://hernanhumana.dev</loc>
</url>
<url>
<loc>https://hernanhumana.dev/blog</loc>
</url>
</urlset>
As your site scales, you will probably want to create your sitemap dynamically.
Dynamic Sitemaps
If your site frequently changes, you should dynamically create a sitemap.
Let's first look at an example where your site content is file-based (e.g., contained inside the /pages
directory).
First, let's add globby
so we can fetch a list of routes.
$ yarn add --dev globby
Note:
globby
might not work on Windows.
Next, we can create a Node script at scripts/generate-sitemap.mjs
.
This file will dynamically build a sitemap based on your /pages
directory.
import { writeFileSync } from 'fs';
import { globby } from 'globby';
import prettier from 'prettier';
async function generate() {
const prettierConfig = await prettier.resolveConfig('./.prettierrc.js');
const pages = await globby([
'pages/*.js',
'data/**/*.mdx',
'!data/*.mdx',
'!pages/_*.js',
'!pages/api',
'!pages/404.js',
]);
const sitemap = `
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmln="http://www.sitemaps.org/schemas/sitemap/0.9">
${pages
.map((page) => {
const path = page
.replace('pages', '')
.replace('data', '')
.replace('.js', '')
.replace('.mdx', '');
const route = path === '/index' ? '' : path;
return `
<url>
<loc>${`https://hernanhumana.dev${route}`}</loc>
</url>
`;
})
.join('')}
</urlset>
`;
const formatted = prettier.format(sitemap, {
...prettierConfig,
parser: 'html',
});
// eslint-disable-next-line no-sync
writeFileSync('public/sitemap.xml', formatted);
}
generate();
Finally, add postbuild
script in your package.json
to run this script after next build
completes.
Your generated file gets created at public/sitemap.xml
which is then accessible from the root of your site.
{
"scripts": {
"build": "next build",
"postbuild": "node ./scripts/generate-sitemap.mjs",
"dev": "next dev",
"start": "next start",
"lint": "next lint"
}
}
External Content
If you have externally hosted data (e.g., a CMS), you'll need to make an API request before you can create your sitemap. This implementation will vary depending on your data source, but the idea is the same. To demonstrate, I've created an example using placeholder data.
First, create a new file at pages/sitemap.xml.js
.
import React from 'react';
class Sitemap extends React.Component {
static async getInitialProps({ res }) {
const request = await fetch(EXTERNAL_DATA_URL);
const posts = await request.json();
res.setHeader('Content-Type', 'text/xml');
res.write(createSitemap(posts));
res.end();
}
}
export default Sitemap;
When the route /sitemap.xml
is initially loaded, we will fetch posts from an external data source
and then write an XML file as the response.
import React from 'react';
const EXTERNAL_DATA_URL = 'https://jsonplaceholder.typicode.com/posts';
const createSitemap = (posts) => `<?xml version="1.0" encoding="UTF-8"?>
<urlset xmln="http://www.sitemaps.org/schemas/sitemap/0.9">
${posts
.map(({ id }) => {
return `
<url>
<loc>${`${EXTERNAL_DATA_URL}/${id}`}</loc>
</url>
`;
})
.join('')}
</urlset>
`;
class Sitemap extends React.Component {
static async getInitialProps({ res }) {
const request = await fetch(EXTERNAL_DATA_URL);
const posts = await request.json();
res.setHeader('Content-Type', 'text/xml');
res.write(createSitemap(posts));
res.end();
}
}
export default Sitemap;
Here's a condensed example of the output.
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://jsonplaceholder.typicode.com/posts/1</loc>
</url>
<url>
<loc>https://jsonplaceholder.typicode.com/posts/2</loc>
</url>
</urlset>
robots.txt
Finally, we can create a static file at public/robots.txt
to define which
files can be crawled and where the sitemap is located.
User-agent: *
Sitemap: https://hernanhumana.dev/sitemap.xml