Control the Sitemap
Are you in Control of Your Sitemap?
Search engines use the sitemap.xml file to understand the structure of the website they are crawling. Information in the sitemap is sent by the crawler back to the search engine provider where that information is used to index the pages of the website just crawled.
A webmaster knows all too well that the sitemap.xml is a file used by the search engines but the question is, “Are you in control of the sitemap?”.
But let us first look if the sitemap.xml is important at all.
How important is the Sitemap.xml for SEO?
When you search the internet on the importance of the sitemap.xml than it seems to be a file that is very important and that you cannot be without one. This is all great but I rather go to the source for the real answer and then it seems that the search engine providers tell a different story.
When you read what Google has to say to webmasters about the sitemap.xml then it is clear that when
- a website has less than 500 pages a sitemap.xml is not necessary however if your website is large then a sitemap.xml is useful.
- the internal linking is optimum and all pages can be found from the home page a sitemap.xml is probably also not needed however if the website has crappy internal links, the sitemap.xml will show the crawler which pages are available at the particular site and then it becomes necessary.
- there are not many rich media content pages with for example a lot of videos or images then a sitemap.xml is not necessary, however, when you have a site with many videos and images a sitemap.xml will help the Google spider.
Bing writes “Sitemaps are an excellent way to tell Bing about URLs on your site that would be otherwise hard to discover by our web crawler.”
Excellent is not the same as “important” or necessary.
Now we can moan about
- important and not important
- more than 500 pages or less than 500 pages
- good proper infrastructure vs “bad” internal links and structure
- a new site with 0 backlinks of a small site with a million backlinks
It is just good practice to have a site map and the only question is basically how you, as a webmaster, want to be in control of that sitemap.
Create a Sitemap.xml – CMS
Several CMS solutions have the option to install SEO plugins/extensions that often includes the creation of the sitemap.xml and every time a new page is created the sitemap.xml is updated and the search engines are “pinged” to let them know that the sitemap.xml has changed.
The question that you need to ask yourself is what is the resource constrain on my system when the sitemap is generated every time a page is added?
There are great solutions for example with WordPress Plugins but
- it is another plugin
- and what will happen if my site is over 10,000 pages and pages are added on the spot? Does this mean that the sitemap is updated continuously?
Although this https://www.seotoolsforwebmasters.com/ is a WordPress site, and I had Joomla before as well, next to Mobirise, I never was interested in having “internal” software to create the sitemap and that has all to do that I am not in control and being a webmaster you want to be in control of “all” assets of your website isn’t it?
Create a Sitemap.xml – online
The creation of the sitemap can also be done using an online website that crawls your site, creates the XML which you can download to your home computer and then upload to your website which of course is labour intensive and the best is to avoid this unless you have a static site that really never changes or if you have a small website, less than 500 pages.
There are interesting PRO solutions with similar features as the one I will describe below but again, you need to download and upload or set up an FTP session which can be seen as opening your website for others.
The biggest problem with all these solutions is that as a webmaster
- you are NOT in control of the data in the file,
- you cannot schedule the creation of the file.
Create the Sitemap.xml – Manually
If you want to be in total control of the sitemap.xml then you have to create this manually but I do not see many people doing that. But in case you do then just learn how to create a sitemap.xml. The format of the sitemap.xml can be found at https://www.sitemaps.org/protocol.html
Create the Sitemap.XML – Locally
Obviously, there are solutions out there that are near perfect. One of the solutions is from xml-sitemap.com which is an unlimited sitemap generator that I have been using for many years.
They have a free online solution with a maximum of 500 URLs or a PRO online version with a monthly payment which might be interesting if you do not have access to for example the crontab.
The solution however that is “near” perfect is described here in more detail that includes the options to configure many parameters.
The configuration of the XML-Sitemaps is in full swing once you go through all the options available to you.
Highlighted are a few important parameters to look at and make your decision before saving the sitemap configuration.
This is pretty much already filled in with the URL of your site, the physical place of your website and the URL to the sitemap.xml which is basically always placed in the root of the site.
Not much to configure here and I would recommend to leave it as-is unless you have a very good reason to place the sitemap.xml, not in the root. Please let me know in the comments if you do so.
We always talk about the sitemap.xml but there are more files that are read by the search engines that can be created.
You see that you can create files like the .txt but also the ROR (https://rorweb.com/) file next to files for images and a mobile sitemap.
You will need to purchase the “add-ons” if video, RSS and news are of interest for your or your client’s site.
You will know how often you change anything on your site or which date you would like to give to a page.
Change the frequency
Without going into much detail of every option because XML-Sitemap has already done that in their excellent documentation (https://www.xml-sitemaps.com/documentation-xml-sitemap-generator.html) and if there are items not really understood then it might be interesting to learn a little more about the technical aspects mastering a website.
Set a user name and password where the password can be over 100 characters long. This might be too long to remember but there are many password managers that solve that problem.
The main points here are the number of URLs in a file before creating a 2nd xml file or better an index file with an x number of xml files.
The compressing of the sitemap files is a great thing if you have to look at bandwidth but do not forget that the maximum file size, uncompressed, is 50MB.
The store external link list is an interesting one especially if you got lost track of all external links on your site.
To view the external links, you will need to select and save the updated configuration. The menu will show an External Links item.
At the same menu, you see a broken links button which is absolutely fantastic. After each run, you will find the broken links here and the page where the link is placed. Nothing is as irritating following a link that does not exist.
Sitemap-XML Crawler Rules
You will be able to control each and every page of your site. This obviously can be done within each CMS system, nevertheless being in control through XML-sitemap is a great asset especially if you have some landing pages outside of your “regular” website that you do not want to be indexed.
Highlight Exclusion Preset
In case you have software like written in the pull-down menu then select one of these as it will help sitemap-xml to create the “perfect” sitemap file for the search engine spider.
Sitemap xml Advanced
Often you see blogs, news, shop are on subdomain like blog.domain.extension or shop.domain.extension which is all great are completely okay with the search engines. However, to make it the crawler easier, just mention it here and your subdomain will be crawled as well.
Start the creation of the Sitemap
The creation of the sitemap can be done manually and if you really have a large site then the best would be to run it in the background.
However, if you have access to the crontab of your website then definitely setup the job to generate the sitemap.
Using the crontab makes you decide when to run the XML Sitemap generator. It goes too far to explain all the options with the crontab, just ask your system administrator.
Once the sitemap is created it is just a matter of adding the sitemap to Google Search Console, Bing Webmaster or any other search engine provider where you can configure a link to your sitemaps.
I have been using the XML-Sitemap generator for years still gives me great satisfactory after setting up a cronjob for new sites and see the email coming writing that the job was successful.
If you want to be in control of the sitemap.xml then the sitemap generator software is what you need to have.
No doubt in my mind that you would love it too and therefore and because of all the other benefits I recommend to download XML-Sitemap and install it on your own webserver with a crontab job to generate the sitemap file the date and time you want.