Incorrect Global Canonicals

Recently in Sydney I asked asked by one of the attendees to take a look at their HREFLang implementation.   They had spent hours looking at it but Google still gave errors saying they did not have alternatives set.   As we have identified before, one of the reasons the search engine cannot find your alternatives is you have set a global canonical on the local pages. 

I this example, if we review the source on the Singapore page, https://www.examplesite.com/sg/  we see the HREFLang Element is correct but look before and you will see that the Singapore page has a canonical set to the global site.  As such, search engines will not index the /sg pages resulting in a hreflang alternate "not found error".  

It is critical to test to ensure that the canonical on each page is correct and IS not another language page or a global page.   

Note:  We have added canonical checking to the HREF Testing Tool. 

 

HREFlang Error Pointing to Same URL for Each Reference

We are starting to see more and more of these errors where the site uses the same alternate URL for each of its entries.  Teh image below will help make the problem clearer.   We beleive it is one of the free tools that is causing the problem but not sure which one.  While developing the screen captures for using Screaming Frog to test HREFLang mark-up it appeard the pages for each country were correct.   They all had 24 sets of markup. Once I put the URL's into HREFLang Checker we find that while we find HREFLang markup on each page it all of them point to the same URL. 

When we look at the source code we see they have set used the same URL for every entry.  Essentially telling search engines that the /jp version of the page is the page for all 24 countries. 

Going to the French language France page, we find the same thing, the /fr page has been used as a reference for all entries. 

Even stranger, we check one more and go to the German page and find that it is correct.  Each country is referenced correctly. 

This is why we suggest taking a sampleing of the pages and test them to make sure there are not other errors that Screaming Frog or other tools are not catching. 

 

Are HREFLang XML Site Maps Effective?

Over the past few days on Twitter there have been some comments on the lack of effectivness of HREFLanguage XML site maps.  In my experience, having used them on over 500 different projects, they are extreamly effective when they are used correctly.  

I have written previously on some of our success stories including a client that generated $8 million in incremental revenue and another receiving a 58% increase in local market traffic or this newest case where the average traffic to South American Sites increased over 200% after fixing and implementing HREFLanguage XML site maps. 

We have already noted that the #1 reason they fail and create No Return Tag Errors is that the XML files are not fully indexed due to errors.  I mentioned on Twitter that the majority of the sites I have worked on all had significant errors in their XML files and people could not believe that International SEO agencies could or would make such a simple mistake.  

The following are some actual projects where we imported final and live HREFLanguage XML site maps that were full of URL's that did not produce 200 status codes when requested.  Note, all of these projects were/are previously managed by some of the top global agenices in the world. 

This first example, is one of the most recent and they came to us last month and asked me if I could take a look since their agency could not find the reason they had nearly 100k No Return errors.  We imported all of the current HREFLanguage XML site maps into HREF Builder and immediatly started to see problems.  These problems were also validated once we were given access to Google Search Console. 

The red numbers indicate the number of pages in the current XML site maps that did not generate a 200 header status and/or had a redirect or a robots exclusion. 

Nearly every country has multiple errors with most clearly more than the 1% search engines has said was an acceptable error rate.  Clicking into their global site we can see how the 1,018 errors were distributed across the various types of problems. 

They had nearly every type of header error you can imagine with the majority being 301 redirects.  The agency told us that Google would follow the redirects so no need to remove them - WRONG.  In another language, we had a case where 214 pages had a 301 redirect to pages that had a canonical tag back to the original page that redirected sending search engines into a loop.  

Google clearly showed these as a problem in Search Console flagging each XML site map with hundreds of errors.   Once these were removed from the XML site maps, Google quickly reindexed the pages and nearly all of the HREFLanguage errors went away.  

Another example is from a Fortune 100 company that used an agency that told them they spent over 60 hours developing the HREFLang XML files manually in Excel.  They have a site targeting over 100 counties but we only showed a sampling of the problems.  Out of 2 million pages nearly 300k had errors.  

Imaging the waste of effort to the search engines, hitting nearly 300,000 pages that did not give them a positive result.  This is why when we checked the server logs a number of these XML site maps had not been requested in a few months. 

Here is another case for a midsized global site where the agency that used one of the free "upload your CSV tools" to build a HREFLang XML site map.   They did not check the pages to make sure they load, did not redirect or have a canonical to another language site or any quality for that matter.   Same process, we imported them into HREF Builder which checked each page and generated this report. 

Every country XML had more than the 1% that Search Engines have told us is acceptable.

This should be unacceptable to the site owners to have this level of junk submitted to the search engines wasting requests on pages that are broken or redirect.   There are some consultants that are advocating the use of in page HREF Langage Tags.  This is a great option for sites that only have a few country versions and can implement the code to do it.   Unfortunatly, for many companies with a large global footprint in page is not practical.    The main reason to use XML site maps vs. in page coding is the amount of code added to each page.  In all of these cases it would be from 50 to 100 extra lines of code to ever page.  I would also be a large development process to parse pages that did not exist in the local markets etc.

XML Site Map versions for HREFLanguage management is a very powerful tool but they must be managed effectively.  This is exactly why I created this tool and why a number of large companies are coming to use to have us managing this mission critical process for them.