Using Screaming Frog to Find In-page HREFLang Mark-up

Our HREFLang Checker is a great tool if you are checking a single page and its alternatives.  However, if you need to check your entire site and identify pages with hreflang mark-up there are not many options.   With the new release of Screaming Frog 6.0 it is amazingly easy to do.  You simply set it once and Screaming Frog will automatically add additional columns for the elements it encounters which is a huge time savings over the previous version. 

 You have been able to check some of your pages with earlier editions of Screming Frog but was a pain to set up and you could only check up to 10.  If you have not upgraded yet and want to do it you can set it up using a excellent step by step guide from Justin McKinney at WPromote.   You can also generate a similar report with DeepCrawl which I will profile shortly.

Setting UP Screaming Frog to Find HREFLang Mark-Up

Start wtih opening Screaming Frog and click the Configuration option in the top navigation then click Custom > Extraction and it will bring up the Custom Extraction Module.

Once you choose "Extraction" it will open a new screen.  Once open enable the following settings:

Step 1 -  Replace Extractor 1's title with one of your Own - I used "HREFLang" but you can name it anything you want. 

Step 2 - Choose "XPath"

Step 3 - Copy or type (//*[@hreflang])/@hreflang into the box - if you added it correctly you will get the green check mark.  If not a red X

Step 4 - Click OK.   

Set the rest of your crawl configurations and run the report.  Once complete, you can view your results by clicking "Custom" in the subnavigation area.   As noted, the HrefLang2, HrefLang3, etc are added for you.  Before we had a limit of 10, but in this case I was able to get all 24 different elements. 

Like most results in SF, you can export the results.  Click "Export" and choose export format of Excel or CSV

Bonus Steps: 

While a fast and easy way to gather all of your HREFLang elements in a single export it does not check them for accuracy.  As you can see in some of the rows above there are errors.  Suggest you take a random sample of the pages and put them into our HREFLang Validation Tool which tests 30 different variables. 

Some of the errors that we see in this result are: 

1.  All pages seem to have a "en" setting but none of the other languages and those pages do not reciprocate with alternate tags.

2.  The country/region code for the UK is incorrect it should be GB

3.  The language code for Japan should be "ja" and not "jp" which is the country code

4.  A big error that Screaming Frog did not detect, nor its it designed to detect, the site is using the same country/language URL for all HREFLang entries on a page.  While it has 24 HREFLang rows all of them used the same URL. This can be ok to use the same URL for regional URL's but not every country and every language in this case.  This would tell the search engien that the Spanish language Latin America page is the page for Japan which is not correct.   

Note:  While there is no limit to the number of HREFLang Elemets you can add to a page we suggest keepig it under 5.  If you start adding too many the code adds up.  For more information read our Help Guide on Too Many HREFLang Elements

Comments

Speak Your Mind

*