This a technique used in search engine optimization (SEO) to deliver to the search engine spider, content that differs from the one delivered to the user’s browser. It is achieved through presenting content which depends on the IP addresses or User-Agent HTTP header of the person who requests the page.
A Little More on What is Cloaking
After a user gets identified as a search engine spider, a server-side script usually presents a copy of the web page which has the content lacking on the visible page, or which is available but cannot be searched, every time that a user is recognized as a search engine spider. Cloaking can be used to trick search engines into showing a page that would not have been displayed initially.
It can also be a meaningful technique of notifying search engines of content they would have failed to find since it is rooted in non-textual containers or some components of Adobe Flash. However, better methods of accessibility such as progressive enhancement have been developed, and this has made cloaking irrelevant for regular SEO.
Usually, cloaking is used as a web spam technique of coaxing search engines into providing a site with a higher ranking. Using this method, cloaking can be operated to deceive users into visit sites which are different from how they are described in the search engine. For example, the delivery of pornographic content which is hidden in non-pornographic search results.
Cloaking is a type of a doorway page technique. Such a technique is also used on the DMOZ web directory although it is different from the search engine cloaking in some ways such as:
- It is designed to fool human editors and not computer search engine spiders
- Usually, the cloaking decision is based on the HTTP referrer, the Visitor’s IP or the user agent but some more developed techniques can depend on the behavior analysis of the client after several page requests. After a user clicks a link on a page, the referrer communicates to its URL to get to the page. Various cloakers provide the fake page to people who visit from a web directory website because the directory editors normally examine sites through clicking on links appearing on a directory web page. Others dont provide the fake page to people coming from a major search engine, and this ensures that detecting cloaking difficult and does not cost a large number of visitors because a majority of people find websites through the use of a search engine.
Webmasters develop pages specifically for the search engines and these pages come without natural popularity because of lack of compelling content to rank them well in the search engines. This gives rise to pages having numerous keywords and other factors which might be search engine friendly but that would make the pages hard for consumption by real visitors.
Due to this, cloaking is considered an essential technique by the black hat SEO practitioners since it allows the webmasters to target the human visitors and the search engine spiders separately. Cloaking ensures a high user experience and at the same time satisfies the required keyword concentration for ranking in a search engine.
In Mosaic cloaking, dynamic pages are built as tiles of content, and only sections of the pages are changed, and this simultaneously reduces the existing contrast between the cloaked page and the friendly page and ensures an increase in the capacity for the targeted content delivery to different spiders and human visitors.
Cloaking Versus IP delivery
IP delivery is a mild variant of cloaking in which different content is provided based on the requester’s IP address. In cloaking, there is no time that the search engines and people see each the pages of each other while in IP delivery, the search engines and people can see similar pages. This technique is popular in graphics-heavy sites that don’t have much textual content to be analyzed by spiders.
IP delivery is used to identify the location of a requestor and then delivering content which is written specifically for that country. This is not cloaking in the real sense. IP delivery is a rudimentary and an undependable way of identifying the language to be used in providing content. This is because either most countries are multi-lingual or the requestor is a foreigner. A reliable method of identifying content negotiation is inspecting the Accept-Language HTTP header of the client.
A majority of sites have been taking up IP delivery in personalizing content for their regular customers. Some top sites such as Amazon are active users of IP delivery, and none of them have been banned from search engines since they don’t have deceptive intent.
References for Cloaking
Academic Research for Cloaking
- Cloak and dagger: dynamics of web search cloaking, Wang, D. Y., Savage, S., & Voelker, G. M. (2011, October). In Proceedings of the 18th ACM conference on Computer and communications security (pp. 477-490). ACM. This paper measures and characterizes the pervasiveness of cloaking on various search engines, how the targeted and untargeted advertising behavior changes and the response to site cloaking by search engine providers.
- Search+ seizure: The effectiveness of interventions on seo campaigns, Wang, D. Y., Der, M., Karami, M., Saul, L., McCoy, D., Savage, S., & Voelker, G. M. (2014, November). In Proceedings of the 2014 Conference on Internet Measurement Conference (pp. 359-372). ACM. This paper studies the effectiveness of various interventions in terms of understudied market niche and counterfeit luxury goods through the use of eight months of empirical crawled data.
- Optimization Methods and Seo Tools, Enache, M. C. (2014). Risk in Contemporary Economy, 98-103. This article focuses on SEO which is the activity of optimizing the web pages or entire sites to make them more friendly to search engines and ensure that they get higher positions in search results.
- Uncovering cloaking web pages with hybrid detection approaches, Deng, J., Chen, H., & Sun, J. (2013, August). In Computational and Business Intelligence (ISCBI), 2013 International Symposium on (pp. 291-296). IEEE. This paper shows a system that can be used in attacking the problems plaguing the existing cloaking detection systems which are low accuracy of their algorithms and limited type of cloaking techniques that can be detected.
- Cloaker Catcher: A Client-based Cloaking Detection System, Duan, R., Wang, W., & Lee, W. (2017). arXiv preprint arXiv:1710.01387. This article focuses on mitigating IP cloaking systems and SEM cloaking and in providing client-based real-time cloaking detection services through a proposal of the Simhash-based Website Model which can be utilized in modeling original page dynamics.
- Black-Hat SEO Practices, Ogden, L. R. (2008). Black-Hat SEO Practices. This paper presents the various black-hat SEO practices which are used in the increment of the rank of page or site in search engines through nefarious means which violate the terms of service of the search engines.
- Obstacles in SEO, Shenoy, A., & Prabhu, A. (2016). In Introducing SEO (pp. 57-61). Apress, Berkeley, CA. This is a presentation of a practical approach of mitigating various hurdles faced in the implementation of SEO in projects through designing the web site for users and then tweaking it for search engines.
- The Ever-Changing Labyrinth: A Large-Scale Analysis of Wildcard DNS Powered Blackhat SEO., Du, K., Yang, H., Li, Z., Duan, H. X., & Zhang, K. (2016, August). In USENIX Security Symposium (pp. 245-262). This study reveals a new black hat SEO infrastructure known as spider pool that seeks a different operational model and uses cheap domains which have low PageRank values in the construction of link networks and poisoning long tail keywords.
- Introduction to SEO, Shenoy, A., & Prabhu, A. (2016). (pp. 1-8). Apress, Berkeley, CA. This paper guides learners in SEO which is a marketing discipline that majors on increasing visibility in the organic search engine results.
- Search+ Seizure: The Effectiveness of Interventions on SEO Campaigns, McCoy, L. S. D., & Voelker, S. S. G. M. (2014). This article examines the effectiveness of various interventions ranging from the modification of search results to the seizure of domains in the context of an understudied market niche and counterfeit luxury goods.
- Improving Cloaking Detection using Search Query Popularity and Monetizability., Chellapilla, K., & Chickering, D. M. (2006, August). In AIRWeb (pp. 17-23). This paper proves that the extent of cloaking in search results is dependent on query properties like popularity and monetizability and then it proposes estimating these properties through analyzing search engine query logs and online advertising click-through logs.