SEO is often a long and convoluted process which takes time, dedication and expertise.
Therefore, anything that can simplify or speed up aspects of a campaign are welcome additions: cue plugins! If your website is hosted on a WordPress CMS, then you are fortunate enough to have access to a plethora of handy plugins.
But with over 54,000 plugins to choose from, selecting the most appropriate and effective plugins for your SEO needs can seem a bit overwhelming. Luckily for you, in this article we have picked out the 14 WordPress plugins which we believe are essential for improving SEO.
From general all-rounder SEO plugins to those engineered for more specific tasks, our recommendations will cover a range of SEO needs.
1. Yoast SEO
We’ll start with the all-rounder SEO plugins and arguably the most well known of the lot. Yoast SEO is an exceedingly popular plugin for fulfilling SEO needs, and is suitable for total newbies and seasoned pros alike.
Yoast is easy to use and intuitive, making it possible for anyone to work with. Its traffic light system provides detailed guidelines, which is perfect for those who are new to SEO.
Its only main downside is that sometimes the plugin does not analyze the entire page – if, for example, part of the content for a particular page sits elsewhere within the CMS. Not an issue in itself but can cause clients to have a temporary meltdown when they see a sea of red alerts.
Nevertheless, with its powerful content analysis abilities, Yoast is my personal recommendation for an all-rounder SEO plugin.
2. All in One SEO Pack
A popular alternative to Yoast, the All in One SEO Pack plugin was actually created prior to Yoast, and therefore built up a dedicated fanbase in the initial years. It provides much the same tools and features as Yoast, but the general consensus is that All in One SEO Pack is slightly less user-friendly than its main rival.
Ultimately it comes down to personal preference. Many people are a fan of the traffic light system deployed by Yoast, as it provides a whole host of handy, actionable tips for you to implement.
Alternatively those red, orange and green lights may drive you up the wall, in which case you may want to consider the All in One SEO Pack, or even our third suggestion…
3. The SEO Framework
The SEO Framework is an alternative to Yoast and All in One SEO Pack. It provides an automated and fast SEO solution, plus it is unbranded, meaning a clean interface. Many users find it less bloated than Yoast, therefore arguably more efficient for the user.
This particular plugin is not quite as accessible as Yoast, but it is by no means difficult to use. Accordingly, it may pay to have at least a basis of SEO knowledge for this plugin.
In short, stick to Yoast if you’re a complete beginner, but if you’re confident with the basics, The SEO Framework is a viable alternative.
Setting up redirects for broken links or deleted pages is key in maintaining a solid user experience.
Redirection is a useful plugin that allows you to handle all kinds of redirects from one easily accessible place. Quickly create new redirects, manage your current redirects and tidy up any loose ends.
5. Smush Image Compression and Optimization
Page speed is probably one of the biggest headaches when optimizing a website and image size is the most common reason for slow loading times. However, if you have a website that is extremely image-heavy, then the thought of manually compressing each image is probably enough to make you want to quit your job.
Well, don’t do that. Because a plugin called Smush exists and aside from being the best-named plugin on this list, it’s also incredibly useful. Smush Image Compression and Optimization compresses all images on your site at the touch of a button.
You also have the option to resize them while you’re at it. It’s perfect for increasing that pesky page speed with minimal effort.
6. W3 Total Cache
Continuing the theme of page speed, W3 Total Cache reduces load time, increases download speed and improves conversion rates. As a result, it helps to improve the overall performance and speed of your site.
With the stamp of approval from numerous high profile websites and industry leaders, what more do you need to know? It’s a no-brainer.
Another tool for helping you with a website spring clean. WP-Optimize cleans up your WordPress database, making space and improving its overall efficiency. A more efficient database means a better performing and faster site.
8. Google XML Sitemaps
An XML sitemap helps the search engines to crawl your site. Anything that makes life easier for the search engines is generally worth doing. The Google XML Sitemaps plugin allows you to create a sitemap quickly and easily, without the need for using a third-party tool.
9. Broken Link Checker
It’s in the name – the Broken Link Checker identifies any broken links on your website.
Broken links can undermine your overall link structure and provide a frustrating user experience. This plugin allows you to uncover any broken links and take the necessary steps to fix them.
10. All In One WP Security & Firewall
Search engines take site security very seriously. The worst kind of user experience would be one which ends in a virus or other form of malware. If your site gets hacked, then it could have a highly detrimental effect on your rankings.
The aim of the All In One WP Security & Firewall is to prevent this happening and keep your website as safe as Fort Knox. On a related note, make sure you invest in an SSL certificate while you’re at it.
11. Yet Another Related Posts Plugin (YARPP)
Search engines take into consideration how quickly users leave your website. It’s therefore worth spending time trying to lower your overall bounce rate. As part of this, you should ensure clear call-to-actions and provide somewhere for users to naturally go once they are finished with a particular page or post.
For blog content, related posts are perfect for retaining users. Ideally, this functionality would be integrated into the design of your blog.
However, if not, then the Yet Another Related Posts Plugin (YARPP) is a handy alternative and uses an algorithm to determine the most effective related posts. Great for user retention and lowering that bounce rate.
12. All In One Schema.Org Rich Snippets
Schema markup can bring some serious SEO points your way. Although there is no evidence that the markup itself is a ranking factor, just the appeal of having a more enticing appearance in the SERPs through rich snippets should be enough.
It is normally best to implement schema manually, without the use of a plugin. However, the All in One Schema.Org Rich Snippets plugin can help bridge the gap in the meantime until you get around to it.
I know it’s one of those tasks that gets pushed to the bottom of the to-do list, but there’s no time like the present!
Akismet is the most popular plugin to deal with spam comments and is useful in preventing those weird and wonderful comments. From an SEO perspective, it stops potentially harmful links appearing in the comments on your site.
Too many of these links could associate you with bad neighborhoods, as although the links are in the comments, they are still coming from your website. They are therefore best avoided at all costs.
14. AMP for WP
With the news of Google’s mobile-first index and the knowledge that over half of website traffic is via mobile, you need to implement Accelerated Mobile Pages (AMP for short) if you haven’t already. The AMP project is an open-source initiative that provides an easy way to generate web pages which will load quickly and smoothly on mobile.
The AMP for WP plugin automatically adds Accelerated Mobile Pages functionality to your site, therefore making it faster for mobile users. It’ll save you a lot of hassle.
This is by no means a comprehensive list of plugins to improve SEO, but these are our personal recommendations. It’s important to remember not to go overboard with plugins. Avoid cutting corners by choosing to use a plugin over achieving better results manually.
Nevertheless, installing and activating a selection of carefully chosen plugins can make the SEO process more accessible, fast and effective.
Just how big is YouTube these days? According to a really cool infographic that was released earlier in 2017, there are some pretty incredible statistics:
- YouTube is available and used in 88 countries around the world
- It is the second largest social media platform with over 1.5 billion monthly users, second only to Facebook (2 billion) and more than twice the number of Instagram (700 million)
- 500 hours of video are uploaded to YouTube every minute
- Mobile viewing makes up half of the site’s streaming.
In other words, YouTube is HUGE. Not only has it been steadily growing since its initial launch in 2005, it has become the single biggest and most important video service on the web. While there are others that have come in is wake, none have reached the same level of popularity.
With that it mind, it is no wonder that so many people are looking to boost the effectiveness of their content on the platform. However, with so much use comes other struggles, like being seen in the crowd. If 720,000 hours are uploaded a day, you have to do everything possible to stand out and be noticed.
Find the sweet spot with your video title length
There are several things to consider when coming up with the video title:
- How engaging and catchy it is for the eye
- How many important keywords you use within your title (those keywords are going to help you rank that video in both YouTube and Google search)
- Which part of the title is immediately visible when people search YouTube or see your video thumbnail in YouTube-generated related videos.
Taking all of the above in the account, the sweet spot for your video title is going to be around 100 characters. That is enough to give a unique, descriptive title while still showing in search without a cut off.
Make sure that title not only describes what is happening in the video and contains key phrases you have already researched, but it is also attention grabbing enough that people will want to click on it.
When crafting a video title, consider including the following:
- Include the important names and entities (your interviewee, event name, branded hashtag, featured brand name, etc.)
- Location (especially if you are targeting a specific locale)
- Your important keyword you’d like the video to show up for.
To distinguish that important keyword, use keyword clustering technique that allows you to see core phrases behind obscure keyword variations. My own trick is to use Serpstat’s clustering feature that allows you to group keywords by how many identical URLs rank in Google for each specific query:
You can read more on how Serpstat clustering feature works in this guide.
You may also to match each keyword group to appropriate keyword intent to make sure your future video content will cover the immediate need and prompt engagement.
Make your descriptions longer
Video and channel descriptions are another valuable resource for drawing traffic to all of your content. YouTube allows up to 5,000 characters, which is between 500 and 700 words.
The rule of thumb is obvious: The more original content you have below your video, the easier for search engines it is to understand what your video is about and what search queries to rank it for.
Not every description needs to be that long, but aiming for around 2,000 characters for videos and 3,000 for channels is a good place because it gives you the space necessary to optimize your keyword use and give some context to viewers. More is fine, but make sure you aren’t filling it with a lot of pointless fluff.
Make the first 150 characters of a description count
Of the words you write, the first 150 characters are the most important. That is because YouTube cuts it off with a (More) tag after the point, so the viewer has to specifically opt in to reading the rest. Not all of them are going to do that.
You should make sure those first characters tell the viewer what they really need to know in order to connect with what they are reading. From there you can focus more on keywords and the rest of the description, as it will still count the same towards searches.
It is also a great place to link out to other channels, your website, etc. Make sure your call to action (CTA) is in the first words, such as liking, subscribing, learning more, etc.
Have a good, high-resolution thumbnail
Thumbnails are pretty standard for monetized video channels at this point. You have probably noticed that they follow a certain pattern: silly face, bright colors, something odd in the background, over the top. Sure, it seems annoying. But they follow the formula because the formula works.
Now, you don’t have to do the same thing. You just want to make sure that you have an eye catching, visually stimulating thumbnail in the recommended 1280 x 720 size. There are a few generators out there to help you make one, but my thumbnail maker of choice right now is Adobe Spark.
Keep in mind that you want a standard format across all of your thumbnails. For instance, if you do your face on one then you should do them on all. If you use some kind of animation or logo, use that.
You want to be immediately recognizable to anyone who follows your channel right from the suggested videos sidebar, or the search results. If you have old videos, go back and upload thumbnails to each one to start getting some better click results.
Furthermore, make sure your thumbnails are readable: Viewers should be able to easily see what it is about at a glance when seeing it in the right-hand column of the suggested videos or on a small mobile device.
Utilize playlists – I mean it!
Playlists are incredibly helpful. First of all, they help you group together certain videos right on your channel. So let’s say you did a series on how to increase your YouTube views and it was split into ten videos. You would create a playlist on your channel titled “Super YouTube Tips” so that people could find them all in one place. But that has an additional benefit.
Search leans towards introducing playlists right at the top of the results page. It also allows people to specifically search for playlists. That is great because it can introduce viewers to multiple videos instead of one and many will choose to pop on a playlist and watch straight through everything there.
If you do a creative series with a continued plot you will find this is a huge help and makes it a million times easier to sort it out, even if YouTube screws with the order on your channel (an issue more than one content creator has had in the past, take it from me).
To sum that up, YouTube playlists help you:
- Increase your chances to rank your video content for a wider variety of phrases (which is also helpful for brand-focused results)
- Improve engagement rate with your videos by giving your audience collections of videos so that they can sit back and watch endlessly. And we know that engagement is the crucial ranking factor when it comes to YouTube rankings.
To illustrate the point, here’s a quick example of how we were able to grab two spots for our show name with the playlist:
Bonus tip: Feature your videos on your site
Finally, an obvious but often missed tactic is to increase your YouTube channel performance by prominently displaying your videos on your site. It’s simple: the more people watch your videos (especially if they watch more of each of your videos), the more exposure YouTube offers to your content through suggesting your videos as related.
One of the most effective ways to generate more views for your channel is to promote your videos outside of YouTube, i.e. use your blog and social media channels. There’s a variety of WordPress themes that aim at doing exactly that: promote your YouTube channel prominently on-site.
Furthermore, promote your videos on social media as much as it makes sense for your audience to build additional exposure, links, and re-shares.
Do you have any tips for optimizing YouTube? Let us know in the comments!
The social network is asking experts to help it learn to be a less toxic place online.
Feed: All Latest
With the updates to Creative Hub and the Split Test feature, one can now institute a basic ad creation and testing process directly within Facebook.
Read more at PPCHero.com
According to Hochman Consultants (2017), the average cost of pay-per-click (PPC) advertising is increasing – with the average cost-per-click in 2016 being nearly double that of 2013.
When you consider the fact that Google processes over 2.3 million searches per minute (Business Insider, 2016), this is hardly surprising.
But what can marketers do to ensure that they can attract customers on this increasingly competitive channel, while avoiding these burgeoning costs?
In my previous two articles, I looked at how to stop Google AdWords campaigns from failing by using a Customer Data Platform (CDP) to gain a holistic overview of customer behavior, and how data-driven attribution with a CDP can supercharge your paid search.
In this article, I’ll outline five ways that a Customer Data Platform can improve your AdWords performance and ROI by keeping costs down and attracting new business.
Content produced in partnership with Fospha.
1. Data accuracy
Many businesses continue to struggle with optimizing their keyword bids. The simple reason for this is the fact that, regardless of how modern and advanced your bid management platform is, inputting inaccurate data can hinder success – and be costly to your business.
A Customer Data Platform gathers, integrates and centralizes customer data from various sources to give marketers more control of, and visibility over, their data. This data-driven approach stitches together the customer journey, and uses attribution to accurately assign credit to various marketing channels based on their importance in the path to conversion.
Without this true view of their data, businesses are missing the accurate value of their different channels. They also risk making poor decisions about which marketing channels are beneficial, and which are not, which might result in budget being taken away from a channel which has a huge role in the path to conversion.
With more accurate data, Customer Data Platforms are able to highlight the true value of keywords – allowing your business to pinpoint high and low performing keywords and campaigns, and optimize their spend on paid search.
For instance, with a more accurate data source, Fospha were able to help a client identify that 50% of their keywords weren’t contributing to any conversions. Check out the full case study here.
Manual bid management can be a laborious task, but with the help of a bid management platform to automate the process, this becomes a quick, effortless and efficient process. The next step lies in super-charging the capabilities of this platform. And the answer lies in an accurate data source.
Combining the power of the Customer Data Platform to discover high and low performing keywords across all channels through this data, with the automation of a bid management platform, enables spend on poorly performing keywords to be quickly reallocated – resulting in an improvement in ROI.
3. Real-time access
Unlike most other Customer Data Platforms, Fospha facilitates real-time interactions for bidding, helping reduce and eliminate the amount of wasted clicks on incorrect audiences. A Customer Data Platform integrates seamlessly with bid management platforms like Kenshoo and Marin to support these real-time interactions, such as bidding on ad clicks.
Real-time access through a Customer Data Platform also enables marketers to automate their bid management through advanced machine learning.
Marketers are becoming increasingly aware of the importance of moving away from keyword-based marketing, and towards audience-based marketing. However, they can go one step further – making a move towards people-based marketing.
This is no less of a necessity with your bidding strategies. Understanding your audience is crucial, and by utilizing a data-driven attribution model, a Customer Data Platform provides you with a granular understanding of a single customer. From here, you are able to use your data to optimize your targeting and increase conversions by offering more relevant content to your customers.
In addition to this, keyword performance is largely dependent on types of devices used. It is important to boost keywords that do better on mobile and to suppress those that do not. Marin found that by adjusting bids for mobile, their clients enjoyed 10% higher CTR and 2.5% lower CPC than those that failed to do so.
A Customer Data Platform is able to detect these optimized conditions and adjust your bid management strategy accordingly.
5. Bidding strategies
Defining your bidding strategy can drastically improve the performance of your paid search campaigns. However, in order to reach a truly optimized level, different keywords, audiences and goals will require different bidding strategies.
A Customer Data Platform gives you a granular view of all your marketing channels to ensure the strategy deployed is custom to each specific need.
Content produced in partnership with Fospha. Views expressed in this article do not necessarily reflect the opinions of Search Engine Watch.
You might be thinking, what’s the fuss about website speed? What is important about the average page loading speed?
According to Aberdeen Group, a 1-second delay in page load time yields:
- 11% fewer page views
- A 16% decrease in customer satisfaction
- A 7% loss in conversions
Amazon reported an increased revenue of 1% for every 100 milliseconds improvement to their website speed while Walmart, also found that every 1-second increase in page speed resulted in a 2% increase in conversions.
The speed of your website additionally impacts your organic search rankings because Google, since 2010, has included site speed as a signal in its ranking algorithm.
Basically, your web page loading speed is very important and determines if you will rank in SERPs.
Below are 8 simple but highly effective ways to enhance your web speed immediately by modifying your website design.
#1 Optimize images and lazy load everything
Some of the most common bandwidth hogs on the web are images. According to HTTP Archive, images now account for 63% of page weight.
“As of May , the average web page surpassed the 2 MB mark. This was almost twice the size of the average page, just three years ago.”
The graph shows a breakdown of what consumes kilobytes the most. Practically all asset types are growing, with the most notable one being images.
When creating content, some people make use of large images and then use CSS to scale them down. However, unknown to them, the browser still loads them at full size.
For instance, if you scale down an image of 800 x 800 to 80 x 80, the browser will load ten times more.
The best way to optimize your images is to compress them into smaller sizes while retaining quality. Using tools such as TinyJPG and Compressor.io, or CMS plugins such as WP Smush It (WordPress) and Imgen (Joomla) for compressing images will guarantee your website loads faster, resulting in better experiences for web visitors and increased conversion rates.
The benefits of Lazy Loading according to Stackpath include:
- Lazy loading strikes a balance between optimizing content delivery and streamlining the end user’s experience.
- Users connect to content faster, since only a portion of the website needs to be downloaded when a user first opens it.
- Businesses see higher customer retention as content is continuously fed to the user, reducing the chances the user will leave the website.
- Businesses see lower resource costs since the content is only delivered when the user needs it, rather than all at once.
#2 Make use of browser caching
Setting up browser caching allows you to temporally store data on a site visitor’s computer. This ensures that they don’t have to wait for your web pages (logo, CSS file, and other resources) to load every time they visit your website.
Your server-side cache and their browsing configuration determine how long you store the data. Setting up a browser caching on your server can be done by contacting your hosting company or by checking out the following resources:
Leveraging browsers caching is specifying how long web browsers keep CSS, JS, and images stored. Allowing your web pages load much faster for repeat visitors resulting in a smoother experience while navigating and better rankings in SERP’s.
Also, installing a cache plugin will have a huge effect on your page loading times. Caching plugins handle this concern by generating a static copy of your content and deliver it to site visitors. This can lessen your page loading time drastically.
Caching plugins could help you see around ten times improvement in your overall website performance. An example of caching plugin includes W3 Total Cache.
#3 Compress web content
Google defines compression as the process of encoding information using fewer bits. Though the latest web browsers support content compression ability, many websites still do not deliver compressed contents.
Visitors who visit these websites experience slow interaction with web pages. Major reasons for these unfavorable website behaviors include old or buggy browsers, web proxies, misconfigured host servers, and antivirus software.
Uncompressed contents make receiving the web contents and page load time very slow for users who have limited bandwidth.
For effective compression tactics to deliver efficient website content, Google recommends the following:
2) Make use of consistent code in CSS and HTML with the following method:
- Use consistent casing – mostly lowercase.
- Ensure consistent quoting of HTML tag attributes.
- Indicate HTML attributes in the same order.
- Indicate CSS key-value pairs in the same order by alphabetizing them.Using tools such as Adobe DreamWeaver and MAMP to create/edit CSS and test run websites locally on your PC respectively.
3) Enable Gzip compression
Gzip finds similar strings and code instances and replaces them temporarily with shorter characters.According to Google:
“Enabling Gzip compression can reduce the size of the transferred response by up to 90%, which can significantly reduce the amount of time to download the resource, reduce data usage for the client, and improve the time to first render of your pages”
#4 Optimize CSS code and delivery
The introduction of CSS was a key improvement and has had almost no shortcomings. However, it is essential to consider the impact CSS scripts can have on page speed, particularly when it comes to the representation of a web page.
When CSS is delivered inadequately, it results in a delay by the browser in downloading and processing the styling data before it can completely finish rendering to display your web pages to your visitors. This is why it’s vital to optimize CSS delivery and to identify the pitfalls that can slow down your web pages.
CSS can be used in several ways by a web page and still work. Given that there are various ways to use it, there are several different CSS setups. Regardless of how you set up on your web pages, your CSS should be aiding your web pages to render faster and not slowing them down.
#5 Use a very fast hosting company
Some time back, I changed my hosting company and the average speed of my sites increased dramatically without changing anything else. That was when I realized that hosting companies are not all equal when it comes to the loading speed of websites.
When navigating a website or opening a web page, you are opening files from a remote computer, which is the web server of the hosting company. The faster the remote computer, the faster your web pages can be opened.
There are various hosting companies out there for your use. Just ensure that you carry out proper research and read enough reviews of each before deciding to host your site with any of them.
When researching for a reliable web hosting company, the most important factors you should look out for include speed/load time, uptime/reliability, customer support and price/value.
#6 Deactivate plugins you don’t use
Plugins are typically the major reason for slowing down a WordPress-hosted site. Delete plugins that you no longer use or aren’t essential.
You can identify plugins that harm your speed by selectively disabling plugins and measuring server performance.
Additionally, to speed up the experience of your WordPress site on mobile, check out our guide to Implementing Google AMP on your WordPress site as quickly as possible.
As an online marketer, you can’t allow this to bog you down. You must regularly strive to scale your web design to improve your page loading speed.
A little continuous attention to your website loading speed will go a long way. Remember: as little as a one-second delay in your site loading speed is all it takes to lose a lead.
Google is making a number of ad partner announcements around its Accelerated Mobile Pages (AMP) format today. The idea behind AMP is to speed up the mobile web experience for users and it’s no secret that ads play a major role in making regular mobile (and desktop) pages load slowly. With AMP ads, Google and its partners aim to make ads load fast again. The new partners Google is… Read More
Although half of surveyed business leaders say CX is a top-two differentiator for their business, just half of them said they perform well in it.1
1Source: Harvard Business Review Analytic Services, “Marketing in the Driver’s Seat: Using Analytics to Create Customer Value,” 2015.
Posted by Karen Budell, Content Marketing Manager, Google Analytics 360 Suite
Performing A/B tests on your landing pages can pinpoint exactly what is working and what is failing on your site. In this new live webinar, Unbounce’s Duane Brown and Hanapin’s Samantha Kerr and Kate Wilcox will share their expert CRO advice on how A/B tests have shaped their clients’ success.
Read more at PPCHero.com
One of the limitations of information on the Web is that it is organized differently at each site on the Web. As a newly granted Google patent notes, there is no official catalog of information available on the internet, and each site has its own organizational system. Search engines exist to index information, but they have issues, as described in this new patent that make finding information challenging.
Limitations on Conventional Keyword-Based Search Engines
The patent granted to Google, in September of 2016, discusses a way to organize information on the Web in a manner which can help to better organize and index that information, using context vectors to better understand how words are being used. The patent describes limitations of search engines that are based upon indexing content using keywords, such as:
- A search engine working with Conventional keyword searching will return all instances of the keyword being searched for, regardless of how that word is used on a site. This can be a lot of results
- Conventional search engines may only return only the home page of a site that contains the keyword. Finding where the keyword is used on the site could be difficult
- Often a conventional search engine will return a list of URLs in response to a keyword search that may be difficult to modify or search further in a meaningful manner.
- Information obtained through a search can become dated quickly. Such information may need to be checked up upon
The patent tells us about those limitations and also points out some of the limitations of directories that could also be used to help find information. It then goes on to provide a possible solution to this problem, with a “data extraction tool” capable of providing many of the benefits of both search engines and directories, without the drawbacks that this patent points out.
Is this The Google Search Engine with RankBrain Inside?
A search engine based on a data extraction tool like the one described in the patent would be an improvement over most search engines. Is this Google’s search engine with RankBrain applied to it? It’s possible that it is, though it doesn’t use the word RankBrain
The Bloomberg introduction to RankBrain, Google Turning Its Lucrative Web Search Over to AI Machines provides information about the algorithm used in RankBrain, and it tells us:
RankBrain uses artificial intelligence to embed vast amounts of written language into mathematical entities — called vectors — that the computer can understand.
This new patent refers to what it calls Context Vectors to index content about words found on the Web. To put it clearly, the patent tells us:
In view of the foregoing, in accordance with the invention as embodied and broadly described herein, a method and apparatus are disclosed in one embodiment of the present invention for determining contexts of information analyzed. Contexts may be determined for words, expressions, and other combinations of words in bodies of knowledge such as encyclopedias. Analysis of use provides a division of the universe of communication or information into domains and selects words or expressions unique to those domains of subject matter as an aid in classifying information. A vocabulary list is created with a macro-context (context vector) for each, dependent upon the number of occurrences of unique terms from a domain, over each of the domains. This system may be used to find information or classify information by subsequent inputs of text, in calculation of macro-contexts, with ultimate determination of lists of micro-contests including terms closely aligned with the subject matter.
When a search submits a query to a search engine, we are told that the search engine may try to give it contexts based upon “other queries from the same user, the query associated with other information or query results from the same use, or other inputs related to that user to give it more context.
The patent is:
User-context-based search engine
Inventors: David C. Taylor
Application Date: 09/04/2012
Grant Number: 09449105
Grant Date: 09/20/2016
A method and apparatus for determining contexts of information analyzed. Contexts may be determined for words, expressions, and other combinations of words in bodies of knowledge such as encyclopedias. Analysis of use provides a division of the universe of communication or information into domains and selects words or expressions unique to those domains of subject matter as an aid in classifying information. A vocabulary list is created with a macro-context (context vector) for each, dependent upon the number of occurrences of unique terms from a domain, over each of the domains. This system may be used to find information or classify information by subsequent inputs of text, in calculation of macro-contexts, with ultimate determination of lists of micro-contests including terms closely aligned with the subject matter.
When RankBrain was first announced, I found a patent that was co-invented by one of the members of the team that was working on it, that described how Google might provide substitutions for some query terms, based upon an understanding of the context of those terms and the other words used in a query. I wrote about that patent in the post, Investigating Google RankBrain and Query Term Substitutions. I think reading the patent that post is about, and the one that this post is about can be helpful in understanding some of the ideas behind a process such as RankBrain.
This patent does provide a lot of insights in explaining the importance of context and how helpful that can be to a system that may be attempting to extract data from a source and index that data in a way which makes it easier to locate. I liked this passage in particular:
Interestingly, some words in the English language, and other languages pertain to many different areas of subject matter. Thus, one may think of the universe of communication as containing numerous domains of subject matter. For example, the various domains in FIG. 2 refer to centers of meaning or subject matter areas. These domains are represented as somewhat indistinct clouds, in that they may accumulate a vocabulary of communication elements about them that pertain to them or that relate to them. Nevertheless, some of those same communication elements may also have application elsewhere. For example, a horse to a rancher is an animal. A horse to a carpenter is an implement of work. A horse to a gymnast is an implement on which to perform certain exercises. Thus, the communication element that we call “horse” belongs to, or pertains to, multiple domains.
A search engine that can identify the domains or contexts that a word might fit within may be able to better index such words; as described in this patent:
In an apparatus and method in accordance with the invention, a search engine process is developed that provides a deterministic method for establishing context for the communication elements submitted in a query. Thus, it is possible for a search engine now to determine to which domain or domains a communication element is “attracted.” Since few things are absolute, domains may actually overlap or be very close such that they man share certain communication elements. That is, communication elements do not “belong” to any domain, they are attracted to or have an affinity for various domains, and may have differing degrees of affinity for differing domains. One may think of this affinity as perhaps a goodness of fit or a best alignment or quality alignment with the subject matter of a particular domain.
Contextually Rewarding Search Results
The patent tells us that a search engine that works well is one that provides a searcher with information in response to a query that is “comparatively close related”. Information that is exactly what has been sought. Then information that is close to what has been sought and is still useful. Then it tells us that what would be “contextually unrewarding” would be information that shares the word in a completely different and useless context related to the query
Words might be related to a wide range of particular fields or subject matter domains. The patent describes how these might be used:
Typically, a domain list of about 40 to 50 terms have been found to be effective. Some domain lists have been operated successfully in an apparatus and method in accordance with the invention with as few as 10 terms. Some domain lists may contain a few hundreds of individual terms. For example, some domains may justify about 300 terms. Although the method is deterministic, rather than statistical, it is helpful to have about 40 to 50 terms in the domain list in order to improve the efficiency of the calculations and determinations of the method.
The domain lists have utility in quickly identifying the particular domain to which their members pertain. This results from the lack of commonality of the terms and the lack of ambiguity as to domains to which they may have utility. By the same token, a list as small as the domain lists are necessarily limited when considering the overall vocabulary of communication elements available in any language. Thus, the terms in domain lists do not necessarily arise with the frequency that is most useful for rapid searching. That is, a word that is unique to a particular subject matter domain, but infrequently used, may not arise in very many queries submitted to a search engine.
A process for creating a vocabulary list of a substantial universe or a substantial portion of a universe of communication elements may be performed by identifying a body or corpus of information organized by topical entries. Thereafter, the text of each of those entries identified may be subjected to a counting process in which occurrences of terms from the domain list occur within each of the topical entries. Ultimately, a calculation of a macro context may be made for each of the topical entries. This calculation is based on the domain lists, and the domains represented thereby.
This is where this patent enters into the world of the Semantic Web. The places where different subject matter domains may be identified for different words could be in knowledge bases or online encyclopedias. Such collections of what is referred to as public knowledge might be called a “corpus”. This kind of corpus of information could be used to create a context vector used to index different meanings of words.
When a different meaning is found, it might then be counted from that information corpus The patent tells us that terms found in such a place could be “individual words, terms, expressions, phrases, and so forth.”
The patent attempts to put this into context for us with this statement:
One may think of a topical entry as a vocabulary term. That is, every topical entry is a vocabulary word, expression, place, person, etc. that will be added to the overall vocabulary. That is, for example, the universe may be divided into about 100 to 120 domains for convenient navigation. Likewise, the domain lists may themselves contain from about 10 to about 300 select terms each. By contrast, the topical entries that may be included in the build of a vocabulary list may include the number of terms one would find in a dictionary such as 300 to 800,000. Less may be used, and conceivably more. Nevertheless, unabridged dictionaries and encyclopedias typically have on this order of numbers of entries.
Contexts as Vectors
When RankBrain first came out, there was a post published that looked at some information that might make it a little more understandable; it included some information about Geoffrey Hinton’s Thought Vectors, and there’s more about those in this post from Jennifer Slegg: RankBrain: Everything We Know About Google’s AI Algorithm.
There is a Google Open Source Blog post on Word Vectors which is closely related, titled Learning the meaning behind words, written by Tomas Mikolov, Ilya Sutskever, and Quoc Le. Ilya Sutskever was a student of Geoffry Hinton. Tomas Mikolov worked on a number of papers about word vectors while with the Google Brain team, including Efficient Estimation of Word Representations in Vector Space.
The patent spends a fair amount of time describing what it considers context vectors to be; the different domains which a word might fall into, and number of occurrences or weights for those words within those domains. It’s worth drilling down into the patent and reading about how terms can be considered context vectors that a search engine might label them as.
When a searcher enters a query into a search engine to be searched, the query may be classified within contexts, to help in selecting information in response to that query.
Using a Browser Helper Object
The patent describes how it might identify different domains that might be associated with specific terms. It tells us that this might be done:
By compiling a list of domain-specific questions, it is possible to (1) specify differences between very similar domains with great precision, and (2) create a rapid way to prototype a domain that does not require many hours of an expert’s time, and can be expanded by relatively inexperienced people.
The patent also describes the use of a BHO (Browser Helper Object) in this manner:
Another slightly more complex implementation is something like a Browser Helper Object (BHO) that runs on the user’s machine and watches/categorizes all surfing activity. With this system, even non-participating sites can contribute to the picture of the user, and any clicking the user does to ad sites served by certified clicks will pick up a much more comprehensive picture.
The patent provides more details on how this contextual vector based system might work, and how data might be extracted from web pages. It is highly recommended reading if you want to get a better sense of how a context-based system might be used to index the web and to make specific information on the Web easier to improve upon most conventional keyword-based search engines.
Copyright © 2016 SEO by the Sea. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana
- Twitter picks up team from narrative app Lightwell in its latest effort to improve conversations
- T-Mobile hit by hours-long nationwide outage
- New Releases on Hero Academy! Starting Pinterest & Apple Search Ads
- ‘This is Your Life in Silicon Valley’: The League founder and CEO Amanda Bradford on modern dating, and whether Bumble is a ‘real’ startup
- The 11 best startups from Y Combinator’s S19 Demo Day 1