CBPO

Tag: Evolution

Evolution of Google’s News Ranking Algorithm

November 1, 2019 No Comments

Image: Photo by Nathan Dumlao on Unsplash

Did the Algorithm Behind How News Articles Rank at Google Just Change?

A Google Patent about how news articles are ranked by Google was updated this week, and in this case it suggests how entities in those documents can have an impact on ranking.

How Have News Articles Been Ranked at Google?

This patent was originally filed in 2003.

The beta version of Google News was first launched by Google in 2002, so this was one of the early patents that described how Google ranked news articles.

One of the inventors of the original patent was Krishna A. Bharat, known as a founder of Google News.

The newest version (a continuation patent) was just granted and is the Sixth Version of the patent. It can be found at:

Systems and methods for improving the ranking of news articles
Inventors: Michael Curtiss, Krishna A. Bharat, and Michael Schmitt
Assignee: Google LLC
US Patent: 10,459,926
Granted: October 29, 2019
Filed: April 27, 2015

This version of the patent provides a history of previous versions of the patent, and when they were filed and what the patent numbers of the earlier 5 versions are:

This application is a

(1) continuation of U.S. patent application Ser. No. 14/140,108, filed on Dec. 24, 2013, which is a

(2) continuation of U.S. patent Ser. No. 13/616,659, filed on Sep. 14, 2012 (now U.S. Pat. No. 8,645,368), which is a

(3) continuation of U.S. patent application Ser. No. 13/404,827, filed Feb. 24, 2012, (now U.S. Pat. No. 8,332,382), which is a

(4) continuation of U.S. patent application Ser. No. 12/501,256, filed on Jul. 10, 2009, (now U.S. Pat. No. 8,126,876), which is a

(5) continuation of U.S. patent application Ser. No. 10/662,931, filed Sep. 16, 2003, (now U.S. Pat. No. 7,577,655),

the disclosures of which are hereby incorporated by reference herein.

What A Continuation Patent is

Continuation Patents take the date of the filing of the patent they are continuing (or the ones those patents are continuing) and are intended to show how the process described by the patents have changed. The processes are set out in the claims sections of the patents, which are the parts of the patents that the prosecuting patent officer reviews when deciding whether or not to grant the new patents.

Often, looking at the very first claim of each patent can help identify important aspects that have changed from one version of a patent to another. It is somewhat rare (in my experience) to see a patent that has been updated 6 times as this one has. I recently wrote about Google’s Universal Search Interface patent which was recently updated a fourth time – Google’s New Universal Search Results.

What Caused A Recent Rankings Change at the New York Times?

A post on Twitter this week suggested that The New York Times may have been negatively impacted by a new Algorithm called Bert that was just released at Google, which was announced in Understanding searches better than ever before.

That Tweet does tell us that it is possible that BERT may have had an impact or a move to Mobile-First Indexing may have caused a loss of rankings at the Newspaper’s site. But seeing that tweet, and seeing that there was a new version of this patent made me curious to see what it contained, and what the changes it may have brought about were.

The Changing Claims from the Ranking of News Articles Patents

But it’s possible that other changes at Google could also have an impact on rankings at news sites. One way to tell how Google changed it how ranks articles is to look at how the patent covering the ranking of news articles has changed over time.

Compare How the first 4 claims from this patent have changed over time.

The latest first claim in this patent introduces some new things to look at

What is claimed is:

1. A method for ranking results, comprising: receiving a list of objects; identifying a first object in the list and a first source with which the first object is associated; identifying a second object in the list and a second source with which the second object is associated; determining a quantity of named entities that (i) occur in the first object that is associated with the first source, and (ii) do not occur in objects that are identified as sharing a same cluster with the first object but that are associated with one or more sources other than the first source; computing, based at least on the quantity of named entities that (i) occur in the first object that is associated with the first source, and (ii) do not occur in objects that are identified as sharing a same cluster with the first object but that are associated with one or more sources other than the first source, a first quality value of the first source using a first metric, wherein a named entity corresponds to a person, place, or organization; computing a second quality value of the second source using a second metric that is different from the first metric; and ranking the list of objects based on the first quality value and the second quality value.

2. The method of claim 1 wherein the identifying the first source with which the first object is associated includes: identifying the first source based on a uniform resource locator (URL) associated with the first object.

3. The method of claim 1 wherein the first source is a news source.

4. The method of claim 1 wherein computing the first quality value of the first source is further based on: one or more of a number of articles produced by the first source during a first time period, an average length of an article produced by the first source, an amount of important coverage that the first source produces in a second time period, a breaking news score, network traffic to the first source, a human opinion of the first source, circulation statistics of the first source, a size of a staff associated with the first source, a number of bureaus associated with the first source, a breadth of coverage by the first source, a number of different countries from which traffic to the first source originates, and a writing style used by the first source.

From the version of the patent that was filed on Sep. 14, 2012 (now U.S. Pat. No. 8,645,368):

What is claimed is:

1. A method comprising: determining, using one or more processors and based on receiving a search query, articles and respective scores; identifying, using one or more processors, for an article of the articles, a source with which the article is associated; determining, using one or more processors, a score for the source, the score for the source being based on: a metric that represents an evaluation, by one or more users, of the source, and an amount of traffic associated with the source; and adjusting, using one or more processors, the score of the article based on the score for the source.

2. The method of claim 1, where identifying the source includes identifying the source based on an address associated with the article.

3. The method of claim 1, where determining the score includes accessing a memory to determine the score for the source.

4. The method of claim 1, where the score for the source is further based on a length of time between an occurrence of an event and publication, by the source, of an article associated with the event.

From the Version of the patent filed on Feb. 24, 2012, (now U.S. Pat. No. 8,332,382):

What is claimed is:

1. A computer-implemented method comprising: obtaining, in response to receiving a search query, articles and respective scores; identifying, using one or more processors, for an article of the articles, a source with which the article is associated; determining, using one or more processors, a score for the source, based on polling one or more users to request the one or more users to provide a metric that represents an evaluation of a source and based on a length of time between an occurrence of an event and publication, by the source, of another article associated with the event; and adjusting, using one or more processors, the score of the article based on the score for the source.

2. The method of claim 1, where identifying the source includes identifying the source based on an address associated with the article.

3. The method of claim 1, where adjusting the score of the article includes: determining, using the score for the source, a new score for the article associated with the source; and adjusting the score of the article based on the determined new score.

4. The method of claim 1, where the score for the source is further based on a usage pattern indicating traffic associated with the source.

From the version of the patent that was filed on February 10, 2009, (Now U.S. Pat. No. 8,126,876):

What is claimed is:

1. A method, performed by one or more server devices, the method comprising: receiving, at one or more processors of the one or more server devices, a search query, from a client device; generating, by one or more processors of the one or more server devices and in response to receiving the search query, a list of references to news articles; identifying, by one or more processors of the one or more server devices and for each reference in the list of references, a news source with which each reference is associated; determining, by one or more processors of the one or more server devices and for each identified news source, whether a news source rank exists; determining, by one or more processors of the one or more server devices and for each reference with an existing corresponding news source rank, a new score by combining the news source rank and a score corresponding to a previous ranking of the reference; and ranking, by one or more processors of the one or more server devices, the references in the list of references based, at least in part, on the new scores.

2. The method of claim 1, where determining whether each news source rank exists includes accessing a database to locate the news source rank.

3. The method of claim 1, further comprising: providing the ranked list of references to the client device.

4. The method of claim 1, where determining the new score comprises: determining, for each reference with an existing corresponding news source rank, a weighted sum of the news source rank and the score corresponding to the previous ranking of the reference.

And the Very First Version of the patent filed on September 16, 2003, (Now U.S. Pat. No. 7,577,655):

What is claimed is:

1. A method comprising: determining, by a processor, one or more metric values for a news source based at least in part on at least one of a number of articles produced by the news source during a first time period, an average length of an article produced by the news source, an amount of coverage that the news source produces in a second time period, a breaking news score, an amount of network traffic to the news source, a human opinion of the news source, circulation statistics of the news source, a size of a staff associated with the news source, a number of bureaus associated with the news source, a number of original named entities in a group of articles associated with the news source, a breadth of coverage by the news source, a number of different countries from which network traffic to the news source originates, or a writing style used by the news source determining, by the processor, an importance metric value representing the amount of coverage that the news source produces in a second time period, where the determining an importance metric includes: determining, by the processor, for each article produced by the news source during the second time period, a number of other non-duplicate articles on a same subject produced by other news sources to produce an importance value for the article, and adding, by the processor, the importance values to obtain the importance metric value; generating, by the processor, a quality value for the news source based at least in part on the determined one or more metric values; and using, by the processor, the quality value to rank an object associated with the news source.

2. The method of claim 1 where the determining includes: determining, by the processor, a plurality of metric values for the news source.

3. The method of claim 2 where the generating includes: multiplying, by the processor, each metric value in the plurality of metric values by a factor to create a plurality of adjusted metric values, and adding, by the processor, the plurality of adjusted metric values to obtain the quality value.

4. The method of claim 3 where the plurality of metric values includes a predetermined number of highest metric values for the news source.

How the News Ranking Claims Differ

An analysis of changes over Time to the patent for “Systems and methods for improving the ranking of news articles,” should reflect how Google has changed how they have been implementing that patent.

We can see that in the claims for the very first patent (filed in 2003) that Google was looking at metric values for different news sources to rank the content that those sources were creating. That very long first claim from that version of the patent list a number of metrics to use to rank news sources, and that ranking influenced the ranking of news articles. So a story from a very well known news agency would have a tendency to rank higher than a story from a lesser-known agency.

The version of the patent filed in 2009 still focuses upon news sources (and a “news source rank”), along with references to the news articles generated by those news sources.

The version of the patent filed in February 2012 again tells us about a score for a news article that is influenced by a score for a news source, but it doesn’t include the many metrics that the 2003 version of the patent does.

The version of the patent filed in September 2012 Holds on to the score for the source, but tells us that score is based on a metric that represents an evaluation, by one or more users, the amount of traffic associated with the source, and a score for the article based upon a score for the source.

The most recent published version of this patent, filed in April 2015, and granted in October 2019 introduces some changes in how news articles may be ranked by Google. It tells us about how articles covering different topics are placed in clusters (which isn’t new in itself), and how those articles may rank higher than other articles by covering more entities that aren’t covered by articles in the same clusters


Copyright © 2019 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Evolution of Google’s News Ranking Algorithm appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


The evolution of Google’s rel “no follow”

October 29, 2019 No Comments

Google updated the no-follow attribute on Tuesday 10th September 2019 regarding which they say it aims to help fight comment spam. The Nofollow attribute has remained unchanged for 15 years, but Google has had to make this change as the web evolves.

Google also announced two new link attributes to help website owners and webmasters clearly call out what type for link is being used,

rel=”sponsored”: Use the sponsored attribute to identify links on your site that were created as part of advertisements, sponsorships or other compensation agreements.

rel=”ugc”: UGC stands for User Generated Content, and the ugc attribute value is recommended for links within user-generated content, such as comments and forum posts.

rel=”nofollow”: Use this attribute for cases where you want to link to a page but don’t want to imply any type of endorsement, including passing along ranking credit to another page.

March 1st, 2020 changes

Up until the 1st of March 2020, all of the link attributes will serve as a hint for ranking purposes, anyone that was relying on the rel=nofollow to try and block a page from being indexed should look at using other methods to block pages from being crawled or indexed.

John Mueller mentioned the use of the rel=sponsered in one of the recent Google Hangouts.

Source: YouTube

The question he was asked

“Our website has a growing commerce strategy and some members of our team believe that affiliate links are detrimental to our website ranking for other terms do we need to nofollow all affiliate links? If we don’t will this hurt our organic traffic?”

John Mueller’s answer

“So this is something that, I think comes up every now and then, from our point of view affiliate links are links that are placed with a kind of commercial background there, in that you are obviously trying to earn some money by having these affiliate link and pointing to a distributor that you trust and have some kind of arrangement with them.

From our point of view that is perfectly fine, that’s away on monetizing your website your welcome to do that.

We do kind of expect that these types of links are marked appropriately so that we understand these are affiliate links, one way to do that is to use just a nofollow.

A newer way to do that to let us know about this kind of situation is to use the sponsored rel link attribute, that link attribute specifically tells us this is something to do with an advertising relationship, we treat that the same as a no-follow.

A lot of the affiliate links out there follow really clear patterns and we can recognize those so we try to take care of those on our side when we can  but to be safe we recommend just using a nofollow or rel sponsered link attribute, but in general this isn’t something that would really harm your website if you don’t do it, its something that makes it a little clearer for us what these links are for and if we see for example a website is engaging in large scale link selling then that’s something where we might take manual action, but for the most part if our algorithms just recognize these are links we don’t want to count then we just won’t count them.”

How quickly are website owners acting on this?

This was only announced by Google in September and website owners have until march to make the change required but data from Semrush show that website owners are starting to change over to the new rel link attribute with.

The data shows that out of From one million domains, only 27,763 has at least one UGC link but the interesting fact is that if we’ll look at those 27,763 domains that have at least one UGC link, each domain from this list on average has 20,904,603 follow backlinks, 6,373,970 – no follow, 22.8 – UGC, 55.5 – sponsored.

Source: Semrush.com

This is still very early days but we can see that there is change and I would expect that to grow significantly into next year.

Conclusion

I believe that Google is going to use the data from these link attributes to catch out website owners that continue to sell links and mark them up incorrectly in order to pass any sort of SEO value other to another website in any sort of agreement Paid or otherwise.

Paul Lovell is an SEO Consultant And Founder at Always Evolving SEO. He can be found on Twitter @_PaulLovell.

The post The evolution of Google’s rel “no follow” appeared first on Search Engine Watch.

Search Engine Watch


Product Hunt Radio: The evolution of Y Combinator, and counter-intuitive advice for founders

October 3, 2018 No Comments

In this episode of Product Hunt Radio, I’m visiting Y Combinator’s San Francisco headquarters to talk to two of the people who are integral to Y Combinator — Kat Manalac and Michael Seibel.

Michael is CEO of Y Combinator’s accelerator program. He has been through YC himself a couple of times — first in 2007, as co-founder and CEO of Justin.tv — and again in 2012 as co-founder and CEO of Socialcam. Justin.tv later became Twitch and sold to Amazon, and Socialcam was sold to Autodesk.

Kat is a partner at Y Combinator and one of the people who convinced us to apply to join the program back in 2014. She has been at YC for five years, where she focuses on founder outreach, helping companies perfect their pitches, and much more. Prior to joining YC, she was chief of staff to Reddit founder Alexis Ohanian and also worked on brand and strategy at WIRED.

In this episode we talk about:

  • The evolution of Y Combinator — it’s changed a ton since Product Hunt went through the program four years ago. They’ve been working on several programs for founders — things that Michael wishes existed when he went through the program.
  • Michael and Kat’s advice for founders, including counter-intuitive tips they’ve learned after working with literally thousands of startups.
  • A key mistake that trips up new founders when pitching their company, as well as advice for founders seeking a technical co-founder.
  • How YC has scaled the organization as a 50-person company with its 4,000 (and growing) alumni.

Of course, we also chat about some of their favorite products, including a virtual assistant that will do anything, a $ 1,500 smart mirror that will get you fit, and a beverage that will get you high.

We’ll be back next week so be sure to subscribe on Apple Podcasts, Google Podcasts, Spotify, Breaker, Overcast, or wherever you listen to your favorite podcasts.


Startups – TechCrunch


The Evolution of Voice Search (and what it means for PPC)

February 15, 2017 No Comments

This post is part of the Hero Conf Los Angeles Speaker Blog Series. Purna Virji will join 50+ PPC experts sharing their paid search and social expertise at the World’s Largest All-PPC Event, April 18-20 in Los Angeles, CA. Like what you read? Find out more about Hero Conf.   Speculating about the future has always […]

Read more at PPCHero.com
PPC Hero