CBPO

Tag: News

Evolution of Google’s News Ranking Algorithm

November 1, 2019 No Comments

Image: Photo by Nathan Dumlao on Unsplash

Did the Algorithm Behind How News Articles Rank at Google Just Change?

A Google Patent about how news articles are ranked by Google was updated this week, and in this case it suggests how entities in those documents can have an impact on ranking.

How Have News Articles Been Ranked at Google?

This patent was originally filed in 2003.

The beta version of Google News was first launched by Google in 2002, so this was one of the early patents that described how Google ranked news articles.

One of the inventors of the original patent was Krishna A. Bharat, known as a founder of Google News.

The newest version (a continuation patent) was just granted and is the Sixth Version of the patent. It can be found at:

Systems and methods for improving the ranking of news articles
Inventors: Michael Curtiss, Krishna A. Bharat, and Michael Schmitt
Assignee: Google LLC
US Patent: 10,459,926
Granted: October 29, 2019
Filed: April 27, 2015

This version of the patent provides a history of previous versions of the patent, and when they were filed and what the patent numbers of the earlier 5 versions are:

This application is a

(1) continuation of U.S. patent application Ser. No. 14/140,108, filed on Dec. 24, 2013, which is a

(2) continuation of U.S. patent Ser. No. 13/616,659, filed on Sep. 14, 2012 (now U.S. Pat. No. 8,645,368), which is a

(3) continuation of U.S. patent application Ser. No. 13/404,827, filed Feb. 24, 2012, (now U.S. Pat. No. 8,332,382), which is a

(4) continuation of U.S. patent application Ser. No. 12/501,256, filed on Jul. 10, 2009, (now U.S. Pat. No. 8,126,876), which is a

(5) continuation of U.S. patent application Ser. No. 10/662,931, filed Sep. 16, 2003, (now U.S. Pat. No. 7,577,655),

the disclosures of which are hereby incorporated by reference herein.

What A Continuation Patent is

Continuation Patents take the date of the filing of the patent they are continuing (or the ones those patents are continuing) and are intended to show how the process described by the patents have changed. The processes are set out in the claims sections of the patents, which are the parts of the patents that the prosecuting patent officer reviews when deciding whether or not to grant the new patents.

Often, looking at the very first claim of each patent can help identify important aspects that have changed from one version of a patent to another. It is somewhat rare (in my experience) to see a patent that has been updated 6 times as this one has. I recently wrote about Google’s Universal Search Interface patent which was recently updated a fourth time – Google’s New Universal Search Results.

What Caused A Recent Rankings Change at the New York Times?

A post on Twitter this week suggested that The New York Times may have been negatively impacted by a new Algorithm called Bert that was just released at Google, which was announced in Understanding searches better than ever before.

That Tweet does tell us that it is possible that BERT may have had an impact or a move to Mobile-First Indexing may have caused a loss of rankings at the Newspaper’s site. But seeing that tweet, and seeing that there was a new version of this patent made me curious to see what it contained, and what the changes it may have brought about were.

The Changing Claims from the Ranking of News Articles Patents

But it’s possible that other changes at Google could also have an impact on rankings at news sites. One way to tell how Google changed it how ranks articles is to look at how the patent covering the ranking of news articles has changed over time.

Compare How the first 4 claims from this patent have changed over time.

The latest first claim in this patent introduces some new things to look at

What is claimed is:

1. A method for ranking results, comprising: receiving a list of objects; identifying a first object in the list and a first source with which the first object is associated; identifying a second object in the list and a second source with which the second object is associated; determining a quantity of named entities that (i) occur in the first object that is associated with the first source, and (ii) do not occur in objects that are identified as sharing a same cluster with the first object but that are associated with one or more sources other than the first source; computing, based at least on the quantity of named entities that (i) occur in the first object that is associated with the first source, and (ii) do not occur in objects that are identified as sharing a same cluster with the first object but that are associated with one or more sources other than the first source, a first quality value of the first source using a first metric, wherein a named entity corresponds to a person, place, or organization; computing a second quality value of the second source using a second metric that is different from the first metric; and ranking the list of objects based on the first quality value and the second quality value.

2. The method of claim 1 wherein the identifying the first source with which the first object is associated includes: identifying the first source based on a uniform resource locator (URL) associated with the first object.

3. The method of claim 1 wherein the first source is a news source.

4. The method of claim 1 wherein computing the first quality value of the first source is further based on: one or more of a number of articles produced by the first source during a first time period, an average length of an article produced by the first source, an amount of important coverage that the first source produces in a second time period, a breaking news score, network traffic to the first source, a human opinion of the first source, circulation statistics of the first source, a size of a staff associated with the first source, a number of bureaus associated with the first source, a breadth of coverage by the first source, a number of different countries from which traffic to the first source originates, and a writing style used by the first source.

From the version of the patent that was filed on Sep. 14, 2012 (now U.S. Pat. No. 8,645,368):

What is claimed is:

1. A method comprising: determining, using one or more processors and based on receiving a search query, articles and respective scores; identifying, using one or more processors, for an article of the articles, a source with which the article is associated; determining, using one or more processors, a score for the source, the score for the source being based on: a metric that represents an evaluation, by one or more users, of the source, and an amount of traffic associated with the source; and adjusting, using one or more processors, the score of the article based on the score for the source.

2. The method of claim 1, where identifying the source includes identifying the source based on an address associated with the article.

3. The method of claim 1, where determining the score includes accessing a memory to determine the score for the source.

4. The method of claim 1, where the score for the source is further based on a length of time between an occurrence of an event and publication, by the source, of an article associated with the event.

From the Version of the patent filed on Feb. 24, 2012, (now U.S. Pat. No. 8,332,382):

What is claimed is:

1. A computer-implemented method comprising: obtaining, in response to receiving a search query, articles and respective scores; identifying, using one or more processors, for an article of the articles, a source with which the article is associated; determining, using one or more processors, a score for the source, based on polling one or more users to request the one or more users to provide a metric that represents an evaluation of a source and based on a length of time between an occurrence of an event and publication, by the source, of another article associated with the event; and adjusting, using one or more processors, the score of the article based on the score for the source.

2. The method of claim 1, where identifying the source includes identifying the source based on an address associated with the article.

3. The method of claim 1, where adjusting the score of the article includes: determining, using the score for the source, a new score for the article associated with the source; and adjusting the score of the article based on the determined new score.

4. The method of claim 1, where the score for the source is further based on a usage pattern indicating traffic associated with the source.

From the version of the patent that was filed on February 10, 2009, (Now U.S. Pat. No. 8,126,876):

What is claimed is:

1. A method, performed by one or more server devices, the method comprising: receiving, at one or more processors of the one or more server devices, a search query, from a client device; generating, by one or more processors of the one or more server devices and in response to receiving the search query, a list of references to news articles; identifying, by one or more processors of the one or more server devices and for each reference in the list of references, a news source with which each reference is associated; determining, by one or more processors of the one or more server devices and for each identified news source, whether a news source rank exists; determining, by one or more processors of the one or more server devices and for each reference with an existing corresponding news source rank, a new score by combining the news source rank and a score corresponding to a previous ranking of the reference; and ranking, by one or more processors of the one or more server devices, the references in the list of references based, at least in part, on the new scores.

2. The method of claim 1, where determining whether each news source rank exists includes accessing a database to locate the news source rank.

3. The method of claim 1, further comprising: providing the ranked list of references to the client device.

4. The method of claim 1, where determining the new score comprises: determining, for each reference with an existing corresponding news source rank, a weighted sum of the news source rank and the score corresponding to the previous ranking of the reference.

And the Very First Version of the patent filed on September 16, 2003, (Now U.S. Pat. No. 7,577,655):

What is claimed is:

1. A method comprising: determining, by a processor, one or more metric values for a news source based at least in part on at least one of a number of articles produced by the news source during a first time period, an average length of an article produced by the news source, an amount of coverage that the news source produces in a second time period, a breaking news score, an amount of network traffic to the news source, a human opinion of the news source, circulation statistics of the news source, a size of a staff associated with the news source, a number of bureaus associated with the news source, a number of original named entities in a group of articles associated with the news source, a breadth of coverage by the news source, a number of different countries from which network traffic to the news source originates, or a writing style used by the news source determining, by the processor, an importance metric value representing the amount of coverage that the news source produces in a second time period, where the determining an importance metric includes: determining, by the processor, for each article produced by the news source during the second time period, a number of other non-duplicate articles on a same subject produced by other news sources to produce an importance value for the article, and adding, by the processor, the importance values to obtain the importance metric value; generating, by the processor, a quality value for the news source based at least in part on the determined one or more metric values; and using, by the processor, the quality value to rank an object associated with the news source.

2. The method of claim 1 where the determining includes: determining, by the processor, a plurality of metric values for the news source.

3. The method of claim 2 where the generating includes: multiplying, by the processor, each metric value in the plurality of metric values by a factor to create a plurality of adjusted metric values, and adding, by the processor, the plurality of adjusted metric values to obtain the quality value.

4. The method of claim 3 where the plurality of metric values includes a predetermined number of highest metric values for the news source.

How the News Ranking Claims Differ

An analysis of changes over Time to the patent for “Systems and methods for improving the ranking of news articles,” should reflect how Google has changed how they have been implementing that patent.

We can see that in the claims for the very first patent (filed in 2003) that Google was looking at metric values for different news sources to rank the content that those sources were creating. That very long first claim from that version of the patent list a number of metrics to use to rank news sources, and that ranking influenced the ranking of news articles. So a story from a very well known news agency would have a tendency to rank higher than a story from a lesser-known agency.

The version of the patent filed in 2009 still focuses upon news sources (and a “news source rank”), along with references to the news articles generated by those news sources.

The version of the patent filed in February 2012 again tells us about a score for a news article that is influenced by a score for a news source, but it doesn’t include the many metrics that the 2003 version of the patent does.

The version of the patent filed in September 2012 Holds on to the score for the source, but tells us that score is based on a metric that represents an evaluation, by one or more users, the amount of traffic associated with the source, and a score for the article based upon a score for the source.

The most recent published version of this patent, filed in April 2015, and granted in October 2019 introduces some changes in how news articles may be ranked by Google. It tells us about how articles covering different topics are placed in clusters (which isn’t new in itself), and how those articles may rank higher than other articles by covering more entities that aren’t covered by articles in the same clusters


Copyright © 2019 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Evolution of Google’s News Ranking Algorithm appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


Daily Crunch: Facebook launches its News section

October 28, 2019 No Comments

The Daily Crunch is TechCrunch’s roundup of our biggest and most important stories. If you’d like to get this delivered to your inbox every day at around 9am Pacific, you can subscribe here.

1. Facebook starts testing News, its new section for journalism

Facebook’s news section, which was previously reported to be imminent, is here: The company is rolling out Facebook News in a limited test in the U.S. as a home screen tab and bookmark in the main Facebook app.

Should publishers trust Facebook? Well, Josh Constine argues that none of them have learned the right lessons from the last 10 years.

2. Pixelbook Go review: a Chromebook in search of meaning

The Go is clearly Google’s attempt to lead the way for manufacturers looking to explore Chromebook life outside the classroom. It has some nice hardware perks, but it’s not the revolution or revelation ChromeOS needs.

3. SpaceX wants to land Starship on the Moon before 2022, then do cargo runs for 2024 human landing

SpaceX president and COO Gwynne Shotwell shed a little more light on her company’s current thinking with regards to the mission timelines for its forthcoming Starship spacefaring vehicle.

4. After its first earnings miss in two years, Amazon shares get walloped in after-hours trading

Amazon shares fell by nearly 7% in after-hours trading on Thursday after the company reported its first earnings miss in two years.

5. Lawmakers ask US intelligence chief to investigate if TikTok is a national security threat

In a letter by Sens. Charles Schumer (D-NY) and Tom Cotton (R-AR), the lawmakers asked the acting director of national intelligence Joseph Maguire if the app maker could be compelled to turn Americans’ data over to Chinese authorities.

6. The SaaS gold rush will become the ‘Hunger Games’

Enterprise software investor Rory O’Driscoll says that while the cloud is obviously here to stay, the next five years in cloud investing will neither be the same nor as easy as the last 10. (Extra Crunch membership required.)

7. Learn how to raise your first euros at TechCrunch Disrupt Berlin

Startup funding experts — including Forward Partners managing partner Nic Brisbourne, Target Global partner Malin Holmberg and DocSend co-founder and chief executive officer Russ Heddleston — will sit down together on the Extra Crunch Stage at TechCrunch Disrupt Berlin.


Social – TechCrunch


Mark Zuckerberg makes the case for Facebook News

October 26, 2019 No Comments

While Facebook CEO Mark Zuckerberg seemed cheerful and even jokey when he took the stage today in front of journalists and media executives (at one point, he described the event as “by far the best thing” he’d done this week), he acknowledged that there are reasons for the news industry to be skeptical.

Facebook, after all, has been one of the main forces creating a difficult economic reality for the industry over the past decade. And there are plenty of people (including our own Josh Constine) who think it would be foolish for publishers to trust the company again.

For one thing, there’s the question of how Facebook’s algorithm prioritizes different types of content, and how changes to the algorithm can be enormously damaging to publishers.

“We can do a better job of working with partners to have more transparency and also lead time about what we see in the pipeline,” Zuckerberg said, adding, “I think stability is a big theme.” So Facebook might be trying something out as an “experiment,” but “if it kind of just causes a spike, it can be hard for your business to plan for that.”

At the same time, Zuckerberg argued that Facebook’s algorithms are “one of the least understood things about what we do.” Specifically, he noted that many people accuse the company of simply optimizing the feed to keep users on the service for as long as possible.

“That’s actually not true,” he said. “For many years now, I’ve prohibited any of our feed teams … from optimizing the systems to encourage the maximum amount of time to be spent. We actually optimize the system for facilitating as many meaningful interactions as possible.”

For example, he said that when Facebook changed the algorithm to prioritize friends and family content over other types of content (like news), it effectively eliminated 50 million hours of viral video viewing each day. After the company reported its subsequent earnings, Facebook had the biggest drop in market capitalization in U.S. history.

Zuckerberg was onstage in New York with News Corp CEO Robert Thomson to discuss the launch of Facebook News, a new tab within the larger Facebook product that’s focused entirely on news. Thomson began the conversation with a simple question: “What took you so long?”

The Facebook CEO took this in stride, responding that the question was “one of the nicest things he could have said — that actually means he thinks we did something good.”

Zuckerberg went on to suggest that the company has had a long interest in supporting journalism (“I just think that every internet platform has a responsibility to try to fund and form partnerships to help news”), but that its efforts were initially focused on the News Feed, where the “fundamental architecture” made it hard to find much room for news stories — particularly when most users are more interested in that content from friends and family.

So Facebook News could serve as a more natural home for this news (to be clear, the company says news content will continue to appear in the main feed as well). Zuckerberg also said that since past experiments have created such “thrash in the ecosystem,” Facebook wanted to make sure it got this right before launching it.

In particular, he said the company needed to show that tabs within Facebook, like Facebook Marketplace and Facebook Watch, could attract a meaningful audience. Zuckerberg acknowledged that the majority of Facebook users aren’t interested in these other tabs, but when you’ve got such an enormous user base, even a small percentage can be meaningful.

“I think we can probably get to maybe 20 or 30 million people [visiting Facebook News] over a few years,” he said. “That by itself would be very meaningful.”

Facebook is also paying some of the publishers who are participating in Facebook News. Zuckerberg described this as “the first time we’re forming long-term, stable relationships and partnerships with a lot of publishers.”

Several journalists asked for more details about how Facebook decided which publishers to pay, and how much to pay them. Zuckerberg said it’s based on a number of factors, like ensuring a wide range of content in Facebook News, including from publishers who hadn’t been publishing much on the site previously. The company also had to compensate publishers who are taking some of their content out from behind their paywalls.

“This is not an exact formula — maybe we’ll get to that over time — but it’s all within a band,” he said.

Zuckerberg was also asked about how Facebook will deal with accuracy and quality, particularly given the recent controversy over its unwillingness to fact check political ads.

He sidestepped the political ads question, arguing that it’s unrelated to the day’s topics, then said, “This is a different kind of thing.” In other words, he argued that the company has much more leeway here to determine what is and isn’t included — both by requiring any participating publishers to abide by Facebook’s publisher guidelines, and by hiring a team of journalists to curate the headlines that show up in the Top Stories section.

“People have a different expectation in a space dedicated to high-quality news than they do in a space where the goal is to make sure everyone can have a voice and can share their opinion,” he said.

As for whether Facebook News will include negative stories about Facebook, Zuckerberg seemed delighted to learn that Bloomberg (mostly) doesn’t cover Bloomberg.

“I didn’t know that was a thing a person could do,” he joked. More seriously, he said, “For better or worse, we’re a prominent part of a lot of the news cycles. I don’t think it would be reasonable to try to have a news tab that didn’t cover the stuff that Facebook is doing. In order to make this a trusted source over time, they have to be covered objectively.”


Social – TechCrunch


Capitalism Burns the Amazon, Lawsuits Burn YouTube, and More News

August 29, 2019 No Comments

Catch up on the most important news from today in two minutes or less.
Feed: All Latest


The Poem on the Statue of Liberty Tops This Week’s Internet News Roundup

August 19, 2019 No Comments

Ken Cuccinelli, the acting director of US Citizenship and Immigration Services, thinks the poem on the Statue of Liberty could use a rewrite. Yes, really.
Feed: All Latest



Star Wars News: The End of ‘Rise of Skywalker’ Will Melt Your Mind

July 29, 2019 No Comments

Just ask Kevin Smith. Plus: Marvel’s Kylo Ren origin story, use the Force—in VR, a movie-authentic Boba Fett helmet from Hasbro, and more.
Feed: All Latest


When National News Impacts Your Client

July 25, 2019 No Comments

Staying up-to-date on the latest events that may impact our clients in a negative or positive way.

Read more at PPCHero.com
PPC Hero


Google’s How News Works, aimed at clarifying news transparency

June 11, 2019 No Comments

In May, Google announced the launch of a new website aimed at explaining how they serve and address news across Google properties and platforms.

The site, How News Works, states Google’s mission as it relates to disseminating news in a non-biased manner. The site aggregates a variety of information about how Google crawls, indexes, and ranks news stories as well as how news can be personalized for the end user.

How News Works provides links to various resources within the Google news ecosystem all in one place and is part of The Google News Initiative.

What is The Google News Initiative?

The Google News Initiative (GNI) is Google’s effort to work with news industry professionals to “help journalism thrive in the digital age.” The GNI is driven and summarized by the GNI website which provides information about a variety of initiatives and approaches within Google including:

  • How to work with Google (e.g., partnership opportunities, training tools, funding opportunities)
  • A list of current partnerships and case studies
  • A collection of programs and funding opportunities for journalists and news organizations
  • A catalog of Google products relevant to journalists

Google attempts to work with the news industry in a variety of ways. For example, it provides funding opportunities to help journalists from around the world.

Google is now accepting applications (through mid-July) from North American and Latin American applicants to help fund projects that “drive digital innovation and develop new business models.” Applicants who meet Google’s specified criteria (and are selected) will be awarded up to $ 300,000 in funding (for U.S. applicants) or $ 250,000 (for Latin American applicants) with an additional award of up to 70% of the total project cost.

The GNI website also provides users with a variety of training resources and tools. Journalists can learn how to partner with Google to test and deploy new technologies such as the Washington Post’s participation in Google’s AMP Program (accelerated mobile pages).

AMP is an open source initiative that Google launched in February 2016 with the goal of making mobile web pages faster.

AMP mirrors content on traditional web pages, but uses AMP HTML, an open source format architected in an ultra-light way to reduce latency for readers.

News transparency and accountability

The GNI’s How It Works website reinforces Google’s mission to “elevate trustworthy information.” The site explains how the news algorithm works and links to Google’s news content policies.

The content policy covers Google’s approach to accountability and transparency, its requirements for paid or promotional material, copyright, restricted content, privacy/personalization and more.

This new GNI resource, a subsection of the main GNI website, acts as a starting point for journalists and news organizations to delve into Google’s vast news infrastructure including video news on YouTube.

Since it can be difficult to ascertain if news is trustworthy and accurate, this latest initiative by Google is one way that journalists (and the general public) can gain an understanding of how news is elevated and indexed on Google properties.

The post Google’s How News Works, aimed at clarifying news transparency appeared first on Search Engine Watch.

Search Engine Watch