CBPO

SEO

Related Questions are Joined by ‘People Also Search For’ Refinements; Now Using a Question Graph

February 22, 2018 No Comments

meyer lemon tree related questions

I recently bought a lemon tree and wanted to learn how to care for it. I started asking about it at Google, which provided me with other questions and answers related to caring for a lemon tree. As I clicked upon some of those, others were revealed that gave me more information that was helpful.

Last March, I wrote a post about Related Questions at Google, Google’s Related Questions Patent or ‘People Also Ask’ Questions.

As Barry Schwartz noted recently at Search Engine Land, Google is now also showing alternative query refinements as ‘People Also Search For’ listings, in the post, Google launches new look for ‘people also search for’ search refinements. That was enough to have me look to see if the original “Related Questions” patent was updated by Google. It was. A continuation patent was granted in June of last year, with the same name, but updated claims

The older version of the patent can be found at Generating related questions for search queries

It doesn’t say anything about the changing of the wording of “Related Questions” Some “people also search for” results don’t necessarily take the form of questions, either (so “people also ask” may be very appropriate, and continue to be something we see in the future.) But the claims from the new patent contain some new phrases and language that wasn’t in the old patent. The new patent is at:

Generating related questions for search queries
Inventors: Yossi Matias, Dvir Keysar, Gal Chechik, Ziv Bar-Yossef, and Tomer Shmiel
Assignee: Google Inc.
US Patent: 9,679,027
Granted: June 13, 2017
Filed: December 14, 2015

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for identifying related questions for a search query is described. One of the methods includes receiving a search query from a user device; obtaining a plurality of search results for the search query provided by a search engine, wherein each of the search results identifies a respective search result resource; determining one or more respective topic sets for each search result resource, wherein the topic sets for the search result resource are selected from previously submitted search queries that have resulted in users selecting search results identifying the search result resource; selecting related questions from a question database using the topic sets; and transmitting data identifying the related questions to the user device as part of a response to the search query.

The first claim brings a new concept into the world of related questions and answers, which I will highlight in it:

1. A method performed by one or more computers, the method comprising: generating a question graph that includes a respective node for each of a plurality of questions; connecting, with links in the question graph, nodes for questions that are equivalent, comprising: identifying selected resources for each of the plurality of questions based on user selections of search results in response to previous submissions of the question as a search query to a search engine; identifying pairs of questions from the plurality of questions, wherein the questions in each identified pair of questions have at least a first threshold number of common identified selected resources; and for each identified pair, connecting the nodes for the questions in the identified pair with a link in the question graph; receiving a new search query from a user device; obtaining an initial ranking of questions that are related to the new search query; generating a modified ranking of questions that are related to the new search query, comprising, for each question in the initial ranking: determining whether the question is equivalent to any higher-ranked questions in the initial ranking by determining whether a node for the question is connected by a link to any of the nodes for any of the higher-ranked questions in the question graph; and when the question is equivalent to any of the higher-ranked questions, removing the question from the modified ranking; selecting one or more questions from the modified ranking; and transmitting data identifying the selected questions to the user device as part of a response to the new search query.

A question graph would be a semantic approach towards asking and answering questions that are related to each other in meaningful ways.

In addition to the “question graph” that is mentioned in that first claim, we are also told that Google is keeping an eye upon how often it appears that people are selecting these related questions and watching how often people are clicking upon and reading those.

The descriptions and the images in the patent are from the original version of the patent, so there aren’t any that reflect upon what a question graph might look like. For a while, Facebook introduced graph search as a feature that you could use to search on Facebook and that used questions that were related to each other. I found a screen that shows some of those off, and such related questions could be considered from a question graph of related questions. It isn’t quite the same thing as what Google is doing with related questions, but the idea of showing questions that may be related to any initial one in a query, and keeping an eye upon those to see if people are spending time looking at them makes sense. I’ve been seeing a lot of related questions in search results and have been using them. Here are the Facebook graph search questions:

Facebook Graph Search Related questions

As you can see, those questions share some facts, and are considered to be related to each other because they do. This makes them similar to the related questions that are found from a question graph that might mean they could be of interest to a searcher who asks the first query. It is interesting that the new patent claims ask about whether or not the related questions being shown are being clicked upon, and that tells Google if there is any interest on the part of searchers to continue to see related questions. I’ve been finding them easy to click upon and interesting.

Are you working questions and answers into your content?


Copyright © 2018 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Related Questions are Joined by ‘People Also Search For’ Refinements; Now Using a Question Graph appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


Google’s Mobile Location History

January 30, 2018 No Comments

Google Location History

If you use Google Maps to navigate from place to place, or if you have agreed to be a local guide for Google Maps, there is a chance that you have seen Google Mobile Location history information. There is a Google Account Help page about how to Manage or delete your Location History. The location history page starts off by telling us:

Your Location History helps you get better results and recommendations on Google products. For example, you can see recommendations based on places you’ve visited with signed-in devices or traffic predictions for your daily commute.

You may see this history as your timeline, and there is a Google Help page to View or edit your timeline. This page starts out by telling us:

Your timeline in Google Maps helps you find the places you’ve been and the routes you’ve traveled. Your timeline is private, so only you can see it.

Mobile Location history has been around for a while, and I’ve seen it mentioned in a few Google patents. It may be referred to as a “Mobile location history” because it appears to contain information collected by your mobile device. Here are three posts I’ve written about patents that mention location history and describe processes that depend upon Mobile Location history.

An interesting article that hints at some possible aspects of location history just came out on January 24th, in the post, If you’re using an Android phone, Google may be tracking every move you make.

The timing of the article about location history is interesting given that Google was granted a patent on user location histories the day before that article was published. It focuses upon telling us how location history works:

The present disclosure relates generally to systems and methods for generating a user location history. In particular, the present disclosure is directed to systems and methods for analyzing raw location reports received from one or more devices associated with a user to identify one or more real-world location entities visited by the user.

Techniques that could be used to attempt to determine a location associated with a device can include GPS, IP Addresses, Cell-phone triangulation, Proximity to Wifi Access points, and maybe even power line mapping using device magnetometers.

The patent has an interesting way of looking at location history, which sounds reasonable. I don’t know the latitudes and longitudes of places I visit:

Thus, human perceptions of location history are generally based on time spent at particular locations associated with human experiences and a sense of place, rather than a stream of latitudes and longitudes collected periodically. Therefore, one challenge in creating and maintaining a user location history that is accessible for enhancing one or more services (e.g. search, social, or an API) is to correctly identify particular location entities visited by a user based on raw location reports.

The location history process looks like it involves collecting data from mobile devices in a way that allows it to gather information about places visited, with scores for each of those locations. I have had Google Maps ask me to verify some of the places that I have visited, as if the score it had for those places may not have been sufficient (not high enough of a level of confidence) for it to believe that I had actually been at those places.

The location history patent is:

Systems and methods for generating a user location history
Inventors: Daniel Mark Wyatt, Renaud Bourassa-Denis, Alexander Fabrikant, Tanmay Sanjay Khirwadkar, Prathab Murugesan, Galen Pickard, Jesse Rosenstock, Rob Schonberger, and Anna Teytelman
Assignee: Google LLC
US Patent: 9,877,162
Granted: January 23, 2018
Filed: October 11, 2016

Abstract

Systems and methods for generating a user location history are provided. One example method includes obtaining a plurality of location reports from one or more devices associated with the user. The method includes clustering the plurality of location reports to form a plurality of segments. The method includes identifying a plurality of location entities for each of the plurality of segments. The method includes determining, for each of the plurality of segments, one or more feature values associated with each of the location entities identified for such segment. The method includes determining, for each of the plurality of segments, a score for each of the plurality of location entities based at least in part on a scoring formula. The method includes selecting one of plurality of locations entities for each of the plurality of segments.

Why generate a location history?

A couple of reasons stand out in the patent’s extended description.

1) The generated user location history can be stored and then later accessed to provide personalized location-influenced search results.
2) As another example, a system implementing the present disclosure can provide the location history to the user via an interactive user interface that allows the user to view, edit, and otherwise interact with a graphical representation of her mobile location history.

I like the interactive user Interface that shows times and distances traveled.

This statement from the patent was interesting, too:

According to another aspect of the present disclosure, a plurality of location entities can be identified for each of the plurality of segments. As an example, map data can be analyzed to identify all location entities that are within a threshold distance from a segment location associated with the segment. Thus, for example, all businesses or other points of interest within 1000 feet of the mean location of all location reports included in a segment can be identified.

Google may track information about locations that appear in that history, such as popularity features, which may include, “a number of social media mentions associated with the location entity being valued; a number of check-ins associated with the location entity being valued; a number of requests for directions to the location entity being valued; and/or and a global popularity rank associated with the location entity being valued.”

Personalization features may also be collected which described previous interactions between the user and the location entity, such as:

1) a number of instances in which the user performed a map click with respect to the location entity being valued;
2) a number of instances in which the user requested directions to the location entity being valued;
3) a number of instances in which the user has checked-in to the location entity being valued;
4) a number of instances in which the user has transacted with the location entity as evidenced by data obtained from a mobile payment system or virtual wallet;
5) a number of instances in which the user has performed a web search query with respect to the location entity being valued.

Other benefits of location history

This next potential feature was one that I tested to see if it was working, querying location history. It didn’t seem to be active at this point:

For example, a user may enter a search query that references the user’s historical location (e.g. “Thai restaurant I ate at last Thursday”). When it is recognized that the search query references the user’s location history, then the user’s location history can be analyzed in light of the search query. Thus, for example, the user location history can be analyzed to identify any Thai restaurants visited on a certain date and then provide such restaurants as results in response to the search query.

The patent refers to a graphical representation of mobile location history, which is available:

As an example, in some implementations, a user reviewing a graphical representation of her location history can indicate that one of the location entities included in her location history is erroneous (e.g. that she did not visit such location). In response, the user can be presented with one or more of the location entities that were identified for the segment for which the incorrect location entity was selected and can be given an opportunity to select a replacement location.

Location History Timeline Interface
A Location History Timeline Interface

In addition to the timeline interface, you can also see a map of places you may have visited:

Timeline with Map Interface
Map Interface

You can see in my screenshot of my timeline, I took a photo of a Kumquat tree I bought yesterday. It gives me a chance to see the photos I took, so that I can edit them, if I would like. The patent tells us this about the user interface:

In other implementations, opportunities to perform other edits, such as deleting, annotating, uploading photographs, providing reviews, etc., can be provided in the interactive user interface. In such fashion, the user can be provided with an interactive tool to explore, control, share, and contribute to her location history.

The patent tells us that it tracks activities that you may have engaged in at specific locations:

In further embodiments of the present disclosure, a location entity can be associated with a user action within the context of a location history. For example, the user action can be making a purchase (e.g. with a digital wallet) or taking a photograph. In particular, in some embodiments, the user action or an item of content generated by the user action (e.g. the photograph or receipt) can be analyzed to assist in identifying the location entity associated with such user action. For example, the analysis of the user action or item of content can contribute to the score determined for each location entity identified for a segment.

I have had the Google Maps application ask me if I would like to contribute photos that I have taken at specific locations, such as at the sunset at Solana Beach. I haven’t used a digital wallet, so I don’t know if that is potentially part of my location history.

The patent describes the timeline feature and the Map feature that I included screenshots from above.

The patent interestingly tells us that location entities may be referred to by the common names of the places they are called, and refers to those as “Semantic Identifiers:

Each location entity can be designated by a semantic identifier (e.g. the common “name” of restaurant, store, monument, etc.), as distinguished from a coordinate-based or location-based identifier. However, in addition to a name, the data associated with a particular location entity can further include the location of the location entity, such as longitude, latitude, and altitude coordinates associated with the location entity.

It’s looking like location history could get smarter:

As an example, an interaction evidenced by search data can include a search query inputted by a user that references a particular location entity. As another example, an interaction evidenced by map data 218 can include a request for directions to a particular location entity or a selection of an icon representing the particular location entity within a mapping application. As yet another example, an interaction evidenced by email data 220 can include flight or hotel reservations to a particular city or lodging or reservations for dinner at a particular restaurant. As another example, an interaction evidenced by social media data 222 can include a check-in, a like, a comment, a follow, a review, or other social media action performed by the user with respect to a particular location entity.

Tracking these interactions is being done under the name “user/location entity interaction extraction,” and it may calculate statistics about such interactions:

Thus, user/location entity interaction extraction module 212 can analyze available data to extract interactions between a user and a location entity. Further, interaction extraction module 212 can maintain statistics regarding aggregate interactions for a location entity with respect to all users for which data is available.

It appears that to get the benefit of being able to access information such as this, you would need to give Google the ability to collect such data.

The patent provides more details about location history, and popularity and other features, and even a little more about personalization. Many aspects of location history have been implemented, while there are some that look like they might have yet to be developed. As can be seen from the three posts I have written about that describes patents that use information from location history, it is possible that location history may be used in other processes used by Google.

How do you feel about mobile location history from Google?


Copyright © 2018 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Google’s Mobile Location History appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


Google Targeted Advertising, Part 1

January 28, 2018 No Comments

Google Targeted Advertisements

One of the inventors of the newly granted patent I am writing about was behind one of the most visited Google patents I’ve written about, from Ross Koningstein, which I posted about under the title, The Google Rank-Modifying Spammers Patent It described a social engineering approach to stop site owners from using spammy tactics to raise the ranking of pages.

This new patent is about targeted advertising at Google in paid search, which I haven’t written too much about here. I did write one post about paid search, which I called, Google’s Second Most Important Algorithm? Before Google’s Panda, there was Phil I started that post with a quote from Steven Levy, the author of the book In the Plex, which goes like this:

They named the project Phil because it sounded friendly. (For those who required an acronym, they had one handy: Probabilistic Hierarchical Inferential Learner.) That was bad news for a Google Engineer named Phil who kept getting emails about the system. He begged Harik to change the name, but Phil it was.

What this showed us was that Google did not use the AdSense algorithm from the company they acquired in 2003 named Applied Semantics to build paid search. But, it’s been interesting seeing Google achieve so much based on a business model that relies upon advertising because they seemed so dead set against advertising when then first started out the search engine. For instance, there is a passage in an early paper about the search engine they developed that has an appendix about advertising.

If you read through The Anatomy of a Large-Scale Hypertextual Web Search Engine, you learn a lot about how the search engine was intended to work. But the section about advertising is really interesting. There, they tell us:

Currently, the predominant business model for commercial search engines is advertising. The goals of the advertising business model do not always correspond to providing quality search to users. For example, in our prototype search engine, one of the top results for cellular phone is “The Effect of Cellular Phone Use Upon Driver Attention”, a study which explains in great detail the distractions and risk associated with conversing on a cell phone while driving. This search result came up first because of its high importance as judged by the PageRank algorithm, an approximation of citation importance on the web [Page, 98]. It is clear that a search engine which was taking money for showing cellular phone ads would have difficulty justifying the page that our system returned to its paying advertisers. For this type of reason and historical experience with other media [Bagdikian 83], we expect that advertising funded search engines will be inherently biased towards the advertisers and away from the needs of the consumers.

So, when Google was granted a patent on December 26, 2017, that provides more depth on how targeted advertising might work at Google, it made interesting reading. This is a continuation patent, which means the description ideally should be approximately the same as the original patent, but the claims should be updated to reflect how the search engine might be using the processes described in a newer manner. The older version of the patent was filed on December 30, 2004, but it wasn’t granted under the earlier claims. It may be possble to dig up those earlier claims, but it is interesting looking at the description that accompanies the newest version of the patent to get a sense of how it works. Here is a link to the newest version of the patent with claims that were updated in 2015:

Associating features with entities, such as categories of web page documents, and/or weighting such features
Inventors: Ross Koningstein, Stephen Lawrence, and Valentin Spitkovsky
Assignee: Google Inc.
US Patent: 9,852,225
Granted: December 26, 2017
Filed: April 23, 2015

Abstract

Features that may be used to represent relevance information (e.g., properties, characteristics, etc.) of an entity, such as a document or concept for example, may be associated with the document by accepting an identifier that identifies a document; obtaining search query information (and/or other serving parameter information) related to the document using the document identifier, determining features using the obtained query information (and/or other serving parameter information), and associating the features determined with the document. Weights of such features may be similarly determined. The weights may be determined using scores. The scores may be a function of one or more of whether the document was selected, a user dwell time on a selected document, whether or not a conversion occurred with respect to the document, etc. The document may be a Web page. The features may be n-grams. The relevance information of the document may be used to target the serving of advertisements with the document.

I will continue with details about how this patent describes how they might target advertising at Google in a part 2 of this post.


Copyright © 2018 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Google Targeted Advertising, Part 1 appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


Does Google Use Latent Semantic Indexing?

January 26, 2018 No Comments
Railroad Turntable Sign
Technology evolves and changes over time.

There was a park in the town in Virginia where I used to live that had been a railroad track that was turned into a walking path. At one place near that track was a historic turntable where cargo trains might be unloaded so that they could be added to later trains or trains headed in the opposite direction. This is a technology that is no longer used but it is an example of how technology changes and evolves over time.

There are people who write about SEO who have insisted that Google uses a technology called Latent Semantic Indexing to index content on the Web, but make those claims without any proof to back them up. I thought it might be helpful to explore that technology and its sources in more detail. It is a technology that was invented before the Web was around, to index the contents of document collections that don’t change much. LSI might be like the railroad turntables that used to be used on railroad lines.

There is also a website which offers “LSI keywords” to searchers but doesn’t provide any information about how they generate those keywords or use LSI technology to generate them, or provide any proof that they make a difference in how a search engine such as Google might index content that contains those keywords. How is using “LSI Keywords” different from keyword stuffing that Google tells us not to do. Google tells us that we should:

Focus on creating useful, information-rich content that uses keywords appropriately and in context.

Where does LSI come from

One of Microsoft’s researchers and search engineers, Susan Dumais was an inventor behind a technology referred to as Latent Semantic Indexing which she worked on developing at Bell Labs. There are links on her home page that provide access to many of the technologies that she worked upon while performing research at Microsoft which are very informative and provide many insights into how search engines perform different tasks. Spending time with them is highly recommended.

She performed earlier research before joining Microsoft at Bell Labs, including writing about Indexing by Latent Semantic Analysis. She was also granted a patent as a co-inventor on the process. Note that this patent was filed in April of 1989, and was published in August of 1992. The World Wide Web didn’t go live until August 1991. The LSI patent is:

Computer information retrieval using latent semantic structure
Inventors: Scott C. Deerwester, Susan T. Dumais, George W. Furnas, Richard A. Harshman, Thomas K. Landauer, Karen E. Lochbaum, and Lynn A. Streeter
Assigned to: Bell Communications Research, Inc.
US Patent: 4,839,853
Granted: June 13, 1989
Filed: September 15, 1988

Abstract

A methodology for retrieving textual data objects is disclosed. The information is treated in the statistical domain by presuming that there is an underlying, latent semantic structure in the usage of words in the data objects. Estimates to this latent structure are utilized to represent and retrieve objects. A user query is recouched in the new statistical domain and then processed in the computer system to extract the underlying meaning to respond to the query.

The problem that LSI was intended to solve:

Because human word use is characterized by extensive synonymy and polysemy, straightforward term-matching schemes have serious shortcomings–relevant materials will be missed because different people describe the same topic using different words and, because the same word can have different meanings, irrelevant material will be retrieved. The basic problem may be simply summarized by stating that people want to access information based on meaning, but the words they select do not adequately express intended meaning. Previous attempts to improve standard word searching and overcome the diversity in human word usage have involved: restricting the allowable vocabulary and training intermediaries to generate indexing and search keys; hand-crafting thesauri to provide synonyms; or constructing explicit models of the relevant domain knowledge. Not only are these methods expert-labor intensive, but they are often not very successful.

The summary section of the patent tells us that there is a potential solution to this problem. Keep in mind that this was developed before the world wide web grew to become the very large source of information that it is, today:

These shortcomings, as well as other deficiencies and limitations of information retrieval, are obviated, in accordance with the present invention, by automatically constructing a semantic space for retrieval. This is effected by treating the unreliability of observed word-to-text object association data as a statistical problem. The basic postulate is that there is an underlying latent semantic structure in word usage data that is partially hidden or obscured by the variability of word choice. A statistical approach is utilized to estimate this latent structure and uncover the latent meaning. Words, the text objects and, later, user queries are processed to extract this underlying meaning and the new, latent semantic structure domain is then used to represent and retrieve information.

To illustrate how LSI works, the patent provides a simple example, using a set of 9 documents (much smaller than the web as it exists today). The example includes documents that are about human/computer interaction topics. It really doesn’t discuss how a process such as this could handle something the size of the Web because nothing that size had quite existed yet at that point in time. The Web contains a lot of information and goes through changes frequently, so an approach that was created to index a known document collection might not be ideal. The patent tells us that an analysis of terms needs to take place, “each time there is a significant update in the storage files.”

There has been a lot of research and a lot of development of technology that can be applied to a set of documents the size of the Web. We learned, from Google that they are using a Word Vector approach developed by the Google Brain team, which was described in a patent that was granted in 2017. I wrote about that patent and linked to resources that it used in the post: Citations behind the Google Brain Word Vector Approach. If you want to get a sense of the technologies that Google may be using to index content and understand words in that content, it has advanced a lot since the days just before the Web started. There are links to papers cited by the inventors of that patent within it. Some of those may be related in some ways to Latent Semantic Indexing since it could be called their ancestor. The LSI technology that was invented in 1988 contains some interesting approaches, and if you want to learn a lot more about it, this paper is really insightful: A Solution to Plato’s Problem: The Latent Semantic Analysis Theory of Acquisition, Induction and Representation of Knowledge. There are mentions of Latent Semantic Indexing in Patents from Google, where it is used as an example indexing method:

Text classification techniques can be used to classify text into one or more subject matter categories. Text classification/categorization is a research area in information science that is concerned with assigning text to one or more categories based on its contents. Typical text classification techniques are based on naive Bayes classifiers, tf-idf, latent semantic indexing, support vector machines and artificial neural networks, for example.

~ Classifying text into hierarchical categories


Copyright © 2018 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Does Google Use Latent Semantic Indexing? appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


Google Giving Less Weight to Reviews of Places You Stop Visiting?

December 19, 2017 No Comments

Google-Timeline-Reviews

I don’t consider myself paranoid, but after reading a lot of Google patents, I’ve been thinking of my phone as my Android tracking device. It’s looking like Google thinks of phones similarly; paying a lot of attention to things such as a person’s location history. After reading a recent patent, I’m fine with Google continuing to look at my location history, and reviews that I might write, even though there may not be any financial benefit to me. When I write a review of a business at Google, it’s normally because I’ve either really liked that place or disliked it, and wanted to share my thoughts about it with others.

A Google patent application filed and published by the search engine, but not yet granted is about reviews of businesses.

It tells us about how reviews can benefit businesses:

Furthermore, once a review platform has accumulated a significant number of reviews it can be a useful resource for users to identify new entities or locales to visit or experience. For example, a user can visit the review platform to search for a restaurant at which to eat, a store at which to shop, or a place to have drinks with friends. The review platform can provide search results based on location, quality according to the reviews, pricing, and/or keywords included in textual reviews.

But, there are problems with reviews that this patent sets out to address and assist with:

However, one problem associated with review platforms is collecting a significant number of reviews. For example, a large majority of people do not take the time to visit the review platform and contribute a review for each point of interest they visit throughout a day.

Furthermore, even after a review is contributed by a user, the user’s opinion of the point of interest may change, rendering the contributed review outdated and inaccurate. For example, a restaurant for which the user previously provided a positive review may come under new ownership or experience a change in kitchen staff that causes the quality of the restaurant to decrease. As such, the user may cease visiting the restaurant or otherwise decrease a frequency of visits. However, the user may not take the time to return to the review platform and update their review.

The patent does have a solution to reviews that don’t get made or updated – if a person stops going to a place that they have reviewed in the past, the review that they submitted may be treated as a diminished review:

Thus, a location history associated with a user can provide one or more signals that indicate an implied review of points of interest. Therefore, systems and methods for using user location information to provide reviews are needed. In particular, systems and methods for providing a diminished review for a point of interest when a frequency of visits by one or more users to the point of interest decreases are desirable.

The pending patent application is at:

User Location History Implies Diminished Review
Inventors: Daniel Victor Klein and Dean Kenneth Jackson
US Patent Application 20170358015
Published: December 14, 2017
Filed: April 7, 2014

Abstract

Systems and methods for providing reviews are provided. One example system includes one or more computing devices. The system includes one or more non-transitory computer-readable media storing instructions that, when executed by the one or more computing devices, cause the one or more computing devices to perform operations. The operations include identifying, based on a location history associated with a user, a first signal. The first signal comprises a frequency of visits by the user to a first point of interest over a first time period. The operations include identifying, based on the location history associated with the user, a change in the first signal after the first time period. The operations include providing a diminished review for the user with respect to the first point of interest when the identified change comprises a decrease in the frequency of visits by the user to the first point of interest.

Some highlights from the patent description:

1. Location updates can be received from more than one mobile devices associated with a user, to create a location history over time.

2. Points of interest can be tracked and cover a really wide range of place types; or a point of interest such as a shopping mall may be treated as a single point of interest.

3. A person may control what information is collected about their location, and may be given a chance to modify or update that information.

4. Not visiting a particular place may lead to an assumption that a “user’s opinion of the point of interest has diminished or otherwise changed.”

5. A Diminished review might be a negative review or a lowering of a review score.

6. A reviewer may also be asked to “confirm or edit/elaborate on the previously contributed review,” if they don’t return to a place they have reviewed in a while.

7. User Contributed Reviews could be said to have a decay period, in which, their influence on search or rating systems wanes.”

8. Other factors besides a change of opinion about a place may be considered, such as a change of residence or workplace to a new location, or an overall change in visitation patterns for all points of interest. These types of changes may not lead to a diminished review.

9. Aggregated frequencies of visits from many people may be considered, and if many still continue to visit a place, then a change by one people may not be used to reduce an overall score for a place. If visits by many people show a decrease than an assumption that something has changed with the point of interest could affect the overall score.


Copyright © 2017 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Google Giving Less Weight to Reviews of Places You Stop Visiting? appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


Does Tomorrow Deliver Topical Search Results at Google?

November 14, 2017 No Comments
The Oldest Pepper Tree in California

At one point in time, search engines such as Google learned about topics on the Web from sources such as Yahoo! and the Open Directory Project, which provided categories of sites, within directories that people could skim through to find something that they might be interested in.

Those listings of categories included hierarchical topics and subtopics; but they were managed by human beings and both directories have closed down.

In addition to learning about categories and topics from such places, search engines used to use such sources to do focused crawls of the web, to make sure that they were indexing as wide a range of topics as they could.

It’s possible that we are seeing those sites replaced by sources such as Wikipedia and Wikidata and Google’s Knowledge Graph and the Microsoft Concept Graph.

Last year, I wrote a post called, Google Patents Context Vectors to Improve Search. It focused upon a Google patent titled User-context-based search engine.

In that patent we learned that Google was using information from knowledge bases (sources such as Yahoo Finance, IMDB, Wikipedia, and other data-rich and well organized places) to learn about words that may have more than one meaning.

An example from that patent was that the word “horse” has different meanings in different contexts.

To an equestrian, a horse is an animal. To a carpenter, a horse is a work tool when they do carpentry. To a gymnast, a horse is a piece of equipment that they perform manuevers upon during competitions with other gymnasts.

A context vector takes these different meanings from knowledge bases, and the number of times they are mentioned in those places to catalogue how often they are used in which context.

I thought knowing about context vectors was useful for doing keyword research, but I was excited to see another patent from Google appear where the word “context” played a featured role in the patent. When you search for something such as a “horse”, the search results you recieve are going to be mixed with horses of different types, depending upon the meaning. As this new patent tells us about such search results:

The ranked list of search results may include search results associated with a topic that the user does not find useful and/or did not intend to be included within the ranked list of search results.

If I was searching for a horse of the animal type, I might include another word in my query that identified the context of my search better. The inventors of this new patent seem to have a similar idea. The patent mentions

In yet another possible implementation, a system may include one or more server devices to receive a search query and context information associated with a document identified by the client; obtain search results based on the search query, the search results identifying documents relevant to the search query; analyze the context information to identify content; and generate a group of first scores for a hierarchy of topics, each first score, of the group of first scores, corresponding to a respective measure of relevance of each topic, of the hierarchy of topics, to the content.

From the pictures that accompany the patent it looks like this context information is in the form of Headings that appear above each search result that identify Context information that those results fit within. Here’s a drawing from the patent showing off topical search results (showing rock/music and geology/rocks):

Search Results in Context
Different types of ‘rock’ on a search for ‘rock’ at Google

This patent does remind me of the context vector patent, and the two processes in these two patents look like they could work together. This patent is:

Context-based filtering of search results
Inventors: Sarveshwar Duddu, Kuntal Loya, Minh Tue Vo Thanh and Thorsten Brants
Assignee: Google Inc.
US Patent: 9,779,139
Granted: October 3, 2017
Filed: March 15, 2016

Abstract

A server is configured to receive, from a client, a query and context information associated with a document; obtain search results, based on the query, that identify documents relevant to the query; analyze the context information to identify content; generate first scores for a hierarchy of topics, that correspond to measures of relevance of the topics to the content; select a topic that is most relevant to the context information when the topic is associated with a greatest first score; generate second scores for the search results that correspond to measures of relevance, of the search results, to the topic; select one or more of the search results as being most relevant to the topic when the search results are associated with one or more greatest second scores; generate a search result document that includes the selected search results; and send, to a client, the search result document.

It will be exciting to see topical search results start appearing at Google.


Copyright © 2017 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Does Tomorrow Deliver Topical Search Results at Google? appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


Semantic Keyword Research and Topic Models

November 14, 2017 No Comments

Seeing Meaning

I went to the Pubcon 2017 Conference this week in Las Vegas Nevada and gave a presentation about Semantic Search topics based upon white papers and patents from Google. My focus was on things such as Context Vectors and Phrase-Based Indexing.

I promised in social media that I would post the presentation on my blog so that I could answer questions if anyone had any.

I’ve been doing keyword research like this for years, where I’ve looked at other pages that rank well for keyword terms that I want to use, and identify phrases and terms that tend to appear upon those pages, and include them on pages that I am trying to optimize. It made a lot of sense to start doing that after reading about phrase based indexing in 2005 and later.

Some of the terms I see when I search for Semantic Keyword Research include such things as “improve your rankings,” and “conducting keyword research” and “smarter content.” I’m seeing phrases that I’m not a fan of such as “LSI Keywords” which has as much scientific credibility as Keyword Density, which is next to none. There were researchers from Bell Labs, in 1990, who wrote a white paper about Latent Semantic Indexing, which was something that was used with small (less than 10,000 documents) and static collections of documents (the web is constantly changing and hasn’t been that small for a long time.)

There are many people who call themselves SEOs who tout LSI keywords as being keywords that are based upon having related meanings to other words, unfortunately, that has nothing to do with the LSI that was developed in 1990.

If you are going to present research or theories about things such as LSI, it really pays to do a little research first. Here’s my presentation. It includes links to patents and white papers that the ideas within in are based upon. I do look forward to questions.


Copyright © 2017 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Semantic Keyword Research and Topic Models appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


Using Ngram Phrase Models to Generate Site Quality Scores

September 30, 2017 No Comments
Scrabble-phrases
Source: https://commons.wikimedia.org/wiki/File:Scrabble_game_in_progress.jpg
Photographer: McGeddon
Creative Commons License: Attribution 2.0 Generic

Navneet Panda, whom the Google Panda update is named after, has co-invented a new patent that focuses on site quality scores. It’s worth studying to understand how it determines the quality of sites.

Back in 2013, I wrote the post Google Scoring Gibberish Content to Demote Pages in Rankings, about Google using ngrams from sites and building language models from them to determine if those sites were filled with gibberish, or spammy content. I was reminded of that post when I read this patent.

Rather than explaining what ngrams are in this post (which I did in the gibberish post), I’m going to point to an example of ngrams at the Google n-gram viewer, which shows Google indexing phrases in scanned books. This article published by the Wired site also focused upon ngrams: The Pitfalls of Using Google Ngram to Study Language.

An ngram phrase could be a 2-gram, a 3-gram, a 4-gram, or a 5-gram phrase; where pages are broken down into two-word phrases, three-word phrases, four-word phrases, or 5 word phrases. If a body of pages are broken down into ngrams, they could be used to create language models or phrase models to compare to other pages.

Language models, like the ones that Google used to create gibberish scores for sites could also be used to determine the quality of sites, if example sites were used to generate those language models. That seems to be the idea behind the new patent granted this week. The summary section of the patent tells us about this use of the process it describes and protects:

In general, one innovative aspect of the subject matter described in this specification can be embodied in methods that include the actions of obtaining baseline site quality scores for a plurality of previously-stored sites; generating a phrase model for a plurality of sites including the plurality of previously-scored sites, wherein the phrase model defines a mapping from phrase-specific relative frequency measures to phrase-specific baseline site quality scores; for a new site, the new site not being one of the plurality of previously-scored sites, obtaining a relative frequency measure for each of a plurality of phrases in the new site; determining an aggregate site quality score for the new site from the phrase model using the relative frequency measures of the plurality of phrases in the new site; and determining a predicted site quality score for the new site from the aggregate site quality score.

The newly granted patent from Google is:

Predicting site quality
Inventors: Navneet Panda and Yun Zhou
Assignee: Google
US Patent: 9,767,157
Granted: September 19, 2017
Filed: March 15, 2013

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for predicating a measure of quality for a site, e.g., a web site. In some implementations, the methods include obtaining baseline site quality scores for multiple previously scored sites; generating a phrase model for multiple sites including the previously scored sites, wherein the phrase model defines a mapping from phrase specific relative frequency measures to phrase specific baseline site quality scores; for a new site that is not one of the previously scored sites, obtaining a relative frequency measure for each of a plurality of phrases in the new site; determining an aggregate site quality score for the new site from the phrase model using the relative frequency measures of phrases in the new site; and determining a predicted site quality score for the new site from the aggregate site quality score.

In addition to generating ngrams from text upon sites, in some versions of the implementation of this patent will include generating ngrams from anchor text of links pointing to pages of the sites. Building a phrase model involves calculating the frequency of n-grams on a site “based on the count of pages divided by the number of pages on the site.”

The patent tells us that site quality scores can impact rankings of pages from those sites, according to the patent:

Obtain baseline site quality scores for a number of previously-scored sites. The baseline site quality scores are scores used by the system, e.g., by a ranking engine of the system, as signals, among other signals, to rank search results. In some implementations, the baseline scores are determined by a backend process that may be expensive in terms of time or computing resources, or by a process that may not be applicable to all sites. For these or other reasons, baseline site quality scores are not available for all sites.


Copyright © 2017 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Using Ngram Phrase Models to Generate Site Quality Scores appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


Google’s Project Jacquard: Textile-Based Device Controls

September 16, 2017 No Comments

Textile Devices with Controls Built into them

I remember my father building some innovative plastics blow molding machines where he added a central processing control device to the machines so that all adjustable controls could be changed from one place. He would have loved seeing what is going on at Google these days, and the hardware that they are working on developing, which focuses on building controls into textiles and plastics.

Outside of search efforts from Google, but it is interesting seeing what else they may get involved in since that is begining to cover a wider and wider range of things, from self-driving cars to glucose analyzing contact lenses.

This morning I tweeted an article I saw in the Sun, from the UK that was kind of interesting: Seating Plan Google’s creating touch sensitive car seats that will switch on air con, sat-nav and change music with a BUM WIGGLE

It had me curious if I could find patents related to Google’s Project Jacquard, so I went to the USPTO website, and searched, and a couple came up.

Attaching Electronic Components to Interactive Textiles
Inventors: Karen Elizabeth Robinson, Nan-Wei Gong, Mustafa Emre Karagozler, Ivan Poupyrev
Assignee: Google
US Patent Application: 20170232538
Granted: August 17, 2017
Filed: May 3, 2017

Abstract

This document describes techniques and apparatuses for attaching electronic components to interactive textiles. In various implementations, an interactive textile that includes conductive thread woven into the interactive textile is received. The conductive thread includes a conductive wire (e.g., a copper wire) that that is twisted, braided, or wrapped with one or more flexible threads (e.g., polyester or cotton threads). A fabric stripping process is applied to the interactive textile to strip away fabric of the interactive textile and the flexible threads to expose the conductive wire in a window of the interactive textile. After exposing the conductive wires in the window of the interactive textile, an electronic component (e.g., a flexible circuit board) is attached to the exposed conductive wire of the conductive thread in the window of the interactive textile.

Interactive Textiles
Inventors: Ivan Poupyrev
Assignee: Google Inc.
US Patent Application: 20170115777
Granted: April 27, 2017
Filed: January 4, 2017

Abstract

This document describes interactive textiles. An interactive textile includes a grid of conductive thread woven into the interactive textile to form a capacitive touch sensor that is configured to detect touch input. The interactive textile can process the touch-input to generate touch data that is useable to control various remote devices. For example, the interactive textiles may aid users in controlling volume on a stereo, pausing a movie playing on a television, or selecting a web page on a desktop computer. Due to the flexibility of textiles, the interactive textile may be easily integrated within flexible objects, such as clothing, handbags, fabric casings, hats, and so forth. In one or more implementations, the interactive textiles may be integrated within various hard objects, such as by injection molding the interactive textile into a plastic cup, a hard casing of a smart phone, and so forth.

The drawings that accompanied this patent were interesting because they showed off how gestures used on controls might be used:

Controls in action

textile controller
Here is a look at the textile controller.
double tap
A double tap on the controller is possible.
two finger touch
A two finger touch on the controller is also possible.
swipe up
You can swipe up on textile controllers
extruder
An Extruder showing plastics materials being heated up to send to a mold
molded devices
The patent shows off plastic molder devices with controls built into them.

My father would have gotten a kick out of seeing a plastics extruder in a Google Patent (I know I did.)

It will be interesting seeing textile and plastics controls come out as described in these patents.


Copyright © 2017 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Google’s Project Jacquard: Textile-Based Device Controls appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


Citations behind the Google Brain Word Vector Approach

September 2, 2017 No Comments

Cardiff-Tidal-pools

In October of 2015, a new algorithm was announced by members of the Google Brain team, described in this post from Search Engine Land – Meet RankBrain: The Artificial Intelligence That’s Now Processing Google Search Results One of the Google Brain team members who gave Bloomberg News a long interview on Rankbrain, Gregory S. Corrado was a co-inventor on a patent that was granted this August along with other members of the Google Brain team.

In the SEM Post article, RankBrain: Everything We Know About Google’s AI Algorithm we are told that Rankbrain uses concepts from Geoffrey Hinton, involving Thought Vectors. The summary in the description from the patent tells us about how a word vector approach might be used in such a system:

Particular embodiments of the subject matter described in this specification can be implemented so as to realize one or more of the following advantages. Unknown words in sequences of words can be effectively predicted if the surrounding words are known. Words surrounding a known word in a sequence of words can be effectively predicted. Numerical representations of words in a vocabulary of words can be easily and effectively generated. The numerical representations can reveal semantic and syntactic similarities and relationships between the words that they represent.

By using a word prediction system having a two-layer architecture and by parallelizing the training process, the word prediction system can be can be effectively trained on very large word corpuses, e.g., corpuses that contain on the order of 200 billion words, resulting in higher quality numeric representations than those that are obtained by training systems on relatively smaller word corpuses. Further, words can be represented in very high-dimensional spaces, e.g., spaces that have on the order of 1000 dimensions, resulting in higher quality representations than when words are represented in relatively lower-dimensional spaces. Additionally, the time required to train the word prediction system can be greatly reduced.

So, an incomplete or ambiguous query that contains some words could use those words to predict missing words that may be related. Those predicted words could then be used to return search results that the original words might have difficulties returning. The patent that describes this prediction process is:

Computing numeric representations of words in a high-dimensional space

Inventors: Tomas Mikolov, Kai Chen, Gregory S. Corrado and Jeffrey A. Dean
Assignee: Google Inc.
US Patent: 9,740,680
Granted: August 22, 2017
Filed: May 18, 2015

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.

One of the things that I found really interesting about this patent was that it includes a number of citations from the applicants for the patent. They looked worth reading, and many of them were co-authored by inventors of this patent, by people who are well-known in the field of artificial intelligence, or by people from Google. When I saw them, I started hunting for locations on the Web for them, and I was able to find copies of them. I will be reading through them and thought it would be helpful to share those links; which was the idea behind this post. It may be helpful to read as many of these as possible before tackling this patent. If anything stands out in any way to you, let us know what you’ve found interesting.

Bengio and LeCun, “Scaling learning algorithms towards AI,” Large-Scale Kernel Machines, MIT Press, 41 pages, 2007. cited by applicant.

Bengio et al., “A neural probabilistic language model,” Journal of Machine Learning Research, 3:1137-1155, 2003. cited by applicant .

Brants et al., “Large language models in machine translation,” Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Language Learning, 10 pages, 2007. cited by applicant .

Collobert and Weston, “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning,” International Conference on Machine Learning, ICML, 8 pages, 2008. cited by applicant .

Collobert et al., “Natural Language Processing (Almost) from Scratch,” Journal of Machine Learning Research, 12:2493-2537, 2011. cited by applicant .

Dean et al., “Large Scale Distributed Deep Networks,” Neural Information Processing Systems Conference, 9 pages, 2012. cited by applicant .

Elman, “Finding Structure in Time,” Cognitive Science, 14, 179-211, 1990. cited by applicant .

Huang et al Improving Word Representations via Global Context and Multiple Word Prototypes,” Proc. Association for Computational Linguistics, 10 pages, 2012. cited by applicant .

Mikolov and Zweig, “Linguistic Regularities in Continuous Space Word Representations,” submitted to NAACL HLT, 6 pages, 2012. cited by applicant .

Mikolov et al., “Empirical Evaluation and Combination of Advanced Language Modeling Techniques,” Proceedings of Interspeech, 4 pages, 2011. cited by applicant .

Mikolov et al., “Extensions of recurrent neural network language model,” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5528-5531, May 22-27, 2011. cited by applicant .

Mikolov et al., “Neural network based language models for highly inflective languages,” Proc. ICASSP, 4 pages, 2009. cited by applicant .

Mikolov et al., “Recurrent neural network based language model,” Proceedings of Interspeech, 4 pages, 2010. cited by applicant .

Mikolov et al., “Strategies for Training Large Scale Neural Network Language Models,” Proc. Automatic Speech Recognition and Understanding, 6 pages, 2011. cited by applicant .

Mikolov, “RNNLM Toolkit,” Faculty of Information Technology (FIT) of Brno University of Technology [online], 2010-2012 [retrieved on Jun. 16, 2014]. Retrieved from the Internet: < URL: http://www.fit.vutbr.cz/.about.imikolov/rnnlm/>, 3 pages. cited by applicant .

Mikolov, “Statistical Language Models based on Neural Networks,” PhD thesis, Brno University of Technology, 133 pages, 2012. cited by applicant .

Mnih and Hinton, “A Scalable Hierarchical Distributed Language Model,” Advances in Neural Information Processing Systems 21, MIT Press, 8 pages, 2009. cited by applicant .

Morin and Bengio, “Hierarchical Probabilistic Neural Network Language Model,” AISTATS, 7 pages, 2005. cited by applicant .

Rumelhart et al., “Learning representations by back-propagating errors,” Nature, 323:533-536, 1986. cited by applicant .

Turian et al., “MetaOptimize / projects / wordreprs /” Metaoptimize.com [online], captured on Mar. 7, 2012. Retrieved from the Internet using the Wayback Machine: < URL: http://web.archive.org/web/20120307230641/http://metaoptimize.com/project- s/wordreprs>, 2 pages. cited by applicant .
Turlan et al., “Word Representations: A Simple and General Method for Semi-Supervised Learning,” Proc. Association for Computational Linguistics, 384-394, 2010. cited by applicant .

Turney, “Measuring Semantic Similarity by Latent Relational Analysis,” Proc. International Joint Conference on Artificial Intelligence, 6 pages, 2005. cited by applicant .

Zweig and Burges, “The Microsoft Research Sentence Completion Challenge,” Microsoft Research Technical Report MSR-TR-2011-129, 7 pages, Feb. 20, 2011. cited by applicant.


Copyright © 2017 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Citations behind the Google Brain Word Vector Approach appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓