CBPO

Tag: Scores

Etleap scores $1.5 million seed to transform how we ingest data

April 24, 2018 No Comments

Etleap is a play on words for a common set of data practices: extract, transform and load. The startup is trying to place these activities in a modern context, automating what they can and in general speeding up what has been a tedious and highly technical practice. Today, they announced a $ 1.5 million seed round.

Investors include First Round Capital, SV Angel, Liquid2, BoxGroup and other unnamed investors. The startup launched five years ago as a Y Combinator company. It spent a good 2.5 years building out the product says CEO and founder Christian Romming. They haven’t required additional funding up until now because they have been working with actual customers. Those include Okta, PagerDuty and Mode among others.

Romming started out at ad tech startup VigLink and while there he encounter a problem that was hard to solve. “Our analysts and scientists were frustrated. Integration of the data sources wasn’t always a priority and when something broke, they couldn’t get it fixed until a developer looked at it.” That lack of control slowed things down and made it hard to keep the data warehouse up-to-date.

He saw an opportunity in solving that problem and started Etleap. While there were (and continue to be) legacy solutions like Informatica, Talend and Microsoft SQL Server Integration Services, he said when he studied these at a deeply technical level, he found they required a great deal of help to implement. He wanted to simplify ETL as much as possible, putting data integration into the hands of much less technical end users, rather than relying on IT and consultants.

One of the problems with traditional ETL is that the data analysts who make use of the data tend to get involved very late after the tools have already been chosen and Romming says his company wants to change that. “They get to consume whatever IT has created for them. You end up with a bread line where analysts are at the mercy of IT to get their jobs done. That’s one of the things we are trying to solve. We don’t think there should be any engineering at all to set up ETL pipeline,” he said.

Etleap is delivered as managed SaaS or you can run it within your company’s AWS accounts. Regardless of the method, it handles all of the managing, monitoring and operations for the customer.

Romming emphasizes that the product is really built for cloud data warehouses. For now, they are concentrating on the AWS ecosystem, but have plans to expand beyond that down the road. “We want help more enterprise companies make better use of their data, while modernizing data warehousing infrastructure and making use of cloud data warehouses,” he explained.

The company is currently has 15 employees, but Romming plans to at least double that in the next 12-18 months, mostly increasing the engineering team to help further build out the product and create more connectors.


Enterprise – TechCrunch


Using Ngram Phrase Models to Generate Site Quality Scores

September 30, 2017 No Comments
Scrabble-phrases
Source: https://commons.wikimedia.org/wiki/File:Scrabble_game_in_progress.jpg
Photographer: McGeddon
Creative Commons License: Attribution 2.0 Generic

Navneet Panda, whom the Google Panda update is named after, has co-invented a new patent that focuses on site quality scores. It’s worth studying to understand how it determines the quality of sites.

Back in 2013, I wrote the post Google Scoring Gibberish Content to Demote Pages in Rankings, about Google using ngrams from sites and building language models from them to determine if those sites were filled with gibberish, or spammy content. I was reminded of that post when I read this patent.

Rather than explaining what ngrams are in this post (which I did in the gibberish post), I’m going to point to an example of ngrams at the Google n-gram viewer, which shows Google indexing phrases in scanned books. This article published by the Wired site also focused upon ngrams: The Pitfalls of Using Google Ngram to Study Language.

An ngram phrase could be a 2-gram, a 3-gram, a 4-gram, or a 5-gram phrase; where pages are broken down into two-word phrases, three-word phrases, four-word phrases, or 5 word phrases. If a body of pages are broken down into ngrams, they could be used to create language models or phrase models to compare to other pages.

Language models, like the ones that Google used to create gibberish scores for sites could also be used to determine the quality of sites, if example sites were used to generate those language models. That seems to be the idea behind the new patent granted this week. The summary section of the patent tells us about this use of the process it describes and protects:

In general, one innovative aspect of the subject matter described in this specification can be embodied in methods that include the actions of obtaining baseline site quality scores for a plurality of previously-stored sites; generating a phrase model for a plurality of sites including the plurality of previously-scored sites, wherein the phrase model defines a mapping from phrase-specific relative frequency measures to phrase-specific baseline site quality scores; for a new site, the new site not being one of the plurality of previously-scored sites, obtaining a relative frequency measure for each of a plurality of phrases in the new site; determining an aggregate site quality score for the new site from the phrase model using the relative frequency measures of the plurality of phrases in the new site; and determining a predicted site quality score for the new site from the aggregate site quality score.

The newly granted patent from Google is:

Predicting site quality
Inventors: Navneet Panda and Yun Zhou
Assignee: Google
US Patent: 9,767,157
Granted: September 19, 2017
Filed: March 15, 2013

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for predicating a measure of quality for a site, e.g., a web site. In some implementations, the methods include obtaining baseline site quality scores for multiple previously scored sites; generating a phrase model for multiple sites including the previously scored sites, wherein the phrase model defines a mapping from phrase specific relative frequency measures to phrase specific baseline site quality scores; for a new site that is not one of the previously scored sites, obtaining a relative frequency measure for each of a plurality of phrases in the new site; determining an aggregate site quality score for the new site from the phrase model using the relative frequency measures of phrases in the new site; and determining a predicted site quality score for the new site from the aggregate site quality score.

In addition to generating ngrams from text upon sites, in some versions of the implementation of this patent will include generating ngrams from anchor text of links pointing to pages of the sites. Building a phrase model involves calculating the frequency of n-grams on a site “based on the count of pages divided by the number of pages on the site.”

The patent tells us that site quality scores can impact rankings of pages from those sites, according to the patent:

Obtain baseline site quality scores for a number of previously-scored sites. The baseline site quality scores are scores used by the system, e.g., by a ranking engine of the system, as signals, among other signals, to rank search results. In some implementations, the baseline scores are determined by a backend process that may be expensive in terms of time or computing resources, or by a process that may not be applicable to all sites. For these or other reasons, baseline site quality scores are not available for all sites.


Copyright © 2017 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Using Ngram Phrase Models to Generate Site Quality Scores appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓


Apttus scores $55M as it closes in on an IPO

September 13, 2017 No Comments

 Apttus, the unicorn quote-to-cash vendor built on the Salesforce platform, announced a $ 55 million round, which is likely its final private investment on the way to an IPO. While CEO Kirk Krappe wouldn’t definitively confirm the company was going public, he did say that today’s round was about gaining the confidence of future investors. “We decided we needed a certain amount… Read More
Enterprise – TechCrunch


Will Google Start Giving People Social Media Influencer Scores?

April 29, 2017 No Comments

Social Media Influencer Scores

A patent granted to Google this week tells us about social media influencer scores developed at Google that sound very much like the scores at Klout. In the references section of the patent, Klout is referred to a couple of times as well, with a link to the Wikipedia Page about Klout, and the Klout FAQ page. We aren’t given a name for these influencer scores in Google’s patent, but it does talk about topic-based influencer scores and advertisers.

Many patents are published that might give the inventors behind those patents a right to the technology described in them, but often the decision to move ahead with the processes described in those patents might be based upon business-based matters, such as whether or not there might be value is pursuing the patent. When I read this patent, I was reminded of an earlier patent from Google from a couple of years ago that described an advertising model that used social media influencers and their interests called Adheat. That patent was AdHeat Advertisement Model for Social Network. A whitepaper that gives us a little more indepth information about that process was AdHeat: An Influence-based Diffusion Model for Propagating Hints to Match Ads. One of the authors/inventors, Edward Chang left Google after the paper came out to join HTC as their Vice President of Research and Innovation.

This new patent was originally filed on May 29, 2012. Edward Chang left Google for HTC in July, 2012. I don’t know if those events are related, but the idea of using social media influencers in advertising is an interesting one. The patent doesn’t pinpoint specific social media platforms that would be used the way that Klout does. Interestingly, Klout does use Google+ as one of the social media networks that they use to generate Klout Scores.

I like seeing what Google patents say about things on the Web. Their introduction to social media and to influencer scores was interesting:

Social media is pervasive in today’s society. Friends keep in contact throughout the day on social networks. Fans can follow their favorite celebrities and interact on blogs, micro-blogs, and the like. Such media are referred to as “social media,” which can be considered media primarily, but not exclusively, for social interaction, and which can use highly accessible and scalable communication techniques. Brands and products mentioned on such sites can reflect customers’ interests and feedback.

Some technologies have been developed to analyze social media. For example, some systems allow users to discover their “influence scores” on various social media. An influence score is a metric to measure a user’s impact in social media.

The patent tells us about the role of the process it defines:

…one aspect of the subject matter described in this specification can be embodied in methods that include the actions of identifying a user in a community; determining an influence score to be associated with the user in the community for a particular topic including determining a reach of one or more communications that relate to the particular topic that have been distributed from the user in the community; evaluating the reach as compared to one or more other users in the community for the particular topic; and storing the influence score in association with the user.

This new patent tells us about

  1. Identifying a user in a community;
  2. Determining an influence score to be associated with the user in the community for a particular topic,
  3. Determining a reach of communications that relate to the particular topic distributed from the user to other users in the community, and
  4. Evaluating that reach and comparing it to the reach of communications from other users in the community for the particular topic; and
  5. storing the influence score in association with the user.
  6. The patent also tells us that the following are advantages to be gained from the use of the process described in the patent:

    (1) The subject matter can be used to attribute viral growth to certain individuals or selected group.
    (2) Such attribution can be used for targeted advertising to the selected group or even to the individuals or other individuals that are influenced by the individual or group.

    The patent is:

    Determining influence in a social community
    Inventors: Emily K. Moxley, Vinod Anupam, Hobart Sze, Dani Suleman, Khanh B. Nguyen
    Assignee: Google Inc.
    US Patent 9,632,972
    Granted: April 25, 2017
    Filed: May 29, 2012

    Abstract

    Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining influence in a social community. In one aspect, a method includes identifying a user in a community; determining an influence score to be associated with the user in the community for a particular topic, including: determining a reach of one or more communications that relate to the particular topic that have been distributed from the user to other users in the community, and evaluating the reach as compared to the reach of one or more communications distributed from other users in the community for the particular topic; and storing the influence score in association with the user.

    The patent is worth reading in full, and it contains some interesting insights including some hints regarding whether Google might engage in this type of social media advertising (see the screenshot from the patent that starts this post, showing influencers and topic scores for them, which is described in a little more detail in the patent.

    I also liked this quote from the patent, and wanted to make sure that I shared it, because it raises a good point:

    Every community has individuals who influence that community. From a prominent economist’s advice on economics to a celebrity buying the latest designer bag, thousands of people pay attention to what influential individuals are doing within their field. However, less attention is paid when an influential individual opines on a topic outside their field. For example, the thousands of individuals that pay attention to the economists on economics would be unlikely to pay attention to the economist’s latest jacket purchase.

    These social media influencer scores do seem very similar to what Klout is doing. Would Google venture into such territory?


Copyright © 2017 SEO by the Sea ⚓. This Feed is for personal non-commercial use only. If you are not reading this material in your news aggregator, the site you are looking at may be guilty of copyright infringement. Please contact SEO by the Sea, so we can take appropriate action immediately.
Plugin by Taragana

The post Will Google Start Giving People Social Media Influencer Scores? appeared first on SEO by the Sea ⚓.


SEO by the Sea ⚓