Smarter Marketing with AddThis: Audience Intelligence Platform Uses Machine Learning to Classify Data from 15M Domains

AddThis Website Marketing Tools

TL; DR: AddThis, a well-known pioneer of “Social Sharing” technology, has engineered a suite of marketing tools to help site owners target their audience and develop authentic, lasting relationships. Taking viewer interest data from 15 million domains, coupled with their cutting-edge, in-house innovations, AddThis scrapes and continuously processes large amounts of site visitor data, and offers that intel back to customers. Website owners can create personalized media and content recommendations, for smarter marketing.

In 2006, AddThis developed their “Social Buttons” — making an unforgettable splash in the social media space — for sharing content on social media. Today, those buttons appear on 15 million domains, and AddThis’s incredible team of engineers has developed the technology to scrape those webpages, classify the data, and relay that user intel back to website owners. We had the chance to chat with AddThis’s VP and GM of Publisher Products, Charlie Reverte, to discuss the product and why AddThis is, yet again, making an incredible impact in the online world.

Audience Intelligence Platform: Personalized Content Recommendation

By taking data from all the sites — 15 million domains and counting — equipped with their flagship “Social Buttons,” AddThis allows site owners to create custom user experiences for site visitors and feature content that matches their individual user interests.

AddThis audience discovery and content recommendation

AddThis discovers web users’ interests, then recommends content to site owners via audience-targeting tools.

The Advantage for Site Owners & Developers: Conversions & Monetization

The concept of conversions, especially the monetization aspect of conversions, often escapes the thought of a lot of developers; however, Charlie explained how AddThis addresses conversions in a way that’s truly advantageous to developers in particular.

“I think developers really have that extra advantage, because they have the chance to treat their visitors the way they should be treated (like they would in person),” he said. “We can provide data to you and you can make that light up; you can make that live.”

He gave the example of those (obnoxious) e-newsletter sign-up prompts that popup for first-time visitors on certain websites. Charlie assimilated this to “trying to close before you even know the person.” Imagine: you’ve never seen the site before, you don’t know if you can trust it or if it even contains whatever you’re looking for — NO, I’m not giving you my e-mail address, thanks.

“We think you should go on a date first,” Charlie said jokingly (but not really), “and really nurture the relationship from the first pageview.”

Steps to Implementation: 1 Line of Code & Full Dashboard Control

Step One: “One line of JavaScript, and you’re done,” Charlie said.

Everything else can be managed through the dashboard. “We have the big, really powerful JavaScript API where you can do everything on code,” he said; however, with release cycles and testing and such, they quickly saw the benefits of a non-code-centric UI. “If you can take things out of code and have it be controlled in the dashboard, it’s much more powerful,” he said.

“If you want to get your hands dirty, you can control all the CSS and HTML through the dashboard,” Charlie explained further. You can also add recommendation widgets, update the conversion modals, and control the audience targeting rules based on interest or based on how long the user has been on your site — and you can do it all on the fly. “The conversion modals are the ones that people want to update the most often,” Charlie said, “and those are the ones for which we’ve really built-out the control capabilities within the dashboard.”

AddThis audience targeting technology

AddThis’s Custom Message Editor: showing an overlay tool, with email capture goal and interest-targeting rules

AddThis offers several different popup types for you to control, but none of them mess with the actual URL of the page (as this makes Google mad). Popups may appear as you exit a site; some may remain consistently at the top of the page. “We’ll show them the right popup, the right modal, to skew them where you want them to go based upon your conversion flow,” Charlie said.

“All this wasn’t true a year ago,” Charlie admitted. “We’ve been working hard.”

New Feature: To code with the AddThis API, you can add code and deploy via the dashboard. (With JSON specification, you can make a really specific, custom rule.)

Going Further: AddThis Gets to Know 2 Billion Users

AddThis may not know your name or your address or your Zodiak sign, but it does know if you’re into sports, shopping, or romantic comedies. Their technology discovers your interests based off of the sites that you visit. This information can be used to help site owners understand what you, as the viewer, are hoping to get out of visiting their site.

Using Social Buttons to Track Where Users Go On the Web

With “Social Buttons” on 15 million domains, AddThis collects a tremendous amount of site visitor data, which they use to help website owners create custom user experiences for their audiences. Knowing what users are searching for, sharing, and looking at on the Internet, and being able to understand that information at scale, allows website owners to capitalize on viewers’ desires. Like those clickable buttons made sharing on social media simple, personalized marketing and content make websites more attractive to viewers, which directly impacts conversions (and revenue!).

15M Domains Classified with Machine Learning

The idea of taking a lot of data and making smart decisions is becoming increasingly popular and necessary for businesses of all scales. Being able to absorb and understand a wealth of information about your customers is invaluable, because that data can be leveraged to meet the needs of your conversion funnel. AddThis took this goal to heart by building their own in-house data processor (Hydra), which is no small feat. Then, they use a machine learning classifier to actually comprehend the content and sort, group, and classify users’ interests accordingly.

Knowing what your audience likes is the first step to delivering on their expectations, but how does AddThis get to know over two billion viewers? Charlie explained that when you visit a site equipped with AddThis (JavaScript on page), you are given an anonymous identifier, which AddThis uses to discover your interests. AddThis processes over three billion requests every day, offering user interest data to their customers to use for audience targeting, content recommendation, and ultimately, enhancing engagement.

The Result: Individual User Interests Discovered & Content Recommended

With all this data and user info scraped from the Web, AddThis can then provide same-site content recommendation. As the site owner, you get full control over the content promoted; however, “by default, we don’t show anyone else’s content,” Charlie explained. “We knew from the beginning that we didn’t want to be taking away traffic from the website,” he said.

AddThis gives site owners the power to prompt different modals, welcome bars, or calls-to-action for different users. You can even send users to different landing pages. “We want to make them convert on your website,” Charlie said. “We want to make them reach your goals and stick around, and we think we can do that better than anybody else, because we have all this data.”

Content Personalization Type 1: Automatic

AddThis offers two content-sharing personalization options, the first of which autonomously runs recommendations for site owners. “It’s really smart and hands-off,” Charlie said.

Content Personalization Type 2: Role-Based

The second personalization option allows you to define how you want to think about your audience and user personas. “We allow you to come into our system and create “rules,” where you can optimize your audience by all this awesome main data we bring you,” Charlie said. That’s where assigning specific modals, welcome messages, and prompts to targeted groups comes in.

The Technology: Hydra, Machine Learning Classifier, & the Hardware

As a founding engineer of AddThis, Charlie was well-prepared to blow us away with the intricate build-out of the tools AddThis uses to process and classify data, and the servers supporting it all.

AddThis In-House Datacenter

AddThis built their own bare-metal server network, which processes over three billion requests every day.

“We knew we wanted to up our processing game, so the first thing we did was build our own in-house processing center,” Charlie said. Enter Hydra (AKA, AddThis’s “secret sauce”). “We really wanted to leverage that ability to get direct access to the hardware.”

Hydra’s Inspiration: Former CTO Recognized 2 “Mega Trends”

Less than 50 people were aboard the AddThis team when they decided to take on building Hydra, but really the core team that contributed to its construction consisted of less than five people. “A lot of people have worked on it over the years,” Charlie said. “Today, we have about a dozen people that contribute to the core system at our company.”

AddThis’s then-CTO, Stewart Allen, spearheaded the initial Hydra project. Stewart was also the former CTO of a company called webMethods, which was a player in the initial Dot-Com Boom, so Stewart was very familiar with the market. Charlie said Stewart “knew the lay of the land and he saw a couple of mega trends going on”:

1. Horizontal Scalability Problem  tons of data to process across tons of machines

2. You can’t know now what you’ll want to know later need aggregations for questions

“We don’t try to just build stuff in-house willy nilly. We try to really make it count when we have to,” Charlie said. The team began working on Hydra a few years before Apache’s Hadoop came out on the market; however, Charlie clarified that “Hydra solves a different problem than Hadoop.”

He explained that Hadoop and MapReduce systems in general are great for taking a humongous data set and processing it with a single question, but what if you need to further develop that analysis later?

Before Hydra, programmers would be sent back to the drawing board to re-crunch data whenever new questions arose, rather than being able to develop on top of existing data that had already been processed.

The Hydra Difference

First of all, Hydra is optimized to continuously update data, so you don’t have to go back and completely re-crunch numbers every time you have a new question to process.

Secondly, Hydra’s media format processes data in trees. These trees serve almost as a database index, which allows you to ask a bunch of flexible queries systematically and at scale. “You don’t even have to know what question you’re asking ahead of time,” Charlie said. “You can create a Hydra job that then basically summarizes your data into the core dimensions and you can still ask really flexible questions on it.”

Open-Sourcing Hydra

“We’ve deployed Hydra across several hundred machines across our datacenter and we’ve been using it for several years,” Charlie told us. Last year, Hydra processed over one trillion events.

AddThis team member continuing research and technological discovery with hackathons

AddThis is very proud of their contributions made to the open-source community, including Hydra and stream-lib.

“We were so excited about it, we wanted to share it with the world,” Charlie explained their thoughts behind open-sourcing. “If Hydra’s a community, it’s just going to get better and better.”

Stream-Summarizing Bonus: stream-lib

Charlie also told us about a really important part of Hydra: stream-lib, which allows AddThis to process huge volumes of data with small machines with small memory. “We open-sourced stream-lib before we open-sourced Hydra,” he said.

This subset of Hydra has become really popular, as it is used by major players in the web sphere, including Twitter and Cassandra. It’s no surprise that stream-lib is one of the “most prided contributions” AddThis has made to the open-source community, according to Charlie.

Check out GitHub/AddThisHydra and stream-lib are updated frequently, with a ton of traffic.

Machine Learning Classification

From a technical side, scraping all webpages that have AddThis for site information and audience discovery may sound simple, but we imagined the implementation process would be significantly more complex. Scraping mass quantities of data is one thing, but actually understanding all of that information and doing something actionable with it, is an entirely different problem (and common frustration in the development world).

Charlie pointed out, “especially doing things at scale,” he said, “trying to do something across a whole cluster of machines — it’s really a bear.”

AddThis uses a machine learning classifier to scrape all webpages that have AddThis on them. This allows them to see and process 50 million new webpages per day.

AddThis machine learning classifier discovers user interests

AddThis allows you to create rules based on the user interests discovered by the machine learning classifier.

“We use that activity to understand what the preferences of visitors are, and that’s what powers our whole data platform,” Charlie said. “Now we can personalize everything about a website’s conversions and conversion funnel and use the same audience data to power personalization within online advertising,” he explained.

The Hardware: From Dedicated to Colocated Servers

Fun Fact: AddThis doesn’t use any cloud hosting or Amazon Web Services.

AddThis began on dedicated servers with Peer1 Hosting, but once they grew beyond 20 or 30 servers, they switched over to their own bare-metal platform on colocated servers. “Both East Coast and West Coast,” Charlie added, “so we have redundancy.” By using their own server network, AddThis is able to run their entire platform (3B+ requests to their network per day) for about one-fifth the cost of Amazon.

Final Gist

“Everybody knows us for the sharing buttons,” Charlie said, “and that’s sort of the rocket ship we rode to being on 15 million domains.” Today, AddThis reaches two billion people across the globe. While plenty of players may do analytics really well or data processing really well, AddThis is the only one to offer their own scraped, processed, and classified data to their customers.

Looking towards the future, AddThis aims to maintain and grow its legacy as a leading personalization network. “We want to continue to be the world’s largest personalization network,” Charlie said. “Data fuels everything with online advertising — the whole ecosystem lives and breathes on data.”

AddThis anticipates the Web in its entirety following in similar suit. With personalization networks powering the Web — offering user-controlled data to leverage the site experience — AddThis is in the perfect position to dominate the (personalized) future of the World Wide Web.


Addendum: Charlie’s Background & His Road to AddThis

AddThis's VP/GM of Publisher Products walked us through their tech stack

AddThis VP and GM of Publisher Products: Charlie Reverte

In college, Charlie was very interested in FPGAs. “I LOVED the hardware side,” he said, noting that with design, machining, et cetera, it can be months before plans actually become reality.

In grad school, Charlie befriended the would-be founder of AddThis while working in a medical robotics lab. The heads-up display system they designed for knee replacements would allow orthopedic surgeons to see the bones on top of the skin before cutting, “like having x-ray vision,” Charlie said.

After their work “basically rotted on the shelf,” to use his words, Charlie came to a conclusion: “If I want to impact the world, it can’t just be with ideas.” So he made the jump from robotics to the Web, soon joining AddThis’s founding team.

Looking back on his jump from hardware to software, Charlie said, “with software, the turnaround time is seconds.” He describes his redirected career field as, “the ultimate space where you can have ideas and bring them to life in the shortest time possible.”

Photo Sources: addthis.com; bizjournals.com

Ryan Frankel

Questions or Comments? Ask Ryan!

Ask a question and Ryan will respond to you. We strive to provide the best advice on the net and we are here to help you in any way we can.