61.5% of Web Traffic Is Not Human

ra2studio/Shutterstock.com

And here's how to build your own little traffic bot, even though you shouldn't.

It happened last year for the first time: Bot traffic eclipsed human traffic, according to the bot-trackers at Incapsula.

This year, Incapsula says 61.5 percent of traffic on the Web is non-human. 

Now, you might think this portends the arrival of "The Internet of Things"—that ever-promised network that will connect your fridge and car to your smartphone. But it does not.

This non-human traffic is search bots, scrapers, hacking tools, and other human impersonators, little pieces of code skittering across the web. You might describe this phenomenon as The Internet of Thingies. 

Because bots are not difficult to build. In fact, it's so simple that a journalist (who has not learned to code) can do it.

I do it with a ($300) program called UBot Studio, which is an infrastructural piece of the botting world. It lets people like me program and execute simple scripts in browsers without (really) knowing any code.

Do you need 100 Hotmail accounts? I got you.

Perhaps you'd like some set of links autotweeted? I'm there.

You want to scrape a few numbers from a government website or an online store? Easy. It'd take 10 minutes. 

Or — and this is the one that gets to me — perhaps you want to generate an extra 100,000 pageviews for some website? So simple. A programmer friend of mine put it like this, "The basics of sending fake traffic are trivial."

I'm going to tell you how here, even though I think executing such a script is highly unethical, probably fraud, and something you should not do. I'm telling you about it here because people need to understand how jawdroppingly easy it really is.

So, the goal is mimicking humans. Which means that you can't just send 100,000 visits to the same page. That'd be very suspicious. 

So you want to spread the traffic out over a bunch of target pages. But which ones? You don't want pages that no one ever visits. But you also don't want to send traffic to pages that people are paying close attention to, which tend to be the most recent ones. So, you want popular pages but not the most popular or recent pages.

Luckily, Google tends to index the popular, recentish stories more highly. And included with UBot are two little bots that can work in tandem. The first scrapes Google's suggestions searches. So it starts with the most popular A searches (Amazon, Apple, America's Cup) then the most popular B searches, etc. Another little bot scrapes the URLs from Google search results. 

So the first step in the script would be to use the most popular search suggestions to find popularish stories on the domain (say, theatlantic.com) and save all those domains.

The first search would be "amazon site:theatlantic.com." The top 20 URLs, all of which would be Atlantic stories, would get copied into a file. Then the bot would search "apple site:theatlantic.com" and paste another 20 in. And so on and so forth until you've got 1,000. 

Now, all you've got to do is have the bot visit each story, wait for the page to load, and go on to the next URL. Just for good measure, perhaps you'd have the browser "focus" on the ads on the page to increase the site's engagement metrics.

Loop your program 100 times and you're done. And you could do the same thing whenever you wanted to. 

Of course, the bot described here would be very easy to catch. If anyone looked, you'd need to be fancier to evade detection. For example, when a browser connects to a website, it sends a little token that says, "This is who I am!" And it lists the browser and the operating system, etc. Mine, for example, is, "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36" 

If we ran the script like this, an identical 100,000 user agents would show up in the site's logs, which might be suspicious. 

But the user agent-website relationship is trust-based. Any browser can say, "I'm Chrome running on a Mac." And, in fact, there are pieces of software out there that will generate "realistic" user agent messages, which Ubot helpfully lets you plug in.

The hardest part would be obscuring that the IP addresses of the visits. Because if 100,000 visits came from a single computer, that would be a dead giveaway it was a bot. So, you could rent a botnet — a bunch of computers that have been hacked to do the bidding of (generally) bad people.

Or you could ask some "friends" to help out via a service like JingLing, which lets people use other people on the network to send traffic to webpages from different IP addresses. You scratch my back; I'll scratch yours! 

But, if the botting process is done subtly, no one might think to check what was going on. Because from a publisher's perspective, how much do you really want to know? 

In the example I gave, no page has gotten more than 100 views, but you've added 100,000 views to the site as a whole. It would just seem as if there was more traffic, but it'd all be down at the bottom of the traffic reports where most people have no reason to look.

And indeed, some reports have come out showing that people don't check. One traffic buyer told Digiday, "We worked with a major supply-side platform partner that was just wink wink, nudge nudge about it. They asked us to explain why almost all of our traffic came from one operating system and the majority had all the same user-agent string."

That is to say, someone involved in the traffic supply chain was no more sophisticated than a journalist with 10 hours of training using a publicly available piece of software. 

The point is: It's so easy to build bots that do various things that they are overrunning the human traffic on the web.

Now, to understand the human web, we have to reckon with the logic of the non-human web. It is, in part, shady traffic that allows ad networks and exchanges to flourish. And these automated ad buying platforms — while they do a lot of good, no doubt about it — also put pressure on other publishers to sell ads more cheaply. When they do that, there's less money for content, and the content quality suffers.

The ease of building bots, in other words, hurts what you read each and every day on the Internet. And it's all happening deep beneath the shiny web we know and (sometimes) love.

(Image via ra2studio/Shutterstock.com)

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.