Human vs. Machine Pt. 1: Algorithms and Spun Content

It may be a time-honored signal gaming trend, but content spinning has long since outlived its usefulness.

In Thursday’s post, I mentioned that content creation is a balancing act between satisfying the algorithms by which search engines rank content and writing for human eyes. This week, I want to do a deeper dive into the concept to illustrate why it is so important to achieve this balance–and why sometimes, seemingly paradoxically, the machine will lose.

Some of the most popular and risky ways to game SERP rankings include:

  • “Spun” or computer-generated content
  • Keyword stuffing
  • Artificial linkbuilding
  • Link spamming
  • Whited-out text

Today, we’re going to talk about spun content.

Back in the bad old days when I started writing optimized marketing content, content spinning was an accepted practice. The idea was to take a base article and generate multiple other iterations which sounded and looked different enough from each other to be publishable. This meant, with some editing and tweaking, someone could hypothetically write one piece of content and get paid for it dozens of times over! (And in the bad old days, say 2010-2013, they DID. This practice always struck me as a little shady, and when I was no longer working for agencies which required it, I did away with it altogether in my own work in favor of creating bespoke content.)

Spun content is easy to pick out because it has an artificial feel, even when it’s been edited and smoothed over for better readability. It’s the Uncanny Valley of text-based content. Even if a reader doesn’t consciously recognize it, the content’s slight “offness” can cause viewers to feel unsettled, uneasy or anxious even if they can’t quite explain or even put their finger on why. When website visitors are put off by the quality of the content, they don’t stay. If visitors don’t stay, your site and your business lose potential revenue!

Does this doll creep you out? If so, you have some idea of what spun content feels like to a reader. If not…well, nothing personal, but maybe we shouldn’t make any dinner plans, okay?

“But so what? If we’re getting eyes on our content, that’s a win, right?”

Sorry, but WRONG.

Over time, human readers and curators will review the content on your site. If it comes off as so much gobbledygook, which “spun” content so frequently does, it will be reported. Rack up enough reports, or even a single report from a superuser, and you could find that content or your entire site suppressed or banned from search engine indexing. When that happens, you lose all your gains and may have to start from zero with a brand-new website. If you think that sounds like a fun time, you may also need to use a different computer and/or router to set up the replacement site, just to ensure there are no fingerprints left on the new site such as that pesky IP address which may set off alarms and get your site banned AGAIN the second you take it live.

Avoiding tripping alarms is a lot easier when you just don’t do the things which activate them in the first place.

Practices such as spinning have largely fallen out of favor except among “black-hat” SEO marketers, who promise quick results and nothing more. White-hat operators strive to create content which gains traffic and value organically. Yes, it takes more time, but it also won’t get your site banned!

Every search engine’s ranking algorithms work in more or less the same manner. They evaluate and grade content based on a number of “signals,” or indicators of relative quality.

Well-crafted, human-created content which is purpose-designed to rank prioritizes signals which satisfy the algorithms without compromising the reader experience.

Spoiler: Happy readers spend money!

Writing to satisfy an algorithm can be tricky. Take the following paragraph, for example. This is a good example of how writing for human eyes and writing for a machine can clash. I’m going to pick on Google for this example, because it’s the 800lb gorilla of the SEO world. I literally know NO ONE who actually worries or even cares about how Bing or Yahoo ranks their content, as long as they reach Google’s Page One results.

Google is very circumspect about how Penguin, Panda and its other algorithms evaluate website content. This is not a bug. It’s a feature of the algorithms’ design. By keeping what signals are used to assess content and how these signals are scored a secret, Google tries to prevent people from gaming the system. People often try to inflate their ranking by emphasizing signals which look great to a computer, but read as gibberish to human eyes. Most of what we know about these algorithms’ function has been discovered “in the wild” through trial and error, including real-time monitoring of page rankings across SERPs as content is posted and refined.

“I’m watching you, human.” –Google Penguin algorithm

Just for a goof and to illustrate the problem with spinning, I took this paragraph and ran it through a free online tool called Spinbot. The text in the top box is the original. The text in the bottom box was the result of the machine rewrite. As you can see, the results…well, they’re something!

Result screen edited for clarity and ease of reading. The original website will look different.

For accessibility purposes, here’s the bottom text as rendered by the machine:

Google is extremely careful about how Penguin, Panda and its different calculations assess site content. This isn’t a bug. It’s an element of the calculations’ plan. By keeping what signs are utilized to evaluate substance and how these signs are scored a mystery, Google attempts to keep individuals from gaming the framework. Individuals regularly attempt to blow up their positioning by stressing signals which look incredible to a PC, yet read as jabber to natural eyes. The majority of what we think about these calculations’ capacity has been found “in the wild” through experimentation, including ongoing observing of page rankings across SERPs as substance is posted and refined.

Mean what you is content my good not? Best the I writer am. Human beat no me can! Error! Error! EX-TERM-IN-ATE…!!!

When we compare the two, the bottom iteration is mostly grammatically sound. It’s TECHNICALLY correct in the sense that the reader can still (sort of?) understand what the writer is trying to say. But in terms of readability or actual sense, it’s abysmal. To illustrate this, I ran both iterations against an online readability checker. Here’s how my original content scored, as indicated by ReadabilityFormulas.com.

If I take this assessment at face value, this paragraph needs to be rewritten so it scores at least a 60 on the Flesch-Kincaid readability scale (higher score = easier comprehension). However, I’m not going to do that right now for three reasons. First, it’s unnecessary. I have faith that anyone savvy enough to follow this article doesn’t need it broken down to Dick and Jane levels. Second, in this case, it’s a pointlessly tedious and difficult exercise which doesn’t add sufficient value to be worth the time investment. Third, while I believe readability scores do serve a purpose in some cases, they are generally a frivolous concern when weighed against other signals which have more impact.

Readability does not always equal intelligibility.

This is one area where I’m perfectly prepared to ignore the machine in favor of content designed for human minds.

This said, I assess work undertaken for a client differently than work I do on my own, depending on what the intended results and audience are. Some situations demand a different approach, such as documents which must be prepared in compliance with state and federal laws specifying a given minimum reading score. This is why humanized content requires human eyes, if you’ll pardon the pun!

And here are the results from the spun copy:

As you can see, the spun content is even more difficult to parse. While it technically meets the definition of “original” content, it falls down badly in terms of readability and in some spots borders on outright nonsense. It also feels artificial, or at best like it was run through Google Translate by a non-native English speaker.

Another thing you may have noticed about the spun content is that some of the keywords have changed. One that jumped out at me is the way “algorithms” are now replaced with “calculations.” This isn’t a problem if I’m concerned about ranking for “Google calculations” over “Google algorithms.” The issue is, “Google calculations” means something very different than “Google algorithms,” so people seeking information about how Panda and Penguin rank their content will pass over anything about “calculations” unless there’s a very clear linkage created. Even then, that linkage is likely to be so thin it probably won’t get to Page One, never mind “above the fold,” which is where 95% of all productive business clicks occur!

As algorithms have become more sophisticated, they have begun to become quite adept at sniffing out spun content. They do this by indexing pages and comparing the content on them across the search engine. When they find content which is sufficiently similar (and no one is entirely sure what the threshold for “sufficient” is outside Google itself), they evaluate which content was posted first. The one which has the longer historical presence wins in this case. This makes content spinning at best of questionable valuable and at worst a disastrous practice in about 99.9 out of 100 cases.

No one knows all the secrets of The Matrix. Except Google. Google knows…

If you absolutely MUST use spun content for some arcane reason, you need an experienced content creation professional who can minimize the risks for you.

Generally speaking, you and your intended audience will be happier with, and more secure in, the quality and value of the content you’re delivering if you just hire a professional to craft bespoke content for you from the outset. It may cost a little more on the front end, but the time, money and hassle you’ll save yourself in the long run will be well worth the investment.

Be sure to tune in tomorrow when we talk about the most popular form of algorithm gaming still going on: keyword stuffing! Also, let me know your thoughts on spun content. Have you had good OR bad prior experiences with it? What happened when you encountered or used it? Let’s talk!

Published by Jericho Wayne

Jericho Wayne is a full-time Internet marketing and content creation consultant, and a published author of erotic romance and urban fantasy.

2 thoughts on “Human vs. Machine Pt. 1: Algorithms and Spun Content

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: