SEO Blog

Posts Tagged ‘Site’


25 Killer Combos for Google’s Site: Operator

Posted by:  /  Tags: , , , ,

Posted by Dr. Pete

There’s an app for everything – the problem is that we’re so busy chasing the newest shiny toy that we rarely stop to learn to use simple tools well. As a technical SEO, one of the tools I seem to never stop finding new uses for is the site: operator. I recently devoted a few slides to it in my BlueGlassX presentation, but I realized that those 5 minutes were just a tiny slice of all of the uses I’ve found over the years.

People often complain that site:, by itself, is inaccurate (I’ll talk about that more at the end of the post), but the magic is in the combination of site: with other query operators. So, I’ve come up with two dozen killer combos that can help you dive deep into any site.

1. site:example.com

Ok, this one’s not really a combination, but let’s start with the basics. Paired with a root domain or sub-domain, the [site:] operator returns an estimated count of the number of indexed pages for that domain. The “estimated” part is important, but we’ll get to that later. For a big picture, I generally stick to the root domain (leave out the “www”, etc.).

Each combo in this post will have a clickable example (see below). I'm picking on Amazon.com in my examples, because they're big enough for all of these combos to come into play:

You’ll end up with two bits of information: (1) the actual list of pages in the index, and (2) the count of those pages (circled in purple below):

Screenshot - site:amazon.com

I think we can all agree that 273,000,000 results is a whole lot more than most of us would want to sort through. Even if we wanted to do that much clicking, Google would stop us after 100 pages. So, how can we get more sophisticated and drill down into the Google index?

2. site:example.com/folder

The simplest way to dive deeper into this mess is to provide a sub-folder (like “/blog”) – just append it to the end of the root domain. Don’t let the simplicity of this combo fool you – if you know a site’s basic architecture, you can use it to drill down into the index quickly and spot crawl problems.

3. site:sub.example.com

You can also drill down into specific sub-domains. Just use the full sub-domain in the query. I generally start with #1 to sweep up all sub-domains, but #3 can be very useful for situations like tracking down a development or staging sub-domain that may have been accidentally crawled.

4. site:example.com inurl:www

The "inurl:" operator searches for specific text in the indexed URLs. You can pair “site:” with “inurl:” to find the sub-domain in the full URL. Why would you use this instead of #3? On the one hand, "inurl:" will look for the text anywhere in the URL, including the folder and page/file names. For tracking sub-domains this may not be desirable. However, "inurl:" is much more flexible than putting the sub-domain directly into the main query. You'll see why in examples #5 and #6.

5. site:example.com -inurl:www

Adding [-] to most operators tells Google to search for anything but that particular text. In this case, by separating out "inurl:www", you can change it to "-inurl:www" and find any indexed URLs that are not on the "www" sub-domain. If "www" is your canonical sub-domain, this can be very useful for finding non-canonical URLs that Google may have crawled.

6. site:example.com -inurl:www -inurl:dev -inurl:shop

I'm not going to list every possible combination of Google operators, but keep in mind that you can chain most operators. Let's say you suspect there are some stray sub-domains, but you aren't sure what they are. You are, however, aware of "www.", "dev." and "shop.". You can chain multiple "-inurl:" operators to remove all of these known sub-domains from the query, leaving you with a list of any stragglers.

7. site:example.com inurl:https

You can't put a protocol directly into "site:" (e.g. "https:", "ftp:", etc.). Fortunately, you can put "https" into an "inurl:" operator, allowing you to see any secure pages that Google has indexed. As with all "inurl:" queries, this will find "https" anywhere in the URL, but it's relatively rare to see it somewhere other than the protocol.

8. site:example.com inurl:param

URL parameters can be a Panda's dream. If you're worried about something like search sorts, filters, or pagination, and your site uses URL parameters to create those pages, then you can use "inurl:" plus the parameter name to track them down. Again, keep in mind that Google will look for that name anywhere in the URL, which can occasionally cause headaches.

Pro Tip: Try out the example above, and you'll notice that "inurl:ref" returns any URL with "ref" in it, not just traditional URL parameters. Be careful when searching for a parameter that is also a common word.

9. site:example.com -inurl:param

Maybe you want to know how many search pages are being indexed without sorts or how many product pages Google is tracking with no size or color selection – just add [-] to your "inurl:" statement to exclude that parameter. Keep in mind that you can combine "inurl:" with "-inurl:", specifically including some parameters and excluding others. For complex, e-commerce sites, these two combos alone can have dozens of uses.

10. site:example.com text goes here

Of course, you can alway combine the "site:" operator with a plain-old, text query. This will search the contents of the entire page within the given site. Like standard queries, this is essentially a logical [AND], but it's a bit of a loose [AND] – Google will try to match all terms, but those terms may be separated on the page or you may get back results that only include some of the terms. You'll see that the example below matches the phrase "free Kindle books" but also phrases like "free books on Kindle".

11. site:example.com “text goes here”

If you want to search for an exact-match phrase, put it in quotes. This simple combination can be extremely useful for tracking down duplicate and near-duplicate copy on your site. If you're worried about one of your product descriptions being repeated across dozens of pages, for example, pull out a few unique terms and put them in quotes.

12. site:example.com/folder “text goes here”

This is just a reminder that you can combine text (with or without quotes) with almost any of the combinations previously discussed. Narrow your query to just your blog or your store pages, for example, to really target your search for duplicates.

13. site:example.com this OR that

If you specifically want a logical [OR], Google does support use of "or" in queries. In this case, you'd get back any pages indexed on the domain that contained either "this" or "that" (or both, as with any logical [OR]). This can be very useful if you've forgotten exactly which term you used or are searching for a family of keywords.

Edit: Hat Tip to TracyMu in the comments – this is one case where capitalization matters. Either use "OR" in all-caps or the pipe "|" symbol. If you use lower-case "or", Google could interpret it as part of a phrase.

14. site:example.com “top * ways”

The asterisk [*] can be used as a wildcard in Google queries to replace unknown text. Let's say you want to find all of the "Top X" posts on your blog. You could use "site:" to target your blog folder and then "Top *" to query only those posts.

Pro Tip: The wild'card [*] operator will match one or multiple words. So, "top * questions" can match "Top 40 Books" or "Top Career Management Books". Try the sample query above for more examples.

15. site:example.com “top 7..10 ways”

If you have a specific range of numbers in mind, you can use "X..Y" to return anything in the range from X to Y. While the example above is probably a bit silly, you can use ranges across any kind of on-page data, from product IDs to prices.

16. site:example.com ~word

The tilde [~] operator tells Google to find words related to the word in question. Let's say you wanted to find all of the posts on your blog related to the concept of consulting – just add "~consulting" to the query, and you'll get the wider set of terms that Google thinks are relevant.

17. site:example.com ~word -word

By using [-] to exclude the specific word, you can tell Google to find any pages related to the concept that don't specifically target that term. This can be useful when you're trying to assess your keyword targeting or create new content based on keyword research.

18. site:example.com intitle:”text goes here”

The "intitle:" operator only matches text that appears in the <TITLE></TITLE> tag. One of the first spot-checks I do on any technical SEO audit is to use this tactic with the home-page title (or a unique phrase from it). It can be incredibly useful for quickly finding major duplicate content problems.

19. site:example.com intitle:”text * here”

You can use almost any of the variations mentioned in (12)-(17) with "intitle:" – I won't list them all, but don't be afraid to get creative. Here's an example that uses the wildcard search in #14, but targets it specifically to page titles.

Pro Tip: Remember to use quotes around the phrase after "intitle:", or Google will view the query as a one-word title search plus straight text. For example, "intitle:text goes here" will look for "text" in the title plus "goes" and "here" anywhere on the page.

20. intitle:”text goes here”

This one's not really a "site:" combo, but it's so useful that I had to include it. Are you suspicious that other sites may be copying your content? Just put any unique phrase in quotes after "intitle:" and you can find copies across the entire web. This is the fastest and cheapest way I've found to find people who have stolen your content. It's also a good way to make sure your article titles are unique.

21. “text goes here” -site:example.com

If you want to get a bit more sophisticated, you can use "-site:" and exclude mentions of copy on any domain (including your own). This can be used with straight text or with "intitle:" (like in #20). Including your own site can be useful, just to get a sense of where your ranking ability stacks up, but subtracting out your site allows you to see only the copies.

22. site:example.com intext:”text goes here”

The "intext:" operator looks for keywords in the body of the document, but doesn't search the <TITLE> tag. The text could appear in the title, but Google won't look for it there. Oddly, "intext:" will match keywords in the URL (seems like a glitch to me, but I don't make the rules).

23. site:example.com ”text goes here” -intitle:"text goes here"

You might think that #22 and #23 are the same, but there's a subtle difference. If you use "intext:", Google will ignore the <TITLE> tag, but it won't specifically remove anything with "text goes here" in the title. If you specfically want to remove any title mentions in your results, then use "-intitle:".

24. site:example.com filetype:pdf

One of the drawbacks of "inurl:" is that it will match any string in the URL. So, for example, searching on "inurl:pdf", could return a page called "/guide-to-creating-a-great-pdf". By using "filetype:", you can specify that Google only search on the file extension. Google can detect some filetypes (like PDFs) even without a ".pdf" extension, but others (like "html") seem to require a file extension in the indexed document.

25. site:.edu “text goes here”

Finally, you can target just the Top-Level Domain (TLD), by leaving out the root domain. This is more useful for link-building and competitive research than on-page SEO, but it's definitely worth mentioning. One of our community members, Himanshu, has an excellent post on his own blog about using advanced query operators for link-building.

Why No Allintitle: & Allinurl:?

Experienced SEOs may be wondering why I left out the operators "allintitle:" and "allinurl:" – the short answer is that I've found them increasingly unreliable over the past couple of years. Using "intitle:" or "inurl:" with your keywords in quotes is generally more predictable and just as effective, in my opinion.


Putting It All to Work

I want to give you a quick case study to show that these combos aren't just parlor tricks. I once worked with a fairly large site that we thought was hit by Panda. It was an e-commerce site that allowed members to spin off their own stores (think Etsy, but in a much different industry). I discovered something very interesting just by using "site:" combos (all URLs are fictional, to protect the client):

(1) site:example.com = 11M

First, I found that the site had a very large number (11 million) of indexed pages, especially relative to its overall authority. So, I quickly looked at the site architecture and found a number of sub-folders. One of them was the "/stores" sub-folder, which contained all of the member-created stores:

(2) site:example.com/stores = 8.4M

Over 8 million pages in Google's index were coming just from those customer stores, many of which were empty. I was clearly on the right track. Finally, simply by browsing a few of those stores, I noticed that every member-created store had its own internal search filters, all of which used the "?filter" parameter in the URL. So, I narrowed it down a bit more:

(3) site:example.com/stores inurl:filter = 6.7M

Over 60% of the indexed pages for this site were coming from search filters on user-generated content. Obviously, this was just the beginning of my work, but I found a critical issue on a very large site in less than 30 minutes, just by using a few simple query operator combos. It didn't take an 8-hour desktop crawl or millions of rows of Excel data – I just had to use some logic and ask the right questions.


How Accurate Is Site:?

Historically, some SEOs have complained that the numbers you get from "site:" can vary wildly across time and data centers. Let's cut to the chase: they're absolutely right. You shouldn't take any single number you get back as absolute truth. I ran an experiment recently to put this to the test. Every 10 minutes for 24 hours, I automatically queried the following:

  1. site:seomoz.org
  2. site:seomoz.org/blog
  3. site:seomoz.org/blog intitle:spam

Even using a fixed IP address (single data center, presumably), the results varied quite a bit, especially for the broad queries. The range for each of the "site:" combos across 24 hours (144 measurements) was as follows:

  1. 67,700 – 114,000
  2. 8,590 – 8620
  3. 40 – 40

Across two sets of IPs (unique C-blocks), the range was even larger (see the "/blog" data):

  1. 67,700 – 114,000
  2. 4,580 – 8620
  3. 40 – 40

Does that mean that "site:" is useless? No, not at all. You just have to be careful. Sometimes, you don't even need the exact count – you're just interested in finding examples of URLs that match the pattern in question. Even if you need a count, the key is to drill down. The narrowest range in the experiment was completely consistent across 24 hours and both data centers. The more you drill down, the better off you are.

You can also use relative numbers. In my example above, it didn't really matter if the 11M total indexed page count was accurate. What mattered was that I was able to isolate a large section of the index based on one common piece of site architecture. Assumedly, the margin of error for each of those measurements was similar – I was only interested in the relative percentages at each step. When in doubt, take more than one measurement.

Keep in mind that this problem isn't unique to the "site:" operator – all search result counts on Google are estimates, especially the larger numbers. Matt Cutts discussed this in a recent video, along with how you can use the page 2 count to sometimes reduce the margin of error:


The True Test of An SEO

If you run enough "site:" combos often enough, even by hand, you may eventually be greeted with this:

Google Captcha

If you managed to trigger a CAPTCHA without using automation, then congratulations, my friend! You're a real SEO now. Enjoy your new tools, and try not to hurt anyone.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!


SEOmoz Daily SEO Blog

Getting Site Architecture Right

Posted by:  /  Tags: , , ,

There are many ways to organize pages on a site. Unfortunately, some common techniques of organizing information can also harm your SEO strategy.

Sites organized by a hierarchy determined without reference to SEO might not be ideal because the site architecture is unlikely to emphasize links to information a searcher finds most relevant. An example would be burying high-value keyword pages deep within a sites structure, as opposed to hear the top, simply because those pages don’t fit easily within a “home”, “about us”, contact” hierarchy.

In this article, we’ll look at ways to align your site architecture with search visitor demand.

Start By Building A Lexicon

Optimal site architecture for SEO is architecture based around language visitors use. Begin with keyword research.

Before running a keyword mining tool, make a list of the top ten competitor sites that are currently ranking well in your niche and evaluate them in terms of language. What phrases are common? What questions are posed? What answers are given, and how are the answers phrased? What phrases/topics are given the most weighting? What phrases/topics are given the least weighting?

You’ll start to notice patterns, but for more detailed analysis, dump the phrases and concepts into a spreadsheet, which will help you determine frequency.

Once you’ve discovered key concepts, phrases and themes, run them through a keyword research tool to find synonyms and the related concepts your competitors may have missed.

One useful, free tool that can group keyword concepts is the Google Adwords Editor. User the grouper function – described here in “How To Organize Keywords” to “generate common terms” option to automatically create keyword groupings.

Another is the Google Contextual Targeting Tool.

Look at your own site logs for past search activity. Trawl through related news sites, Facebook groups, industry publications and forums. Build up a lexicon of phrases that your target visitors use.

Then use visitor language as the basis of your site hierarchy.

Site Structure Based On Visitor Language

Group the main concepts and keywords into thematic units.

For example, a site about fruit might be broken down into key thematic units such as “apple”, “pear”, “orange”, “banana” and so on.

Link each thematic unit down to sub themes i.e. for “oranges”, the next theme could include links to pages such as “health benefits of oranges”, “recipes using oranges”, etc, depending on the specific terms you’re targeting. In this way, you integrate keyword terms with your site architecture.

Here’s an example in the wild:

The product listing by category navigation down the left-hand side is likely based on keywords. If we click on, say, the “Medical Liability Insurance” link, we see a group of keyword-loaded navigation links that relate specifically to that category.

Evidence Based Navigation

A site might be about “cape cod real estate”. If I run this term through a keyword research tool, in this case Google Keywords, a few conceptual patterns present themselves i.e people search mainly by either geographic location i.e. Edgartown, Provincetown, Chatham, etc or accommodation type i.e. rentals, commercial, waterfront, etc.

Makes sense, of course.

But notice what isn’t there?

For one thing, real estate searches by price. Yet, some real estate sites give away valuable navigation linkage to a price-based navigation hierarchy.

This is not to say a search function ordered by house value isn’t important, but ordering site information by house value isn’t necessarily a good basis for seo-friendly site architecture. This functionality could be integrated into a search tool, instead.

A good idea, in terms of aligning site architecture with SEO imperatives, would be to organise such a site by geographic location and/or accommodation type as this matches the interests of search visitors. The site is made more relevant to search visitors than would otherwise be the case

Integrate Site Navigation Everywhere

Site navigation typically involves concepts such as “home”, “about”, “contact”, “products” i.e. a few high-level tabs or buttons that separate information by function.

There’s nothing wrong with this approach, but the navigation concept for SEO purposes can be significantly widened by playing to the webs core strengths. Tim Berners Lee placed links at the heart of the web as links were the means to navigate from one related document to another. Links are still the webs most common navigational tool.

“Navigational” links should appear throughout your copy. If people are reading your copy, and the topic is not quite what they want, they will either click back, or – if you’ve been paying close attention to previous visitor behaviour – will click on a link within your copy to another area of your site.

The body text on every page on your site is an opportunity to integrate specific, keyword-loaded navigation. As a bonus, this may encourage higher levels of click-thru, as opposed to click-back, pass link juice to sub-pages and ensure no page on your site is orphaned.

Using Site Architecture To Defeat Panda & Penguin

These two animals have a world of connotations, many of them unpleasant.

Update Panda was an update partly focused on user-experience. Google is likely using interaction metrics, and if Google isn’t seeing what they deem to be positive visitor interaction, then your pages, or site, will likely take a hit.

What metrics are Google likely to be looking at? Bounce backs, for one. This is why relevance is critical. The more you know about your customers, and the more relevant link options you can give them to click deeper into your site, rather than click-back to the search results, the more likely you are to avoid being Panda-ized.

If you’ve got pages in your hierarchy that users don’t consider to be particularly relevant, either beef them up or remove them.

Update Penguin was largely driven by anchor text. If you use similar anchor text keywords pointing to one page, Penquin is likely to cause you grief. This can even happen if you’re mixing up keywords i.e. “cape cod houses”, “cape cod real estate”, “cape cod accommodation”. That level of keyword diversity may have been acceptable in the past, but it’s not now.

Make links specific, and link it to specific, unique pages. Get rid of duplicate, or near duplicate pages. Each page should be unique, not just in terms of keywords used, but in terms of concept.

In a post-Panda/Penquin world, webmasters must have razor-sharp focus on what information searchers find most relevant. Being close, but not quite what the visitor wanted, is an invitation for Google to sink you.

Build relevance into your information architecture.

Categories: 

SEO Book.com

How to Free Your E-Commerce Site from Google’s Panda

Posted by:  /  Tags: , , , , ,

On Feb. 25, 2011, Google released Panda to wreak havoc on the web. While it may have been designed to take out content farms, it also took out scores of quality e-commerce sites. What do content farms and e-commerce sites have in common? Lots of pages. Many with zero or very few links. And on e-commerce sites with hundreds or thousands of products, the product pages may have a low quantity of content, making them appear as duplicate, low quality, or shallow to the Panda, thus a target for massive devaluation.

My e-commerce site was hit by Panda, causing a 60% drop in traffic overnight. But I was able to escape after many months of testing content and design changes. In this post, I’ll explain how we beat the Panda, and what you can do to get your site out if you’ve been hit.

The key to freeing your e-commerce site from Panda lies at the bottom of a post Google provided as guidance to Pandalized sites:

One other specific piece of guidance we’ve offered is that low-quality content on some parts of a website can impact the whole site’s rankings, and thus removing low quality pages, merging or improving the content of individual shallow pages into more useful pages, or moving low quality pages to a different domain could eventually help the rankings of your higher-quality content.

Panda doesn’t like what it thinks are “low quality” pages, and that includes “shallow pages”. Many larger e-commerce sites, and likely all of those that were hit by Panda, have a high number of product pages with either duplicate bits of descriptions or short descriptions, leading to the shallow pages label. In order to escape from the Panda devaluation, you’ll need to do something about that. Here are a few possible solutions:

Adding Content To Product Pages

If your site has a relatively small number of products, or if each product is unique enough to support entirely different descriptions and information, you may be able to thicken up the pages with unique, useful information. Product reviews can also serve the same purpose, but if your site is already hit by Panda you may not have the customers to leave enough reviews to make a difference. Additionally, some product types are such that customers are unlikely to leave reviews.

If you can add unique and useful information to each of your product pages, you should do so both to satisfy the Panda and your customers. It’s a win-win.

Using Variations To Decrease Product Pages

Some e-commerce sites have large numbers of products with slight variations. For example, if you’re selling t-shirts you may have one design in 5 different sizes and 10 different colors. If you’ve got 20 designs, you’ve got 1,000 unique products. However, it would be impossible to write 1,000 unique descriptions. At best, you’ll be able to write one for each design, or a total of 20. If your e-commerce site is set up so that each of the product variations has a single page, Panda isn’t going to like that. You’ve either got near 1,000 pages that look like duplicates, or you’ve got near 1,000 pages that look VERY shallow.

Many shopping carts allow for products to have variations, such that in the above situation you can have 20 product pages where a user can select size and color variations for each design. Switching to such a structure will probably cause the Panda to leave you alone and make shopping easier for your customers.

Removing Poor Performing Products

If your products aren’t sufficiently unique to add substantial content to each one, and they also don’t lend themselves to consolidation through selectable variations, you might consider deleting any that haven’t sold well historically. Panda doesn’t like too many pages. So if you’ve got pages that have never produced income, it’s time to remove them from your site.

Getting Rid of All Product Pages

This is a bold step, but the one we were forced to take in order to recover. A great many of our products are very similar. They’re variations of each other. But due to the limitations of our shopping cart combined with shipping issues, where each variation had different shipping costs that couldn’t be programed into the variations, it was the only viable choice we were left with.

In this option, you redesign your site so that products displayed on category pages are no longer clickable, removing links to all product pages. The information that was displayed on product pages gets moved to your category pages. Not only does this eliminate your product pages, which make up the vast majority of your site, but it also adds content to your category pages. Rather than having an “add to cart” or “buy now” button on the product page, it’s integrated into the category page right next to the product.

Making this move reduced our page count by nearly 90%. Our category pages became thicker, and we no longer had any shallow pages. A side benefit of this method is that customers have to make fewer clicks to purchase a product. And if your customers tend to purchase multiple products with each order, they avoid having to go from category page to product page, back to the category page, and into another product page. They can simply purchase a number of products with single clicks.

Noindexing Product Pages

If you do get rid of all links to your product pages but your cart is still generating them, you’ll want to add a “noindex, follow” tag to each of them. This can also be a solution for e-commerce sites where all traffic enters on category level pages rather than product pages. If you know your customers are searching for phrases that you target on your category pages, and not specifically searching for the products you sell, you can simply noindex all of your product pages with no loss in traffic.

If all of your products are in a specific folder, I’d recommend also disallowing that folder from Googlebot in your robots.txt file, and filing a removal request in Google Webmaster Tools, in order to make sure the pages are taken out of the index.

Other Considerations: Pagination & Search Results Pages

In addition to issues with singular product pages, your e-commerce site may have duplicate content issues or a very large number of similar pages in the index due to your on-site search and sorting features. Googlebot will fill in your search form and index your search results pages, potentially leading to thousands of similar pages in the index. Make sure your search results pages have a rel=”noindex, follow” tag or a rel=”canonical” tag to take care of this. Similarly, if your product pages have a variety of sorting options (price, best selling, etc.), you should make sure the rel=”canonical” tag points to the default page as the canonical version. Otherwise, each product page may exist in Google’s index in each variation.


Maxmoritz, a long time member of our SEO Community, has been working in SEO full time since 2005. He runs a variety of sites, including Hungry Piranha, where he blogs regularly.

Categories: 

SEO Book.com

The ‘Scam’ Site That Never Launched

Posted by:  /  Tags: , , ,

(A case study in being PRE negatively seo’ed)

Well it has been a fun year in search. Having had various sites that I thought were quality, completely burnt by Google since they started with the Penguins and Pandas and other penalties, I thought i would try something that I KNEW Google would love….. Something dare I say would be “bulletproof.” Something I could go to bed, knowing it would be there the next day in Google’s loving arms. Something I could focus on and be proud of.

Enter www.buymycar.com, an idea I had wanted to do for some time, where people list a car and it gets sent to a network of dealers who bid on it from a secure area. A simple idea but FAR from simple to implement.

Notes I made prior to launch to please Google and to give it a fighting chance were:

  1. To have an actual service and not to be an affiliate. Google crushed my affiliate sites and we know they are not fond of them as they want to be the only affiliate I think.
  2. To make sure the content was of a high quality. I took this so seriously that we actually made a point of linking out to direct competition where it helped to do so. This was almost physically painful to do! But I thought I would start as I meant to go on. I remember paying the content guy that helps me, triple his normal fee to go above and beyond normal research for the articles in our “sell my car” and “value my car” sections.
  3. To make the site socially likeable. I wanted something that people would share and as such to sacrifice profits in the short term to get it established.
  4. To give Google the things it loves on-site. Speed testing, webmaster tools error checking (even got a little well done from Google for having no errors, bless), user testing, sitemaps for big G to find our content more easily, fast hosting, letting it have full access with analytics…
  5. TO NEVER, EVER UNDER ANY CIRCUMSTANCES PAY FOR A LINK. Yes, I figured I would put all the investment into the site and content this time. If it went how I had hoped perhaps I could find the holy grail where site’s link to us willingly without a financial incentive! A grail I had been chasing for some years. Could people really link out without being paid? I had once heard a rumour it was possible and I wanted to investigate it……

Satisfied I had ticked all the boxes from hours of Matt Cutts video’s and Google guidelines documents, I went to work and stopped SEO on all my smaller sites that were out of favour. I was enjoying building what I had hoped would be a useful site and kicked myself for not having done so sooner. I also thanked Google mentally for being smart enough now to reward better sites.

Fast forward 4 months of testing and re testing and signing up car dealers across the country and I decided to do a cursory check to see if anyone had liked what I was building and linked to it. I put my site into ahrefs.com and to my surprise, 13,208 sites had!! What was also nice was that all of them had used the anchor text “Buy My Car Scam” and had been so kind as to give me worldwide exposure on .ru, .br and .fr sites in blog comments amongst others.

In seriousness, this was absolutely devastating to see.

A worried competitor had obviously decided I was a threat and to nip my site in the bud with Google and attack it before it had even fully started. The live launch date was scheduled for January 7th, 2013! I was aware of negative SEO from other sites I had lost but not in advance of actually having any traffic or rankings. Now I was faced with death by Google rankings to look forward to before it had any rankings, add to that my site being cited as a scam across the Internet before it launched!

My options were immediately as follows:

  1. Go back and nuke the likely candidates in Google who had sabotaged me. Not really an option as I think it is the lowest of the low.
  2. Start trying to contact 13,000+ link owners to ask for the links to be removed. When I am heavily invested in this project anyway and have a deadline to reach, this was not an option. Also, Xrummer, Scrapebox or other automated tools could send another 13,000 just as easily in hours for me to deal with.
  3. Disavow links with Google. To download all the links, disavow them all and hope that Google would show me mercy in the few months Matt Cutts said it takes to get to them all removed.
  4. Give up the project. Radical as this may sound, it did go through my mind as organic traffic was a big part of my business plan. Thankfully I was talked out of it and it would be “letting them win.”

I opted for number 3, the disavow method but wondered what would happen if I kept being sent 10’s of thousands more links and how a new site can actually have any protection from this? To set back a site months in its early stages is devastating to a new on-line business. To be in a climate where it is done prior to launch is ridiculous.

Had I fired back at future competitors as many suggested I did, there would be a knock on effect that makes me wonder if in the months to come, everyone will be doing it to each other as routine. Having been in SEO for years I always knew it was possible to sabotage sites but never thought it would become so common and before they even ranked!


Robert Prime is a self employed web developer based in East Sussex, England. You can follow him on Twitter at @RobertPrime.

Categories: 

SEO Book.com