1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

[GUIDE] Scrapebox intermediate tut [GUIDE]

Discussion in 'Black Hat SEO Tools' started by Wolfpack, Sep 19, 2011.

  1. Wolfpack

    Wolfpack Junior Member

    Joined:
    Jul 13, 2011
    Messages:
    166
    Likes Received:
    346
    This is a followup thread with the (so far) pretty popular Scrapebox newbie tut, and is the more advanced version. This isn't so much a guide including methods of how to use Scrapebox as such, but more how to get the most out of Scrapebox as a whole package; methods can be imporved by using a little common sense and using these more advanced techniques. First of all, I'll start with something I lightly mentioned in the newbie tut, but did not explain; how to create good comments. This guide is only classed as intermediate as I am not a long term user of Scrapebox - I am not an advanced user. But this does shine a lgiht on a lot of my knowledge, which I think is quite extensive.

    Comments:
    The most important feature to make comments work for you is to use spintax. This feature allows us to utilize one spintax'd comment to produce hundreds or thousands of different comments. The way we implemented Spintax is as follows:

    Code:
    I {like|love} your {site|website|blog}, what {font|background image|Wordpress theme} do you use?
    
    As you can see, certain words are surrounded by {curly brackets}, and are seperated|by|these|lines. What happens is Scrapebox randomly chooses one of these words, and uses it in the comment. This means that this comment will produce many varieties of outputs, for example:

    Code:
    I like your blog, what background image do you use?
    
    could be a possible comment from the above spintax.

    One word in the curly brackets is randomly selected, and a comment is spun. This is the most vital part of making good, high success rate comments. But there's more!

    Scrapebox is not a stupid tool, and can and will work in real time with the page it is on to create a unique comment. This can cleverly make the comment
    seem like it could have only been written by a person! A number of operators will take information from a website and use it within the comment. It's better that you read on to get a good understanding of this, it's hard to explain verbally. Here's the list of the comment operations you can use with Scrapebox:

    • %BLOGTITLE% - Replaced with the page <title> of the blog you're commenting on.
    • %WEBSITE% - This will be replaced by your websites you have loaded in websites.txt (great for adding your links directly in your comments)
    • %BLOGURL% - This will be replaced by the blogs URL you're commenting on (domain name).
    • %NAME% - This will be replaced by the users name from names.txt or if have your anchors setup to be used in the websites.txt, it will use an anchor from that.
    • %EMAIL% - Will be replaced by one of your emails in the emails.txt file.
    (source of this list - http://www.scrapeboxhelp.com/new-prespun-comment-list-available-you-probably-havent-seen-this-before)

    But these will not guarantee your comment looks genuine! They must be used effectively and creatively. Writing a comment, using these operands, such as this:

    Code:
    I love %BLOGTITLE%, it's so great!
    
    Will not be effective! If the title of the page is something like "~~~ John Donahue's Magic blog! &&& Best blog in the Universe &&& - Blog Post #223 ~~~", this comment would seem strange, and noticeably automated:

    Code:
    I love ~~~ John Donahue's Magic blog! &&& Best blog in the Universe &&& - Blog Post #223 ~~~, it's so great!
    
    Instead, use them in a way that sounds
    intelligent and contributional to the topic.

    For example, I could use it in the following way, which makes me sound much more human, and much more like a query rather than a spammy useless post:
    I noticed the title of your blog (%BLOGTITLE%) doesn't seem to be very well optimized for what the blog post is about. Hit me back on %EMAIL% if you need any help with your on page SEO! Would be glad to help such a great blogger :)
    As you can see, I have used the blog's title, which makes me seem like a real person, as well as offered my email. But these are not the most important things - I came off very human in my post, and I offered him a helping hand, which seems to be beneficial for the author. Even if the blog's title IS optimized for the blog, the comment shows some uncertainty, and so it could be taken as pure ignorance towards SEO as to why I've said that I don't think the title is optimised. The comment uses page-precise information and seems human and genuine. These are vital for successful comments, and by adding in spintax, this comment would probably have a very high success rate to non-AA blogs, snagging us even more backlinks. Let's move on.

    More of footprints:
    Footprints are the shizzle
    . I'm not gonna lie, this is where the magic should start, they are your initial way of filtering URLs from the start, and so they catch most of the shit you don't want.

    So it is important you nkow ho to use them to your full advantage. One of the first things I realised when using footprints is that they utilise Google and other search engines. Therefore, they can utilise their commands. Here's some you should know how to use, and some advanced ones:

    • filetype: - this allows you to filter by file type, For example, and probably the way I most use these is to find XML sitemaps. Typing in filetype:xml makes sure all of the returned results are .xml; I use this by appending all my AA blog sites with "site:", and then use the filetype:xml as my custom footprint to scrape any sitemaps of my AA blogs - they will likely have more AA blogs within their sitemaps. Talking of "site:"..
    • site: - this only allows results to be those of the domain stated. For instance, "site:google.com" will only return pages from Google.com, eg. "google.com/maps" and "google.com/images".
    • inurl, intitle- self explanatory - inurl returns documents that only have that text string in their url (eg. inurl:"bunny-treats" could return "bunnies.net/bunny-treats" and "bunny-treats.com" as possible results". And intitle does the same, but for the title, who'd have thoguht huh.
    • An extensive list is available here (http://www.googleguide.com/advanced_operators.html), but these operands should get you by most tasks.
    However, the fun doesn't stop here. There are many more operands to master and use! Utilise these to really fine tune your searching capabilities!

    • + or - signs - these either make sure a word must be in the result (eg +bunnies means documents must contain the word bunnies exactly once), and - signs means pages must definitely not have that word or phrase in them (eg. -"justin beiber" will return results which definitely do not include the exact phrase justin beiber, who wants him anyway).
    • AND or OR - search terms with AND between them must include that search term and the one following it (so "+carpets AND +removal will return pages that contain the word carpets and the word removal). OR does.. well yeah you get it. They msut be in CAPS LOCK.
    • .. -here's one many people do not know - the double dot. When placed between two numbers, any integer value between them will be considered in the search quote, eg. using the footprint "2..50 comments" will return posts only containing the phrase x comments, where x is between 2 and 50. HINT HINT there btw, it's useful ;)
    • ~ - the wavey thing produces synonyms. So, the search term ~courage will find not just websites related to courage, but also to the synonyms of the word courage (eg valour, brave, fearless, etc).
    If you use these intelligently, use can make some footprints that are near masterpieces. For instance:

    Code:
    "2..50 comments" -"comments closed" "best buy" AND "coupons"
    
    This footprint would find posts that
    only have between 2 and 50 comments (low number of comments = low number of OBL = better link juice), that do not have the words "comments closed"(ie more likely to have open comments), and can be targetted to meet blogs only containing the phrases "best buy" and coupons". So this would be very targetted towards best buy coupon blogs with open comments, and which have less than 50 comments. Impressive!

    This is not as far as footprints go though. They have so many uses it's hard to list them all, but I do have a favourite that I'm surprised very few people mention, and this combines the power of relevancy with highly successful comments. By seaching for websites that have a certain topic that you have written about, let's say Warren Buffett, you could easily produce relevant comments. So, what defines a page as being about a certain topic or person or whatever? Well, the title of course, and I'm sure you can see where this is going:
    intitle:"Warren Buffett"
    And that's it. Honestly. Now save this and merge it to some business related keywords, marge any other buffer footprints you have (I would also use "powered by Wordpress", "Powered by Blogengine" etc. as examples), et voila. You can tailor a comment around Warren Buffett, for example, you could have the comment "Warren Buffett is the most fantastic businessman of the last 100 years. Great article!". It's relevant and precisely tailored to the pages you have scraped. And that means high approval rates.

    How long did that take to think about? 5 seconds? The grand entrance of creative thinking huh. If you need a footprint, this thread lists all the possible wildcards, parameteres, etc. The only thing that prevents you from being able to get whatever you want with Scrapebox is lack of imagination. If you can't scrape it, you're not thinking hard enough.

    Pinging and improving AA lists:
    Pinging is great because, let's face it, AA blogs probably are not such good examples themselves of well kept websites. They are openly spammed, have thousands of OBL and usually very few IBL. So it's very predictable that many are not indexed by Google or are not frequently maintained (in the SEO sense of things), and so they will not have greatly consistent value. Scrapebox is here to help.

    Pinging is basically tell search engine spiders that the site has been updated, a cry for the spiders to come creeping. Therefore, it's a good idea to do this after you have made the backlinks. There's not much point telling the search engine that the page is updated if you have not been posting comments, what's the value of that to you?

    Anyway, it's very basic and is very similar to link checking. Click the "Ping mode" radio button, and ping. This can take a couple of days though, as the search engines can take a while to crawl pages. This is also useful for pinging your own pages.

    Grooming your list:
    After you have finally got yourself a nice AA list, and you've posted and pinged for one website, you will more than likely keep this list handy for other uses. But remember, a lot can happen in a small period of time online, and websites that are there one day may be gone the next. Therefore, before you start a new campaign, or before you sell that 100K AA list that you found from a couple months back, it's a good idea to check and groom the list. These are also good grooming techniques to use on any AA list to keep it up to scratch.


    The first useful tool is the Alive checker addon. Does what it says on the tin, very easy to use.

    Now you probably know this but Google are very fucking smart, and the don't favour quick linkbuilding. They also do not seem to appreciate a shit torrent of spammed to hell blogs backlinking your page - it seems suspicious to them. So it's good to quality test your backlinks before you blast websites that you value. The best two ways of doing this are to search for blogs that do not have many comments, and to only use high PR blogs.

    There's two ways to search for blogs that have a low number of comments: first of all, there's the footprint that uses the .. operand (eg. 2..50 comments); secondly, there's the blog size checker. Going to Settings > Slow and Manual Blog Limits will limit the allowed size of URLs - it may say that it is only for slow and manual posting, but I have definitely noticed a difference with this on fast posting, and that difference correlates towards the lesser the size stated, the less the number of comment on the blogs I post to (but also the less number of blogs that are succesfully posted to, as a side effect of filtering out spammed blogs).

    Finding high PR blogs works very much in the way you'd expect. Before I post to the URLs, I check their PageRank; all of those that do not have a PR of 1 or higher, I remove (HINT: a very very nifty way of doing this is to order the URLs by their PR, and then to use some smart keyboard shortcuts I was taught in school - pressing CTRL + SHIFT + END will highlight from the current selection down to the bottom of the page. So highlight the first blog with PR0, and press those hotkeys - you have now selected all non-PR blogs;right-click and remove). I then post and scrape other internal pages of these and usually drag some more PR blogs from this.

    These two little helpful snippets produce backlinks with high link juice. Now, many people will tell you that checking for nofollow and d0f0ll0w blogs is important, but to me I believe the opposite. It looks unnatural to have lots of d0f0ll0w blogs, so I accept any type of backlink without checking this. However, some wish to, and there's an addon for this. Look into it, but do not blast only d0f0ll0w blogs, please. Google will notice this, and they will penalise for this.

    Other Scrapebox uses:
    This is really up to you, Scrapebx can be used for anything related to needing to scrape URLs, but addons have allowed it to go a bit further (not too significantly off of the course of what the program does, but to some extent). By no means will this be extensive. There is no way I could possibly list the endless uses of this tool, it's designed to be too versatile for that. To give you an idea, I used it no less than 3 days ago to spin me different descriptions for Fiverr gigs on the multiple number of accounts I have, by using the commenter. It's all up to imagination.

    However, I will give you a couple of examples to show you the extensive abilities of the tool. Do not take this as a thorough list though, Google for ideas, search this thread, read read read to really get ideas that are unique for Scrapebox - everybody thinks in different ways, everybody comes up wiht different ideas. So steal them! ;)

    TDNAM and Fake PR checker:
    Not a new concept, the TDNAM checker will check expiring domains. These could be aged and with PR, and if you snatch these up (sometimes for very little), you have yourself a nice website. TheFake PR checker can check for you that they are not faking the PR that they claim, just as a security measure. Grabbing bargain PR'd domains is not something I expected the tool to be capable of.

    Whois addon:
    The Whois addon lets you take the Who is information from harvested domains. Can you see where this is going? Oh yes, data mining webmasters. Finding out their Whois can allow you to do much more blackhat things, such as attack competition with "cease and desist" type letters and claims (not something I promote btw, it's pretty dickish, but hey it's done), or offering them services and advertising. Again, pretty neat! And because you scrape the domains, you can customise who's Whois you are data mining by targetting your audience with custom footprints.

    Data mining:
    This builds on from the above in a more sophisticated way, using another tool. Let's take for example something mentioned before the second edit on this post, somebody asked how they could data mine phone numbers. Well, it's pretty damn simple to be frank. Let's say I want UK numbers from London. To my knowledge, they all start with 01895, AKA the London telephone number code, and all UK numbers have 11 digits. Now this takes a bit of thinking outside the box, but here's a little idea from me. Using the knowledge in this thread, you know what the .. (double dot) operand does; it states integers between these values. See where this is going yet? Bare with me..

    So we know they must start with "020", and then they have 8 numbers following them. Therefore the footprint I would use is "1895000000..1895999999" +london; this would return any value between 1895 000 000 and 1895 000 000, and this includes numbers beginning with 0. Therefore, all numbers beginning with 01895 would be harvested from the web with this footprint. It would be better however to find a good number directory and scrape all it's pages, as phone nubmers can easily just be random duplicates.

    The basic jist of all this is that it relies heavily on building a good custom footprint, by finding something generic about the data you're trying to mine. Maybe all websites that have Vaja iPad leather cases have their slogan, which is (idk) "The best leather for your iPad" (i'm not going into the slogan business anytime soon), in which case use the custom footprint that specifies that pages must have that text string. You get the idea!

    Then, to data mine we must use a data mining program (I will not specify one as I don't want to advertise products I don't know about, but you can PirateBay yourself a (probably) good one for free) the URLs we have. Et voila, data mining with help from Scrapebox.

    Whew, that took a long time to write. CTRL + A.. delete.. no no, I think I'd break down. Anyway, I hope this has been useful to you and I hope you've got the main point of this thread. This is all intermediate knowledge because the real advanced knowledge comes from using all of this together and creating advanced algorithms (algorithm = a line of procedures or processes) that harvests exactly what you want. That is the key to Scrapebox. Good luck, and if you find out more please share, and think outside the box!

    ..Well I thoguht I was done, but I read one last request for a little more on d0f0ll0w blogs and how to find them. I don't personally do this much, but I remember reading some stuff about it, and I'm providing this from research I'm doing right now. So..

    D0f0ll0w:
    I'm writing this mostly because I like typing 0s instead of os, and not for your benefit, please remember that. So expect overuse of the word d0f0ll0w.

    Scrapebox searches through HTML as well as page text when used in the linkchecker mode, so it finds stuff we don't see. Check the page source of nofollow blogs, and you'll see many have this in their code:

    HTML:
    rel="nofollow"
    or
    rel='nofollow'
    So all we have to do is load that large list of harvested URLs into the link checker, and check for those two strings in the web pages. We remove those that have these, and voila, we have d0f0ll0w blogs. Some will have other ways of making their pages nofollow, so this list is not 100% d0f0ll0w.

    Another key way to find d0f0ll0w blogs is with your first line of defense against shit you don't want; custom footprints. Many Wordpress plugins that are d0f0ll0w use plugins like Keyword Luv and Comment Luv, and may leave behind traces of evidence that blogs use them. There are many threads about this throughout the forum, and SweetFunny herself has already offered her own ideas about this, but from what I've Googled, most suggest this as a footprint for KeywordLuv:

    Code:
    [/FONT] [FONT=Trebuchet MS]
    "This site uses [URL="http://cutemegatron.com//goto/http://www.scratch99.com/wordpress-plugin-keywordluv/"]KeywordLuv[/URL]. Enter YourName@YourKeywords in the Name field to take advantage."
    
    There's your footprint! Have fun ladies and gents :)
     
    • Thanks Thanks x 107
    Last edited: Sep 19, 2011
  2. Yupefer

    Yupefer Regular Member

    Joined:
    Jul 13, 2011
    Messages:
    274
    Likes Received:
    54
    Occupation:
    Musician
    Location:
    Endless
    Spamcop.net? thats what the newbie tut link leads to
     
    • Thanks Thanks x 2
  3. Wolfpack

    Wolfpack Junior Member

    Joined:
    Jul 13, 2011
    Messages:
    166
    Likes Received:
    346
    No idea what happened there. Thank you for pointing that out!
     
    • Thanks Thanks x 1
  4. Yupefer

    Yupefer Regular Member

    Joined:
    Jul 13, 2011
    Messages:
    274
    Likes Received:
    54
    Occupation:
    Musician
    Location:
    Endless
    Lmao I thought I was mistaken!

    Newbie here bout to check that out then read this one thanks dude :p
     
    • Thanks Thanks x 1
  5. ronstylistic386

    ronstylistic386 Junior Member

    Joined:
    Mar 28, 2011
    Messages:
    198
    Likes Received:
    366
    Home Page:
    Nice share OP.. I will wait for the continuation.. :)

    thanks + rep for you..
     
    • Thanks Thanks x 2
  6. macdonjo3

    macdonjo3 Jr. VIP Jr. VIP Premium Member

    Joined:
    Nov 8, 2009
    Messages:
    5,562
    Likes Received:
    4,317
    Location:
    Toronto
    Home Page:
    You mean beginner guide?

    Also, ensure to edit your post before it gets locked.

    Macdonjo3
     
    • Thanks Thanks x 1
  7. sanjeev2010

    sanjeev2010 Junior Member

    Joined:
    Oct 11, 2008
    Messages:
    199
    Likes Received:
    88
    Occupation:
    DMPK
    Location:
    New England
    Love to see more on this!
     
  8. sneakfreak28

    sneakfreak28 Junior Member

    Joined:
    Dec 30, 2009
    Messages:
    154
    Likes Received:
    12
    great read for someone like me who just recently bought scrapebox and having a hard time from reading other guides. yours is trully a notch better than all of em.
     
    • Thanks Thanks x 1
  9. Bostoncab

    Bostoncab Elite Member

    Joined:
    Dec 31, 2009
    Messages:
    2,255
    Likes Received:
    514
    Occupation:
    pain in the ass cabbie
    Location:
    Boston,Ma.
    Home Page:
    any idea how to use scrapebox to scrape mobile phone numbers?
     
  10. loads16017

    loads16017 Power Member

    Joined:
    Aug 24, 2010
    Messages:
    574
    Likes Received:
    103
    Location:
    In everywhere
    Good work man :rofl:
     
  11. TheMatrix

    TheMatrix BANNED BANNED

    Joined:
    Dec 20, 2008
    Messages:
    3,444
    Likes Received:
    7,279
    I don't think you can do that with SB.

    However, you can scrape pages that contain those phone numbers, and then manually/using other software, scrape the numbers,
     
  12. jvinci

    jvinci Newbie

    Joined:
    Jul 27, 2011
    Messages:
    35
    Likes Received:
    17
    wow, thanks for posting this. definitely learned a lot!
     
  13. Wolfpack

    Wolfpack Junior Member

    Joined:
    Jul 13, 2011
    Messages:
    166
    Likes Received:
    346
    Haha, the cheek! If this is the kind of stuff you'd expect a beginner to be able to pick up and instantly use usefully, I'll call it a beginner's guide, but I can't see how a beginner would make much use of this thread without knowing the beginnings of Scrapebox first, hence it not being a beginner's guide. I know so far it isn't mind blowing, but it's not finished just yet, and it does get much more extensive than this; that's 2 topics covered of about 8 or 9, there's a lot of areas in Scrapebox which can be dug into far more deeply. If however you do know a lot more about Scrapebox than what's provided in the two threads, please feel free to share what you know with us. ;)

    This is something I'm going to deal with in tonight's installment! It's not necessarily phone numbers that we'll be scraping, but it's how to scrape generic pages that contain a lot of phone numbers, or for that matter, generic pages that contain anything that you want to scrape. Scrapebox doesn't actually scrape the phone numbers itself; it only returns URLs. It's very easy to scrape whatever you want with Scrapebox, as long as you remember some pretty obvious key steps, and implement them correctly. I will use this as an example though for you to show you how to do something like this :)
     
  14. kaif0346

    kaif0346 Power Member

    Joined:
    Jul 13, 2011
    Messages:
    734
    Likes Received:
    93
    Occupation:
    free lancer, SEO, VA
    Location:
    battle field
    Home Page:
    I have blast 10k links and 2700+ are successfully submitted but they are not indexed yet aren't count yet as backilinks I have check on iwebttool .what should I do??
     
  15. Drago05

    Drago05 Junior Member

    Joined:
    Oct 31, 2010
    Messages:
    151
    Likes Received:
    10
    Location:
    Europe
    I like to see some tutorials of how to find do follow blogs with scrapebox.
     
  16. Wolfpack

    Wolfpack Junior Member

    Joined:
    Jul 13, 2011
    Messages:
    166
    Likes Received:
    346
    The guide is finished! Please enjoy and respond critically on improvements; leave constructive comments and ideas. If you see spelling errors, let me know, this was written as I went along!
     
  17. KlaAz0r

    KlaAz0r BANNED BANNED

    Joined:
    Jan 11, 2011
    Messages:
    722
    Likes Received:
    853
    very very good! this will help and safe a lot of SB posts :)
     
  18. takeachance

    takeachance Power Member

    Joined:
    Jul 31, 2009
    Messages:
    557
    Likes Received:
    412
    Location:
    The UK of A
    Shezzz....all I see is nik picking comments here. FFS lets rep Wolfpack for this post because it in essence is a superb guide for any SB user. If you read the text well you will be well on your way to SB mastery - there are some real gems in here for even the most experienced SB'er.

    Thanks Wolfpack, great post, thanks & rep+ given.
     
    • Thanks Thanks x 1
  19. Wolfpack

    Wolfpack Junior Member

    Joined:
    Jul 13, 2011
    Messages:
    166
    Likes Received:
    346
    I'm not sure if you've linkchecked all of your submitted posts when you say 2700+ submitted. Are those 2700 successful posts, or 2700 posts that have been checked with the link checker tool also? Many successfully posted posts are not automatically approved, so your link may not be on the webpage; generally only a small fraction of posted entries are AA.

    Thank you, but I'm sure a lot of the comments I have gotten so far are just requests by people who are looking for specific information, and really really want it. I don't mind those comments because they give me new topics to add in to make the post more complete.

    Thank you so much for the +rep though, I've noticed it fly up! I've read a lot of posts of yours in the past too which have helped me, so I returned the favour also :)
     
  20. clist

    clist Registered Member

    Joined:
    Dec 10, 2009
    Messages:
    63
    Likes Received:
    51

    I was wondering something similar....I have checked and found a lot of links posted after my blasts.

    What would be the appropriate way to ping / get them indexed the fastest?

    B.T.W. @wolfpack Great Guide posts! A lot of what you've posted has confirmed what I already thought and I've learned a lot too!

    Thanks!