1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to scan or discover removed tumbrl blogs

Discussion in 'Black Hat SEO' started by templaries, Apr 13, 2014.

  1. templaries

    templaries Newbie

    Joined:
    Feb 24, 2012
    Messages:
    15
    Likes Received:
    0
    Occupation:
    Developer
    Location:
    Spain
    Hello

    I am looking for any method to discover removed tumblr blogs. I put here a basic strategy but I would to check here if it could be a wasted of time.

    - List for tumbrl blogs from a keyword using google or bing.
    - For each tumbrl
    - Add to a list the urls found to other tumblr blogs​

    - For each url of the previous list
    - Check if the url returns 404 or any other code error or any message like blogger where you can try to take the url again.​

    Regargs
     
  2. CanadianDollar

    CanadianDollar Registered Member

    Joined:
    Mar 28, 2014
    Messages:
    80
    Likes Received:
    26
    Occupation:
    SEOlogist.
    Location:
    Somewhere within the UK.
    My method -

    Use Ahrefs or Majestic to check for any linked pages on a tumblr subdomain.

    Then, take of ALL those subdomains you can find (because they HAVE to be indexed and have links pointing to them for Ahrefs to find) then run all the subdomains through a broken link checker.
     
  3. templaries

    templaries Newbie

    Joined:
    Feb 24, 2012
    Messages:
    15
    Likes Received:
    0
    Occupation:
    Developer
    Location:
    Spain
    Good idea. But I understand that for this method you need a premium account for one of that services. In the other hand, how do you tell ahrefs that? Do you start with a set of tumblr blogs?

    I would to know if exist tumblr blogs that contain links to removed tumblr blogs, like blogger site in each profile.

    Sorry for my english.
     
    Last edited: Apr 13, 2014
  4. alvin.orrt

    alvin.orrt Regular Member

    Joined:
    Mar 13, 2011
    Messages:
    351
    Likes Received:
    62
    Location:
    Brussels, Belgium
    if you have scrapebox or gscraper or any scraper just do a search for:

    site:tumblr.com

    add a bunch of keywords and scrape

    then trim all urls, remove duplicates and run them through a live checker (you can use either the sb addon or xenu link sleuth)

    select all the dead ones, run the pr checker and there you go, you have yourself a bunch of high pr tumblrs
     
    • Thanks Thanks x 5
  5. templaries

    templaries Newbie

    Joined:
    Feb 24, 2012
    Messages:
    15
    Likes Received:
    0
    Occupation:
    Developer
    Location:
    Spain
    So all the urls are extracted from Google, so it is a need to have proxies. I think that the point is to get a set of tumblr blogs as seed and then scrape them searching other tumblr blogs inside (in the post, in blog roll zones, etc)

    I have not using scrapebox yet, by the moment I make my own bots to do these jobs.
     
  6. xha44a

    xha44a Power Member

    Joined:
    Dec 2, 2012
    Messages:
    532
    Likes Received:
    444
    Hi,

    I do this with a 2 step process.

    1. Scrapebox. Find a dead tumblr blog. Find keywords on it. Create a footprint. That's really simple. Scrape all the blogs. Filter - remove duplicates, should only have domain .tumblr.com on them, etc.
    2. Live check them all in Scrapebox. You've done a live check - the dead ones are *potentially* available.
    3. Run them through TumblingJazz to see if the actually are possible to sign up.

    Pretty easy.
     
    • Thanks Thanks x 3
  7. templaries

    templaries Newbie

    Joined:
    Feb 24, 2012
    Messages:
    15
    Likes Received:
    0
    Occupation:
    Developer
    Location:
    Spain
    So scrapebox takes links from Google and also from the tumblr blogs that are processing? I suppose that scrapebox could return thousand of tumblr blogs in that process.

    Does a tumblr blog return 404 error code when is available?

    Edit: Yes, they do. (black-hat-seo/646896-tumblr-returns-404-meanwhile-not-available-registration.html)
     
    Last edited: Apr 14, 2014
  8. prab1996

    prab1996 Elite Member

    Joined:
    Jan 8, 2013
    Messages:
    3,496
    Likes Received:
    2,028
    Occupation:
    your gf's <3 ♥♥♥♥
    Location:
    Prab1996.com
    Home Page:
    you only need scrapebox.

    scrape shit out of tumblr > remove dupes > use alive check > short out 404 ulrs > use vernity checker.

    -=-
     
    • Thanks Thanks x 1
  9. alvin.orrt

    alvin.orrt Regular Member

    Joined:
    Mar 13, 2011
    Messages:
    351
    Likes Received:
    62
    Location:
    Brussels, Belgium
    Yes, with scrapbox i usually get at least 100k tumblr urls which turn into 50k unique tumblr accounts. Most of them are still alive. Only close to 5% of them are dead. From those 5% less than half are available for registration. If you can build your own bot you'd better do it as the vanity checker inside scrapebox is not that trustworthy.

    And yes, you need tons of proxies to scrape properly, i suggest adding a huge list of public proxies - that's what i'm doing and i'm reaching 180-200 urls / sec.
     
    • Thanks Thanks x 1
  10. templaries

    templaries Newbie

    Joined:
    Feb 24, 2012
    Messages:
    15
    Likes Received:
    0
    Occupation:
    Developer
    Location:
    Spain
    Thank you, that the awnser what I was looking for. I am going to find many proxy then. It's scary that outside there are many players with thousands of proxies fishing tumblr blogs, hehe.
     
  11. strovolo

    strovolo Registered Member

    Joined:
    Aug 5, 2009
    Messages:
    90
    Likes Received:
    12
    Location:
    Sin City
    Home Page:
    after spending a week with scrapebox and gscraper i gave up , i think there is a software sold in the forum that can do better in identifing the available for registration, 404 does not mean is available!!!!
    i
     
  12. prab1996

    prab1996 Elite Member

    Joined:
    Jan 8, 2013
    Messages:
    3,496
    Likes Received:
    2,028
    Occupation:
    your gf's <3 ♥♥♥♥
    Location:
    Prab1996.com
    Home Page:
    that soft is too like a kid before sb (it just have too many bugs)

    i scrape 900k+ urls , then trim to root and then remove dupe domains.
    now use alive check (more then half of 404 urls are available to be registered)
    after some hours i have a list of 100+ available tumblrs.

    it's not that difficult , all depends upon the way of scraping.
    -=-