1. This website uses cookies to improve service and provide a tailored user experience. By using this site, you agree to this use. See our Cookie Policy.
    Dismiss Notice

Expired Tumbler Scraping Issue in Scrapebox

Discussion in 'Black Hat SEO' started by SiddharthW, Apr 7, 2018.

  1. SiddharthW

    SiddharthW Jr. VIP Jr. VIP

    Joined:
    Jul 29, 2017
    Messages:
    690
    Likes Received:
    317
    Gender:
    Male
    I recently started Scraping some Tumbler but Scrapebox giving 404 error on Sensitive Media. Any solution?
     
  2. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    4,641
    Likes Received:
    2,466
    Gender:
    Male
    Home Page:
    Can you give me some examples? Im not sure I even understand the issue. Where are you getting the 404 when scraping in the main harvester or when running the vanity name checker addon or ?

    Perhaps you can attach a screenshot.
     
  3. SiddharthW

    SiddharthW Jr. VIP Jr. VIP

    Joined:
    Jul 29, 2017
    Messages:
    690
    Likes Received:
    317
    Gender:
    Male
    I'm getting 404 code in vanity name checker addon. But those Aren't actually expired domain. I've configured the Scarpebox in that way Alive = 404 error. But When I started checking vanity name checker addon giving 404 error on this type of site https://raravis-carlqvist.tumblr.com/
     

    Attached Files:

  4. SiddharthW

    SiddharthW Jr. VIP Jr. VIP

    Joined:
    Jul 29, 2017
    Messages:
    690
    Likes Received:
    317
    Gender:
    Male
    I mean by Alive Checker. I'm not using Vanity name checker.
     
  5. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    4,641
    Likes Received:
    2,466
    Gender:
    Male
    Home Page:
    Thats a bit of an anomaly. When I run it, it completes fine, so its entirely possible its your security software etc..

    When I load that in a browser it redirects to a safemod on page. That safemode on page doesn't contain data that makes scrapebox think its available, nor does it contain data that tells scrapebox its taken. So the vanity checker, for me, just displays completed.

    You could edit the tumblr definition file to show that if it gets to safe mode it shows as taken.

    But because its safe mode your security software may be intercepting the request and thus returning a 404. Or your ISP or your router or all sorts of things. So I doubt its your ISP, unless you just have really strict rules in your country.

    Else whitelist scrapebox in all security software, then whitelist in your router, if possible. You will also want to whitelist the entire scrapebox folder and then disable any real time scanning.

    Make sure you whitelist and not just disable security software.
     
    • Thanks Thanks x 1
  6. SiddharthW

    SiddharthW Jr. VIP Jr. VIP

    Joined:
    Jul 29, 2017
    Messages:
    690
    Likes Received:
    317
    Gender:
    Male
    Can you check this with ALive Check V2.0.0.12?
     
  7. SiddharthW

    SiddharthW Jr. VIP Jr. VIP

    Joined:
    Jul 29, 2017
    Messages:
    690
    Likes Received:
    317
    Gender:
    Male
    I've observed Vanity Name Checker Is working Fine with Sensitive Site but Alive check giving 404 error in this situation.
     
  8. SiddharthW

    SiddharthW Jr. VIP Jr. VIP

    Joined:
    Jul 29, 2017
    Messages:
    690
    Likes Received:
    317
    Gender:
    Male
    Just Checked I can't use rotating proxy in Vanity Name Checker, Why? Private Proxy is working but no rotating Proxy
     
  9. MatthewGraham

    MatthewGraham Jr. VIP Jr. VIP

    Joined:
    Oct 6, 2015
    Messages:
    1,739
    Likes Received:
    2,474
    Gender:
    Male
    Occupation:
    Cashing Welfare Checks
    Location:
    United States of America
    Home Page:
    Is it possible to spoof your useragent to Googlebot when using Scrapebox?

    Interesting fact about Tumblr: Normally you can only see sensitive media when logged into a Tumblr account. But since that would prevent Google from crawling those pages correctly, they show the pages normally to anyone whose useragent is set to / spoofed as Googlebot. If you can set that up to work with Scrapebox, there is a good chance that it would solve your problem.
     
    • Thanks Thanks x 1