1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Anyone else having problems scraping urls from google with scrapebox?

Discussion in 'Black Hat SEO Tools' started by XoRaK, Nov 10, 2013.

  1. XoRaK

    XoRaK Regular Member

    Joined:
    Oct 28, 2009
    Messages:
    303
    Likes Received:
    250
    Occupation:
    social worker
    Location:
    Belgium
    When i try this query on google I get 263.000 results

    GOOGLE QUERY:
    site:www.youtube.com/watch?v= "published on nov 9, 2013" -feature=youtube_gdata

    When i enter this query in scrapebox, it only scrapes about 300 results.

    I use private proxies that pass the google test, anyone knows what might be going on?

    Greetz

    XoRaK
     
  2. miluge

    miluge Regular Member

    Joined:
    Oct 26, 2012
    Messages:
    227
    Likes Received:
    93
    Occupation:
    Sneakerhead
    Location:
    Behind you!
    Which error do you get when scraping ? Error 503, 999 ?
     
  3. Nightly

    Nightly Regular Member

    Joined:
    Oct 18, 2013
    Messages:
    292
    Likes Received:
    79
    If I remember correctly, the proxy tester doesn't use advanced operators, just words to check If they are passed. Just because you can search a word on google with a proxy, doesn't mean It Isn't temp banned from operators. When I'm using private proxies for scraping, I have It on a 1:15 ratio, with a delay to keep my proxies from getting banned as quickly.
     
    • Thanks Thanks x 1
  4. XoRaK

    XoRaK Regular Member

    Joined:
    Oct 28, 2009
    Messages:
    303
    Likes Received:
    250
    Occupation:
    social worker
    Location:
    Belgium
    No error, it says harvester completed... but that's after 400 or so results, with each query i try...
     
  5. Nightly

    Nightly Regular Member

    Joined:
    Oct 18, 2013
    Messages:
    292
    Likes Received:
    79
    You need to not use the Multi-threaded harvester If you want to see the error message (If scraping).

    In the Settings tab, uncheck the "Use multi-threaded harvester" option and try scraping again with your proxies.


    Scrapebox works fine, just scraped 1 million URL's today within an hour or two, and scraped 200k+ Images and downloaded them. It's probably your proxies, or there are just not enough search results. Just because It says 200k results, doesn't mean It will show all of them. Search your query yourself on google to see If Google Is cutting the results early by only showing the most relevant results (try going to the last page).
     
    • Thanks Thanks x 2
    Last edited: Nov 10, 2013
  6. XoRaK

    XoRaK Regular Member

    Joined:
    Oct 28, 2009
    Messages:
    303
    Likes Received:
    250
    Occupation:
    social worker
    Location:
    Belgium
    I'm guessing it's my proxies then.

     
  7. HelloInsomnia

    HelloInsomnia Jr. Executive VIP Jr. VIP Premium Member

    Joined:
    Mar 1, 2009
    Messages:
    1,816
    Likes Received:
    2,912
    Just because your proxies pass the Google test does not mean that they are able to do this kind of scraping. The Google test will test to see if they can do a basic search such as: weight loss

    What you're trying to do is use an advanced operator. When you start to add in and string together search operators such as inurl: or "keywords here" and -"minus this bit" it means the proxies will burn out faster.

    Like Nightly said above untick the multi threaded harvester if you're having issues and you may even want to set a delay of a few seconds on top of that. It sounds like your proxies are working but getting burned out fast. The only way to solve this problem is to add more proxies or to increase the wait time between each proxy use.
     
  8. Nightly

    Nightly Regular Member

    Joined:
    Oct 18, 2013
    Messages:
    292
    Likes Received:
    79
    Hey buddy, your PM box Is full so I couldn't reply back. Here Is the message,

    I pay for all my proxies, so no can do buddy. Username and password protected, IP protected as well as they can only be run using my home computer and server. I can however scrape a couple thousand results for you If you want, to see If It's a proxy Issue or results Issue for you. Just give me a (not too big) keyword list and your custom footprint.

    I use myprivateproxy.net for my private proxies. Have been for a couple years now, never any problems. I never heard of buyproxies.org :p

    If you use MPP, make sure you select "Scrapebox" as your use for them when ordering, so they know which proxies to give you. If you select something else you might get proxies that don't work for your Intended use.
     
  9. seosuperman1

    seosuperman1 Regular Member

    Joined:
    May 25, 2010
    Messages:
    295
    Likes Received:
    57
    Bump for the multi-thread..
     
  10. HelloInsomnia

    HelloInsomnia Jr. Executive VIP Jr. VIP Premium Member

    Joined:
    Mar 1, 2009
    Messages:
    1,816
    Likes Received:
    2,912
    Care to explain?