1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to harvest tons of URLs in Scrapebox?

Discussion in 'Black Hat SEO' started by SuzieQBlackHat, Feb 8, 2017.

  1. SuzieQBlackHat

    SuzieQBlackHat BANNED BANNED

    Joined:
    Feb 8, 2017
    Messages:
    120
    Likes Received:
    17
    Gender:
    Male
    I have heard about people who can harvest hundreds of thousands of URLs per day in Scrapbox. I am wondering how they do that? Meanwhile all I can do is harvest like 4000 URLs per hour. Is there some trick that the people are using to get so many URLs? Please tell me how to do it also.
     
  2. SuzieQBlackHat

    SuzieQBlackHat BANNED BANNED

    Joined:
    Feb 8, 2017
    Messages:
    120
    Likes Received:
    17
    Gender:
    Male
    bump
     
  3. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,849
    Likes Received:
    2,044
    Gender:
    Male
    Home Page:
    Thats a loaded question. I mean you can start by selecting every engine scrapebox has, that will bump your results.

    Next is probably your proxies. Your going to be able to harvest faster with back connect proxies, if your only harvesting. They aren't good for posting, and many other things, but harvesting they work nicely. public proxies, if you have it sorted (which is a topic unto its own) could be useful with using all the engines in scrapebox as well.

    Next is probably your footprints and keywords. If your using obscure footprints that have few or new results that will slow things down.

    etc... there are many things that affect speed, but engines, proxies and queries are probably the top 3.
     
    • Thanks Thanks x 1
  4. arsen99

    arsen99 Jr. VIP Jr. VIP

    Joined:
    Nov 28, 2010
    Messages:
    306
    Likes Received:
    45
    Home Page:
    Speed and URLs number mostly depends on proxies and keywords/footprints. I'll skip here the question of the speed Internet connection because nowadays it is rather occasional problem.

    You can harvest millions of links if inquiries are popular keywords, but do not always translate to achieve the intended purpose. More often better to go in quality, not quantity. If you will be more detailed in choosing keywords to harvest, you can grab less urls number but better matching and with higher quality. You will do it also much faster (read time is money) and with less resources (read proxies and bandwith).

    Regardless of what keywords you do not use, you need a good proxies to be able to pull the max with Scrapebox. If you focus only on harvesting, it can be done using a public proxies, rotated proxies, shared datacenter proxies or private proxies. In any case, there are some pros and cons and you have to decide yourself which solution is best for you, which mostly depends on what and how fast you want to achieve.

    Personally, I recommend you our offer packages proxies to scrape 3 search engines: Google, Yahoo and Yandex. These are private proxies dedicated to harvesting, which can be purchased at a very low price.

    All the best!
     
  5. SuzieQBlackHat

    SuzieQBlackHat BANNED BANNED

    Joined:
    Feb 8, 2017
    Messages:
    120
    Likes Received:
    17
    Gender:
    Male
    Thanks for your reply. I'll definitely consider buying and using your proxies. I just have a couple of questions. I've gotten the hang of how to use Scrapebox but the only real issue I am noticing is that during the phase where I harvest URLs from the keywords, Google reaches like 8000 or 10000 or even 30,000 URLs and then boom, it just stops and no more. I am assuming this is because Google has blocked the IP because it got too many search requests in too short of a time.

    So if I use your proxy servers, do I need to use a time space between requests, and if so, how much time, like 30 seconds or 60 seconds or whatever?

    Thanks again for your help.
     
  6. arsen99

    arsen99 Jr. VIP Jr. VIP

    Joined:
    Nov 28, 2010
    Messages:
    306
    Likes Received:
    45
    Home Page:
    Dear @SuzieQBlackHat

    You are absolutely right.
    The scenario you described fully fit when too many requests to google is done in too short period of time. Your action resulted in the imposition of blockade by Google.
    It could happen even after far fewer requests if they contain advanced search operators or query recognized by Google as spam.

    Definitely yes. Proxy server is treated as an ordinary IP address and topped with the same limitations. With the proxy server you multiply "your" IP addresses

    It depends on the keywords, the amount of those keywords, the duration of the harvest, speed querying and a few other things.
    There is one golden rule, of course, there are limits, because at the end, and so everything is evaluated by an algorithm google, but the best measure is your own tests.
    However, I understand perfectly well that it's nice to have a base from which to start, so I recommend you watch a movie by Looplin:


    It will give you some tips how this mechanism works and and to better understand the rules of scraping.

    All the best!
     
    • Thanks Thanks x 1
  7. SuzieQBlackHat

    SuzieQBlackHat BANNED BANNED

    Joined:
    Feb 8, 2017
    Messages:
    120
    Likes Received:
    17
    Gender:
    Male
    Thanks again for your help. I'll check out that video you linked. I also took a look at your massproxy site and damn, you have very good prices. 200 proxies for 60 dollars a month. I'll definitely be buying your proxies once I figure out how to use it properly, and figure out how to not get banned from Google.
     
  8. SuzieQBlackHat

    SuzieQBlackHat BANNED BANNED

    Joined:
    Feb 8, 2017
    Messages:
    120
    Likes Received:
    17
    Gender:
    Male
    Another issue I am facing now is during the comment posting phase. I was able to harvest 100,000 URLs (no duplicates) within an hour, somehow or another. So I started posting and the rate of success is like 10%. Usually I was getting like 30% before. I tried it on a few different VPN IP addresses and same result. So is this list just crap or is there some other problem? Do I need to use proxies during the comment posting phase? Or use a time gap delay or something?
     
  9. SuzieQBlackHat

    SuzieQBlackHat BANNED BANNED

    Joined:
    Feb 8, 2017
    Messages:
    120
    Likes Received:
    17
    Gender:
    Male
    Once I have my questions answered, I'll buy your private proxies. But first I need to know EXACTLY what to do, especially in regards to timeout settings, for both harvesting URLs and posting comments. Otherwise I will make a mistake and burn up the proxies and just waste time and money.
     
  10. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,849
    Likes Received:
    2,044
    Gender:
    Male
    Home Page:
    My first guess is yes the list is crap, but you didn't really give enough details. What are the errors? Does it give you an error code or does it say a lot of "unknown platform" or ?
     
  11. SuzieQBlackHat

    SuzieQBlackHat BANNED BANNED

    Joined:
    Feb 8, 2017
    Messages:
    120
    Likes Received:
    17
    Gender:
    Male
    Well I changed my IP address on my VPN and now I am back at 30% success rate again.

    I use Bing to harvest URLs, so I am not worried about the harvesting side anymore. Bing never blocks you. Now the only thing I need to do is to buy the private proxies and then set it up to use for comment posting. And I am assuming I should add in a time delay to the comment posting, like a 10 or 15 second delay? But I cannot find where to add that, and it is not in the Settings section. Please just let me know what is the most simple method to set it up for the highest success rate in the comment poster.
     
  12. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,849
    Likes Received:
    2,044
    Gender:
    Male
    Home Page:
    Ok, glad you sorted it. Why would you need a delay? You can't use one, but unless your posting to 100% the same domain then you wont' need a delay.

    If your using the same domain just set the connections to 1 and let it trickle along.
     
  13. SuzieQBlackHat

    SuzieQBlackHat BANNED BANNED

    Joined:
    Feb 8, 2017
    Messages:
    120
    Likes Received:
    17
    Gender:
    Male
    Hi MassProxy arsen99, I was going to buy your 50 private proxies deal for 26 dollars a month but on the website it says 26,00USD. Maybe you mistakenly used a comma instead of a period? Also on the order page it asks for "IP for proxy authentication" but I don't know what to put in that box. I tried messaging you thru the Contact page on your site but have not gotten a reply yet. Hence I am messaging you on here.

     
  14. arsen99

    arsen99 Jr. VIP Jr. VIP

    Joined:
    Nov 28, 2010
    Messages:
    306
    Likes Received:
    45
    Home Page:
    Hi @SuzieQBlackHat,

    Nice to hear that, but please remember that our proxies work only with a few, exactly indicated services such as: Google, Yahoo, Yandex, Youtube, Instagram, Facebook, LinkedIn, Snapchat, VKontakte and Wikipedia for SNIPE packages and Google, Yahoo and Yandex for SCAPE packages. From what I can see you are looking for proxies for posting and our proxies will not fit for that purpose.

    SNIPE package 50 proxies cost $26. In our region comma is used instead of a period in such cases ;)

    Explanation of "IP for proxy authentication" you can find on our FAQ page: https://massproxy.com/faqs.php
    Also I must admit that I can not fond any unresolved ticket.

    All the best!