1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Who is a scrapebox pro in here?

Discussion in 'Black Hat SEO Tools' started by swaggerboy, May 10, 2015.

  1. swaggerboy

    swaggerboy Registered Member

    Joined:
    Jan 18, 2011
    Messages:
    72
    Likes Received:
    17
    I just dabbled into scrapebox and I cant imagine I've never used this tool before. I know you can do very interesting things with it and I have some nice ideas how to implement this with PBN backlinking.

    But harvesting is still a little bit of trouble and I was wondering if there is anyone here who I can add on skype to talk some IM and scrapebox. Cheers !
     
  2. fanatik1389

    fanatik1389 Regular Member

    Joined:
    Apr 7, 2014
    Messages:
    299
    Likes Received:
    105
    • Thanks Thanks x 3
  3. Repulsor

    Repulsor Power Member

    Joined:
    Jun 11, 2013
    Messages:
    766
    Likes Received:
    275
    Location:
    PHP Scripting ;)
    A user, loopline here as tons of good infos about how to use scrapebox. I think all his tuts are in youtube as well. Give it a go.

    I doubt you would get anyone to get to you in skype, unless they get paid. That is how it is I guess.
     
    • Thanks Thanks x 1
  4. TheSEOWizard

    TheSEOWizard Power Member

    Joined:
    Aug 20, 2011
    Messages:
    548
    Likes Received:
    156
    Occupation:
    SEO, PBN, Website et all
    Location:
    PBN World/SEO Land
    Loopline is the scrapebox pro here or anywhere else. Having said that, most SEOs use SB in more ways than you can imagine. It is the Swiss Army Knife of SEO. I personally consider it a whitehat tool nowadays. Hardly anyone use it for blasting in today's SEO scenario. So, maybe what you are planning to do is already done by others.

    What problem are you facing in harvesting. Just include the terms, use and select the proxies, check the search engines you need and harvest. Remember that Google might block you too easily. So, make sure to use other engines too. If you are using SB V2.0 beta, there are shitload of engines out there.

    EDIT: Seems two others already recommended loopline while I was typing this post. Check his youtube videos as he has covered almost everything there. And if you still face problems, reply to the SB thread here on BHW and he answers questions almost everyday.
     
    • Thanks Thanks x 2
    Last edited: May 10, 2015
  5. swaggerboy

    swaggerboy Registered Member

    Joined:
    Jan 18, 2011
    Messages:
    72
    Likes Received:
    17
    Thanks for the advice. Yes, I already checked Loopline his video's. I already have alot of expereince in web analytics through tag manager, GA and alot of other whitehatstuff etc. But we never used any tools. So it could be interesting for the person to talk some IM with me. I'm not here to leech ;)

    Anyways, I'm trying to harvest blogs related to smartphone repair. When I harvest I'm using downloaded footprints from a thread here and combine those with the keyword: iphone repair, samsung repair, smartphone repair.

    It's a dutch website, when I tried it in dutch for iPhone. Just this keyword. I got back 11 results. I'm using good proxies in EU from squidproxies.
     
  6. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,724
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    Thanks for the recomendations guys!


    I don't do skype, but if you post here Im happy to help.

    I don't understand fully what your trying to do though, are you trying to harvest results from a specific website or just the web in general.

    Probably the best simple advice I Can give you is that google and other engines see Scrapebox as a browser. So open up your favorite browser, go to google and put in 1 of those footprints your using in Scrapebox. When you see the results you want in a browser, put the same thing in Scrapebox and you will get the same results.

    Sometimes people just grab a bunch of random stuff and shove it together and don't understand why its not working, yet when you put the same stuff in a browser and look at results, you can quickly and easily see how to change it to get what you want, and then you put the same thing in Scrapebox.

    So in a nutshell, Scrapebox does nothing "magical", no SEO program does, They only automate what you can do manually, so often its helpful to do it manually until you have a good idea of what your after, and then automate it.

    Its like when you first learn to drive (Assuming you know how to drive a car etc....) you don't hop in a car for the first time ever behind the drivers wheel and jump straight on to the interestate or highway going 70mph or whatever. You start out in a parking lot or on side streets, do the same thing with Scrapebox. Its like a car, it doesn't really take you anywhere you can't walk, it just gets you there faster (more or less).
     
    • Thanks Thanks x 1
  7. swaggerboy

    swaggerboy Registered Member

    Joined:
    Jan 18, 2011
    Messages:
    72
    Likes Received:
    17
    Thanks for your response Loopline. I'll try to explain a bit more what I'm after. I also just did another test but no results.

    I'm looking for a list of relevant dutch blogs to do some manual commenting for backlinking my PBN with a first tier.

    Step 1: I load up the keywords which are "iphone reparatie" "samsung reparatie". In english it says iphone repair and samsung repair. Alright, next I use a custom footprint: "blog" and I search for all platforms. Like I stated in my old post that I also used a list of a BHW member with custom footprints for WP blogs.

    This process I did several time on different ways. For example just the keywords, no custom footprint and only wordpress blogs. This resulted in practically nothing.

    When I harvest just as step 1 describes I get the most irrelevant results back and they're not even in dutch. So as how you explain, it harvests the urls till page 10 for my keyword just as I would type it in Google. SO let's say i search for iphone reparatie. And I just use this keyword, mark all platforms I should get the same results as when searching for on Google.

    Also I use the google.nl searchengine in scrapebox. I don't get why my results are so irrelevant and I;ve tried alot of things. Or is this just the way the tool works? There is not a single relevant website in the results.
     
  8. swaggerboy

    swaggerboy Registered Member

    Joined:
    Jan 18, 2011
    Messages:
    72
    Likes Received:
    17
    Alright, so I was reading the guide of jacob king and he suggested to try your footprint + KW in the searchengine first and get around 1000 results. So I used the footprint "wordpress" "reactie" (= dutch for reaction/comment) together with the KW "iphone reparatie" and this gave exactly 1000 results. When I now used the scraper it gave me perfect results.

    But I wanna know how this works. To me this is kinda weird. Because it does not seem to scrape in hierarcial order as how Google displays or do I see this wrong? Because even if I scrape with loads of results it will only take the 1000 most relevant ones, just like the browser orders them, right?
     
  9. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,724
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    Firstly glad you got it working. Scrapebox scrapes the first 1000 results just like a browser gets, but it doesn't necessarily display them in the order they would be in a browser. So result 873 might be first followed by result 431 etc...

    If you need them displayed in the order of the browser you would either need to use the detailed harvester in V2 or use the single threaded harvester in V1.
     
  10. JustUs

    JustUs Power Member

    Joined:
    May 6, 2012
    Messages:
    626
    Likes Received:
    582
    @loopline.

    Not really to do with this post. I found a bug in the Custom Harvester in V2. When you click pause, the Harveter and Scrapebox lock up. Once the program has locked up, the only way of closing the program is via the task manager. When the program is locked up, on my machine (i7 4770k) scrapebox is using about 9 percent of the CPU time.
     
  11. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,724
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    Does it happen when using only a few keywords or only with large lists?
     
  12. BassTrackerBoats

    BassTrackerBoats Super Moderator Staff Member Moderator Jr. VIP

    Joined:
    Mar 10, 2010
    Messages:
    15,914
    Likes Received:
    29,233
    Occupation:
    I don't actually have a job
    Location:
    Not England
    Home Page:
    OP, next time just use Google.

    [​IMG]
     
    • Thanks Thanks x 2
  13. JustUs

    JustUs Power Member

    Joined:
    May 6, 2012
    Messages:
    626
    Likes Received:
    582
    The lists I am using have 120 - 200 keywords in them and I will generally run 10 - 100 threads. However, I have had it happen on much smaller lists when I maxed out my bandwidth and need to pause to attempt to get something else done for a few minutes. I have also attempted to let the pause run overnight to see if it would clear - no luck.
     
  14. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,724
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    lol

    So does it ever popup an error box? Pausing causes all the threads to have to be stopped, which if 1 is locked by a 3rd party program, then it will stay in "pausing" mode forever.

    Your using .39 right?
     
  15. JustUs

    JustUs Power Member

    Joined:
    May 6, 2012
    Messages:
    626
    Likes Received:
    582
    Yes, I am using .39. No, an error box never pops up. The splash screen with the circular widget just never ends. I have no idea what would cause a race condition and interfere with SB I have tried stopping all other programs without any change. I do not see anything in the services that should not be running. I cannot currently access the SB Harvester error logs because of an in progress long scrape, but I do not think this log would tell us anything. I also run Malwarebytes as a service and it will stop connections to flaky proxies.

    The Windows Application Error log shows this:
    General:
    Code:
    The program scrapebox.exe version 2.0.0.39 stopped interacting with Windows and was closed. To see if more information about the problem is available, check the problem history in the Action Center control panel.
    Details:
    Code:
    - System 
    
      - Provider 
    
       [ Name]  Application Hang 
     
      - EventID 1002 
    
       [ Qualifiers]  0 
     
       Level 2 
     
       Task 101 
     
       Keywords 0x80000000000000 
     
      - TimeCreated 
    
       [ SystemTime]  2015-05-13T16:26:01.000000000Z 
     
       EventRecordID 18597 
     
       Channel Application 
     
       Computer LPTop 
     
       Security 
     
    
    - EventData 
    
       scrapebox.exe 
       2.0.0.39 
       258c 
       01d08d452fbbbd0a 
       4294967295 
       C:\Users\XXXXX\Documents\ScrapeBox Beta\scrapebox.exe 
       bbc0c312-f98c-11e4-8282-d43d7edcd8c6 
        
        
       54006F00700020006C006500760065006C002000770069006E0064006F0077002000690073002000690064006C00650000000000 
    
    
    --------------------------------------------------------------------------------
    
    Binary data:
    
    
    In Words
    
    0000: 006F0054 00200070 0065006C 00650076 
    0010: 0020006C 00690077 0064006E 0077006F 
    0020: 00690020 00200073 00640069 0065006C 
    0030: 00000000    
    
    
    In Bytes
    
    0000: 54 00 6F 00 70 00 20 00   T.o.p. .
    0008: 6C 00 65 00 76 00 65 00   l.e.v.e.
    0010: 6C 00 20 00 77 00 69 00   l. .w.i.
    0018: 6E 00 64 00 6F 00 77 00   n.d.o.w.
    0020: 20 00 69 00 73 00 20 00    .i.s. .
    0028: 69 00 64 00 6C 00 65 00   i.d.l.e.
    0030: 00 00 00 00               ....
     
  16. Sweetfunny

    Sweetfunny Jr. VIP Jr. VIP

    Joined:
    Jul 13, 2008
    Messages:
    1,785
    Likes Received:
    5,067
    Location:
    ScrapeBox v2.0
    Home Page:
    That could be what's happening. I just tried harvesting from multiple engines with a lot of threads and paused/unpaused dozens of times no issues. I know it's working for me on Win7, Win 8.1 and Win Server 2008 R2 and I can check on Win10 later but I'm sure it will work there too.

    ScrapeBox is probably trying to pause a thread and Malwarebytes is locking the thread to analyze the connection/proxy, and in turn Windows is stepping in and saying "The program stopped interacting with Windows and was closed." So i would try shutting Malwarebytes and anything else that can interfere with connections and retry it.

    Also all your harvester URL's are saved to the Harvester_Sessions folder in ScrapeBox in real time so nothing is lost.
     
  17. JustUs

    JustUs Power Member

    Joined:
    May 6, 2012
    Messages:
    626
    Likes Received:
    582
    When I finish this particular scrape, I will attempt another without using Malwarebytes. I am not worried about losing anything I have scraped as I am very aware of the saving in the Harvester sessions directory; I am far more concerned about having to rescrape the keywords. Most of the scrapes I am doing are slow and use the Harvester proxies. I do not know if it will help you, but here is a screenshot of the current scrape I am conducting. The Google connections have a large amount of errors. The error logs from previous scrapes show that they are socket errors and 500 http errors. This may also be related to Malwarebytes, but I would not venture a guess at this time. While not shown, there are 34 footprints. This is one that I tried to pause- so please understand that I am hesitant to attempt to pause at this time.

    [​IMG]

    Edit add:
    Became frustrated enough that I shut the computer down and killed malwarebytes on boot. ScrapeBox now pauses and is scraping faster. There are still many errors, esp. with Google (10 min. runtime, 1180 errors Google. 198 Google API and 125 Bing). When malwarebytes log was checked, Malwarebytes was blocking proxy IP's as suspected and this seems to be why the pause would not function.

    Thanks for help.
     
    Last edited: May 16, 2015
  18. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,724
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    Glad you got it sorted.

    As for the errors, its most likely proxy errors due to failing proxies as more proxies are google banned then will be banned for other engines. If you have the automator I just made a video showing how you can have fresh proxies always using that and you can have the harvester constantly reloading those fresh proxies all at the same time.

    https://www.youtube.com/watch?v=pCgfY0Q6MV4
     
  19. JustUs

    JustUs Power Member

    Joined:
    May 6, 2012
    Messages:
    626
    Likes Received:
    582
    I wish it were that easy. I grabbed all the raw proxys listed in GSA SE Ranker and dumped them into SB. About 300,000 of them. Then I tested through SB. At the end, I had 14 Google passed proxies and 1400 Anon proxies. This is not the first time I have run that kind of test either.

    This does not suggest a problem with SB. It does suggest that there is an increase in the competition to use proxies and a subsequent burning out of those proxies.
     
  20. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,724
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    Yes, there is increased competition and google bans even faster so combine the two and you get what you see. However thats not to say that those sources aren't just heavily banned. So you may find other sources that are not so heavily banned.

    Else I don't even hassle with all this, I buy back connect proxies and use them for Scraping and then I use private proxies for posting. I have a video showing how I had Scrapebox harvesting over a million urls per minute from google here:

    https://www.youtube.com/watch?v=ThtrP-6TUKM

    It was using those back connect proxies, mind you in every day production I don't use that many, because I have things all running thru automation and I when you spread out scraping power over time consistently you don't need as many proxies. But anyway I harvest advanced operators and 24 hour footprints and all sorts of stuff with these back connect proxies and I never have issues (knock on wood) and its cheap and no hassles.