1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scrapebox - Cannot harvest google results with operators and proxies

Discussion in 'Black Hat SEO' started by johndoes, Feb 10, 2011.

  1. johndoes

    johndoes Junior Member

    Joined:
    Nov 12, 2010
    Messages:
    158
    Likes Received:
    17
    I've tried again and again but looks like Google doesn't reply to search operators like site: allinurl: allintitle etc if you're using proxies.

    When I stop using proxies, sb collects results, but surely the ip is blocked after some time.

    Anything I can do to overcome this?

    Regards,
     
  2. jb2008

    jb2008 Senior Member

    Joined:
    Jul 15, 2010
    Messages:
    1,158
    Likes Received:
    972
    Occupation:
    Scraping, Harvesting in the Corn Fields
    Location:
    On my VPS servers
    This is because the IP is already blocked for the proxy. inurl etc operators are not viable for any substantial harvests. Try using quotes instead. Often if it's a string that appears exclusively in the url, you will get the results you are looking for anyway. Remember when you search you are searching not only in the page content, title, meta description, h1 tags etc, you are also searching in the url. The inurl operator simply means 'exclusively' in the url. If you've got a url string that appears exclusively in the url anyway, then quotes will suffice. For example, something like "index.php?member?=" or whatever (that's not a real string I just made it up for explanation purposes) but you get the drift. Be creative with your quotes and you can get around this.
     
    • Thanks Thanks x 5
  3. johndoes

    johndoes Junior Member

    Joined:
    Nov 12, 2010
    Messages:
    158
    Likes Received:
    17
    thanks for the tip. I tried some versions of footprints but no luck.

    I know the thing about quotes but how can you query this phrase without operators?

    inurl:google.com -site:google.com

    or much simpler

    allintitle:google is the king

    I think google is 10 times more picky about these webmaster related queries. I can run simple queries easily but these operator queries fail with proxies

    They're starting to figure out what we're doing and they're not happy with that :)

    Thanks,

     
  4. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,377
    Likes Received:
    1,799
    Gender:
    Male
    Home Page:
    Well if it works without proxies for you and not with your proxies then its your proxies. You need new proxies, try getting them from other sources then you are currently using. Im not having any issues.
     
    • Thanks Thanks x 1
  5. johndoes

    johndoes Junior Member

    Joined:
    Nov 12, 2010
    Messages:
    158
    Likes Received:
    17
    yeah thats what I want to know. If it doesnt accept any proxies at all or only SB's internal proxies.

    Lemme look for other proxies

    thanks

     
  6. mazgalici

    mazgalici Supreme Member

    Joined:
    Jan 2, 2009
    Messages:
    1,489
    Likes Received:
    881
    Home Page:
    Google is getting annoyed much faster if you do advanced searches
     
  7. Volkow

    Volkow Newbie

    Joined:
    Dec 22, 2010
    Messages:
    16
    Likes Received:
    2
    Occupation:
    Web developper
    Location:
    Strasbourg
    Use proxies wich allow GET/POST.

    I think. :)
     
  8. chris456

    chris456 Regular Member

    Joined:
    May 17, 2010
    Messages:
    281
    Likes Received:
    567
    Scrapebox proxies are used by thousands of scrapebox users every day , so they are almost all already flagged , I can't use them too , I would follow both advices - JB2008 saying that it works with quotes (for me it works too) and other users saying that you have to find your new proxies or private proxies because scrapebox proxies are useless for inurl: etc. footprint searches.

    I would welcome if Scrapebox add new function :
    something like operator searches proxy tester (something like Google tester - passed or NOT) which would tell you if this proxy is good for Google operator searches or not
     
  9. johndoes

    johndoes Junior Member

    Joined:
    Nov 12, 2010
    Messages:
    158
    Likes Received:
    17
    thats very clever

     
    • Thanks Thanks x 1
  10. chris456

    chris456 Regular Member

    Joined:
    May 17, 2010
    Messages:
    281
    Likes Received:
    567
    I have noticed that it will be your first thanks from 19 posts , so I had to be the first -:)
     
    • Thanks Thanks x 1
  11. johndoes

    johndoes Junior Member

    Joined:
    Nov 12, 2010
    Messages:
    158
    Likes Received:
    17
    lol thanks :)



     
  12. Sweetfunny

    Sweetfunny Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 13, 2008
    Messages:
    1,747
    Likes Received:
    5,039
    Location:
    ScrapeBox v2.0
    Home Page:
    Yes i though of this feature before but the big problem with it is to test if a proxy is blocked for inurl: the tester needs to actually perform an inurl: search. Then by the sheer fact of everyone is running these queries testing would block whatever proxies are good before they even get used for a real search.
     
  13. johndoes

    johndoes Junior Member

    Joined:
    Nov 12, 2010
    Messages:
    158
    Likes Received:
    17
    that is also clever :)

    i think there's no hope with public proxies.

    I'll try to find other ways to replace those operators

     
  14. Davidian

    Davidian Newbie Premium Member

    Joined:
    Feb 9, 2011
    Messages:
    36
    Likes Received:
    9
    Why don't you try HideMyAss VPN. There are a lot of proxies available although some of them are really slow. If you purchase 10 or 20 private proxies from any proxy seller they'll be flagged soon if you use SB on a daily basis. You must replace them after every 2 or 3 days I guess.
     
  15. chris456

    chris456 Regular Member

    Joined:
    May 17, 2010
    Messages:
    281
    Likes Received:
    567
    I understand now , it is true and logic but I have maybe a solution, but first of all I wanted to tell you that I really appreciate your Sweetfunny bot , browsing this forum for threads and posts about scrapebox , and your quick answers, you deserve your fame and your clients .

    my solution :
    scrapebox is sold everyday and already has to have thousands users doing everyday the same tasks , queries , they all are using 6 proxy sources , about 6000 proxies , if everybody like me and others try to search for operator footprints, so they are destroying those first proxies (first proxies till you realize that they are bad and will stop the process) anyway, right ? I when I try it , I am trying it and see that there are no results but in the period when I realize that there will be no extracted websites , I am destroying many for normal searches good proxies which other users should use for real , normal queries , so what about to forbid to use those operator queries for original Scrapebox proxies (I mean proxies provided by scrapebox - because I think that to every single scrapebox proxy Google already gave his own flagged name like a planets in the universe, so what about make them more healthy and add new option to test for operator footprints with our own proxies ?
    It should be some medal for those users who are not destroying original scrapebox proxies .
    Scrapebox proxies will become more healthy (in future when more users will buy the scrapebox you will need to resolve it anyway , I got after 1 hour testing 400 good proxies (from 6000) so if everybody will search for his own proxies , scrapebox will work better , and this new option using operator test would be some price for those who were trying to find their own proxies.
    You will gain twice , proxies will become more healthy , people more satisfied , better results and those who dedicated more time to search for their own proxies would get new function - test for (for me very important -operator footprint proxies -they will in fact destroy their own proxies) if they don't they will get better and more accurate results.
    Hope that you understand -:) Thanks
     
    Last edited: Feb 11, 2011
  16. dichotom

    dichotom Jr. VIP Jr. VIP

    Joined:
    Dec 9, 2008
    Messages:
    1,916
    Likes Received:
    544
    Public proxies are garbage, especially so when they are the default sources installed in a scraping tool. If you are serious about using the tool you will buy or find your own sources to use anyway. Trust me, paid proxies will save you some grey hairs. I don't think anything needs to be changed. Besides, they are not "scrapebox proxies" they are just some lists that the tool happens to pull in.
     
  17. chris456

    chris456 Regular Member

    Joined:
    May 17, 2010
    Messages:
    281
    Likes Received:
    567
    In other words you've said what I've said , I wanted to encourage the users to search for their own proxies and those who don't want , they don't need to use operator footprints they can use normal proxies for normal work and because of my suggestion those normal proxies should be healthier by NOT destructing them by useless using of operator footprints queries , because as we all know they will not work anyway , so I have suggested to Sweetfunny to disallow using operator footprints for original scrapebox proxies which are suffering by numerous users queries and add this test function for more serious people who are willing to pay for private or who are willing to search for better new public proxies.
    I need this function , for me is very necessary to have accurate searches , so I am ready to buy private proxies, but if I have the possibility to search for good public proxies for free which has no problems , I will save this money very gladly , and will everyday put new proxies in new tester , I 've heard that also private proxies sometimes have problems with operator footprints .

    I say operator footprint tester should be nice addition , thats all .

    Sweetfunny can't create it , because of logical arguments that unsuccesful test should destroy normal working proxies , and I am saying that users are destroying them anyway , because they are trying them while trying to make them search in Google in real , so this is my suggestion to dissallow it because they won't work anyway.
     
    Last edited: Feb 11, 2011
  18. dichotom

    dichotom Jr. VIP Jr. VIP

    Joined:
    Dec 9, 2008
    Messages:
    1,916
    Likes Received:
    544
    We aren't saying the same thing though. The "test" is the harvest itself. If the operator footprints don't work - it won't harvest anything. If scrapebox hits a proxy that doesn't work, it goes to the next one. If none of them work you don't get a harvest. Test complete.

    As far as the proxy lists that come with scrapebox - who cares? They are going to get blasted to oblivion either way so it is irrelevant. The proxies that you are calling "scrapebox proxies" are not exclusive to scrapebox, people are using those lists with other tools as well. If you are going to use public proxies all you need to do is throw them in and try to use them. If nothing gets harvested you know they were a pile of burnt up garbage.
     
  19. chris456

    chris456 Regular Member

    Joined:
    May 17, 2010
    Messages:
    281
    Likes Received:
    567
    Why the proxies tester exists if testing and harvesting is the same thing? of course it is the same process, but scrapebox test it for Google , IP test and speed , why don't we add another option to test it for operator queries and then be sure that everything will work ok?

    Shall I finish after 50 minutes scanning with for me useless test which will result in not working proxies?

    I dont want to test it again in real work by harvesting , and if I test it in real work (by harvesting )how can I know that this one single proxy works ?
    there will be many of them , shall I pick up few proxies from the list investigating which one worked?

    You also can never be sure if private proxies will for this kind of "operator footprint searching" be ok with them too ... Tester should be nice addition . I am sure

    Sweetfunny admitted too that she was thinking about it.
    My english is not the best , so maybe I don't get what you are saying or maybe you don't understand what I mean.
    (I don't use scrapebox for commenting because I would like to work more whitehat , I use it for extracting domains from Search Engines for my niche relevant keywords , Operator footprints would be more accurate for me , and tester for this kind searches more comfortable . )

    That's what I wish , I want to put my own proxy list there (private or public - whatever) and to test it if I can use or 6600 "scrapebox sources proxies" or 10000 "mine sources proxies" or all of them together for operator footprints and if at least some of them will work , it will save me a lot of time because relevancy of websites I want to grab with it will be much more accurate.
     
    Last edited: Feb 11, 2011
  20. kappa84

    kappa84 Power Member

    Joined:
    May 19, 2010
    Messages:
    736
    Likes Received:
    334
    Location:
    Bath, UK
    Public proxies are crap: did a test on xrumer, with the ip from the vps and 50 threads was getting 200-230 links/min, with public proxies harvested and cleaned I was getting 16 links/minute.

    Yes, private proxies worth the money...