1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

REQ: Scrapebox page scanner footprints

Discussion in 'Black Hat SEO Tools' started by Monie, Sep 11, 2014.

  1. Monie

    Monie Regular Member

    Joined:
    Mar 22, 2014
    Messages:
    212
    Likes Received:
    42
    I think Scrapebox Page Scanner would actually be a great way to filter URLs and reduce the list to URLs that GSA SER can actually post to. I suppose this will be possibly greatly increase the success rate and efficiency of GSA SER.

    If someone has a list of footprints for all (or many, or a few) of the platforms that GSA SER can post to for usage with SB page scanner, it would be really great if you could give that.

    Thanks. :)
     
  2. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,725
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    Well you can just go into your GSA folder under program files and go to the engines folder. There you will find each engine. Just look for like


    page must have1=href="signup.php"|Powered by <a href="http://www.articledashboard.com">Article Dashboard</a>|<p class="center small">© 2005-2011 Article Dashboard|class="formsubmitart"
    page must have2=login.php"|action=login2submitart.php|submitarticles.php"|Powered by <a href="http://www.articledashboard.com">Article Dashboard</a>|<p class="center small">© 2005-2011 Article Dashboard|class="formsubmitart"
    page must have3=submitarticles.php"|Powered by <a href="http://www.articledashboard.com">Article Dashboard</a>|<p class="center small">© 2005-2011 Article Dashboard|class="formsubmitart"



    That is your footprints that GSA uses to qualify a page, so the page scanner could do the same.

    BUT before you run off and spend hours doing that, GSA SER already has this functionality built in. Just go to program options >> advanced >> tools >> Import Urls (identify platform and sort in). That will load each page, and discard the ones that GSA can't post to and then save the ones it can post to, separated by platform, in the identified global folder.

    So the page scanner can load the page, check it and sort them, or GSA can load the page check them and sort them. But with Scrapebox you have to copy out all the engine def files and build the mask file so it can work. I think anyone will tell you I think Scrapebox is an amazing tool worth its weight in gold, but I think even the Scrapebox developer would tell you that there isn't much sense in reinventing the wheel.
     
    • Thanks Thanks x 1
  3. Monie

    Monie Regular Member

    Joined:
    Mar 22, 2014
    Messages:
    212
    Likes Received:
    42
    Wow ok thanks so much Loopline. This is the first time I am seeing something called "Tools" in GSA-SER, and it sure has a truckload of cool options. :)
     
  4. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,725
    Likes Received:
    1,993
    Gender:
    Male
    Home Page:
    Ha, yes it has a lot of useful tools in there.
     
  5. s4nt0s

    s4nt0s Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 10, 2009
    Messages:
    3,863
    Likes Received:
    1,970
    Location:
    Texas
    Just wanted to mention a stand alone for this is coming soon. *teaser screenshot*

    [​IMG]
     
    • Thanks Thanks x 2
  6. Monie

    Monie Regular Member

    Joined:
    Mar 22, 2014
    Messages:
    212
    Likes Received:
    42
    Awesome stuff santos! :D
     
  7. Monie

    Monie Regular Member

    Joined:
    Mar 22, 2014
    Messages:
    212
    Likes Received:
    42
    Santos, will this standalone app be faster than SER at identifying sites? Because SER actually seems to be a bit slow at this.
     
  8. s4nt0s

    s4nt0s Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 10, 2009
    Messages:
    3,863
    Likes Received:
    1,970
    Location:
    Texas
    Yes, it's pretty fast for sure and you can set how ever many threads you want for each project and can be running multiple identify and sort in projects at the same time.
     
  9. TrevorB

    TrevorB Senior Member

    Joined:
    Dec 21, 2011
    Messages:
    1,185
    Likes Received:
    361
    Location:
    Canada
    When will that tool be available?
     
  10. s4nt0s

    s4nt0s Jr. VIP Jr. VIP Premium Member

    Joined:
    Jul 10, 2009
    Messages:
    3,863
    Likes Received:
    1,970
    Location:
    Texas
    No exact ETA but it will be very soon. Maybe +- couple weeks.
     
    • Thanks Thanks x 1
  11. Monie

    Monie Regular Member

    Joined:
    Mar 22, 2014
    Messages:
    212
    Likes Received:
    42
    That's nice Santos. This will surely improve efficiency of any SER setup.

    i just discovered that running an identified list (with 10 private proxies and 150 threads) gives me 45 LPM versus just 25 LPM on a raw scraped list.
     
  12. hebnar

    hebnar Newbie

    Joined:
    Sep 21, 2014
    Messages:
    6
    Likes Received:
    0
    looking forward to having this feature:).
     
  13. SonaBG

    SonaBG Junior Member

    Joined:
    Feb 17, 2014
    Messages:
    180
    Likes Received:
    18
    Gender:
    Female
    Occupation:
    SEO Consultant
    Location:
    Armenia
    Home Page:
    would be great to have it ;)