1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Identify forums from list of urls

Discussion in 'Black Hat SEO Tools' started by AgentZero, Jun 22, 2016.

  1. AgentZero

    AgentZero Regular Member

    Joined:
    Jul 10, 2008
    Messages:
    405
    Likes Received:
    83
    Occupation:
    Self Employed
    Location:
    Playboy Mansion
    Hi guys,

    Have a massive list of URLs and want to identify forums from the list. Is there any good tools which can do this or ways with scrapebox?

    Cheers,
    AZ
     
  2. laur.laurix

    laur.laurix Power Member

    Joined:
    May 8, 2013
    Messages:
    600
    Likes Received:
    222
    Occupation:
    Reverse Engineering Maniac
    Location:
    Mars
    It can be done with SB take a look data extractor plugin
     
  3. vishalgmistry

    vishalgmistry Regular Member

    Joined:
    Sep 25, 2008
    Messages:
    321
    Likes Received:
    521
    1. open
    Code:
    http://tools.buzzstream.com/meta-tag-extractor
    2. Download CSV
    3. Filter out data, filter the title field which contains word forum, discussion.
    4. delete the rest.

    :)
     
  4. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,729
    Likes Received:
    1,996
    Gender:
    Male
    Home Page:
    The page scanner can do this. It comes included with the platforms that scrapebox supports, but if your talking forums you probably want to input/build in your own list. But you can do that and it will scan pages and match content or html footprints.




    the custom data scraper mentioned above could be used if you want to scrape specific data from the forums. Also there is a video on the same channel as the above video on the custom data scraper.
     
  5. AgentZero

    AgentZero Regular Member

    Joined:
    Jul 10, 2008
    Messages:
    405
    Likes Received:
    83
    Occupation:
    Self Employed
    Location:
    Playboy Mansion
    Thanks for that!

    I used the scrapebox page scanner and i seem to be getting 407 errors. Any ideas?
    [​IMG]
     
  6. loopline

    loopline Jr. VIP Jr. VIP

    Joined:
    Jan 25, 2009
    Messages:
    3,729
    Likes Received:
    1,996
    Gender:
    Male
    Home Page:
    407 is an authentication error. Thats coming from your proxies, either they are setup wrong or they ip auth isn't setup correctly etc... so sort your proxies and that will fix it.
     
  7. Joseph Lich

    Joseph Lich BANNED BANNED

    Joined:
    Nov 25, 2015
    Messages:
    402
    Likes Received:
    79
    try without proxie when check platforms