1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to Scrape ALL AA LINKS On a WEBSITE

Discussion in 'Black Hat SEO' started by Ghoztseo, Feb 12, 2014.

  1. Ghoztseo

    Ghoztseo Power Member

    Joined:
    Nov 24, 2012
    Messages:
    523
    Likes Received:
    208
    Occupation:
    High IQ
    Location:
    Sri Lanka
    Hello guys i'm using Scrapebox but i don't know how to scrape all URL's of a Website? I tried Sitemap but it doesn't work!
     
  2. JournoNick

    JournoNick Jr. VIP Jr. VIP Premium Member

    Joined:
    Feb 11, 2014
    Messages:
    613
    Likes Received:
    309
    Location:
    on the frontline
    Are you looking for tier targets?
     
  3. Ghoztseo

    Ghoztseo Power Member

    Joined:
    Nov 24, 2012
    Messages:
    523
    Likes Received:
    208
    Occupation:
    High IQ
    Location:
    Sri Lanka
    yes! Journo! Do you know how to do it bro?
     
  4. reapV

    reapV Registered Member

    Joined:
    Jan 27, 2014
    Messages:
    56
    Likes Received:
    10
    This can be very easily achieved by a small python or php script. I can point you to specific tutorials if you want to throw together a small tool.
     
    • Thanks Thanks x 1
  5. Ghoztseo

    Ghoztseo Power Member

    Joined:
    Nov 24, 2012
    Messages:
    523
    Likes Received:
    208
    Occupation:
    High IQ
    Location:
    Sri Lanka
    yes Reap v if you don't mind teach me how to scrape those bunch of links?
     
  6. 0_00_0

    0_00_0 Senior Member

    Joined:
    Oct 7, 2010
    Messages:
    1,024
    Likes Received:
    486
    Location:
    Canada
    You can just use the "site:" operator and scrape URLs normally with Scrapebox I believe.
     
    • Thanks Thanks x 1
  7. Ghoztseo

    Ghoztseo Power Member

    Joined:
    Nov 24, 2012
    Messages:
    523
    Likes Received:
    208
    Occupation:
    High IQ
    Location:
    Sri Lanka
    yes it does but it want like this. Check this out example

    Site:xxxx.com

    I need links from xxx.com and it should be like this

    XXX.com/xx1
    xxx.com/xx2
    xxx.cpm/xx3
    xxx.com/xx4

    like that! it's for mass spamming.. btw i have tried Sitemap scrapper. it doesn't help me to find blogs with sitemap :\
     
  8. divok

    divok Senior Member

    Joined:
    Jul 21, 2010
    Messages:
    1,015
    Likes Received:
    634
    Location:
    http://twitter.com/divok
    first try to scrape all internal links from homepage , keep going deep according to your needs and then on the last step extract all external links from those pages .
     
  9. Winternacht

    Winternacht Junior Member

    Joined:
    Jan 7, 2011
    Messages:
    113
    Likes Received:
    46
    use the link extractor form scrapebox and scrape the internal links several times, this should give you almost every page on the website
     
  10. ija61

    ija61 Senior Member

    Joined:
    Mar 2, 2011
    Messages:
    960
    Likes Received:
    634
    Gender:
    Male
    Occupation:
    The first SEO economist:)
    Location:
    Romania
    Home Page:
    As mentioned you can use the Link Extractor addon

    1. Take your website and harvest with the "site:" operator (list1)
    2. Clean the resulted list (list1) and load it in the link extractor extract internal links (list2)
    3. Clean the resulted list (list2) from the link extractor and load it again and extract internal links. (list3)
    4. Remove the links from list3 that you allready have in list 1 and 2 and then the resulted list load it again in the link extractor.
    ......

    You can do this until you will get less new links.

    If you have the automator you can use it and automate the process.

    This is the best way to do it with SB, a second way it will be to use a web spider but a good one will cost you money.
     
  11. Ghoztseo

    Ghoztseo Power Member

    Joined:
    Nov 24, 2012
    Messages:
    523
    Likes Received:
    208
    Occupation:
    High IQ
    Location:
    Sri Lanka
    Yes guys! thanks for your awesome advices! so i could use web extractor to harvest internal links easily xD :)