1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Question about site spidering.

Discussion in 'Black Hat SEO' started by MarketerX, Sep 17, 2011.

  1. MarketerX

    MarketerX Regular Member

    Joined:
    Mar 7, 2010
    Messages:
    398
    Likes Received:
    120
    Hey everyone, I am learning VB.NET to make a spam bot. I need to spider the sites I'm going to be spamming and only collect links that are within a certain subdirectory of the site.

    Do I need to write my own custom spidering/link harvesting code, or is there a program that will do this for me? I think Httrack would do it...just looking for suggestions because I need to harvest alot of links to use with my bot

    :D
     
    Last edited: Sep 17, 2011
  2. Autumn

    Autumn Elite Member

    Joined:
    Nov 18, 2010
    Messages:
    2,197
    Likes Received:
    3,041
    Occupation:
    I figure out ways to make money online and then au
    Location:
    Spamville
    Just use wget to do the fetching, I believe it runs on Windows and you can limit the paths it will follow via the command line options. No need to reinvent the wheel. Then you can do your custom parsing on the downloaded html files if you need to extract more data.
     
  3. accelerator_dd

    accelerator_dd Jr. VIP Jr. VIP Premium Member

    Joined:
    May 14, 2010
    Messages:
    2,441
    Likes Received:
    1,005
    Occupation:
    SEO
    Location:
    IM Wonderland
    I believe this should be in the programming section, but I am sure a mod will move it.

    On your question, I code mainly in C# but they are pretty similar from what I have head of, anyways when parsing the URLs with regex (i guess), there's always the possibility to check before following throguh.

    If you are doing some while loop going over all the urls and for each of those urls, following (recursive) you can make it just not do a one run (if the domain is different) and continue with the rest of the links.

    again, not a vb coder, but i think this is somethin general so that's that
     
  4. VIC SEO

    VIC SEO Elite Member

    Joined:
    Feb 19, 2010
    Messages:
    2,156
    Likes Received:
    363
    Gender:
    Male
    Occupation:
    SEO Specialist
    Location:
    iSynergyMedia
    Home Page:
    It might not be the case like this at the moment but after some time we might be able to sites ranking according to the number of +'s they have against them. Google will consider sites that have a higher number of +'s to be more authoritative than others and in the end that will be the determining factor. But all of this is just hearsay, we will have to wait and find out.