1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

WebHarvy : trips

Discussion in 'Black Hat SEO Tools' started by fmartin, Jan 31, 2014.

  1. fmartin

    fmartin Registered Member

    Joined:
    Jan 29, 2014
    Messages:
    96
    Likes Received:
    21
    Occupation:
    Watching boobs
    Location:
    africa
    WebHarvy is a easy way to scrap website, but have several limitation.
    -You can not provide it a list of url to parse ( or i do not found out how)
    -You can not set proxy to be used randomly ( it cycle throu it)
    -You can not be multithread.

    So, here the tips:
    -> Clone 10 time your WebHarvy bin folder.
    -> start each clone, and configure it to use 10% of your proxy.
    -> Create 10 .cmd file each starting 10% of url to parse, using comment line:

    Ex cmd2:

    IF EXIST A:\app\scrapebox\projet\WebHarvy\out\out2.csv ( echo "Already exist A:\app\scrapebox\projet\WebHarvy\out\out2.csv" ) else ( A:\app\webharvy\WebHarvy.exe A:\app\scrapebox\projet\WebHarvy\scrap_pj2.xml -1 A:\app\scrapebox\projet\WebHarvy\out\out2.csv )
    IF EXIST A:\app\scrapebox\projet\WebHarvy\out\out3.csv ( echo "Already exist A:\app\scrapebox\projet\WebHarvy\out\out2.csv" ) else ( A:\app\webharvy\WebHarvy.exe A:\app\scrapebox\projet\WebHarvy\scrap_pj3.xml -1 A:\app\scrapebox\projet\WebHarvy\out\out2.csv )

    ...



    So, you can start the 10 CMD file now. it will resume if stopped.
    All will work in // and rotate on their proxy list. So 2 thread wont use same proxy.
     
  2. harubel

    harubel Newbie

    Joined:
    May 26, 2015
    Messages:
    18
    Likes Received:
    0
    I have webhary software and like to exchange with someone interested