1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

How to avoid scraping some of the same URL's eachtime you use scrapebox

Discussion in 'Black Hat SEO Tools' started by bubbety, Apr 19, 2013.

Tags:
  1. bubbety

    bubbety Newbie

    Joined:
    Apr 2, 2012
    Messages:
    5
    Likes Received:
    0
    I'm pretty new to scrape box and I'm wondering how I can avoid blasting duplicate URL's to my website each time I harvest for links. I have a few similar topics within my niche and this means I overlap when harvesting. I can't just harvest all my links in one go. So is there a way/method used to avoid the same URL's being chosen because there's too much to put it into folders and be finding duplicates that way. Thanks
     
  2. TrevorB

    TrevorB Jr. VIP Jr. VIP Premium Member

    Joined:
    Dec 21, 2011
    Messages:
    1,185
    Likes Received:
    361
    Location:
    Canada
  3. bubbety

    bubbety Newbie

    Joined:
    Apr 2, 2012
    Messages:
    5
    Likes Received:
    0
    Yes but does that count for other harvests I may have done previously?
     
  4. homenet

    homenet Power Member

    Joined:
    Jan 5, 2009
    Messages:
    790
    Likes Received:
    338
    Location:
    Dimension X
    Use the "Import" button and then use the compare list/compare lists on domain levels to remove any duplicates when compared to your existings list. This is what I do when I'm using Sbox:

    1 - harvest urls
    2 - post to urls
    3 - check 'posted entries' against my URL's and see how many matches you have
    4 - save those matches in a file
    5 - harvest more urls'
    6 - compare my matched entries from step 4 against the newly harvested list and remove dupes (I remove dupes from domain level but you can just remove dupe urls if you want)
    7 - back to step 2 - rinse and repeat!
     
  5. TrevorB

    TrevorB Jr. VIP Jr. VIP Premium Member

    Joined:
    Dec 21, 2011
    Messages:
    1,185
    Likes Received:
    361
    Location:
    Canada
    You should have 1 main URL file with all the URL's of
    sites that you want to post to. Each time you harvest
    and ad new links, just remember to remove the
    duplicates.

    Now, from what I can understand from your post. You
    want to make sure that you don't post your similar sites
    to the same URL on each blast, correct?

    Well to make sure that this does not happen, all you
    would do is export your link list in batches to make
    different files of the url's to post to. Then once you get
    a successful submission, just remove that link from all
    your lists.

    Does that make sense?

    I'm not that great at explaining things. LOL Maybe someone
    else can chime in here to explain it a little better.
     
    • Thanks Thanks x 1