problem: scrapebox cannot fit more than 1 million urls

Discussion in 'Black Hat SEO' started by nonai, Oct 6, 2014.

  1. nonai

    nonai

    Oct 10, 2013
    imagine you have a lot of keywords that you are harvesting. when scrapebox reaches one million urls, it keeps harvesting. however, at this point, the harvesting is useless because scrapebox cannot fit more than one million urls. Am I right?
    so what am I supposed to do if I have a lot of keywords and expect the harvested urls to go way over one million? it would be nice if SB would remove the duplicates behind the scenes while it's harvesting so the one million would not be so easily reached. can it do that?
  2. bk071

    bk071

    Nov 24, 2010
    Install scrapebox crashdump logger and enable it.
    SB will now create a new file in the harvester folder whenever it reaches 1M URLs in the previous file e.g. if you harvest 7 million URLs, you will have 7 txt files and a million URLs in each file.
  3. Peter Ngo

    Peter Ngo

    Apr 23, 2013
    It is a built in feature, when the scrape reach 1 million URLs, it automatically saves it to a different file.

    You should change your harvesting folder's size in setting option, so it won't interrupt your scrape.