Scrapebox Help

judaculla · Mar 4, 2016

I'm new to scrapebox, and I'm trying to build out a list of auto-approve urls. So far, my process is this:

scrape sites by keyword+footprint
post a comment to harvested sites
check all urls for comment url (filtering for successful posts)
export only urls of successful comment urls
Import successful comment urls into link extractor and extract all internal links
Rinse & Repeat the Link extracting process 2-3 more times

Now, at this point, I have taken a list of like 150-300 urls, which allowed auto-approved commenting, and expanded it considerably to include more urls from the same sites. ( I'm making the assumption that most sites that would allow an auto-approved comment on one url would allow an auto approved on all)

So, Through this process I find myself with a list of around 80K urls, which account for the majority of all posts/articles/pages from those original comment allowing sites.

The problem, is that this final list contains a lot of urls for pages that wouldn't have a comment section?category archives, tags archive, author archives, contact pages, user profiles, etc...?and I need to filter them again to cut the flak out. I could just load them into the Comment Poster again, and extract successful urls from those, but that seems inefficient and I'd also like to avoid alerting said sites to a potential influx of spam until it's already too late.

Is there a way to load my url list into Scrapebox and search urls against a footprint (like "comment" "leave a comment" etc..) and return a "found" or "not found" response?

kveldulv · Mar 5, 2016

You can filter the URLs with

Remove / Filter
- Remove URLs containing (mask, eg "contact", or *whatever*|*etc*)
- Remove URLs containing entries from (file with list of masks)
- Remove URLs not containing
- Remove URLS not containing entries from
Addons > Page Scanner to check for footprints in the source code of the pages.
Output lists of found/notfound.
Tools > Text file tool can also process files for lines that match/notmatch a mask.

For what you're doing, repeating the same actions over and over, buy the Automator plugin.
With it you can automate the process and save yourself lots of mindless clicking.

judaculla · Mar 5, 2016

kveldulv,

Thanks for the help, I think the page scanner and text file tool sound like exactly what I was looking for.

I actually bought sb awhile back with some discount that allowed me the automater plugin as well, just haven't had a lot of time to play with it yet. The whole day job thing is a HUGE inconvenience....

Anyway, thanks a lot for the help, I'm going to give it a try and see what I can get out of it.

judaculla · Mar 5, 2016

yeah, it's amazing how powerful scrapebox is. I'm pretty new to software like this, but it really does blow my mind how much info can be gathered and refined, and how quickly it can be done. I'd say I've been able to cut out ~30% of unwanted urls just with the filtering tools so far.

sms2indo · Mar 6, 2016

Im new to scrappebox and need to have some direction to start with it.
Are there post that i can follow?

Scrapebox Help

judaculla

Regular Member

kveldulv

Regular Member

judaculla

Regular Member

judaculla

Regular Member

sms2indo

Junior Member

Main Menu

Marketplace

Making Money

BlackHat World