Is there a content scraper that does this?

Bostoncab

Elite Member
Joined
Dec 31, 2009
Messages
2,256
Reaction score
514
I want a content scraper that I can feed a list of urls scraped from Scrapebox. I want the content scraper to go to each url and extract the words images metatags and other page elements and spit it out in one .html file

Any such thing?
 
Not possible, since HTML tags, and content placement of each site is different and varies with platforms as well. At best, a bot can be made that is platform specific, say WP, but it will still have very low accuracy due to varied content placement.
 
My idea was to take the top 1,000 results for a keyword from scrapebox. Eliminate the duplicate urls. Have the "bot" or scraper whatever you want to call it go there harvest all the content from each result and stuff it all together on one page. Now I take the html page copy its content into a wp post and upload all the files (images etc.) that the bot harvested to the WP site and presto I have one perfect keword targeted page of unique non spun content.

No good?
 
My idea was to take the top 1,000 results for a keyword from scrapebox. Eliminate the duplicate urls. Have the "bot" or scraper whatever you want to call it go there harvest all the content from each result and stuff it all together on one page. Now I take the html page copy its content into a wp post and upload all the files (images etc.) that the bot harvested to the WP site and presto I have one perfect keword targeted page of unique non spun content.

No good?

Not to create any hype, but something similar is one of the modules of Licorne AIO. It doesn't allow custom URLs, but scrapes content based on KW from sources (from Google) and saves Images, videos, articles etc in html file, thus creating a complete site.

However, what you say is not possible, or not efficient, if you will due to reasons mentioned in previous post.

It would be best if the harvested URLs are on a single platform/single site. Then it can be done using even ubot.
 
Back
Top