Where to run a large scale crawler and minimize bandwidth cost? Cloud unlimited in-bound?

Discussion in 'Black Hat SEO Tools' started by darkmonk, Sep 7, 2012.

    Nov 21, 2007
    Lately I've been toying with the idea of building my own private crawler. Designing and building the data storage and crawler architecture is not a problem for me. But, I am wondering if this is a futile endeavor due to bandwidth costs alone. I would love to put this on a cloud platform. Most cloud platforms seem to offer "unlimited" in-bound bandwidth (Amazon EC2, Windows Azure and HP Cloud for instance).

    Has anyone tackled this problem? If you start consuming a couple hundred terabytes a month of inbound bandwidth, do they honor the unlimited in-bound statement?