1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Facebook Sues Data Scraper

Discussion in 'BlackHat Lounge' started by Guaji, Apr 5, 2010.

  1. Guaji

    Guaji Regular Member

    May 28, 2009
    Likes Received:
    A computer programmer culls data from 210 million Facebook profiles--and pisses Facebook off in the process.

    Remember those fascinating graphs by Peter Warden that used Facebook data to illustrate, for example, people's interests and common names across the U.S.? Facebook has totally squashed the project. But the privacy concerns remain.

    Warden gathered that data from public profiles using "crawling" software similar to what's commonly available on the Web; he was planning to release the set to select researchers, who proposed cross-referencing that data in all sorts of cool ways, trying to find links, for example, between income, employment, and social connections. (Does having more friends equal more cash? Is there a threshold, where too many friends means you're way to social?) As Warden was at pains to point out, the data is exceedingly public: You can still access it through Google's caches; and as Warden writes, "Nobody ever alleged that my data gathering was outside the rules the Web has operated by since crawlers existed."

    Still, Facebook was none too pleased: They first requested a thorough scrubbing of the data, to eliminate any personal info that might be used by spammers. And eventually, they simply threatened to sue Warden, unless he deleted all the data. They were alleging a terms of service violation. Warden didn't have any money to fight the suit, so he deleted the data.

    Facebook doesn't look totally evil though. According to Warden: "From my conversations with technical folks at Facebook, there seems to be a real commitment to figuring out safeguards around the widespread availability of this data." And yet, the problem will probably remain--if not on Facebook, then somewhere else.

    Again, Warden: "To the many researchers I've disappointed, there's a whole world of similar data available from other sources too. By downloading the Google Profile crawling code you can build your own data set, and it's easy enough to build something similar for Twitter. I'm already in the middle of some new research based on public Buzz information, so this won't be stopping my work."

    Still, why can't Facebook actually encourage this type of research, while working on its privacy issues in parallel? They're sitting on top of data the likes of which no one has ever seen before--it naive to even guess at what sorts of fascinating research could result.

    Here is the original news:
    • Thanks Thanks x 1
  2. Anonyma

    Anonyma Newbie

    Mar 28, 2010
    Likes Received:
  3. evilman11

    evilman11 Junior Member

    Apr 6, 2009
    Likes Received:
    chillin at bhw and internet marketing
    on the net making my pockets fatter
    More like threatened to sue, they never really sued the guy. They just told him to take the shit down or we'll financially fuck you in the ass pretty much. Anyways, thats alot of damn scraping...:p