1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

C#: How can I scrape Twitter usernames?

Discussion in 'C, C++, C#' started by kytro360, Sep 7, 2011.

  1. kytro360

    kytro360 Power Member

    Joined:
    Jan 12, 2010
    Messages:
    703
    Likes Received:
    733
    I was wondering if anyone knows the code to scrape the usernames of of Twitter search. Any help would be appreicated, thanks :)
     
    Last edited: Sep 7, 2011
  2. xcubic

    xcubic Jr. VIP Jr. VIP Premium Member

    Joined:
    Apr 24, 2008
    Messages:
    440
    Likes Received:
    582
    Location:
    Internet

    You can scrap someone's friends and the friends of the friends...
     
  3. darshan1994

    darshan1994 BANNED BANNED

    Joined:
    Oct 9, 2009
    Messages:
    654
    Likes Received:
    318
    http://www.codeproject.com/KB/cs/TagBasedHtmlParser.aspx < Could work if you find out tags used by twitter for usernames.
     
  4. supernour09

    supernour09 Junior Member

    Joined:
    Jul 20, 2011
    Messages:
    191
    Likes Received:
    6
    i can help with imacro for 10 buck
     
  5. kytro360

    kytro360 Power Member

    Joined:
    Jan 12, 2010
    Messages:
    703
    Likes Received:
    733
    Twitter doesnt really even use tags, they use links
     
  6. xenon2010

    xenon2010 Regular Member

    Joined:
    Apr 27, 2010
    Messages:
    231
    Likes Received:
    48
    Occupation:
    web and desktop apps programmer
    Location:
    prison
    Home Page:
    1- download the page...
    2- use regex to parse the page..
    3- collect the data..
    4- do whatever you want with it..

    its simple you can use simple browser control to achieve it...
     
  7. kytro360

    kytro360 Power Member

    Joined:
    Jan 12, 2010
    Messages:
    703
    Likes Received:
    733
    @xenon2010 :/ Thanks now I just gotta learn how to download the page, and regex :/
     
  8. scriptomania

    scriptomania Junior Member

    Joined:
    Dec 28, 2010
    Messages:
    127
    Likes Received:
    249
    Occupation:
    A full time pirate at sea
    Location:
    The European capital of politics
    This might help you out: http://www.blackhatworld.com/blackhat-seo/black-hat-seo-tools/345255-free-python-codng-available.html

    Something rather simple written by me in python, uses Twitters API (so you are restricted to a certain number of calls per hour). But feel free to improve on it if you like. I'll probably publish a new version pretty soon, just so busy with other projects right now.

    Anyhow, if you wanna do it without the use of an API (correct me if I'm wrong guys, maybe twitter changed some stuff around.... idk), you'll need to use C#'s httpwebrequest class to send a simple "GET" request to download the desired webpage. Then parse it using regex.

    Just look at the source code of the page and try to figure out a pattern. For example, suppose that usernames can be found between <span> tags. Your regex command would look something like:

    Code:
    <span>(.*?)</span>
    Resources:

    http requests: http://www.csharp-station.com/HowTo/HttpWebFetch.aspx
    regex: http://msdn.microsoft.com/en-us/library/ms228595(v=vs.80).aspx#Y347
     
  9. xenon2010

    xenon2010 Regular Member

    Joined:
    Apr 27, 2010
    Messages:
    231
    Likes Received:
    48
    Occupation:
    web and desktop apps programmer
    Location:
    prison
    Home Page:
    you can achieve that with any language...
    if you are looking for scripting language go for PHP..
    use cURL to download the page then use either regex or tags parser to extract the usernames..
    personally I prefer to use regex (supported by all languages) its harder but more advanced.

    if you are looking for desktop applications go for C# or VB. (although C# is better with syntax).. use webclient class or webpage control to download the page. then use regex or html tags parser to extract the data from the page..

    now go and learn something useful :p
    and rep me up :D
     
  10. ashishthakkar

    ashishthakkar Junior Member

    Joined:
    Nov 4, 2011
    Messages:
    123
    Likes Received:
    10
    I have done exactly this using visual basic 6.

    Dont have the require tools to do it in c# or would code it.


    ~ASHISH
     
  11. BigoS

    BigoS Newbie

    Joined:
    May 8, 2011
    Messages:
    21
    Likes Received:
    7
    Location:
    Europe
    Hi u can find all information on twitter site. Google for twitter API and next go for API resource documentation - Can't post u direct url coz of moderation system. Cheers
     
  12. crashed

    crashed Jr. VIP Jr. VIP Premium Member

    Joined:
    Aug 13, 2008
    Messages:
    958
    Likes Received:
    1,198
    Occupation:
    Guru-slayer
    Location:
    Behind the VPN...
    Home Page:
    Have a look at the TweetSharp.Net api for C#... Makes it so easy to grab the ID's of someones friend etc.
     
  13. kokoloko75

    kokoloko75 Elite Member

    Joined:
    Jan 1, 2011
    Messages:
    1,628
    Likes Received:
    1,936
    Occupation:
    Design director
    Location:
    Paris (France)
    Old thread...

    The method has been shared by Xenon2010 :
    1. Downlaod the page with GET.
    2. Parse the code with regular epxression to extract username.
    3. Put each username in list box.

    Use the feed search from Twitter to get optimized datas, example :
    Code:
    http://search.twitter.com/search.atom?q=love
    Then, extract string like :
    Code:
    <uri>http://twitter.com/IamRainey89</uri>
    And remove "<uri>http://twitter.com/" and "</uri>" in the string.

    Beny
     
    • Thanks Thanks x 1
  14. Senotaru

    Senotaru Registered Member

    Joined:
    Jan 17, 2011
    Messages:
    67
    Likes Received:
    11
    If you use the RSS method above, here is a regex expression to scrape out everything:

    Code:
    (?<=<uri>http://twitter.com/).*?(?=</uri>)
    To use it, get the webpage with httpwebrequest or some other method of getting the source code, add:

    Code:
    using System.Text.RegularExpressions;
    to the top of your code.

    Where you want to do the scraping use:

    Code:
    Regex userRegex = new Regex("(?<=<uri>http://twitter.com/).*?(?=</uri>)");
    foreach(Match item in userRegex.Matches(sourceCodeStringHere))
    {
           String matchedText = item.Value;
           //Your code here
    }
    
     
    • Thanks Thanks x 2