1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

Scraping usernames from youtube video with VB.net

Discussion in 'Visual Basic .NET' started by Kalashnikov, Mar 20, 2013.

  1. Kalashnikov

    Kalashnikov Junior Member

    Joined:
    Feb 18, 2011
    Messages:
    148
    Likes Received:
    59
    Location:
    Northern Ireland
    Here's what I've started with:

    Code:
    Dim browser_text As String = browser.DocumentText
            Dim allelements As HtmlElementCollection = browser.Document.All
            For Each webpageelement As HtmlElement In allelements
                Dim raw_id = webpageelement.GetAttribute("href")
                If raw_id.Contains("/user/") Then
                    Dim new_id As String = raw_id
                    txtScrapedList.Items.Add(new_id)
                End If
            Next
    
    it works like this:

    http://pokit.org/get/img/b92c9423a2a6408001f30165222278a5.jpg

    Has anyone got a more efficient method, I can imagine removing duplicates as well as automating the "show more" button and restarting the process will be extremely inefficient.
     
  2. DarkPixel

    DarkPixel Jr. VIP Jr. VIP Premium Member

    Joined:
    Oct 4, 2011
    Messages:
    1,328
    Likes Received:
    1,239
    Location:
    ↓↓↓↓
    Home Page:
    Why don't you now delete the beginning of the strings, till the part of the username (delete this: http://www.youtube.com/user/ )? That way you should have the usernames, instead of URLs. Also if there is "?feature=watch", delete that too. Voila, you have usernames.
     
    • Thanks Thanks x 1
  3. Kalashnikov

    Kalashnikov Junior Member

    Joined:
    Feb 18, 2011
    Messages:
    148
    Likes Received:
    59
    Location:
    Northern Ireland
    That's not the issue, that's something I will be doing later once I have a proper method down. The problem is automating the "show more" button or perhaps finding an alternative method. Just seeing what the bot guru's on the forum would suggest.
     
  4. r000k

    r000k Registered Member

    Joined:
    Jan 10, 2013
    Messages:
    66
    Likes Received:
    30
    forget loading it into a textbox or what ever, load it into an array, ie arraylist, much much much faster, as no need to update the UI, removing duplicates is easy, youre talking about efficiency yet youre using webbrowser controls, ditch them, using webrequests is easier that you think.

     
    • Thanks Thanks x 1
    Last edited: Mar 20, 2013
  5. Kalashnikov

    Kalashnikov Junior Member

    Joined:
    Feb 18, 2011
    Messages:
    148
    Likes Received:
    59
    Location:
    Northern Ireland
    See I'm fairly new to webrequests, I'd like to move towards that direction. Do you have any suggestions of how I can make it load all the comments at once then as I don't know how the "show more" button works with webrequests?

    Thanks for the response.
     
  6. Kalashnikov

    Kalashnikov Junior Member

    Joined:
    Feb 18, 2011
    Messages:
    148
    Likes Received:
    59
    Location:
    Northern Ireland
    Bump ..
     
  7. ultra.marine

    ultra.marine Registered Member

    Joined:
    Oct 5, 2012
    Messages:
    80
    Likes Received:
    101
    Location:
    Macedonia
    Why don't you try the "All comments feature"for example

    Code:
    [URL]http://www.youtube.com/all_comments?v=j5-yKhDd64s&page=1[/URL]
    [URL]http://www.youtube.com/all_comments?v=j5-yKhDd64s&page=2[/URL]
    
    etc etc... think widely next time when you trying to make scraper :D
     
    • Thanks Thanks x 2
  8. Kalashnikov

    Kalashnikov Junior Member

    Joined:
    Feb 18, 2011
    Messages:
    148
    Likes Received:
    59
    Location:
    Northern Ireland
    I honestly didn't know such a page existed, you sir are my hero!