VB.NET ~ Need help parsing XML loaded in richtextbox (loaded from URL)

simpleonline1234

Junior Member
Jan 26, 2010
170
13
I have two rich text boxes, and two buttons on my screen. The first button grabs HTML from a URL and then converts the HTML to XML which resides in rich text box 1.

The second button is to grab the XML from the rich text box1 and then parse it to grab all the input elements by their ID.

My issue is that my parser isn't doing anything. My guess is that I'm not quite getting the XML from the first rich text box.

What would be the best way to grab the XML from a rich text box load it into memory and then parse the XML to grab all the ID tags?

Here is my code -- Thanks for any help.

Code:
    Imports mshtml
     Imports System.Text
     Imports System.Net
     Imports System.Xml
     Imports System.IO
     Imports System.Xml.XPath
     
     Public Class Scraper
     
         Private Sub Scraper_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
         End Sub
     
         Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
             '  Note: This example uses two Chilkat products: Chilkat HTTP
             '  and Chilkat HTML-to-XML.  The "Chilkat Bundle" can be licensed
             '  at a price that is less than purchasing each product individually.
             '  The "Chilkat Bundle" provides licenses to all existing  Chilkat components.  Also, new-version upgrades are always free.
     
             Dim http As New Chilkat.Http()
     
             '  Any string argument automatically begins the 30-day trial.
             Dim success As Boolean
             success = http.UnlockComponent("30-day trial")
             If (success <> True) Then
                 TextBox1.Text = TextBox1.Text & http.LastErrorText & vbCrLf
                 Exit Sub
             End If
     
             Dim html As String
             html = http.QuickGetStr("http://www.quiltingboard.com/register.php")
             If (html = vbNullString) Then
                 TextBox1.Text = TextBox1.Text & http.LastErrorText & vbCrLf
                 Exit Sub
             End If
     
             Dim htmlToXml As New Chilkat.HtmlToXml()
     
             '  Any string argument automatically begins the 30-day trial.
             success = htmlToXml.UnlockComponent("30-day trial")
             If (success <> True) Then
                 TextBox1.Text = TextBox1.Text & htmlToXml.LastErrorText & vbCrLf
                 Exit Sub
             End If
     
             '  Indicate the charset of the output XML we'll want.
             htmlToXml.XmlCharset = "utf-8"
     
             '  Set the HTML:
             htmlToXml.Html = html
     
             '  Convert to XML:
             Dim xml As String
             xml = htmlToXml.ToXml()
     
             '  Save the XML to a file.
             '  Make sure your charset here matches the charset
             '  used for the XmlCharset property.
             htmlToXml.WriteStringToFile(xml, "out.xml", "utf-8")
     
             RichTextBox1.Text = xml
         End Sub
     
         Private Sub LoopThroughXmlDoc(ByVal nodeList As XmlNodeList)
             For Each elem As XmlElement In nodeList
                 If elem.HasChildNodes Then
                     LoopThroughXmlDoc(elem.ChildNodes)
                 Else
                     '' Extract the information
                     If elem.HasAttribute("id") Then
                         'elem.Attributes("AssetID").Value.ToString()
                     ElseIf elem.HasAttribute("name") Then
                         'elem.Attributes("AttributeID").Value.ToString()
                     End If
                 End If
             Next
         End Sub
     
         Private Sub Button2_Click_1(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button2.Click
             Dim doc As XmlDocument = New XmlDocument
             doc.Load("xmlFile.xml")
             Dim nodeList As XmlNodeList = doc.GetElementsByTagName("input")
             LoopThroughXmlDoc(nodeList)
         End Sub
     End Class
 
You can have a look to Html Agility Pack (just google it) instead of using XML. Then you can parse a string into an HTMLDocument and select the nodes you are looking for.
 
if you do want to go the xml route once you us the XmlDocument.Load(string) method you can use XPath expressions to search through the nodes for what you want.

Code:
http://support.microsoft.com/kb/317069
http://www.w3schools.com/xpath/
 
Back
Top
AdBlock Detected

We get it, advertisements are annoying!

Sure, ad-blocking software does a great job at blocking ads, but it also blocks useful features and essential functions on BlackHatWorld and other forums. These functions are unrelated to ads, such as internal links and images. For the best site experience please disable your AdBlocker.

I've Disabled AdBlock