ASP.NET: Retrieve data from a web page
From Wiki
Summary: An example of how we can make a request to a web page and retrieve the resulting HTML
There are a few situations where it would be useful to be able to retrieve the HTML from a web page via code. Fortunately, this is made relatively easy by the HttpWebRequest and httpWebResponse classes.
Firstly, we need to decide which type of form the web page is using. The two different methods are GET and POST and for this article, I'm going to assume you have a basic understanding of these methods and I'll simply show you an example of how to implement either method rather than go into the differences between them. For both methods we'll need a simple page to actually display the results so let's start by making a page with a TextBox on it:
- <%@ Page Language="VB" AutoEventWireup="false" CodeFile="Default1.aspx.vb" Inherits="Default1" %>
- <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
- <html xmlns="http://www.w3.org/1999/xhtml" >
- <head runat="server">
- <title>Retrieve data from a web page</title>
- </head>
- <body>
- <form id="form1" runat="server">
- <div>
- <asp:TextBox ID="TextBox1" runat="server" Rows="40" Columns="100" TextMode="MultiLine"></asp:TextBox>
- </div>
- </form>
- </body>
- </html>
Now, we need to write the code based on whichever method we have decided to use.
Using the GET method
This is the shorter of the two methods as since we don't have to post any data we can simply set the URL and add any querystring values to that URL. We can then simply make a request using this URL and read the response given back to us:
- Imports System.IO
- Imports System.Net
- Partial Class Default1
- Inherits System.Web.UI.Page
- Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
- Dim strURL As String
- Dim strResult As String
- Dim wbrq As HttpWebRequest
- Dim wbrs As HttpWebResponse
- Dim sr As StreamReader
- ' Set the URL (and add any querystring values)
- strURL = "http://aspnetlibrary.com/articles.aspx?Page=1"
- ' Create the web request
- wbrq = WebRequest.Create(strURL)
- wbrq.Method = "GET"
- ' Read the returned data
- wbrs = wbrq.GetResponse
- sr = New StreamReader(wbrs.GetResponseStream)
- strResult = sr.ReadToEnd.Trim
- sr.Close()
- ' Write the returned data out to the page
- TextBox1.Text = strResult
- End Sub
- End Class
Using the POST method
This method is slightly longer than the GET method due to the fact that we actually have to specify the data that is going to be sent to the page. We also have to set the actual type and length of this content before posting it so the page in question knows what it will be receiving:
- Imports System.IO
- Imports System.Net
- Partial Class Default1
- Inherits System.Web.UI.Page
- Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
- Dim strURL As String = ""
- Dim strPostData As String = ""
- Dim strResult As String = ""
- Dim wbrq As HttpWebRequest
- Dim wbrs As HttpWebResponse
- Dim sw As StreamWriter
- Dim sr As StreamReader
- ' Set the URL to post to
- strURL = "http://www.webcom.com/cgi-bin/form"
- ' Post some values to the page
- strPostData = String.Format("your_name={0}&userid={1}&form_name={2}", "Mark Smith", "webcom", "tutortest")
- ' Create the web request
- wbrq = WebRequest.Create(strURL)
- wbrq.Method = "POST"
- ' We don't always need to set the Referer but in this case
- ' the page we are posting to will only issue a response if we do
- wbrq.Referer = "http://www.webcom.com/cgi-bin/form"
- wbrq.ContentLength = strPostData.Length
- wbrq.ContentType = "application/x-www-form-urlencoded"
- ' Post the data
- sw = New StreamWriter(wbrq.GetRequestStream)
- sw.Write(strPostData)
- sw.Close()
- ' Read the returned data
- wbrs = wbrq.GetResponse
- sr = New StreamReader(wbrs.GetResponseStream)
- strResult = sr.ReadToEnd.Trim
- sr.Close()
- ' Write the returned data out to the page
- TextBox1.Text = strResult
- End Sub
- End Class
Both of these simple examples demonstrate an easy method of reading the results of a request to a web page. You can obviously parse the results if necessary and only display certain information from the web page but this article will at least get you up to that point.
This Hack is part of the ASP.NET Hacks collection


