Login or Sign Up to become a member!
LessThanDot Sit Logo

LessThanDot

Community Wiki

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot friendfeed Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Navigation

Google Ads

ASP.NET: Retrieve data from a web page

From Wiki

Jump to: navigation, search

Summary: An example of how we can make a request to a web page and retrieve the resulting HTML

There are a few situations where it would be useful to be able to retrieve the HTML from a web page via code. Fortunately, this is made relatively easy by the HttpWebRequest and httpWebResponse classes.

Firstly, we need to decide which type of form the web page is using. The two different methods are GET and POST and for this article, I'm going to assume you have a basic understanding of these methods and I'll simply show you an example of how to implement either method rather than go into the differences between them. For both methods we'll need a simple page to actually display the results so let's start by making a page with a TextBox on it:

  1. <%@ Page Language="VB" AutoEventWireup="false" CodeFile="Default1.aspx.vb" Inherits="Default1" %>  
  2. <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">  
  3.  
  4. <html xmlns="http://www.w3.org/1999/xhtml" >  
  5. <head runat="server">  
  6.     <title>Retrieve data from a web page</title>  
  7. </head>  
  8. <body>  
  9.     <form id="form1" runat="server">  
  10.     <div>  
  11.          <asp:TextBox ID="TextBox1" runat="server" Rows="40" Columns="100" TextMode="MultiLine"></asp:TextBox>  
  12.     </div>  
  13.     </form>  
  14. </body>  
  15. </html>

Now, we need to write the code based on whichever method we have decided to use.


Using the GET method

This is the shorter of the two methods as since we don't have to post any data we can simply set the URL and add any querystring values to that URL. We can then simply make a request using this URL and read the response given back to us:

  1. Imports System.IO  
  2. Imports System.Net  
  3. Partial Class Default1
  4.     Inherits System.Web.UI.Page
  5.  
  6.     Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
  7.         Dim strURL As String
  8.         Dim strResult As String
  9.         Dim wbrq As HttpWebRequest
  10.         Dim wbrs As HttpWebResponse
  11.         Dim sr As StreamReader
  12.  
  13.         ' Set the URL (and add any querystring values)  
  14.         strURL = "http://aspnetlibrary.com/articles.aspx?Page=1"
  15.  
  16.         ' Create the web request  
  17.         wbrq = WebRequest.Create(strURL)
  18.         wbrq.Method = "GET"
  19.  
  20.         ' Read the returned data  
  21.         wbrs = wbrq.GetResponse
  22.         sr = New StreamReader(wbrs.GetResponseStream)
  23.         strResult = sr.ReadToEnd.Trim
  24.         sr.Close()
  25.  
  26.         ' Write the returned data out to the page  
  27.         TextBox1.Text = strResult
  28.     End Sub
  29. End Class

Using the POST method

This method is slightly longer than the GET method due to the fact that we actually have to specify the data that is going to be sent to the page. We also have to set the actual type and length of this content before posting it so the page in question knows what it will be receiving:

  1. Imports System.IO  
  2. Imports System.Net  
  3. Partial Class Default1
  4.     Inherits System.Web.UI.Page
  5.  
  6.     Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
  7.         Dim strURL As String = ""
  8.         Dim strPostData As String = ""
  9.         Dim strResult As String = ""
  10.         Dim wbrq As HttpWebRequest
  11.         Dim wbrs As HttpWebResponse
  12.         Dim sw As StreamWriter
  13.         Dim sr As StreamReader
  14.  
  15.         ' Set the URL to post to  
  16.         strURL = "http://www.webcom.com/cgi-bin/form"
  17.         ' Post some values to the page  
  18.         strPostData = String.Format("your_name={0}&userid={1}&form_name={2}", "Mark Smith", "webcom", "tutortest")
  19.  
  20.         ' Create the web request  
  21.         wbrq = WebRequest.Create(strURL)
  22.         wbrq.Method = "POST"
  23.         ' We don't always need to set the Referer but in this case  
  24.         ' the page we are posting to will only issue a response if we do  
  25.         wbrq.Referer = "http://www.webcom.com/cgi-bin/form"
  26.         wbrq.ContentLength = strPostData.Length
  27.         wbrq.ContentType = "application/x-www-form-urlencoded"
  28.  
  29.         ' Post the data  
  30.         sw = New StreamWriter(wbrq.GetRequestStream)
  31.         sw.Write(strPostData)
  32.         sw.Close()
  33.  
  34.         ' Read the returned data  
  35.         wbrs = wbrq.GetResponse
  36.         sr = New StreamReader(wbrs.GetResponseStream)
  37.         strResult = sr.ReadToEnd.Trim
  38.         sr.Close()
  39.  
  40.         ' Write the returned data out to the page  
  41.         TextBox1.Text = strResult
  42.     End Sub
  43. End Class

Both of these simple examples demonstrate an easy method of reading the results of a request to a web page. You can obviously parse the results if necessary and only display certain information from the web page but this article will at least get you up to that point.


This Hack is part of the ASP.NET Hacks collection

470 Rating: 2.3/5 (6 votes cast)