Login or Sign Up to become a member!
LessThanDot Site Logo

LessThanDot

Community Wiki

Less Than Dot is a community of passionate IT professionals and enthusiasts dedicated to sharing technical knowledge, experience, and assistance. Inside you will find reference materials, interesting technical discussions, and expert tips and commentary. Once you register for an account you will have immediate access to the forums and all past articles and commentaries.

LTD Social Sitings

Lessthandot twitter Lessthandot Linkedin Lessthandot facebook Lessthandot rss

Note: Watch for social icons on posts by your favorite authors to follow their postings on these and other social sites.

Navigation

Google Ads

Basic Design Patterns – Aggregating data

From Wiki

Jump to: navigation, search

Commonly, you need to total the values of a numeric field or average them or count the number of records that fit certain criteria or find the first or last record that meets a certain criteria. These types of actions require the use of aggregate functions. In SQL Server, if you use aggregate functions, you must use a GROUP BY that includes all the other columns you are returning in your query. Let’s look at basic aggregate queries. Suppose you want to total an order and find various information about a customer. The OrderDetail table contains the various items in the order and the Order table contains general information about the order. The aggregate function to total a numeric field is SUM(FieldName). SUM will only work on numeric data, you can’t directly sum numbers stored as strings. A typical query might be something like:

  1. SELECT O.OrderNumber, C.CustomerName, SUM(OD.ItemTotal) as OrderTotal
  2. FROM [Order] O
  3. JOIN Customer C on O.CustomerID = C.CustomerID
  4. JOIN OrderDetail OD on O.Order_ID = OD.Order_ID
  5. GROUP BY  O.OrderNumber, C.CustomerName

This query would give you the total amounts for each order for all customers with orders. But that might not be specific enough for you. Suppose you only want to see the orders that total more than $100. Now you have to add a HAVING clause to filter on the aggregate sum. A common mistake is to try to do this in a WHERE clause but aggregates need to be in HAVING clauses. The WHERE Clause is evaluated before the aggregate is calculated and that’s why you can’t use it for that.

  1. SELECT O.OrderNumber, C.CustomerName, SUM(OD.ItemTotal) as OrderTotal
  2. FROM [Order] O
  3. JOIN Customer C on O.CustomerID = C.CustomerID
  4. JOIN OrderDetail OD on O.Order_ID = OD.Order_ID
  5. GROUP BY  O.OrderNumber, C.CustomerName
  6. HAVING SUM(OD.ItemTotal)>100.00

Again this may not be specific enough, suppose you only want the data for one particular customer. This isn’t an aggregate function so you would put this condition in the WHERE clause. You can have both a WHERE and a HAVING clause. The sample query to get the information for just one customer would be:

  1. SELECT O.OrderNumber, C.CustomerName, SUM(OD.ItemTotal) as OrderTotal
  2. FROM [Order] O
  3. JOIN Customer C on O.CustomerID = C.CustomerID
  4. JOIN OrderDetail OD on O.Order_ID = OD.Order_ID
  5. WHERE CustomerID = 101
  6. GROUP BY  O.OrderNumber, C.CustomerName
  7. HAVING SUM(OD.ItemTotal)>100.00

Note also that columns in the SELECT must be in the GROUP BY, but those in the JOIN or the WHERE Clause that are not in the SELECT do not need to be in the GROUP BY. Note also that if you use the aggregate in the HAVING clause, but not in the SELECT, you still need the GROUP BY.

  1. SELECT O.OrderNumber
  2. FROM [Order] O
  3. JOIN Customer C on O.CustomerID = C.CustomerID
  4. JOIN OrderDetail OD on O.Order_ID = OD.Order_ID
  5. WHERE CustomerID = 101
  6. GROUP BY  O.OrderNumber
  7. HAVING SUM(OD.ItemTotal)>100.00

Other aggregates include COUNT(*) which will count the total records, COUNT(FieldName) which will count the total number of non-null records in the specified field, Max(FieldName) which will get the highest value in the field for the grouping and MIN(FieldName) which will get the smallest. If these are numeric fields the MAX and MIN will be the largest and smallest number, but if they are strings it will be the last in an alphabetical sort or the first in an alphabetical sort. This causes problems when you try to use them when you store numeric data in a string field as the results are often not what was expected. For instance 111 would be sorted earlier than 12 in a string field, so 12 might be the MAX() value when you are expecting 111. This will also cause a problem if you are trying to SUM a field which contains numeric data stored as a string. To get around this, the best thing to do is to use the correct data type for the data you are storing. If you can’t change the structure and the database uses a varchar field for the numeric data, you will need to cast the data to a numeric data type before applying the aggregate. An example is shown below:

  1. SELECT O.OrderNumber
  2. FROM [Order] O
  3. JOIN Customer C on O.CustomerID = C.CustomerID
  4. JOIN OrderDetail OD on O.Order_ID = OD.Order_ID
  5. WHERE CustomerID = 101
  6. GROUP BY  O.OrderNumber
  7. HAVING SUM(CAST(OD.ItemTotal as Decimal(10,2)))>100.00

779 Rating: 1.7/5 (39 votes cast)