Friday, March 4, 2011

SQL Statement Help - Select list of CustomerID, OrderDate with the most records in a table.

I'll be using the AdventureWorks Database to illustrate my problem.

I need to show for a particular customer a list of OrderDate with the most Orders.

My intial attempt was as follows:

SELECT CustomerID, OrderDate, COUNT(1) Cnt
FROM Sales.SalesOrderHeader
WHERE CustomerID = 11300
GROUP BY CustomerID, OrderDate
ORDER BY Cnt DESC

This will get us the following result:

CustomerID    OrderDate    Cnt
-----------   ----------          ----
11300     2003-11-22 00:00:00.000  2
11300     2004-01-28 00:00:00.000  2
11300     2004-02-18 00:00:00.000  2
11300     2004-02-08 00:00:00.000  2
11300     2004-02-15 00:00:00.000  1
11300     2004-03-11 00:00:00.000  1
11300     2004-03-24 00:00:00.000  1
11300     2004-03-30 00:00:00.000  1
11300     2004-04-28 00:00:00.000  1
11300     2004-05-03 00:00:00.000  1
11300     2004-05-17 00:00:00.000  1
11300     2004-06-18 00:00:00.000  1
...

Not exactly what I wanted, as the result should only show all records where Cnt = 2, like so:

CustomerID    OrderDate    Cnt
-----------   ----------          ----
11300     2003-11-22 00:00:00.000  2
11300     2004-01-28 00:00:00.000  2
11300     2004-02-18 00:00:00.000  2
11300     2004-02-08 00:00:00.000  2

I'm stuck because I can't wrap my mind around two problems:

1) A customer might have more than one OrderDate with the same Cnt value. This means I can't do something like TOP 1 to get the desired result.
2) Because the number of Orders for each customer may be different, I cannot use the following SQL statement:

SELECT CustomerID, OrderDate, COUNT(1) Cnt
FROM Sales.SalesOrderHeader
WHERE CustomerID = 11300
GROUP BY CustomerID, OrderDate HAVING COUNT(1) > 1
ORDER BY Cnt DESC

This will work for getting the right result for this customer, but will definitely be wrong if the next customer has only one Order for a particular day.

So, either the query is impossible in this situation, or I am approaching the query in the wrong way. Any ideas on this problem is appreciated.

Also, since this will be a query in a stored procedure, any ideas on solving this in T-SQL will be acceptable.

UPDATE: Thanks to Mehrdad, I've been introduced to Common Table Expressions, and Life is Good®. :)

From stackoverflow
  • I don't have a copy of AdventureWorks handy to test with, but could the WITH TIES clause help out? You could do something like:

    SELECT TOP 1 WITH TIES CustomerID, OrderDate, COUNT(*) Cnt
    ...
    ORDER BY COUNT(*) DESC
    

    ... which should get you all the rows that match the value in your ORDER BY clause (the count).

  • You should use TOP n WITH TIES clause to accomplish the task:

    SELECT TOP 1 WITH TIES CustomerID, OrderDate, COUNT(*) Cnt
    FROM Sales.SalesOrderHeader
    WHERE CustomerID = 11300
    GROUP BY CustomerID, OrderDate
    ORDER BY Cnt DESC
    

    Alternatively, you can use a common table expression (CTE) to solve the problem. This solution requires SQL Server 2005 or higher:

    WITH MyTable AS (SELECT CustomerID, OrderDate, COUNT(*) Cnt
                     FROM Sales.SalesOrderHeader
                     WHERE CustomerID = 11300
                     GROUP BY CustomerID, OrderDate)
    SELECT CustomerID, OrderDate, Cnt
    FROM MyTable
    WHERE Cnt = (SELECT MAX(Cnt) FROM MyTable);
    

    If you're using the CTE method, make sure the previous statement is terminated by a semicolon (if it's not the first statement). It's required before a CTE expression.

    alextansc : I've tested this SQL, and it works fine! That final WHERE statement is the main source of my grief because I wasn't able to find a way to reference the Max(Cnt) value. Thank you!
    Mehrdad Afshari : You're welcome! Have fun CTEing!

0 comments:

Post a Comment