I think the row_number() function is one of the best things that’s been added to SQL for some time. If you’re not familiar with it you really should check it out. It’s very useful where you want to join a set of data to another set detailing the latest/biggest/etc something related to the first set. Imagine top (n) and group by combined. The 5 most recent orders from each customer. The person who has used each repoort the most. You get the picture.
It adds a sequential number column to a resultset. The syntax is
row_number() over ([partition by ...] order by ...)
Order by (required) specifies what order you want the rows numbered in. Partition by (optional) controls when the numbering should start back at 1 again.
Suppose you want to list your customers and the value of each one’s latest order. You could do it like this:
select c.Name , (select top 1 o.OrderValue from Orders o where o.CustomerID = c.CustomerID order by o.OrderDate desc) as LatestOrderValue from Customers c
Not too bad in a simple example but it can get really convoluted really quickly in the real world and then you pass that magic cut-off point where the query optimiser suddenly gives you a bad plan.
This is the same thing with row_number():
select * from ( select c.Name , o.OrderValue as LatestOrderValue , row_number() over (partition by o.CustomerID order by o.CustomerID asc, o.OrderDate desc) as RowNum from Customers c inner join Orders o on o.CustomerID = c.CustomerID ) x where RowNum = 1
It looks a bit heavy but it clearly expresses the intention and stays readable when the query gets more complex.