Results 1 to 4 of 4

Thread: How to get TOP n with an ORDER BY

  1. #1
    Join Date
    Oct 2002
    Posts
    933

    How to get TOP n with an ORDER BY

    SELECT EMPNO, ENAME, JOB, SAL FROM ( SELECT EMPNO, ENAME, JOB, SAL FROM EMP ORDER BY JOB DESC) WHERE ROWNUM <= 8

    The statement only works in 8i and not in 8.0.5. So is there a way to get the TOP n records (after the sort) when there is an ORDER BY clause in 8.0.5?

    Thanks,

    Frank

  2. #2
    Join Date
    Dec 2002
    Posts
    4
    Perhaps this doesn't help, but just incase....

    but you could create a temporary table
    insert the rows into the table (ordered)
    and then do your top N query.


    Graeme

  3. #3
    Join Date
    Oct 2002
    Posts
    933
    Graeme,
    Yes, ther are generic options to use View and SP. But I do not have that option in my domain. I just have to retrieve the entire recordset (sorted) and then that the TOP n records...

    Thank,

    Frank

  4. #4
    Join Date
    Dec 2002
    Posts
    1

    top n records in a query

    Here's a little trick from the late 80's and sql history. Newbies to sql and Oracle see the bottom of this post for more explanation of the inner workings.

    select b.salary, b.emp_no, b.etc, count('x') cnt_col
    from emp a, emp b
    where a.salary > b.salary
    order by cnt_col
    group by b.salary, b.emp_no, b.etc
    having count('x') >= 10;

    note an optional "having" clause..
    having count('x') between X and Y
    (substitute in your own values for X and Y)
    which allows you to pick an arbitrary range of values from the returned list.
    If you include two references to the table of interest in the "from" and include an aggragate function in the base query, count('x') say, along with the other columns you are interested in and do the normal group by clause at the end of the query, and then follow it with the "having" clause
    you can arbitrarily pick either the top n elements of interest or the 23rd, through 28th elements or xth through (x+n)th records. In other words you can pick an arbitrary range of the values in question. You will not need a temporary table, nor will you need to worry about the sequence of events where you have problems using "rownum" and "order by", i.e. the rownum is applied to the pre-sorted (pre-ordered) rows.

    You may have to tweak that query a bit but the concept is there. Note that if the tables are huge this runs a while. And it would help performance to index the salary column. But I am digressing into another set of topics.

    Newbies, perhaps you are confused by 'aliasing' in the "from" clause,
    "from emp a, emp b" it is just a shorthand way to keep ambiguous table references clear. It means you can refer to emp in the left hand case as "a", and emp in the right hand case as "b". The query above also aliases the count('x') column with cnt_col. which allows you to refer to a simpler expression in the order by clause.

    There is an order of operations at play here. First a "select" returns a row for every row in the "b" table with a count of how many rows in the "a" table have a salary value greater than the row currently being considered.

    That might be way more rows than you want back for your purposes and you want to avoid having to clean them up on the application side of this thing.

    Then the "order by" sorts the return set
    and gets them in ascending count order (or descending if you say order by cnt_col desc).

    The "group by" is needed in order to make the aggragate "count" function work and doesn't really directly affect the return set of rows.

    The "having" clause does the dirty work of looking at the sorted returned set of a bunch of records and delivers to you the records you are interested in and only those records.

    I hope I didn't bore you with this description, but I know that some people will derive benefit from the more complete description of this somewhat subtle process.

    One important point is the amount of time it can take to use views or build temporary tables make the above solution the better performer in many cases. Meaning, that views take a while to construct (as they come into existence by reference unless they are "materialized views") and will likely run much slower than the above query.

    And don't be confused by the count('x') part. 'x' is literally the letter 'x'. It is a way to not have to go to the table for some value to count. And can speed things up. In this case the only value of interest from the "a" table reference is the a.salary in the "where" clause. In the "select" portion of the query we only use a 'x' to represent an arbitary existence so to speak of a record from the "a" table. The effect of this is that we can use just the index on salary to make the comparisons and don't read data from the table for the "a" references, nor do we consume as much memory. (though I may be misleading you here a bit)
    There are other ways to do it too.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •