Results 1 to 13 of 13

Thread: Indexes and Access Book

  1. #1
    Join Date
    Apr 2003
    Posts
    48

    Indexes and Access Book

    I have to things that I need help with. Can some explain exactly what indexing does. If I was to index last name field on a table what does that property do????

    does anyone know where I can get a good database and access book. ?



    Rob

  2. #2
    Join Date
    Apr 2003
    Posts
    48
    do I want to maybe even index the whole table so if I pick any field that it will search quicker.????

  3. #3
    Join Date
    Mar 2003
    Location
    Jacksonville, Florida
    Posts
    52
    Indexing is applicable to that field if you are going to be doing alot of queries on that table with that field as a ORDER BY field (sort) or WHERE field (condition).

    An INDEX is a way of refering to the data. It is generally a TREE structure (B-Tree most often, one of the most efficient Binary Search Tree forms where searching is done at a worst case scenario on the order of O(n log n) ).

    This index contains entries with two fields, the KEY (the data that determines its place in the tree) and the REFERENCE (the pointer to where the rest of that row is in memory). When rows are entered into the table, the INDEX has added to it all of these rows.

    Tree:
    Code:
                  olivia(3)
                /          \
            jeff(4)         tom(2)
           /    \          /      \
       allen(6) john(7)  sally(1) zeb(5)
    Table:

    record 1: sally
    record 2: tom
    record 3: olivia
    record 4: jeff
    record 5: zeb
    record 6: allen
    record 7: john

    In the figure above, searching on this tree structure is done much quicker than when it is done in a contiguous list (arrays are contiguous lists, data structures in which data is aligned one after the other). The maximum amount of comparisons required to either find a value in the tree structure above (through a successful search OR a failed search with no returned value) is three, as opposed to the 7 comparisons required for a failed search on an array of the same size.

    The premise for the binary tree form is that every value is either greater than, less than, or equal to every other value, so we organize or data in that fashion.

    The formal definition for a Binary Search Tree is:
    1) All items in the right sub tree are greater than the node's root.
    2) All items in the left sub tree are less than the node's root.
    3) Both the right sub tree and left sub tree are Binary Search Trees.

    Code:
                  node (= root)
                 /             \
    left node (< root)      right node (> root)
    We have a root node which is the base node. Now the tree is organized so that when you search, the value for which you are searching (the searchee) you first compare it to the first node, either it is equal to, less than, or greater than this node.

    Now what happens with our original tree is, lets say we are searching for john. First we check the base node. john is less than olivia (lexigraphical comparison), so if john is in the list, it MUST be in the left sub tree of olivia. In this fashion, we have eliminated half of the elements in the tree with one comparison, because we know that what we are searching for could not be in the right sub tree because that sub tree contains only elements greater than olivia. Now we will compare john to jeff. john is greater than jeff, john will be located in the right sub tree of jeff (if it is present in the tree). So now we will search that sub tree. When we compare john to john, we find that we have found the object of our extensive search. Note that in this search, however, only 3 comparisons were made, and that in this particular situation, that is the most comparisons that would be made for an unsuccessful search (considered the worst case scenario in searching because it means that you have eliminated ALL possible candidates for a successful search).

    Now when compared to our unsorted contiguous record listing, you'll find that the if we start from the front and search for john, we will end up comparing against every single object until we get to john, the last record. In this worse case scenario, a total of 7 comparisons are made, that is on the order of O(2n) (These "order of" statements are calculations that also take into account that not only does the comparison take "1" execution cycle, but that the move to the next record also uses at least "1" execution cycle as well).

    Now when we execute a "SELECT" SQL statement, using the "WHERE" clause on the indexed field, it uses the index tree to find the key for that (those) records, then retrieves the records using the reference pointer provided, so when we say:

    SELECT Names.* FROM Names WHERE Name = 'john'

    The SQL Server searches the index for entries that match john, then sees the record(s) associated with that (those) match(es), retrieves the records referenced by those matches (record 7), then retrieves the information requested (Names.*, all fields).

    Another important note. If alot of insertions, deletions, or updates (on indexed fields) happen, you need to periodically "rebuild the indexes" for that table. This is crucial because when they become fragmented, the tree can end up having a worse case search scenario consistent with that of the array:

    Code:
    allen(6)
        \
       jeff(4)
          \
         john(7)
            \
           olivia(3)
              \
             sally(1)
                \
               tom(2)
                  \
                 zeb(5)
    In this scenario, the worse case situation would result in the entire contents of the tree (7 elements) being compared, which IS NOT ACCEPTIBLE. This negates the entire purpose of the index tree (other than sorting, which is the other grace by which the INDEX is loved). So proper index maintenance in crucial. Usually this can be done periodically, nightly if time and use permits.
    Last edited by rwendel; 04-09-2003 at 09:29 AM.

  4. #4
    Join Date
    Feb 2003
    Posts
    102
    Just to add to RWendels reply

    Indexes speed up searches (SELECTS), if applied correctly.

    Indexes slow down Inserts, Updates and Deletes (as the RDBMS need to maintain the indexes at these events).

    I've not done any tests to check out Accesses performance under these circumstances (the Insert, Update and Delete).

    They also add to the size of the mdb (cause it's gotta store the index somewhere).

    Normally one should index both sides of joins, and as RWendel says fields used in ORDER BY and WHERE clauses.

    The more unique a field eg a boolean true/false field is low in uniqueness, an autonumber field very high (well it is unique!), the better the perfomance gain in general from adding an index.

    You can also create multi field (composite) indexes, such as FamilyName, GivenName. In this instance only FamilyName will be used in the search by the optimizer. In SQL Server at least, and I have read elsewhere but not from MS, using a composite index will mean that Access only has to lookup and return the index, not then do a lookup on the row in the table. This is known as 'covering the query' (all the fields for a specified table, required for a query are in the index).

    Opps, I didn't mean to write so much but indexing is an important topic and normally not enough attention is paid to it by people who are beginning to develop systems (I've seen tables with over 60000 rows without any indexes (including no primary key, and boy, did indexing it speed up performance).

    HTH,

    Peter

  5. #5
    Join Date
    Mar 2003
    Location
    Jacksonville, Florida
    Posts
    52

    forgive me for beating the horse...

    Now the other element in our INDEX discussion is sorting, achieved by the "ORDER BY" clause of the SQL statement. An index tree does this for us too when needed. First a small tangent discussion:

    There are 6 types of tree traversal:

    Preorder (Node,Left,Right)
    Inorder (Left,Node,Right)
    Postorder (Left,Right,Node)

    The other 3 types of traversal are the reverse of each of those:

    Reverse Preorder (Node,Right,Left)
    Reverse Inorder (Right,Node,Left)
    Reverse Postorder (Right,Left,Node)

    Well considering what we know about binary search trees from above, two of these might look particularly interesting, considering our current interest in sorting. The Inorder traversal, and its converse, the Reverse Inorder traversal. The interesting thing about the Inorder travsersal is that when we "Visit" each of the items in the tree in the order given by the Inorder traversal, we get a list of items, which happen to be sorted in ascending order, for example:

    Code:
                   olivia
                /          \
            jeff             tom
          /      \         /      \
       allen    john    sally     zeb
    When we do this inorder traversal (Left,Node,Right)(LNR) on our list from above we get:

    allen,jeff,john,olivia,sally,tom,zeb

    First we go the the first node, olivia. We then traverse its left sub tree using the same inorder technique, which results in us ending up at jeff. The same applies, so we must traverse its left sub tree first. The left sub tree, allen, has no left sub tree, so the node root, allen, is visited, then its right sub tree, which does not exist. Now, when the allen sub tree is finished, control is returned to its parent jeff, which then visits the node's root (jeff) and then traverses its right sub tree (john). The exact order of traversal is shown with the numbers below.

    Code:
                    olivia(4)
                 /             \
           jeff(2)            tom(6)
          /       \          /      \
       allen(1)  john(3)  sally(5)   zeb(7)
    http://forums.databasejournal.com/clear.gif
    Last edited by rwendel; 04-09-2003 at 10:15 AM.

  6. #6
    Join Date
    Apr 2003
    Posts
    48
    lets say from the form itself. I am able to search throught the database to any field that is there. Like First Name, phone number, problem of the situation. Say I have 10 fields. Do I put the index property to yes for all of them if I am able to search through them. If so would it make it faster??

    ROb

  7. #7
    Join Date
    Apr 2003
    Posts
    48
    is there a book that will give me practice and more understanding of this???? I am in the process of creating databases.

  8. #8
    Join Date
    Mar 2003
    Location
    Jacksonville, Florida
    Posts
    52
    It would make it faster, however the index is stored in memory, plus for each index there is an added insertion, deletion, and update time to be considered, so you need to weigh which is required to be faster, lookups (searches and sorts) or data manipulation (insertion, deletion, and updating). There is no 'set' guideline, so determine if speed of searching is more important or if speed updating is more important, and there will be your answer.

  9. #9
    Join Date
    Mar 2003
    Location
    Jacksonville, Florida
    Posts
    52
    I like the SQL Server 2000 Administration Guide, from Microsoft Press. It goes over most of what you'll use, including triggers, joins, views, indexing, stored procedures, and the like.

  10. #10
    Join Date
    Mar 2003
    Location
    Jacksonville, Florida
    Posts
    52
    Also I would suggest a good book on Data Structures, which are just a good thing to know. That is where you will learn about the Binary Tree discussed above and other important programming data structures.

  11. #11
    Join Date
    Feb 2003
    Posts
    102
    Rwendel, excellent technical discussion about SQL Servers indexing structure, but remember the OP is using JET as the backend - not SQL Server.

    A recommended book on MS Access is any of the Access Developers 97|2000|2002 Handbook.

    http://www.developershandbook.com/

    Indexing in JET is 'simpler' than SQL Server cause you don't have as much control (and thus can't squeeze out as much performance...).

    Refer to my earlier post about what to index when using JET and I'll say it again

    1. Both sides of joins
    2. Fields used in WHERE and ORDER BY clauses

    You will get better performance on an index when

    1. There is high number of unique values
    2. There are few nulls (if there's stacks then look at the IgnoreNulls property of the index).

    As RWendel says knowing the data structure the RDBMS (Relation Database Management System) uses for indexes is useful, especially as the size and complexity of your application grows.

    Oh and BTW too many indexes will also reduce the concurrency of your database by creating extra locks on the index structures.

    HTH,

    Peter

  12. #12
    Join Date
    Mar 2003
    Location
    Jacksonville, Florida
    Posts
    52
    Yeah, I just realized what forum I was in...

    Anyway...

    Also a good point, I have found that in purchasing a good Access book, check out the discounted books section in any of the major book sellers. Being a poor college student, I found some really good Access 2000 books placed there. There are some differences between Access 2002 and 2000, but not very significant to make a difference in the book you choose, at least I don't think so.

  13. #13
    Join Date
    Apr 2003
    Posts
    48
    thanks alot

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •