I have a store with Products and Sub-Products. The following query below uses two select statements because it needs to find "matching products" OR "the parent of matching sub-products".
Unfortunately, this query is painfully slow and usually timeout. Does anyone have any ideas that would return the same results, but improve speed.
One thing I noticed it that removing the ORDER BY speeds it up quite a bit.
Please me know if you have any ideas.
Here is the query:
Code:
SELECT TOP 50
A.idProduct, A.sku, A.description, A.price, A.listHidden, A.listPrice, A.serviceSpec, A.bToBPrice, A.smallImageUrl, A.noprices, A.stock, A.noStock, A.pcprod_HideBTOPrice, A.pcProd_BackOrder, A.FormQuantity, A.pcProd_BackOrder, A.pcProd_BTODefaultPrice, A.sDesc
FROM products A
WHERE idProduct IN (
SELECT TOP 50 idProduct FROM categories_products, categories WHERE idProduct = A.idProduct AND ((A.serviceSpec<>0 AND A.pcProd_BTODefaultPrice>=0 And A.pcProd_BTODefaultPrice<=999999999) OR (A.serviceSpec=0 AND A.configOnly=0 AND A.price>=0 And A.price<=999999999)) AND A.active=-1 AND A.removed=0 AND categories.idCategory=categories_products.idCategory AND categories.iBTOhide=0 AND categories.pccats_RetailHide=0 AND idSupplier=10 AND ((((content1 LIKE '%apparel%') OR (content2 LIKE '%apparel%') OR (content3 LIKE '%apparel%'))) OR ((A.details LIKE '%apparel%') OR (A.description LIKE '%apparel%') OR (A.sDesc LIKE '%apparel%') OR (SKU LIKE '%apparel%')))
) OR idProduct IN (
SELECT B.pcProd_ParentPrd FROM products B WHERE pcProd_ParentPrd = A.idProduct AND ((B.serviceSpec=0 AND B.configOnly=0 AND B.price>=0 And B.price<=999999999)) AND B.active=0 AND B.pcProd_SPInActive=0 AND B.pcProd_ParentPrd>0 AND B.removed=0 AND idSupplier=10 AND ((((content1 LIKE '%apparel%') OR (content2 LIKE '%apparel%') OR (content3 LIKE '%apparel%'))) OR ((B.details LIKE '%apparel%') OR (B.description LIKE '%apparel%') OR (B.sDesc LIKE '%apparel%') OR (SKU LIKE '%apparel%')))
) ORDER BY A.sku ASC, A.idproduct DESC
I did look at the execution plan, but I am not really sure what to look for. Nothing stood out to me. I will do some research about how to read those.
In the meantime, can you please tell me if the query logic is sound? Do you see anything that looks like a really bad practice? I may not be able to change the actually db. So I need to make sure the query is optimized as much as possible. If it is already looking the best it can be maybe proper indexing with make a huge difference. Right now though I suspect that must be a big problem in the query because it takes so long to run.
If I remove the ") OR idProduct IN (" and just execute one query or the other it is very fast. So maybe using an "OR" between queries is bad practice? Is there a better way? Note, I also tried UNION and it was even slower.
They all say table scan. The largest step is the last step at 34%. It says "index seek". I have posted a link to the plan here. The only thing is this plan was taken from a db with not as much data as the live store.
You can use index tuning wizard to find out if need new index for this query. By the way, rebuild indexes on involved tables then run your query again. Check plan to see if have any difference.
I noticed that both of your subqueries refer back to the outer table (A.productid, etc.) That will cause the subquery to be rerun for every row in the "Products" table. It also makes the "IN" clause useless, since the subquery will never return more than one row. Try moving all of the checks against the "Products" table into the outer query. Also, don't use "TOP" in a subquery unless you're also using an "ORDER BY"... it will skew your results.
*Don't expect me to reply again - I just happened by on Google and couldn't resist giving a response.
Hi Brilliand, Thank you for taking the time to look at the query. That is very interesting indeed. I was under the impression that the nested queries run first and the outer query runs against just that result set. If what you say is correct then it is very inefficient right now. I am going to try moving the conditions into the outer query.
First things first: you have to organize longer queries into a readable fashion otherwise, you'll just go crazy trying to comprehend it all. (see attachment)
Secondly, 'IN's are faster than 'NOT IN's - but I'm not entirely sure they utilize indexes. Beyond that, I see 'like' in there a lot - that keyword ignores indexes completely because the engine has to do a table scan. True, the prior conditions will narrow the result set, but you can't just partially tune your queries - it should always be an 'all or nothing' kind of thing.
The last poster is right, your first subquery is a TOP 50 - it will never examine more than the first 50 rows returned from that query - so make sure you change that.
He's also correct in that the sub-queries need to run for EACH row. If they were in the FROM clause instead of the WHERE, they would be executed once (and first). But because its in the WHERE clause, it is run for each row.
Also, try to consolidate the conditional statements a bit. If it makes sense to you, thats good, but looking at the statement as someone who knows nothing about your business, I see a LOT of and's and or's and that can be intimidating to anyone. It also makes it difficult to troubleshoot later.
VERY IMPORTANT:
The first subquery's FROM clause shows 2 tables. But they are not joined. The only join I see is from the table that has the 'idProduct' column to the PRODUCTS table from the parent query. The table that does not have the 'idProduct' column is never joined. Even if both tables have that column, you need to explicity specify the join conditions.
If you don't specify a join condition, the query engine may do a full join just to play it safe and that is bad. ALWAYS best to join that way you 'know' what it's doing.
Also - join in the FROM clause. Joining in the WHERE clause is inefficient and is not ansi-standard anymore.
ie:
FROM Products P
INNER JOIN Categories_Products CP
ON P.idProduct = CP.idProduct
WHERE This = This
AND That = That
(supposed to be double-spaced indent, but this board doesn't preserve spacing too well..)
As an alternative to LIKE usage, read up on Full-Text Indexes.. if managed properly, FTI's can help a lot.
First subquery:
Lose the TOP 50
Do not join to Products from within the subquery - this is a co-related subquery (a query that cannot be run by itself). If you have to, add the Products table into the subquery - but try to avoid referencing tables outside the query if you can help it.
Make sure the tables listed in the FROM clause are joined
Joins should happen in the FROM clause, not the WHERE
Second subquery:
You are basically joining Products to itself here. (alias 'B' is PRODUCTS and so is 'A') If that's the case, then you don't need this subquery.
I know this is kind of long winded.. but hopefully you got something useful out of it.
Last thing, when using aliases, try to qualify ALL columns with the alias letter. In the first subquery, noone knows what table idSupplier or SKU belongs to.
I second Phantasm's recommendations. You should also look at structuring your code better, like Phantasm's attachment (not my coding style, but it'll do). It's difficult unraveling what your query is trying to do. It's also helpful to post table structures for complex questions like this. As a rule of thumb, if the code is difficult to read, it's more difficult to fix.
One thing that is going to kill performance is your LIKE '%apparel%' statements, that's going to bring the server to its knees, especially if you're searching a lot of records. You also have several instances of code where you're comparing price along these lines: *Price>=0 And *Price<=999999999. Do you have prices outside of that range? Not knowing your data, it's difficult to offer specific fixes. If you, for example, had a bit field on apparel items rather than doing wildcard searches, you could greatly improve your performance.
And as others have suggested, check for indexes on the fields that you are joining on.
I don't know if you're familiar with Red Gate Software. They have some free ebooks that you can download, including one on reading query plans. You can download it from here: http://www.red-gate.com/about/news/ebook.htm. Click on the Download Best of SQLServerCentral.com, Vol. 1 link, and you should find it on the following page.
FROM products A
WHERE
((A.serviceSpec<>0
AND A.pcProd_BTODefaultPrice>=0
And A.pcProd_BTODefaultPrice<=999999999)
OR
(A.serviceSpec=0 AND A.configOnly=0
AND A.price>=0 And A.price<=999999999)
)
AND A.active=-1
AND A.removed=0
AND (A.details LIKE '%apparel%'
OR A.description LIKE '%apparel%'
OR A.sDesc LIKE '%apparel%' SKU LIKE '%apparel%')
idProduct IN (
SELECT TOP 50 idProduct
FROM categories_products, categories
WHERE
AND categories.idCategory=categories_products.idCatego ry
AND categories.iBTOhide=0
AND categories.pccats_RetailHide=0
AND idSupplier=10
AND (
content1 LIKE '%apparel%'
OR content2 LIKE '%apparel%'
OR content3 LIKE '%apparel%' )
union
SELECT B.pcProd_ParentPrd
FROM products B
WHERE B.serviceSpec=0
AND B.configOnly=0
AND B.price>=0
And B.price<=999999999
AND B.active=0
AND B.pcProd_SPInActive=0
AND B.pcProd_ParentPrd>0
AND B.removed=0
AND idSupplier=10
AND (content1 LIKE '%apparel%'
OR content2 LIKE '%apparel%'
OR content3 LIKE '%apparel%'
OR B.details LIKE '%apparel%'
OR B.description LIKE '%apparel%'
OR B.sDesc LIKE '%apparel%'
)
) ORDER BY A.sku ASC, A.idproduct DESC