-
Massive query, where NOT IN does not appear to work?
Hello
I'll keep it to the point so we can get down to the nitty gritty!
Table: tags
Fields: tid INT, tname VARCHAR
Table: uploads
Fields: uid INT, ufile VARCHAR
Table: tags_upoads
Fields: tuid INT, tid INT, uid INT
Step 1: From a webpage, a user uploads an image. The filepath to the upload is stored in the uploads table.
Step 2: A user can then assign tags to an upload to describe its data. When a tag is submitted to the server, it checks against the database to see if it already exists, and if it doesn't it will create a new one.
Step 3: Then the server will link those tag ID's to the upload ID using the tags_uploads table. So, uploads can have many tags.
For reporting purposes, I would like to generate a result from the SQL server that meets the following conditions:
* Shows distinct uploaded filenames
* Only if the tag ID matches a certain tag number(s)
* Only if the tag ID does not match other certain tag number(s)
* Ordered by the popularity of the tags selected (number of times the tag ID has been assigned to an upload ID)
However, imagine now that some uploads are tagged with a tag ID of 5 AND 6, but I want to show the uploads that are tagged 5 and exclude the uploads that are tagged with 6. How is that done?
Imagine this data set in tags_uploads for this example:
tuid=1, uid=1, tid=5
tuid=2, uid=1, tid=6
tuid=3, uid=2, tid=5
tuid=4, uid=2, tid=6
tuid=5, uid=3, tid=5
In this example, uid's 1 and 2 are tagged with tid's 5 and 6. Only uid 5 is tagged with just 5 on its own.
However these results need to be pulled back as part of an extremely large query which selects just distinct uploads, which is as follows:
Code:
SELECT uploads.uid, uploads.ufile, tu.tid, counter.tidcount
FROM uploads
INNER JOIN (
SELECT DISTINCT tags_uploads.uid, tags_uploads.tid
FROM tags_uploads
WHERE tags_uploads.tid
IN ( 5 )
AND tags_uploads.tid NOT
IN ( 6 )
GROUP BY tags_uploads.uid
) AS tu ON uploads.uid = tu.uid
INNER JOIN tags ON tags.tid = tu.tid
INNER JOIN (
SELECT tags_uploads.tid, COUNT( * ) AS tidcount
FROM tags_uploads
WHERE tags_uploads.tid
IN ( 5 )
AND tags_uploads.tid NOT
IN ( 6 )
GROUP BY tags_uploads.tid
) AS counter ON tu.tid = counter.tid
ORDER BY counter.tidcount DESC , uploads.uid DESC
If I remove the code that says AND tags_uploads.tid NOT IN (6) from both subqueries, then I receive the exact same set of results.
Please can somebody explain what I am missing here, and what is required for this to now work?
I feel that this is an exceptionally complex question and I really will put my thanks out to the person that can help me with this. I'd even consider payment in some form.
Thanks and good luck
-
Perhaps a better way to visualise the data would be like this:
uploads
uid = 1, ufile = redapple1.jpg
uid = 2, ufile = redapple2.jpg
uid = 3, ufile = greenapple.jpg
tags
tid = 1, tname = apple
tid = 2, tname = red
tid = 3, tname = green
tags_uploads
tuid = 1, tid = 1, uid = 1
tuid = 2, tid = 1, uid = 2
tuid = 3, tid = 1, uid = 3
tuid = 4, tid = 2, uid = 1
tuid = 5, tid = 2, uid = 2
tuid = 6, tid = 3, uid = 3
So we have 3 pictures of apples. All 3 are tagged as apple. 2 are tagged as red and one is tagged as green.
If I wanted to show pictures of apples, but not red apples, I think the query should be this:
Code:
SELECT uid FROM tags_uploads WHERE tid IN (1) AND tid NOT IN (2)
Which should bring back just one result of the green apple, right? Well, it brings back all results. So I believe I am missing some crucial stage of logic here!
Again, thanks in advance for helping out a poor soul.
-
Also in the above, if you just run a query to show NOT apple's, you still get green's and red's, which are apples
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
|