我有一个Postges数据库,有一个非常长的表和3列,如下所示:
s_id | c_id | a_id 1 | 1 | 2 1 | 1 | 3 1 | 3 | 15 2 | 1 | 2 2 | 2 | 23 3 | 1 | 2 3 | 3 | 16我有一个查询,找到所有具有c_id 1和3的s_ids,返回它们及其计数:
SELECT s_id, COUNT(s_id) as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(c_id) >= 2 ORDER BY matching_clusters DESC我得到的是以下内容:
s_id | matching_clusters 1 | 3 3 | 2但是,我只想计算一次重复的c_id,这样结果应该是
s_id | matching_clusters 1 | 2 3 | 2有关如何做到这一点的任何建议? 我以为我可以将DISTINCT粘贴到COUNT命令中,但这不起作用。 我可以使用不同的c_id将表结果连接到表本身但我不想重新运行查询,因为在此表上运行查询是非常昂贵的计算方式。
I have a Postges database with one very long table and 3 columns like so:
s_id | c_id | a_id 1 | 1 | 2 1 | 1 | 3 1 | 3 | 15 2 | 1 | 2 2 | 2 | 23 3 | 1 | 2 3 | 3 | 16I have a query that finds all s_ids that have c_id 1 and 3, returns them and their counts:
SELECT s_id, COUNT(s_id) as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(c_id) >= 2 ORDER BY matching_clusters DESCWhat I get back is the following:
s_id | matching_clusters 1 | 3 3 | 2But, I only want to count recurring c_id once, such that results here should be
s_id | matching_clusters 1 | 2 3 | 2Any suggestions on how to do this? I thought I can stick DISTINCT into the COUNT command, but that didn't work. I can probably join the result on table itself with distinct c_id but I don't want to re-run the query because running a query on this table is very expensive computation wise.
最满意答案
如果我理解正确,那么这将有效:
SELECT s_id, 2 as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(c_id) >= 2 ORDER BY matching_clusters DESC;这可能是你想要的:
SELECT s_id, COUNT(DISTINCT c_id) as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(DISTINCT c_id) = 2 ORDER BY matching_clusters DESC;注意在having子句中使用distinct 。
If I understand correctly, then this will work:
SELECT s_id, 2 as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(c_id) >= 2 ORDER BY matching_clusters DESC;This may be what you want:
SELECT s_id, COUNT(DISTINCT c_id) as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(DISTINCT c_id) = 2 ORDER BY matching_clusters DESC;Note the use of distinct in the having clause.
SQL GROUP BY不同的行(SQL GROUP BY distinct rows)我有一个Postges数据库,有一个非常长的表和3列,如下所示:
s_id | c_id | a_id 1 | 1 | 2 1 | 1 | 3 1 | 3 | 15 2 | 1 | 2 2 | 2 | 23 3 | 1 | 2 3 | 3 | 16我有一个查询,找到所有具有c_id 1和3的s_ids,返回它们及其计数:
SELECT s_id, COUNT(s_id) as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(c_id) >= 2 ORDER BY matching_clusters DESC我得到的是以下内容:
s_id | matching_clusters 1 | 3 3 | 2但是,我只想计算一次重复的c_id,这样结果应该是
s_id | matching_clusters 1 | 2 3 | 2有关如何做到这一点的任何建议? 我以为我可以将DISTINCT粘贴到COUNT命令中,但这不起作用。 我可以使用不同的c_id将表结果连接到表本身但我不想重新运行查询,因为在此表上运行查询是非常昂贵的计算方式。
I have a Postges database with one very long table and 3 columns like so:
s_id | c_id | a_id 1 | 1 | 2 1 | 1 | 3 1 | 3 | 15 2 | 1 | 2 2 | 2 | 23 3 | 1 | 2 3 | 3 | 16I have a query that finds all s_ids that have c_id 1 and 3, returns them and their counts:
SELECT s_id, COUNT(s_id) as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(c_id) >= 2 ORDER BY matching_clusters DESCWhat I get back is the following:
s_id | matching_clusters 1 | 3 3 | 2But, I only want to count recurring c_id once, such that results here should be
s_id | matching_clusters 1 | 2 3 | 2Any suggestions on how to do this? I thought I can stick DISTINCT into the COUNT command, but that didn't work. I can probably join the result on table itself with distinct c_id but I don't want to re-run the query because running a query on this table is very expensive computation wise.
最满意答案
如果我理解正确,那么这将有效:
SELECT s_id, 2 as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(c_id) >= 2 ORDER BY matching_clusters DESC;这可能是你想要的:
SELECT s_id, COUNT(DISTINCT c_id) as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(DISTINCT c_id) = 2 ORDER BY matching_clusters DESC;注意在having子句中使用distinct 。
If I understand correctly, then this will work:
SELECT s_id, 2 as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(c_id) >= 2 ORDER BY matching_clusters DESC;This may be what you want:
SELECT s_id, COUNT(DISTINCT c_id) as matching_clusters FROM test WHERE c_id IN (1,3) GROUP BY s_id HAVING COUNT(DISTINCT c_id) = 2 ORDER BY matching_clusters DESC;Note the use of distinct in the having clause.
发布评论