If your SQL VIEW seems to be running slow, and you’re doing aggregation in it with a GROUP BY clause, note that certain things aren’t pushed down to the base query.
Most SQL servers will pass predicates in the WHERE clause down to the base query, as well as the columns in the SELECT statement.
For example, your SQL VIEW contains:
SELECT column1, column2 FROM mytable;
If your SQL query is the following:
SELECT column1 FROM myview WHERE column1 > 0;
Your VIEW, the base query, will be changed to:
SELECT column1 FROM mytable WHERE column1 > 0;
The list of columns in the SELECT clause has been passed down to the base query, as well as the WHERE predicate. For this query, it saves the IO overhead of processing the second column, and processing the rows where column1 <= 0.
However, passing down the list of columns in the SELECT won’t save the CPU and memory overhead required to do a GROUP BY on columns that aren’t selected.
For example, your SQL VIEW contains:
SELECT column1, column2, SUM(column3) AS total FROM mytable GROUP BY column1, column2;
If your SQL query is this:
SELECT column1, total FROM myview WHERE column1 > 0;
You might expect your VIEW, the base query, to be changed to:
SELECT column1, SUM(column3) AS total FROM mytable WHERE column1 > 0 GROUP BY column1, column2;
However, your VIEW, the base query, will actually be changed to:
SELECT column1, column2, SUM(column3) AS total FROM mytable WHERE column1 > 0 GROUP BY column1, column2;
Even though the SELECT clause doesn’t contain column2, it is required to do the GROUP BY in the VIEW.
This may be what you intended, but if the remaining columns are just metadata for column1 (for example, if column1 is a unique key), you can save on the overhead of grouping on all columns by changing your view to only group on the columns that are needed.
SELECT column1, SUM(column3) AS total FROM mytable GROUP BY column1, column2;