*“The fastest way to perform a calculation is to have already done so.”*

## Introduction

Here’s an interesting problem: given a sequence of numbers between ** M** and

**, how many are prime numbers? How would you solve this and how efficient will your solution be if M=0 and N=10,000,000? On my PC I’m able to get the job done in about 3 seconds.**

*N*## Introducing the Indexed Function

One day while trying to figure out some recursive query I stumble across a SQL Server Central post about calculating a Fibonacci sequence where someone asked, *“why not just store the values in a table?*” It seems obscenely obvious to me now but at the time, but it blew me away that I never thought of it. It still blows my mind how so few developers take advantage of this. Just populate a table (on-disk or in-memory) with the values you need, add indices as needed, then build an inline table valued function (iTVF) to retrieve the value with an index seek. This allows your function to run in O(1) time for each value passed to it. Indexed values are stored as a suitably sorted stream which makes it possible to retrieve the value(s) without a scan. A web search for “*create Fibonacci function*“, “function for prime numbers” or something similar will generally return various algorithms (mostly recursive) but scarce mentions of pre-calculating problems where the values never change. Doesn’t Fibonacci(7) *always* 13? Factorial(5) = 120? When is 7 *not* a prime number? ** I suspect that, the fastest way to perform a calculation is to have already done so**, then save the result in an index for fast retrieval.

In Paul White’s article, *Understanding and using APPLY (Part 1)* he refers to an inline table valued function (iTVF) as “a view that accepts parameters”. With that in mind, what would be the best way to describe an* indexed view (or table) *that* accepts parameters*? I’m thinking ** indexed function**.

## Nasty Fast Primes

Primes is a great example since calculating them can be costly, especially when using a solution which “counts up” iteratively or recursively. For this exercise we’re going to build a prime number function that can determine if an integer between 1 and 10,000,000 is prime and do so *in O(1) time and space*. I know that 99.999% of you don’t need a prime number function so remember that this is for learning and to prove the power of this concept; this technique will work against any immutable set of values.You don’t have to play along or can test this with a smaller number of primes. I’m including the DDL used to create the function so you can keep me honest about the performance tests later in the article.

### The Setup Procedure

This proc indexes and populates a prime numbers table called *idx.prime*. There are two parameters: *@Counter* and *@maxrows*. *@counter* is the number that the proc starts at and should be left at 14. You can use @counter to set up batch inserts. If you execute usp_primes and accept the default values, the function will support 1-10M with a bit column indicating if the number is prime. You don’t have to do 10M rows but it’s ~300MB, uncompressed, if you do. For me it this runs 20 minutes without batching the insert; I did it all in one shot. Note that I keep my indexed functions in an “idx” schema, no harm in using dbo. Once the procedure is done you can create our prime numbers function *as an indexed function*. As mentioned in Part 2 of my PerfML series, functional purity is key so only an iTVF will do. If you’re playing along you will need a one million row tally table, you can get the DDL

**here**and the stored procedure to create an indexed prime number function is attached below.

Note that it creates a schema name idx; this is where my indexed functions live but it’s not necessary.

### The Indexed Function (idx.fnPrimes)

Now we have a sorted set of numbers from 1 to 10M. Our function takes @N as an input and performs a one-row index seek.

**Figure 1: idx.fnPrimes**

```
CREATE FUNCTION idx.fnPrimes
(
@N BIGINT
)
RETURNS TABLE WITH SCHEMABINDING AS RETURN
SELECT p.N, p.PRIME FROM idx.prime AS p WHERE p.N = @N;
GO
```

To check if a number is prime:

**Figure 2: Basic idx.fnPrimes demo**

`SELECT p.N, p.PRIME FROM idx.fnPrimes(11) AS p;`

To return the numbers 1 to 7 along with their prime value we use fnTally to generate the stream of numbers and map them to fnPrimes.

```
SELECT p.N, p.PRIME
FROM dbo.fnTally(1,7) AS f
CROSS APPLY idx.fnPrimes(f.N) AS p;
```

This returns: [(1 1),(2 1),(3 1),(4 0),(5 1),(6 0),(7,1)]; now a quick performance test.

## Performance Test #1: Calculate Primes 1 to 1 Million

The purpose of this test is to prove out the concept of the indexed function. This query will take the values 1 through 10,000,000 and determine if they are prime or not. I am forcing a serial plan since the serial plan actually performs best. @P variable captures the value so as to avoid SSMS spitting out 10 million rows and ruining the performance metrics.

**Figure 3.1: idx.fnPrimes Performance Test**

```
PRINT CHAR(13)+'fnTally serial'+CHAR(13)+REPLICATE('-',90);
GO
DECLARE @ST DATETIME = GETDATE(), @P BIT;
SELECT @P = p.PRIME
FROM dbo.fnTally(1,10000000) AS f
CROSS APPLY idx.fnPrimes(f.N) AS p
OPTION (MAXDOP 1);
PRINT DATEDIFF(MS,@ST,GETDATE());
GO 3
```

**Figure 3.2: idx.fnPrimes Performance Test Results**

```
fnTally serial
------------------------------------------------------------------------------------------
Beginning execution loop
2660
2727
2750
```

Under ** 3 seconds** for idx.fnPrimes to accept the numbers 1 to 10,000,000 and determine if which ones are prime. This with a serial execution plan!

## Unexpected Virtual Index Pitfalls (continued…)

One motivation for giving a name to what I call, the “virtual index,” is to make the following type of problem (and solution) easier to explain. Part 2 of this series you were introduced to some unexpected virtual indexing pitfalls which can be tricky to identify. There’s another issue to watch out for people using SQL 2019 which includes batch mode over rowstore, or users or who leverage Itzik Ben-Gan’s batch mode processing trick to enjoy this functionality on SQL Server 2016+. I use SQL 2019 and the optimizer leverages a window aggregate function that invokes batch mode over rowstore which is the culprit here. Below are two techniques to retrieve the numbers 1 to 7 and identify which are prime.

**Figure 4: Hidden CTE Tally table row retrieval issue**

```
--==== 1. Leveraging fnTally
SELECT p.N, p.PRIME
FROM dbo.fnTally(1,7) AS f
CROSS APPLY idx.fnPrimes(f.N) AS p;
--==== 2. Static Values
SELECT p.N, p.PRIME
FROM (VALUES(1),(2),(3),(4),(5),(6),(7)) AS f(N)
CROSS APPLY idx.fnPrimes(f.N) AS p;
```

Each function returns the expected result set: [(1 1),(2 1),(3 1),(4 0),(5 1),(6 0),(7,1)]; The execution plan for the fnTally solution.

**Figure 4.1: Execution Plan Details**

fnTally retrieved ** 2700 rows** to generate the numbers 1-7! The second query didn’t have this issue. Luckily there are more than a few ways to solve this.

### Solution 1: Disable batch mode over rowstore

Let’s temporarily disable batch mode over rowstore and try again.

**Figure 5.1: Solution #1: Disable batch mode over rowstore query**

```
SELECT p.N, p.PRIME
FROM dbo.fnTally(1,7) AS f
CROSS APPLY idx.fnPrimes(f.N) AS p;
OPTION(USE HINT('DISALLOW_BATCH_MODE'));
```

**Figure 5.2: Solution #1: Execution plan**

7 rows only, much better!

### Solution 2: make_parallel

In my experience Adam Machanic’s make_parallel always forces plans that don’t include batch mode over rowstore. I don’t know if it’s an option with make_parallel in play.

**Figure 6.1: Solution #2: make_parallel**

```
SELECT p.N, p.PRIME
FROM dbo.fnTally(1,7) AS f
CROSS APPLY idx.fnPrimes(f.N) AS p
CROSS JOIN dbo.make_parallel() AS parallel_execution;
```

Below is the portion of the plan where the ROW_NUMBER sequence is created; again, only 7 rows

**Figure 6.2: Solution #2: Execution plan**

### Solution 3: Use a physical tally table

The conventional wisdom seems to be that the CTE tally table is faster than a physical table pretty much *all the time*, a reasonable assumption since the CTE tally table is readless but physical tally tables do require IO. Here is an example of where the physical tally table (correctly indexed) has an edge.

**Figure 7.1: Solution #3: Physical Tally Table**

Note that both a TOP + ORDER BY, WHERE clause filter, or a both together will do just fine.

```
SELECT TOP(7) p.N, p.PRIME
FROM dbo.tally AS f
CROSS APPLY idx.fnPrimes(f.N) AS p
ORDER BY f.N;
```

Again, only **7 numbers** retrieved.

**Figure 7.2: Solution #3: Execution Plan**

This is one of many subtle differences between a persisted tally table and a CTE version. Let’s explore deeper.

## RangeAB and RangeAB2

We will conclude by running two performance tests against one million rows. The one million limit is because that’s what dbo.tally is capped at. For this test let two tally table functions: the first can be considered an “add-on” to dbo.fnTally named dbo.rangeAB. The second takes identical parameters and returns identical results, the only difference is that it leverages physical tally table (dbo.tally) instead of dbo.fnTally.

Both functions can be used to help enforce best practices with as little code as possible. Even better: *Both functions return the exact result set given the same parameters.* *This gives you the ability to test your set-based code against both a physical and virtual tally table index simply adding or removing the number “2” from the end of “RangeAB.”*

The functions and usage examples are located here. For the purposes of this article I will show you the execution plans for a basic usage scenario so we can compare execution plans. Both examples use one of the RangeAB functions to return a series of weeks. We’ll run these with “show actual execution plan” turned on.

**Figure 8.1: Basic Usage Example**

```
DECLARE
@startdate DATE = '20200831',
@weeks BIGINT = 5,
@row1 BIGINT = 1;
BEGIN
--==== RangeAB (fnTally version)
SELECT
WeekNumber = f.RN,
WeeksLeft = f.OP,
DaysFromMedian = ABS(f.DM),
WeekStart = DATEADD(DAY,f.N1-1,@startdate),
WeekEnd = DATEADD(DAY,f.N2-2,@startdate)
FROM dbo.rangeAB(@row1, @weeks*7-(1-@row1), 7, @row1) AS f
ORDER BY f.RN;
--==== RangeAB2 (dbo.tally version)
SELECT
WeekNumber = f.RN,
WeeksLeft = f.OP,
DaysFromMedian = ABS(f.DM),
WeekStart = DATEADD(DAY,f.N1-1,@startdate),
WeekEnd = DATEADD(DAY,f.N2-2,@startdate)
FROM dbo.rangeAB2(@row1, @weeks*7-(1-@row1), 7, @row1) AS f
ORDER BY f.RN;
END;
```

**Figure 8.2: Results (for each)**

```
WeekNumber WeeksLeft DaysFromMedian WeekStart WeekEnd
----------- ---------- --------------- ---------- ----------
1 5 14 2020-08-31 2020-09-06
2 4 7 2020-09-07 2020-09-13
3 3 0 2020-09-14 2020-09-20
4 2 7 2020-09-21 2020-09-27
5 1 14 2020-09-28 2020-10-04
```

### Execution plans

Ignore the 40/60% estimates, they aren’t accurate, both perform almost identically. The *win* is that we can use each to quickly determine which is better – dbo.tally or dbo.fnTally without any logic changes. fnTally uses the Itzik Ben-Gan style ROW_NUMBER over cross joins, the other gets its rows from dbo.tally.

**Figure 8.3: Execution Plans**

## Prime Numbers Test #1: RangeAB for a running total of Primes

Now back to our indexed function, idx.fnPrimes. This first example is a simple running total of prime numbers from 1 to 50. Ignore the @gap parameter for now.

```
DECLARE @gap BIGINT = 1,
@low BIGINT = 0,
@hgh BIGINT = 50;
--==== Solution #1: Primes running total over an Indexed Function + Virtual Index (RangeAB)
SELECT
N = ((rnd.RN-2)*@gap+@low)+1,
TotalPrimes = SUM(p.PRIME+0)
FROM dbo.rangeAB(@low,@hgh,@gap,1) AS rnd
CROSS APPLY dbo.rangeAB(0,rnd.N1,1,0) AS r
CROSS APPLY idx.fnPrimes(r.RN) AS p
GROUP BY rnd.RN
ORDER BY rnd.RN;
--==== Solution #2: Primes running total over two indexed functions
SELECT
N = ((rnd.RN-2)*@gap+@low)+1,
TotalPrimes = SUM(p.PRIME+0)
FROM dbo.rangeAB2(@low,@hgh,@gap,1) AS rnd
CROSS APPLY dbo.rangeAB2(0,rnd.N1,1,0) AS r
CROSS APPLY idx.fnPrimes(r.RN) AS p
GROUP BY rnd.RN
ORDER BY rnd.RN;
```

**Figure 10.1: Primes Test #1 Results (truncated for brevity)**

```
N TotalPrimes
------- -----------
1 1
2 2
3 3
4 3
5 4
6 4
7 5
8 5
...
46 15
47 16
48 16
49 16
50 16
```

## Prime Numbers Test #2: Performance

Here we’ll do two tests, first with a serial plan, then with a parallel plan. What we need to do is create @gap-sized groups and collect a between @low and @hgh along with a count of prime numbers in that group. We’ll compare the performance RangeAB vs RangeAB2. Again, the logic is identical except that the latter leverages dbo.tally. Note the parameters.

First to test serial execution using MAXDOP 1.

**Figure 11.1 Serial Test **

```
SET STATISTICS IO, TIME ON;
DECLARE
@gap BIGINT = 100000,
@low BIGINT = 0,
@hgh BIGINT = 1000000;
PRINT CHAR(10)+'fnTally version'+CHAR(10)+REPLICATE('-',90);
SELECT
L = ((rnd.RN-2)*@gap+@low)+1,
H = (rnd.RN-1)*@gap+@low,
TotalPrimes = SUM(p.PRIME+0)
FROM dbo.rangeAB(@low,@hgh,@gap,1) AS rnd
CROSS APPLY dbo.rangeAB(0,rnd.N1,1,0) AS r
CROSS APPLY idx.fnPrimes(r.RN) AS p
GROUP BY rnd.RN
ORDER BY rnd.RN
OPTION (MAXDOP 1);
PRINT CHAR(10)+'Tally Function V2 - Indexed'+CHAR(10)+REPLICATE('-',90);
SELECT
L = ((rnd.RN-2)*@gap+@low)+1,
H = (rnd.RN-1)*@gap+@low,
TotalPrimes = SUM(p.PRIME+0)
FROM dbo.rangeAB2(@low,@hgh,@gap,1) AS rnd
CROSS APPLY dbo.rangeAB2(0,rnd.N1,1,0) AS r
CROSS APPLY idx.fnPrimes(r.RN) AS p
GROUP BY rnd.RN
ORDER BY rnd.RN
OPTION (MAXDOP 1);
SET STATISTICS IO, TIME OFF;
```

**Figure 11.2: Primes Test #2 Result Set**

```
L H TotalPrimes
----------- ------------ -----------
1 100000 19187
100001 200000 38367
200001 300000 57548
300001 400000 76728
400001 500000 95910
500001 600000 115092
600001 700000 134272
700001 800000 153453
800001 900000 172634
900001 1000000 191814
```

**Figure 11.3: Primes Test #2 Performance (Serial: Time & IO)**

```
fnTally version
------------------------------------------------------------------------------------------
Table 'prime'. Scan count 0, logical reads 16843792, physical reads 0...
Table 'Worktable'. Scan count 0, logical reads 15101551, physical reads 0...
SQL Server Execution Times: CPU time = 23312 ms, elapsed time = 23452 ms.
Tally Function V2 - Indexed
------------------------------------------------------------------------------------------
Table 'prime'. Scan count 0, logical reads 16843792, physical reads 0...
Table 'tally'. Scan count 2, logical reads 9555, physical reads 0...
SQL Server Execution Times: CPU time = 9110 ms, elapsed time = 9163 ms.
```

This a case where dbo.tally finds redemption. 9 Seconds vs 23 – not bad. Next for a parallel execution plan leveraging dbo.make_parallel.

**Figure 11.4: Parallel Test**

```
SET STATISTICS IO, TIME ON;
DECLARE
@gap BIGINT = 100000,
@low BIGINT = 0,
@hgh BIGINT = 1000000;
PRINT CHAR(10)+'fnTally version'+CHAR(10)+REPLICATE('-',90);
SELECT
L = ((rnd.RN-2)*@gap+@low)+1,
H = (rnd.RN-1)*@gap+@low,
TotalPrimes = SUM(p.PRIME+0)
FROM dbo.rangeAB(@low,@hgh,@gap,1) AS rnd
CROSS APPLY dbo.rangeAB(0,rnd.N1,1,0) AS r
CROSS APPLY idx.fnPrimes(r.RN) AS p
CROSS JOIN dbo.make_parallel() AS x
GROUP BY rnd.RN
ORDER BY rnd.RN;
PRINT CHAR(10)+'Tally Function V2 - Indexed'+CHAR(10)+REPLICATE('-',90);
SELECT
L = ((rnd.RN-2)*@gap+@low)+1,
H = (rnd.RN-1)*@gap+@low,
TotalPrimes = SUM(p.PRIME+0)
FROM dbo.rangeAB2(@low,@hgh,@gap,1) AS rnd
CROSS APPLY dbo.rangeAB2(0,rnd.N1,1,0) AS r
CROSS APPLY idx.fnPrimes(r.RN) AS p
CROSS JOIN dbo.make_parallel() AS x
GROUP BY rnd.RN
ORDER BY rnd.RN;
SET STATISTICS IO, TIME OFF;
```

**Figure 11.5: Primes Test #2 Performance (Parallel: Time & IO)**

```
fnTally version (RangeAB)
------------------------------------------------------------------------------------------
Table 'prime'. Scan count 0, logical reads 16843855, physical reads 0...
Table 'Worktable'. Scan count 0, logical reads 15101558, physical reads 0...
SQL Server Execution Times: CPU time = 35877 ms, elapsed time = 7816 ms.
dbo.tally version (RangeAB2)
------------------------------------------------------------------------------------------
Table 'tally'. Scan count 9, logical reads 9604, physical reads 0...
Table 'prime'. Scan count 0, logical reads 16843855, physical reads 0...
SQL Server Execution Times: CPU time = 16687 ms, elapsed time = 3638 ms.
```

The great news is that, in both cases, parallel execution speeds things up for both. The fist version (virtual index) increasing to 7.8 seconds, the second (the indexed function) completes in 3.6 seconds. The reads against the primes table (via the idx.primes indexed function) are excessive. That said, to calculate the numbers of primes, split into 100K chunks up to 1,000,000… that should take longer than 3.6 seconds.

## Conclusion

We have a new way to take complicated functions and speed them up to O(1) time leveraging a new concept I coined as the *Indexed Function*.; a powerful tool for you set-based programming toolbox. We have a way to quickly compare a CTE-Tally to a real one. We’ve learned that a physical tally table, in some cases, can blow the doors off a virtual one.

The Virtual Index and Indexed Functions are indispensable tools in my toolbox. Thanks for reading.

## 2 thoughts on “Virtual Indexing Part 3: The Indexed Function”