Ramblings of a SQL Hillbilly

Or, the mad ravings of a data warehouse forklift operator

Temporary Query Items: Table Variables

Note: This is part of a series on Temporary Query Items.

Previously we’ve talked about Temporary Query Items, what they are, why they matter, and the various factors that might cause you to choose one over another. We’ve also already talked about our first TQI, temporary tables. Without further ado, it’s time for our next one – the Table Variable.

A Little History

Table variables have been present in SQL Server since at least SQL Server 2000. You should be able to make use of them everywhere.

Location, Location, Location

Table variables live in TempDB. There’s an ancient myth that says table variables only live in memory – that’s not entirely true. Like any other data set in use, a table variable may be cached in memory, but its actual home address is TempDB. As with temporary tables, this means that table variables can more easily cause TempDB contention than other TQIs.

Unlike temporary tables, table variables are local in scope. You can only use them in the originating connection.

How Long Does It Live?

Table variables don’t last very long relative to temporary tables, but they do have a fair amount of longevity. When you create a table variable, the data contained within will exist until the completion of the transaction. You can refer to it in multiple queries within one transaction, but it will disappear at the end of transaction.

Indexing

Table variables do support clustered indexes, but ONLY when the index is defined in the variable declaration. This actually improves somewhat in SQL Server 2014, where we have the ability to define non-clustered indexes directly in our table create statements – and therefore, we are able to define non-clustered indexes on table variables there.

One thing that table variables do NOT support is column statistics. This can be a pain point due to the fact that the query optimizer can’t determine how many rows to expect out of a table variable, so it assumes that the number will always be 1. 1 is a terrible default, but it’s better than all the other potential defaults. Large result sets stored in table variables can lead to performance problems depending on the execution plan that is chosen.

Copy That

Like temporary tables, when you create a table variable based on an existing data set, it is created as a separate physical copy of the data. Modifications to the table variable do not affect the original data source. This can be either a strength or a weakness , depending on your purposes for using a TQI.

But Does It Blend?

The data in a table variable is fully modifiable. However, modifications to the table variable result set do not automatically translate back to the original data used to populate it. This can definitely be either a point for or against your choice of TQI.

Show Me The Code!

Test Data Setup

I’ll be using a test table to do some demonstrations. You can create/populate it here:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
create table dbo.JK_Temp_Stuff_Test_Table (
       myIntKey int PRIMARY KEY CLUSTERED,
       myVarchar varchar(10) not null
);

insert into dbo.JK_Temp_Stuff_Test_Table (myIntKey, myVarchar)
values (5, 'Wheeee');

create table dbo.JK_Temp_Stuff_Test_Table_2 (
       myIntKey int PRIMARY KEY CLUSTERED,
       myOtherVarchar varchar(10) not null
);

insert into dbo.JK_Temp_Stuff_Test_Table_2 (myIntKey, myOtherVarchar)
values (5, 'Waffle');

Creation

The creation syntax for a table variable is a hybrid between a CREATE TABLE statement and a DECLARE statement – which makes sense, as a table variable is a table that happens to be a variable.

1
2
3
4
declare @JK_Test_Table_Variable TABLE (
       myIntKey int PRIMARY KEY CLUSTERED,
       myVarchar varchar(10) not null
);

Note that you CANNOT do a SELECT INTO with a table variable.

Modification

Inserting into a table variable is just like inserting into a regular table. However, note that you have to do the INSERT in the same transaction as the CREATE.

1
2
3
4
5
6
7
8
9
10
declare @JK_Test_Table_Variable TABLE (
       myIntKey int PRIMARY KEY CLUSTERED,
       myVarchar varchar(10) not null
);

insert into ##JK_Test_Temp_Table (myIntKey, myVarchar)
values (1, 'A'),
       (2, 'B'),
       (3, 'A'),
       (4, 'C');

Table variables are also updateable:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
declare @JK_Test_Table_Variable TABLE (
       myIntKey int PRIMARY KEY CLUSTERED,
       myVarchar varchar(10) not null
);

insert into @JK_Test_Table_Variable (myIntKey, myVarchar)
values (1, 'A'),
       (2, 'B'),
       (3, 'A'),
       (4, 'C');

update tgt
   set MyVarchar = 'D'
 from @JK_Test_Table_Variable tgt
where myIntKey = 4;

select *
  from @JK_Test_Table_Variable;

Indexing

Unfortunately, you cannot index a table variable after creation. This means that prior to SQL Server 2014, you cannot add nonclustered indexes to table variables. You can, however, declare a table variable with a clustered or unique index on it:

1
2
3
4
5
6
7
8
9
declare @JK_Test_Table_Variable TABLE (
       myIntKey int PRIMARY KEY CLUSTERED,
       myVarchar varchar(10) not null
);

-- Fails
create nonclustered index JK_Test_Table_Variable_SIDX_1 on @JK_Test_Table_Variable (
       myVarchar asc
);

"Two for One Sale, Sunday only!"

It is also not possible to create statistics on a table variable. Therefore, the query optimizer will always estimate that there is 1 row in the table variable, and therefore it may make some very bad plan choices. You’ll find that a lot of people insist that table variables are bad for performance – this is usually the reason. A table variable with millions of rows is almost always going to be slower than about any other option.

Global vs Local

Table Variables are only local. They’re so local, actually, that their use is limited to the transaction they’re created in.

Longevity

No really, only the creating transaction. The insert will fail:

1
2
3
4
5
6
7
8
9
10
11
12
declare @@JK_Test_Table_Variable TABLE (
       myIntKey int PRIMARY KEY CLUSTERED,
       myVarchar varchar(10) not null
);
go

insert into @@JK_Test_Table_Variable (myIntKey, myVarchar)
values (1, 'A'),
       (2, 'B'),
       (3, 'A'),
       (4, 'C');
go

Source Data Dependence

As with temporary tables, data set modifications are done independently of the original data. This should be pretty clear from the INSERT INTO syntax:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
declare @JK_Test_Table_Variable TABLE (
       myIntKey int PRIMARY KEY CLUSTERED,
       myVarchar varchar(10) not null
);

insert into @JK_Test_Table_Variable
select *
  from dbo.JK_Temp_Stuff_Test_Table;

update @JK_Test_Table_Variable
   set myVarchar = 'Whooo';

select * from @JK_Test_Table_Variable;
select * from dbo.JK_Temp_Stuff_Test_Table;

Cleanup

And here’s the cleanup code for this exercise:

1
2
drop table dbo.JK_Temp_Stuff_Test_Table;
drop table dbo.JK_Temp_Stuff_Test_Table_2;

Moving On

We’ve taken some time to look at some attributes of Table Variables, including strengths and weaknesses. As with Temporary Tables, you can’t make a decision about using them in a vacuum. In particular, I’ve cautioned against the use of table variables for large data sets – but for very small data sets, table variables can be very useful. I particularly like them for use with the OUTPUT clause of the MERGE statement when trying to log row counts in SSIS.

Next up: Views.