The database is SQL Server 2008. I have a query which extracts rows from one or more tables and then attempts to insert them into a table variable.
I'd like an efficent way to prevent duplicate inserts, so what I came up with was:
INSERT INTO @MyTableVariable SELECT SomeID FROM SomeTable st INNER JOIN SomeOtherTable sot ON sot.SomeID = st.SomeID LEFT JOIN @MyTableVariable t ON t.SomeID = sot.SomeID WHERE t.SomeID IS NULL
This, however, does not appear to prevent duplicate inserts in certain scenarios.
It seems as if (and this makes sense if you think about it and look at the query plan) that only the initial "state" of @MyTableVariable is used in the left join operation. In other words, this would prevent duplicates if @MyTableVariable already had SomeID in it before this statement was run, but would not prevent duplicates if the FROM/INNER JOIN on SomeTable/SomeOtherTalbe resulted in duplicate SomeIDs.
Aside from simply slapping a DISTINCT on the SELECT statement, is there another more efficent way to handle this?
As far as I know, there's no way of
INSERT IGNORE or
INSERT ON DUPLICATE KEY in SQL Server. There is MERGE, of course, but it wouldn't solve your problem, because it would behave same as your INSERT, i.e. it would raise an exception.
is there another more efficent way to handle this?
In my view, your options are:
- Attempt to find a more specific way of filtering/joining so as not to produce duplicates.
- 'Slap' DISTINCT somewhere at an earlier stage so as to prevent duplicates from entering either of the tables being joined in the first place.
- Transform the table(s) primarily responsible for producing duplicates, into subselects with DISTINCTs applied locally.
If you cannot produce a duplicate-free result set, you'll have to pay (performance-wise) for eliminating possible duplicates. Whatever it could be, DISTINCT, or GROUP BY, or maybe ranking functions, it would induce some performance penalty, and you should just accept the fact.
You need to create key on the id column of the table variable.
declare it like this:
declare @MyTableVariable table(SomeID int identity(1,1) primary key)
This primary key will prevent duplicate insert
Hope this helps