Best practice to populate static data using a Visual Studio 2010 database project?

Question

How do you populate your database with static, source-controlled data using a Visual Studio database project? I have tried all three strategies below, finding each one to be progressively better than the last. I am using but not completely satisfied with strategy 3. Do you have an another alternative?

Place insert scripts in the "Data Generation Plans" folder. Reference the scripts in the "Script.PostDeployment.sql" file to include them in the deployment process.

-- advantage: straight-forward
-- drawback: slooooooow
-- drawback: subsequent deploys must first delete static data or check for non-existence of data => inefficient

Insert the data into the database the first time using whatever method is most convenient (e.g. could be the SSMS edit table feature). Extract that data using the bcp command line utility to create a bunch of data files and add them to your project. Create a script referenced in the "Scripts.PostDeployment.sql" file that executes a "bulk insert" statement for each data file.

-- advantage: much faster than insert statements
-- advantage: can leverage SSMS edit table feature
-- drawback: each bulk insert statement requires a fully qualified file name to the data file so if the data files are located on my machine at "C:\Projects\Dev\Source\foo.dat" then the remote dev machine must also have them at that location or the bulk insert statement fails
-- drawback: must delete existing static data before executing bulk insert statements on subsequent deploys

Create temporary tables during deployment to hold the static data and use the sql merge statement to synchronize these tables with the target tables. See either of these blog posts.

-- advantage: seems like sql merge has the perfect semantics for the problem
-- drawback: the logic for this strategy is repeated in each file -- drawback: table definitions are repeated as temporary tables in the sql merge files

Is there a superior alternative strategy? I gave up on strategy 1 because it was too slow. I dislike strategy 2 due to the fully qualified file name issue. I am satisfied but not thrilled by strategy 3. Is there a best practice?

Do you have the option to leave the data in the target database, rather than re-populating it each time? — David Atkinson, Nov 29 '10 at 16:58
@David: I don't believe there is any such option unless someone can tell me otherwise. It needs to be part of the scripted solution somehow. Strategy 3 handles this with the merge command. Strategy 1 would have to be modified to first check if the data exists before inserting. Likewise for strategy 2. — Tim Partridge, Nov 29 '10 at 17:12
It's probably no consolation to you, but we've just released SQL Source Control 2, which has Static Data Support. Sadly, this doesn't support the database project - Not yet, at least. However, we're seriously considering this. If you're interested, please vote it up here: http://redgate.uservoice.com/forums/39019-sql-source-control/suggestions/1010465-work-with-a-visual-studio-2010-database-project?ref=title — David Atkinson, Mar 11 '11 at 21:41
For strategy 1, slooooooow is an understatement. I'm on a greenfields project at the moment. Started off just running in the DB creation and default data. I moved to the DB project to aim for a better way of maintaining the DB stuff, but now I feel like manually running scripts was easier, purely because it's so slow to deploy, and as you said you have to first manually check for existence if you want to run in data in a post-deploy script and have it idempotent. — jamiebarrow, Mar 18 '11 at 10:20

score 9 · Answer 1 · answered Mar 08 '11 at 14:16

In your insert.sql script, you can put a GUID in the [__RefactorLog] table (which is a system table used by deployment) and check if this GUID exist before inserting your data like this :

:setvar SOMEID "784B2FC9-2B1E-5798-8478-24EE856E62AE" //create guid with Tools\CreateGuid in VS2010

IF NOT EXISTS (SELECT [OperationKey] FROM [dbo].[__RefactorLog] where [OperationKey] = '$(SOMEID )')

BEGIN

...

INSERT INTO [dbo].[__RefactorLog] ([OperationKey] ) values( '$(SOMEID )' )

END

Then you insert data only if not exist or if you want to (by changing the Guid).

score 2 · Answer 2 · answered Nov 29 '13 at 20:03

This is how I solved this problem in case anyone else finds this useful...

The strategy is to set a sqlcmdvars variable before building the database project. This variable would contain the absolute path to the build folder that can be referenced from the post deployment script. Then it would be a simple matter to use that in the deployment script for any additional files or resources that you might need. The advantage of this strategy is that all the paths are relative to the project file instead of requiring a hard coded shared path.

Create a new Sql command variable name $(MSBuildProjectDirectory). This will get overriden in the prebuild script.

Create an msbuild script that would set the sql command variable and build the database.

<Project ToolsVersion="4.0" DefaultTargets="BuildDatabase"  xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
<Import Project="$(MSBuildExtensionsPath)\MSBuildCommunityTasks\MSBuild.Community.Tasks.Targets"/>
<PropertyGroup>
    <DatabaseServer>(Local)</DatabaseServer>
    <DeploymentConnectionString>Data Source=$(DatabaseServer)%3BIntegrated Security=True%3BPooling=False</DeploymentConnectionString>
    <Configuration>Release</Configuration>
</PropertyGroup>
<Target Name="BuildDatabase">
    <!-- Sets the projet path variable so that the post deployment script can determine the location of the bulk insert csv files. -->
    <XmlUpdate
        Prefix="urn"
        Namespace="urn:Microsoft.VisualStudio.Data.Schema.Package.SqlCmdVars"
        XmlFileName="$(MSBuildProjectDirectory)\DatabaseProjectName\Properties\Database.sqlcmdvars"
        XPath="/urn:SqlCommandVariables/urn:Properties/urn:Property[urn:PropertyName='MSBuildProjectDirectory']/urn:PropertyValue"
        Value="$(MSBuildProjectDirectory)\DatabaseProjectName" />

    <MSBuild
            Projects="DatabaseProjectName\DatabaseProjectName.dbproj"
            Properties="Configuration=$(Configuration);
                    TargetDatabase=DatabaseName;
                    TargetConnectionString=$(DeploymentConnectionString);
                    GenerateDropsIfNotInProject=True;
                    BlockIncrementalDeploymentIfDataLoss=False;
                    DeployToDatabase=True;
                    IgnorePermissions=True"
            Targets="Build;Deploy">
        <Output TaskParameter="TargetOutputs" ItemName="SqlFiles"/>
    </MSBuild>
</Target>

Update your post deployment script as follows...

BULK INSERT [dbo].[TableName] FROM '$(MSBuildProjectDirectory)\Scripts\Post-Deployment\Data\YourDataFile.csv'
WITH (FIELDTERMINATOR = ',', ROWTERMINATOR='\n')

score 1 · Answer 3 · answered May 05 '11 at 02:08

1

You can use the schema output from the database project to update the target database There is a cmd tool to run it on other machine is not with in view of you vs2010 IDE

So you data will still be the same, unless you have drops on any column

answered May 05 '11 at 02:08

Darrell

11
1

score 0 · Answer 4 · answered Feb 28 '11 at 20:36

0

We haven't rolled our VS 2010 db project into Production yet, but for our in-house project we load the production database into the target database and build/deploy to it during the dev/test phase. That said, I understand that probably will not work for you Tim if you have multiple prod databases and static data that goes out to each. But, it can be done for single prod db shops like ours.

answered Feb 28 '11 at 20:36

vkirkpat

1

1

Thanks for the reply. Do you mind giving me a little more detail? So do you backup prod, restore it to target (hence target has all the data), then run deploy from VS to sync up the schemas? If that's true, then it seems like you can't prevent someone from adding data to prod which will then propagate to target. I'm looking for a source-controlled way of saying: "This is our baseline data and after I click deploy, I'm sure that it exists on target." By corollary, the same method would also support: "Here is a collection of data that forms a test case that is repeatedly deployable" – Tim Partridge Mar 01 '11 at 15:09

Best practice to populate static data using a Visual Studio 2010 database project?

4 Answers4