en-US/about_PSql_Deploy_Seeds.help.txt
about_PSql_Deploy_Seeds ======================= SHORT DESCRIPTION A PSql.Deploy seed is a T-SQL script that populates a target database with data. The Invoke-SqlSeed cmdlet applies seeds to target databases, with configurable parallelism to support both large databases and many-database scenarios. PSql.Deploy supports SQL Server, Azure SQL Database, and compatible databases. LONG DESCRIPTION :: Source Directory Structure :: PSql.Deploy expects seeds to have a specific filesystem layout. src\ The source directory: a set of migrations and │ seeds for one database design. The path can vary. │ ├─ Seeds\ Seeds. Required only if there are any seeds. │ │ │ ├─ TestData\ One seed. The name can vary. │ │ │ │ │ ├─ _Main.sql Top-level script for the seed. It can │ │ │ include other files with the :r directive. │ │ ├─ FileA.sql Example file included by _Main.sql. │ │ ├─ FileB.sql Example file included by _Main.sql. │ │ └─ ... More files, subdirectories, etc. │ │ │ └─ ... More seeds. │ ├─ Migrations\ Migrations. Required only if there are any migrations. │ └─ ... See about_PSql_Deploy_Migrations for details. │ └─ ... PSql.Deploy does not care about other files or directories present in the source directory. Given an arbitrary source directory, PSql.Deploy expects to find seeds in a Seeds subdirectory. Within the Seeds directory, each subdirectory containing a _Main.sql file is an individual seed. The name of the subdirectory determines the name of the seed. A seed name must be a valid directory name but is otherwise unrestricted. The _Main.sql file is the entry point for the seed. The file is a T-SQL script with a few extensions to support modularity and parallelism. :: Seed Script SQLCMD Support :: PSql.Deploy seed scripts support a limited set of SQLCMD directives. Typical seeds use these directives to organize code into multiple files, to separate SQL batches, and to replace hard-coded values with variables. GO Ends the current SQL batch and begins a new one. $(<name>) Replaced by the value of the SQLCMD variable <name>. :r <file> Replaced by the contents of the specified file. The path is relative to the current PowerShell directory ($PWD or Get-Location), matching SQLCMD.EXE behavior. To enable inclusion relative to the seed, PSql.Deploy predefines the SQLCMD variable 'Path' as the full path of the directory containing the _Main.sql file. Thus the directive :r $(Path)\Foo.sql includes the file Foo.sql from the same directory as _Main.sql. The path may be enclosed within double quotes. This is mandatory if the path contains spaces, tabs, or double quotes. To include a double quote in the path, use two double quotes. Examples: :r "$(Path)\My Script.sql" :r "$(Path)/My ""Special"" Script.sql" Note that Windows does not support tabs or double quotes in file names. :setvar <name> <value> Sets the SQLCMD variable <name> to the spacified <value>. The value may be enclosed within double quotes. This is mandatory if the value contains spaces, tabs, or double quotes. To include a double quote in the value, use two double quotes. :setvar MyVar MyValue :setvar MyVar "My Value" :setvar MyVar "My ""Special"" Value" :: Seed Script Parallelism Directives :: Seeds typically consist of many statements that could execute in parallel if the language supported it. For example, inserting rows into different tables could occur in parallel, provided that all rows necessary to satisfy foreign key constraints are already present. Parallelism can significantly reduce the time required to seed large databases. PSql.Deploy recognizes special directives to enable parallelism within seeds. These directives are 'magic' comments: line comments beginning with '--#'. The MODULE directive enables the author to split a seed script into 'modules' that can execute in parallel. A module is a named chunk of SQL that performs one unit of seed work, such as populating a particular table with data. For example, a seed script might contain the following two modules: --# MODULE: Customer INSERT dbo.Customer ... --# MODULE: Product INSERT dbo.Product ... All lines of SQL code after a MODULE directive until the next MODULE directive (or the end of the file) are part of that module. Any lines of SQL code before the first MODULE directive are implicitly part of a module named 'init'. Thus each line of SQL code in a seed script is part of exactly one module. Module names are author-defined. They appear in logging and error messages. A module can contain multiple GO-separated batches. A module can declare dependencies upon other modules. PSql.Deploy's Invoke-SqlSeed cmdlet executes modules in parallel where possible but guarantees that each module's dependencies are completed before the module itself executes. The dependency model is based on 'topics'. A topic is an author-defined name that represents some facet of a seed's work that PSql.Deploy should track. Each module 'provides' one or more topics, meaning that the module contributes to the work represented by each topic. A module always provides its own name as a topic and can use the PROVIDES directive to declare additional provided topics. A module can 'require' one or more topics via the REQUIRES directive. When a module requires a topic, the module will not execute until that topic is complete. For example, consider a seed that populates three tables: Customer, Product, and Order. The Order table has foreign keys to both Customer and Product. Customer and Product can be populated in parallel, but Order cannot be populated until both Customer and Product are complete. The seed script might approach this as follows: --# MODULE: Customer INSERT dbo.Customer ... --# MODULE: Product INSERT dbo.Product ... --# MODULE: Order --# REQUIRES: Customer Product INSERT dbo.Order ... Another approach is to use a topic to represent the concept of 'master data': --# MODULE: Customer --# PROVIDES: MasterData INSERT dbo.Customer ... --# MODULE: Product --# PROVIDES: MasterData INSERT dbo.Product ... --# MODULE: Order --# REQUIRES: MasterData INSERT dbo.Order ... Invoke-SqlSeed executes seed modules in parallel by creating multiple worker tasks. Each worker repeatedly picks the next ready module and executes it on the worker's connection to the target database. In rare cases, an author might want a seed module to execute on every worker. For example, an author might want to set connection options (like NOCOUNT) or to create a temporary table. The WORKER directive enables this behavior. To perform connection setup: --# MODULE: InitWorker --# WORKER: all SET NOCOUNT ON; -- This happens in every connection --# MODULE: Customer --# REQUIRES: InitWorker INSERT dbo.Customer ... Seed directive reference: --# MODULE: <name> [<topic> ...] Starts a new module with the specified name, optionally declaring extra provided topics. The module name itself is a provided topic. --# PROVIDES: <topic> [<topic> ...] Indicates that the current module provides the specified topics. --# REQUIRES: <topic> [<topic> ...] Indicates that the current module requires the specified topics. --# WORKER: all|any Specifies worker execution mode: 'all' means that a copy of the module will run on every worker; 'any' (the default) means that the module executes once on any single available worker. :: See Also :: - about_PSql_Deploy - Invoke-SqlSeed - https://github.com/sharpjs/PSql.Deploy |