{"id":3324,"date":"2025-02-24T12:39:07","date_gmt":"2025-02-24T11:39:07","guid":{"rendered":"https:\/\/thedatastory.nl\/?p=3324"},"modified":"2025-09-17T15:28:29","modified_gmt":"2025-09-17T13:28:29","slug":"staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd","status":"publish","type":"post","link":"https:\/\/thedatastory.nl\/nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/","title":{"rendered":"Staying on Track with Dataform Railway Design &#8211; Streamlining Dataform Development with Local Setup and CI\/CD"},"content":{"rendered":"\n<p>Recently, I had the exciting opportunity to host my first session at&nbsp;<strong>Measurecamp Malm\u00f6 2025<\/strong>, where I presented strategies to&nbsp;<strong>streamline Dataform development<\/strong>&nbsp;using&nbsp;<strong>local environments and CI\/CD integration<\/strong>.&nbsp; Many data teams struggle with inefficient&nbsp;<strong>local testing<\/strong>, accidentally&nbsp;<strong>run test scripts on entire datasets on BigQuery<\/strong>, and struggle to maintain&nbsp;<strong>schema integrity across environments<\/strong>. Developing in Dataform without a structured workflow is like running a railway network without signals, track switches or station coordination. Without a proper structure, we risk delays, collisions, or even derailment.<\/p>\n\n\n\n<p>This blog outlines a<strong> Dataform local setup <\/strong>in the form of a well-maintained railway system: integrating&nbsp;<strong>CI\/CD (signal lights), automated schema testing (station maps), environment switching (track switches), and sample data transfers (cargo management)<\/strong>, making Dataform workflows&nbsp;<strong>more efficient and scalable<\/strong>. We will serve as railway engineers, laying down the tracks for a structured Dataform local setup. We will look into the challenges of local Dataform development, go over the key features of a robust local setup and provide a step-by-step guide on how to get started, along with a more in-depth look into some of the key features. All aboard! \ud83d\ude82<\/p>\n\n\n\n<p>A more extensive setup guide can be found in the <strong>GitHub repository<\/strong>:&nbsp;<a href=\"https:\/\/github.com\/The-Data-Story\/dataform_local_setup_with_ci\" target=\"_blank\" rel=\"noreferrer noopener\">Dataform Local Setup with CI\/CD<\/a><\/p>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<figure class=\"wp-block-image size-full is-resized\"><img fetchpriority=\"high\" decoding=\"async\" width=\"1024\" height=\"1024\" src=\"http:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/AD_4nXdszCn3iIszkyrbOVA2Jl6xoo2F-T9wHgZ3p5Qu-l8OXLisJSokkdG636SFJpTcCLnn8DRP4zJZiOwwa5j_Ag_md3XlqSVAS3SBGcfyt33Ohn0-kN9idUYc.jpg\" alt=\"\" class=\"wp-image-3395\" style=\"width:461px;height:auto\" srcset=\"https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/AD_4nXdszCn3iIszkyrbOVA2Jl6xoo2F-T9wHgZ3p5Qu-l8OXLisJSokkdG636SFJpTcCLnn8DRP4zJZiOwwa5j_Ag_md3XlqSVAS3SBGcfyt33Ohn0-kN9idUYc.jpg 1024w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/AD_4nXdszCn3iIszkyrbOVA2Jl6xoo2F-T9wHgZ3p5Qu-l8OXLisJSokkdG636SFJpTcCLnn8DRP4zJZiOwwa5j_Ag_md3XlqSVAS3SBGcfyt33Ohn0-kN9idUYc-300x300.jpg 300w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/AD_4nXdszCn3iIszkyrbOVA2Jl6xoo2F-T9wHgZ3p5Qu-l8OXLisJSokkdG636SFJpTcCLnn8DRP4zJZiOwwa5j_Ag_md3XlqSVAS3SBGcfyt33Ohn0-kN9idUYc-150x150.jpg 150w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/AD_4nXdszCn3iIszkyrbOVA2Jl6xoo2F-T9wHgZ3p5Qu-l8OXLisJSokkdG636SFJpTcCLnn8DRP4zJZiOwwa5j_Ag_md3XlqSVAS3SBGcfyt33Ohn0-kN9idUYc-768x768.jpg 768w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/AD_4nXdszCn3iIszkyrbOVA2Jl6xoo2F-T9wHgZ3p5Qu-l8OXLisJSokkdG636SFJpTcCLnn8DRP4zJZiOwwa5j_Ag_md3XlqSVAS3SBGcfyt33Ohn0-kN9idUYc-12x12.jpg 12w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>\ud83d\udea7<\/strong><strong> Railway Obstacles in Local Dataform Development<\/strong><\/h3>\n\n\n\n<p>Working with&nbsp;<strong>Google Cloud Dataform<\/strong>&nbsp;presents several hurdles:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\ud83d\udd10&nbsp;<strong>Security concerns <\/strong>with<strong> downloading the data to the local<\/strong> environment<strong>.<\/strong><\/li>\n\n\n\n<li>\ud83d\uded1&nbsp;<strong>Manual schema validation<\/strong>, leading to&nbsp;<strong>errors in production<\/strong>.<\/li>\n\n\n\n<li>\u2601\ufe0f&nbsp;<strong>Limited local development support<\/strong>&nbsp;due to cloud dependencies.<\/li>\n\n\n\n<li>\ud83d\udd04&nbsp;<strong>Inefficient data transfer<\/strong>&nbsp;between development and production environments.<\/li>\n<\/ul>\n\n\n\n<p><\/p>\n\n\n\n<p>To prevent derailments, we built a&nbsp;<strong>local-first development approach<\/strong>, ensuring a seamless experience for&nbsp;<strong>testing and deploying Dataform workflows<\/strong>.<\/p>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>\u26a1 Laying Down the Tracks &#8211; Key Features of the Local Setup<\/strong><\/strong><\/h3>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<h5 class=\"wp-block-heading\"><strong><strong>\ud83d\udea6<strong> Signal Lights: Automating Schema Validation with CI\/CD<\/strong><\/strong><\/strong><\/h5>\n\n\n\n<p>One of the key improvements in this setup is the integration of GitHub Actions for automated schema testing. Using these signal lights, we can avoid major disruptions from trains entering the wrong track: every time a pull request (PR) is created, the CI pipeline validates the schema, detecting any added or removed columns, type changes, or new tables before merging. This prevents unintended modifications from disrupting production environments. Developers receive instant feedback on potential schema-breaking changes, reducing deployment risks and ensuring smooth data transitions.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\"><strong><strong>\ud83d\udd04 <strong>Track Switching &#8211; Simplified Environment Management<\/strong><\/strong><\/strong><\/h5>\n\n\n\n<p>Managing multiple environments is essential for structured development, and this setup enables effortless switching between DEV and PROD using dedicated scripts. Instead of manually reconfiguring settings, developers can toggle environments with a single command, ensuring that production data remains untouched during development. The separation of environments using prefixed datasets ensures that testing occurs in isolation, maintaining data integrity across projects.<\/p>\n\n\n\n<figure class=\"wp-block-gallery has-nested-images columns-default is-cropped wp-block-gallery-1 is-layout-flex wp-block-gallery-is-layout-flex\">\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"768\" data-id=\"3393\" src=\"https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1738050627650-1024x768.jpeg\" alt=\"\" class=\"wp-image-3393\" srcset=\"https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1738050627650-1024x768.jpeg 1024w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1738050627650-300x225.jpeg 300w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1738050627650-768x576.jpeg 768w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1738050627650-1536x1152.jpeg 1536w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1738050627650-16x12.jpeg 16w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1738050627650.jpeg 2048w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image size-large\"><img decoding=\"async\" width=\"1024\" height=\"1024\" data-id=\"3394\" src=\"https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1737359752477-1024x1024.jpeg\" alt=\"\" class=\"wp-image-3394\" srcset=\"https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1737359752477-1024x1024.jpeg 1024w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1737359752477-300x300.jpeg 300w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1737359752477-150x150.jpeg 150w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1737359752477-768x768.jpeg 768w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1737359752477-12x12.jpeg 12w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/1737359752477.jpeg 1200w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n<\/figure>\n\n\n\n<h5 class=\"wp-block-heading\"><strong><strong>\ud83d\ude82 <strong>Standardized Train Engines: Docker for Local Development<\/strong><\/strong><\/strong><\/h5>\n\n\n\n<p>A Dockerized development environment ensures that every developer has the same engine (setup), without the need for manual installation. By using a preconfigured devcontainer in VS Code, all necessary dependencies and configurations are pre-installed, creating a fully reproducible workspace. This guarantees consistency across team members and eliminates the common \u201cit works on my machine\u201d problem, streamlining collaboration and keeping all trains on schedule.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\"><strong><strong>\ud83d\udce6<strong>Cargo Management: Sample Data Transfers for Testing<\/strong><\/strong><\/strong><\/h5>\n\n\n\n<p>A train doesn\u2019t need to carry every possible container on every journey &#8211; only those that are required. Similarly, testing doesn\u2019t require full production datasets. Hence, our local setup includes automated sample data transfers to a BigQuery test project. A dedicated script simplifies the movement of partitioned and incremental data, allowing developers to test transformations with realistic datasets without unnecessary load. By handling data loads programmatically, manual setup time is reduced, making the development cycle more efficient. This approach ensures that tests are conducted with relevant data while maintaining a clear separation from production environments.<\/p>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<h3 class=\"wp-block-heading\">\ud83d\udee0\ufe0f Step-by-Step Setup Guide<\/h3>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<h5 class=\"wp-block-heading\"> 1. Fork and Clone the Repository<\/h5>\n\n\n\n<p># Fork the repository on GitHub, then clone it<br><code>&nbsp;  git clone https:\/\/github.com\/&lt;user&gt;\/dataform_local_setup_with_ci.git<br><code>&nbsp;  <\/code>cd dataform_local_setup_with_ci<\/code><\/p>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\"> 2. Prerequisites<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\ud83d\udee0\ufe0f&nbsp;<strong>Git<\/strong>&nbsp;for version control.<\/li>\n\n\n\n<li>\ud83d\udc33&nbsp;<strong>Docker<\/strong>&nbsp;or&nbsp;<strong>OrbStack<\/strong>&nbsp;for containerized development.<\/li>\n\n\n\n<li>\ud83d\udda5\ufe0f&nbsp;<strong>VS Code<\/strong>&nbsp;with&nbsp;<strong>Dev Containers extension installed<\/strong>.<\/li>\n\n\n\n<li>\u2601\ufe0f&nbsp;<strong>Google Cloud Account<\/strong>&nbsp;with&nbsp;<strong>two projects<\/strong>:&nbsp;dev-project&nbsp;and&nbsp;prod-project.<\/li>\n\n\n\n<li>\ud83d\udd11&nbsp;<strong>Service Account<\/strong>&nbsp;with&nbsp;<strong>BigQuery permissions<\/strong>.<\/li>\n<\/ul>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\"> 3. Configure Google Cloud Credentials<\/h5>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Download the&nbsp;<strong>JSON key<\/strong>&nbsp;for the service account.<\/li>\n\n\n\n<li>Save it as&nbsp;GCPkey.json&nbsp;in the&nbsp;<strong>root directory<\/strong>.<\/li>\n\n\n\n<li>Add&nbsp;GCPkey.json&nbsp;to&nbsp;.gitignore.<\/li>\n<\/ul>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\"> 4. Start the Dev Container<\/h5>\n\n\n\n<p><code><code><code>&nbsp;  <\/code><\/code>code .<\/code><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>VS Code will detect the setup and reopen in a&nbsp;<strong>fully configured dev environment<\/strong>.<\/li>\n<\/ul>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\"> 5. Initialize Dataform Credentials<\/h5>\n\n\n\n<pre class=\"wp-block-code\"><code><code><code><code>&nbsp; <span style=\"font-family: inherit;font-size: 1rem\">dataform init-creds<\/span><\/code><\/code><\/code><\/code><\/pre>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Select&nbsp;<strong>EU<\/strong>&nbsp;as the region.<\/li>\n\n\n\n<li>Choose&nbsp;<strong>JSON service account<\/strong>&nbsp;and provide&nbsp;GCPkey.json.<\/li>\n<\/ul>\n\n\n\n<p><\/p>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<h3 class=\"wp-block-heading\">\ud83d\udd0dIn-Depth Look into Some of the Key Features<\/h3>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>\ud83d\ude86<\/strong><strong> Running Dataform Commands with Scripts<\/strong><\/h4>\n\n\n\n<h5 class=\"wp-block-heading\">\u2705&nbsp;dataform_exec Script<\/h5>\n\n\n\n<p>Run&nbsp;<strong>Dataform commands<\/strong>&nbsp;with&nbsp;<strong>environment validation<\/strong>:<\/p>\n\n\n\n<p><code>dataform_exec compile&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <em># Validate Dataform code<\/em><br>dataform_exec test<code>&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;     <code> &nbsp; &nbsp;<\/code><\/code><em># Run unit tests<\/em><br>dataform_exec run --dry-run&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <em># Simulate execution<\/em><br>dataform_exec run --full-refresh &nbsp; &nbsp; <em># Perform a full refresh<\/em><\/code><\/p>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\">\ud83d\udd04&nbsp;switch_env Script<\/h5>\n\n\n\n<p>Toggle between&nbsp;<strong>development<\/strong>&nbsp;and&nbsp;<strong>production<\/strong>:<\/p>\n\n\n\n<p><code>switch_env dev &nbsp; <em># Switch to DEV<\/em><br>switch_env prod&nbsp; <em># Switch to PROD<\/em><\/code><\/p>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>\ud83d\udee4\ufe0f<\/strong><strong> Automated CI\/CD Schema Tests &#8211; Keeping Trains on Track<\/strong><\/h4>\n\n\n\n<h5 class=\"wp-block-heading\">How It Works:<\/h5>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Compares schema changes<\/strong>&nbsp;between PRs and existing definitions.<\/li>\n\n\n\n<li><strong>Detects modifications<\/strong>&nbsp;(new columns, type changes, new tables).<\/li>\n\n\n\n<li><strong>Logs warnings\/errors<\/strong>&nbsp;for quick debugging.<\/li>\n\n\n\n<li><strong>Prevents schema-breaking changes<\/strong>&nbsp;from merging into production.<\/li>\n<\/ol>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\">GitHub Workflow Configuration<\/h5>\n\n\n\n<p>&#8211; name: Set up Google Cloud authentication<br><code><code>&nbsp;<\/code><\/code> run: echo &#8220;${{ secrets.GCPKEY }}&#8221; &gt; \/tmp\/gcpkey.json<\/p>\n\n\n\n<p><\/p>\n\n\n\n<h5 class=\"wp-block-heading\">Example CI\/CD Test Output<\/h5>\n\n\n\n<p><strong>Schema Changes:<\/strong><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code><code>Processing table: project.dataform__report__analytics.paid_campaigns\n&nbsp;&nbsp;+ tesfield (STRING) added\n&nbsp;&nbsp;- cost (FLOAT) removed\n<code>&nbsp;&nbsp;<span style=\"font-family: inherit;font-size: 1rem\">~ SOURCE (STRING \u2192 INTEGER) changed<\/span><\/code><\/code><\/code><\/pre>\n\n\n\n<p><\/p>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<h4 class=\"wp-block-heading\"><strong>\ud83d\udce6<\/strong><strong> Optimized Sample Data Transfer for Testing<\/strong><\/h4>\n\n\n\n<p>The&nbsp;export_and_load.py&nbsp;script simplifies&nbsp;<strong>data migration<\/strong>&nbsp;between GCP projects.<\/p>\n\n\n\n<h5 class=\"wp-block-heading\">How It Works:<\/h5>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Reads&nbsp;config.json&nbsp;for&nbsp;<strong>source\/target details<\/strong>.<\/li>\n\n\n\n<li><strong>Creates a temporary partitioned table<\/strong>.<\/li>\n\n\n\n<li><strong>Loads data incrementally<\/strong>&nbsp;into the target project.<\/li>\n\n\n\n<li><strong>Deletes temporary resources<\/strong>&nbsp;after transfer.<\/li>\n<\/ol>\n\n\n\n<p><\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>\/usr\/bin\/python3 \/dataform\/src\/exampleData\/export_and_load.py<\/code><\/pre>\n\n\n\n<h5 class=\"wp-block-heading\">Example Configuration:<\/h5>\n\n\n\n<pre class=\"wp-block-code\"><code>{\n  \"tables\": &#91;\n    {\n      \"source_project\": \"prod-project\",\n      \"source_dataset\": \"source_dataset1\",\n      \"source_table\": \"event_20240201\",\n      \"target_project\": \"dev-project\",\n      \"location\": \"EU\",\n      \"partition_size\": 10000,\n      \"max_rows\": 39990000\n    }\n  ]\n}\n<\/code><\/pre>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<hr class=\"wp-block-separator has-alpha-channel-opacity\"\/>\n\n\n\n<b>&nbsp;<\/b>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"2560\" height=\"1707\" src=\"https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/Person-holding-printer-paper-scaled.jpg\" alt=\"Train Journey Planning\" class=\"wp-image-3676\" srcset=\"https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/Person-holding-printer-paper-scaled.jpg 2560w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/Person-holding-printer-paper-300x200.jpg 300w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/Person-holding-printer-paper-1024x683.jpg 1024w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/Person-holding-printer-paper-768x512.jpg 768w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/Person-holding-printer-paper-1536x1024.jpg 1536w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/Person-holding-printer-paper-2048x1365.jpg 2048w, https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/Person-holding-printer-paper-18x12.jpg 18w\" sizes=\"(max-width: 2560px) 100vw, 2560px\" \/><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>\ud83d\uddfa\ufe0f<\/strong><strong> Final Thoughts &#8211; A Smooth, Efficient Dataform Journey<\/strong><\/h3>\n\n\n\n<p>By implementing a structured Dataform local setup, your development process transforms into a well-managed railway system:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\u2705&nbsp;<strong>Schema integrity <\/strong>avoids unexpected train delays by preventing train crashes<\/li>\n\n\n\n<li>\u2705&nbsp;<strong>Environment consistency <\/strong>provides us with a smooth transition between tracks<\/li>\n\n\n\n<li>\u2705&nbsp;<strong>Automated validation&nbsp;<\/strong><\/li>\n<\/ul>\n\n\n\n<p><\/p>\n\n\n\n<p>By integrating&nbsp;<strong>GitHub Actions, environment switching, and sample data transfers <\/strong>into their Dataform projects, teams can:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>\ud83d\ude86&nbsp;<strong>Reduce deployment risks<\/strong><\/li>\n\n\n\n<li>\ud83d\udee0\ufe0f&nbsp;<strong>Automate schema validation<\/strong><\/li>\n\n\n\n<li>\ud83d\udd04&nbsp;<strong>Streamline data migration<\/strong>&nbsp;between environments<\/li>\n\n\n\n<li>\ud83e\udd1d&nbsp;<strong>Enhance collaboration<\/strong>&nbsp;with PR-based workflows<\/li>\n<\/ul>\n\n\n\n<p><\/p>\n\n\n\n<p>With the right infrastructure in place, Dataform development becomes a more smooth, predictable journey &#8211; where every train arrives on time, every deployment is robust, and every pipeline runs like a well-coordinated railway system! Want some help with your railway structure? Feel free to reach out!<\/p>\n\n\n\n<p>This approach was&nbsp;<strong>presented at Measurecamp Malm\u00f6 2025<\/strong>, and we&#8217;re excited to share the&nbsp;<strong>GitHub repository<\/strong>:&nbsp;<a href=\"https:\/\/github.com\/The-Data-Story\/dataform_local_setup_with_ci\" target=\"_blank\" rel=\"noreferrer noopener\">Dataform Local Setup with CI\/CD<\/a>&nbsp; \ud83d\ude82<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Explore how to streamline Dataform local development using CI\/CD integration. Automate schema testing, manage environments, optimize workflows, and build scalable, reliable data pipelines.<\/p>\n","protected":false},"author":8,"featured_media":3403,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_price":"","_stock":"","_tribe_ticket_header":"","_tribe_default_ticket_provider":"","_tribe_ticket_capacity":"0","_ticket_start_date":"","_ticket_end_date":"","_tribe_ticket_show_description":"","_tribe_ticket_show_not_going":false,"_tribe_ticket_use_global_stock":"","_tribe_ticket_global_stock_level":"","_global_stock_mode":"","_global_stock_cap":"","_tribe_rsvp_for_event":"","_tribe_ticket_going_count":"","_tribe_ticket_not_going_count":"","_tribe_tickets_list":"[]","_tribe_ticket_has_attendee_info_fields":false,"footnotes":""},"categories":[2],"tags":[],"class_list":["post-3324","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-stories"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Dataform Local Development with CI\/CD \u2013 A Complete Guide<\/title>\n<meta name=\"description\" content=\"Streamline Dataform local development with CI\/CD. Automate testing, enforce schema integrity, and optimize workflows for scalable growth.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/thedatastory.nl\/nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/\" \/>\n<meta property=\"og:locale\" content=\"nl_NL\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Dataform Local Development with CI\/CD \u2013 A Complete Guide\" \/>\n<meta property=\"og:description\" content=\"Streamline Dataform local development with CI\/CD. Automate testing, enforce schema integrity, and optimize workflows for scalable growth.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/thedatastory.nl\/nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/\" \/>\n<meta property=\"og:site_name\" content=\"The Data Story\" \/>\n<meta property=\"article:published_time\" content=\"2025-02-24T11:39:07+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-09-17T13:28:29+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/p-j-o0lnBAQ175A-unsplash1.jpeg\" \/>\n\t<meta property=\"og:image:width\" content=\"640\" \/>\n\t<meta property=\"og:image:height\" content=\"427\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Andrei\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Geschreven door\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andrei\" \/>\n\t<meta name=\"twitter:label2\" content=\"Geschatte leestijd\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minuten\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/\"},\"author\":{\"name\":\"Andrei\",\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/#\\\/schema\\\/person\\\/fa86d66941f08878a6125c15a17e5486\"},\"headline\":\"Staying on Track with Dataform Railway Design &#8211; Streamlining Dataform Development with Local Setup and CI\\\/CD\",\"datePublished\":\"2025-02-24T11:39:07+00:00\",\"dateModified\":\"2025-09-17T13:28:29+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/\"},\"wordCount\":1166,\"publisher\":{\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/thedatastory.nl\\\/wp-content\\\/uploads\\\/2025\\\/02\\\/p-j-o0lnBAQ175A-unsplash1.jpeg\",\"articleSection\":[\"Data stories\"],\"inLanguage\":\"nl-NL\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/\",\"url\":\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/\",\"name\":\"Dataform Local Development with CI\\\/CD \u2013 A Complete Guide\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/thedatastory.nl\\\/wp-content\\\/uploads\\\/2025\\\/02\\\/p-j-o0lnBAQ175A-unsplash1.jpeg\",\"datePublished\":\"2025-02-24T11:39:07+00:00\",\"dateModified\":\"2025-09-17T13:28:29+00:00\",\"description\":\"Streamline Dataform local development with CI\\\/CD. Automate testing, enforce schema integrity, and optimize workflows for scalable growth.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/#breadcrumb\"},\"inLanguage\":\"nl-NL\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"nl-NL\",\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/#primaryimage\",\"url\":\"https:\\\/\\\/thedatastory.nl\\\/wp-content\\\/uploads\\\/2025\\\/02\\\/p-j-o0lnBAQ175A-unsplash1.jpeg\",\"contentUrl\":\"https:\\\/\\\/thedatastory.nl\\\/wp-content\\\/uploads\\\/2025\\\/02\\\/p-j-o0lnBAQ175A-unsplash1.jpeg\",\"width\":640,\"height\":427,\"caption\":\"Dataform Railway Design\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/data-stories\\\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Data stories\",\"item\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/data-stories\\\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Staying on Track with Dataform Railway Design &#8211; Streamlining Dataform Development with Local Setup and CI\\\/CD\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/#website\",\"url\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/\",\"name\":\"The Data Story\",\"description\":\"Data Analyse, Visualisatie &amp; Automation\",\"publisher\":{\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"nl-NL\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/#organization\",\"name\":\"The Data Story\",\"url\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"nl-NL\",\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/thedatastory.nl\\\/wp-content\\\/uploads\\\/2021\\\/11\\\/Logo-negatief.svg\",\"contentUrl\":\"https:\\\/\\\/thedatastory.nl\\\/wp-content\\\/uploads\\\/2021\\\/11\\\/Logo-negatief.svg\",\"width\":250,\"height\":49,\"caption\":\"The Data Story\"},\"image\":{\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/#\\\/schema\\\/logo\\\/image\\\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/thedatastory.nl\\\/en\\\/#\\\/schema\\\/person\\\/fa86d66941f08878a6125c15a17e5486\",\"name\":\"Andrei\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"nl-NL\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/11680e1aebb33aed50e46f4d4f4e3beb3ffe69aa44fa4716c5f2a099ed13d4c3?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/11680e1aebb33aed50e46f4d4f4e3beb3ffe69aa44fa4716c5f2a099ed13d4c3?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/11680e1aebb33aed50e46f4d4f4e3beb3ffe69aa44fa4716c5f2a099ed13d4c3?s=96&d=mm&r=g\",\"caption\":\"Andrei\"},\"url\":\"https:\\\/\\\/thedatastory.nl\\\/nl\\\/author\\\/andrei\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Dataform Local Development with CI\/CD \u2013 A Complete Guide","description":"Streamline Dataform local development with CI\/CD. Automate testing, enforce schema integrity, and optimize workflows for scalable growth.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/thedatastory.nl\/nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/","og_locale":"nl_NL","og_type":"article","og_title":"Dataform Local Development with CI\/CD \u2013 A Complete Guide","og_description":"Streamline Dataform local development with CI\/CD. Automate testing, enforce schema integrity, and optimize workflows for scalable growth.","og_url":"https:\/\/thedatastory.nl\/nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/","og_site_name":"The Data Story","article_published_time":"2025-02-24T11:39:07+00:00","article_modified_time":"2025-09-17T13:28:29+00:00","og_image":[{"width":640,"height":427,"url":"http:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/p-j-o0lnBAQ175A-unsplash1.jpeg","type":"image\/jpeg"}],"author":"Andrei","twitter_card":"summary_large_image","twitter_misc":{"Geschreven door":"Andrei","Geschatte leestijd":"7 minuten"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/#article","isPartOf":{"@id":"https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/"},"author":{"name":"Andrei","@id":"https:\/\/thedatastory.nl\/en\/#\/schema\/person\/fa86d66941f08878a6125c15a17e5486"},"headline":"Staying on Track with Dataform Railway Design &#8211; Streamlining Dataform Development with Local Setup and CI\/CD","datePublished":"2025-02-24T11:39:07+00:00","dateModified":"2025-09-17T13:28:29+00:00","mainEntityOfPage":{"@id":"https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/"},"wordCount":1166,"publisher":{"@id":"https:\/\/thedatastory.nl\/en\/#organization"},"image":{"@id":"https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/#primaryimage"},"thumbnailUrl":"https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/p-j-o0lnBAQ175A-unsplash1.jpeg","articleSection":["Data stories"],"inLanguage":"nl-NL"},{"@type":"WebPage","@id":"https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/","url":"https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/","name":"Dataform Local Development with CI\/CD \u2013 A Complete Guide","isPartOf":{"@id":"https:\/\/thedatastory.nl\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/#primaryimage"},"image":{"@id":"https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/#primaryimage"},"thumbnailUrl":"https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/p-j-o0lnBAQ175A-unsplash1.jpeg","datePublished":"2025-02-24T11:39:07+00:00","dateModified":"2025-09-17T13:28:29+00:00","description":"Streamline Dataform local development with CI\/CD. Automate testing, enforce schema integrity, and optimize workflows for scalable growth.","breadcrumb":{"@id":"https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/#breadcrumb"},"inLanguage":"nl-NL","potentialAction":[{"@type":"ReadAction","target":["https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/"]}]},{"@type":"ImageObject","inLanguage":"nl-NL","@id":"https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/#primaryimage","url":"https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/p-j-o0lnBAQ175A-unsplash1.jpeg","contentUrl":"https:\/\/thedatastory.nl\/wp-content\/uploads\/2025\/02\/p-j-o0lnBAQ175A-unsplash1.jpeg","width":640,"height":427,"caption":"Dataform Railway Design"},{"@type":"BreadcrumbList","@id":"https:\/\/thedatastory.nl\/data-stories\/staying-on-track-with-dataform-railway-design-streamlining-dataform-development-with-local-setup-and-ci-cd\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/thedatastory.nl\/en\/"},{"@type":"ListItem","position":2,"name":"Data stories","item":"https:\/\/thedatastory.nl\/en\/data-stories\/"},{"@type":"ListItem","position":3,"name":"Staying on Track with Dataform Railway Design &#8211; Streamlining Dataform Development with Local Setup and CI\/CD"}]},{"@type":"WebSite","@id":"https:\/\/thedatastory.nl\/en\/#website","url":"https:\/\/thedatastory.nl\/en\/","name":"The Data Story","description":"Data Analyse, Visualisatie &amp; Automation","publisher":{"@id":"https:\/\/thedatastory.nl\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/thedatastory.nl\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"nl-NL"},{"@type":"Organization","@id":"https:\/\/thedatastory.nl\/en\/#organization","name":"The Data Story","url":"https:\/\/thedatastory.nl\/en\/","logo":{"@type":"ImageObject","inLanguage":"nl-NL","@id":"https:\/\/thedatastory.nl\/en\/#\/schema\/logo\/image\/","url":"https:\/\/thedatastory.nl\/wp-content\/uploads\/2021\/11\/Logo-negatief.svg","contentUrl":"https:\/\/thedatastory.nl\/wp-content\/uploads\/2021\/11\/Logo-negatief.svg","width":250,"height":49,"caption":"The Data Story"},"image":{"@id":"https:\/\/thedatastory.nl\/en\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/thedatastory.nl\/en\/#\/schema\/person\/fa86d66941f08878a6125c15a17e5486","name":"Andrei","image":{"@type":"ImageObject","inLanguage":"nl-NL","@id":"https:\/\/secure.gravatar.com\/avatar\/11680e1aebb33aed50e46f4d4f4e3beb3ffe69aa44fa4716c5f2a099ed13d4c3?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/11680e1aebb33aed50e46f4d4f4e3beb3ffe69aa44fa4716c5f2a099ed13d4c3?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/11680e1aebb33aed50e46f4d4f4e3beb3ffe69aa44fa4716c5f2a099ed13d4c3?s=96&d=mm&r=g","caption":"Andrei"},"url":"https:\/\/thedatastory.nl\/nl\/author\/andrei\/"}]}},"_links":{"self":[{"href":"https:\/\/thedatastory.nl\/nl\/wp-json\/wp\/v2\/posts\/3324","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/thedatastory.nl\/nl\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/thedatastory.nl\/nl\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/thedatastory.nl\/nl\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/thedatastory.nl\/nl\/wp-json\/wp\/v2\/comments?post=3324"}],"version-history":[{"count":70,"href":"https:\/\/thedatastory.nl\/nl\/wp-json\/wp\/v2\/posts\/3324\/revisions"}],"predecessor-version":[{"id":3678,"href":"https:\/\/thedatastory.nl\/nl\/wp-json\/wp\/v2\/posts\/3324\/revisions\/3678"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/thedatastory.nl\/nl\/wp-json\/wp\/v2\/media\/3403"}],"wp:attachment":[{"href":"https:\/\/thedatastory.nl\/nl\/wp-json\/wp\/v2\/media?parent=3324"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/thedatastory.nl\/nl\/wp-json\/wp\/v2\/categories?post=3324"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/thedatastory.nl\/nl\/wp-json\/wp\/v2\/tags?post=3324"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}