Welcome to Database Version Control’s documentation!#

_images/app_logo_black.png

DVC to version control your database!

License GitHub Workflow Status Readthedocs PyPI Python Commit Last commit GitHub Youtube

Introduction#

What is Version Control?#

In its most general sense, Version Control means to track and manage the different versions of a document or set of documents. While it is mostly practised in the world of software engineering, its use cases also abound in other areas. For instance, book writers often benefit from version controlling their drafts. This enables them to explore different writing styles, themes and so on. Failed attempt in a certain direction can be safely scrapped while the writer jumps back to the version he is most comfortable in.

What is Software Version Control?#

As mentioned, software engineering practices version control heavily. Unlike hardware, software has high malleability. This makes it necessary to keep track of the changes made to the software and to have the ability to jump back to the prior version when the latest version breaks. The current industry standard for Software Version Control is Git, a distributed vrelease-0.1.15ersion control system created by Linux Torvald. It has superseded its predecessors like SVN, which is a local version control system which stores a project’s history in a single server.

What is Database Version Control?#

Interestingly, while there is much discussion around Software Version Control, there seems to be a lack of discussion around database version control. Database, especially Relational Database Management System (RDBMS), does not merely store data. It also stores, among others, data about data (metadata, like DDL) and user access privileges (DCL), Given that a lot of applications are data-driven, which means the software’s behaviour is affected by the data which it receives, version controlling your database should not come off as secondary to version controlling your software.

Examples of Existing Database Version Control Systems in the market

Existing Database Version Control Tools#

There are a number of existing database version control tools.

Commercial#

Examples which are commericial and written in java.

Open Source#

In the python world, we have the below open-source products.

Personal experience#

I have experimented with both alembic and yoyo-migration, but found the following shortcomings:

alembic#

The metadata is mostly stored as files in the repository, and NOT as an entry in the database.

It makes it difficult to directly query the database for its current version.

yoyo-migration#

The SQL migration files must be prefixed with numbers left padded with zeros.

For instance, ‘0001__description.sql’. If we have more than 1000 SQL files, then we probably need to rename the old files (e.g. to ‘00001__descrition.sql’). However, given that file names must be immutable in order for the database version control system to work properly, it means that it always has a upper limit as to how many SQL migration files you can have.

Goal of a custom database version control tool#

This makes me want to create my own version of database version control system (DVC).

My own database version control system in python: Design Requirements

My goal is to have a DVC satisfying the below requirements:

  • It works only with the Postgresql database (Optionally, it should be extendable)

  • It accepts only SQL files (It does not use ORM, like SQLalchemy)

  • It retains rich metadata in the database, so that we can check the database version via SQL.

  • The spirit is that the DVC should act like git, which stores all the git objects in a .git folder. Similarly, there should be a table in the database, which stores all the metadata of the DVC, such that a simple SQL can be used to query the database version.

Features#

This page explains the features of the CLI.

Overview#

Command Line Interface#

Command Line Interface
  • Showing CLI commands.

Rich Database Metadata#

Note

More information about the metadata can be found under design/metadata

Rich Metadata
  • Showing the metadata tables.
    • dvc.database_revision_history table which shows the revision SQL files applied.

    • dvc.database_version_history table which shows the database version which results from the revision SQL files applied.

Upgrade or Downgrade#

Rich Metadata
  • Showing upgrade and downgrade command.
    • Started from database version 1 (Shown via dvc db current)

    • Showed the Revision SQL files under the default folder sample_revision_sql_files.

    • Applied database upgrade via dvc db upgrade.

    • Showed the database version became 2.

    • Applied database downgrade via dvc db downgrade.

    • Showed the database version was back to 1.

Flexible Configuration Format#

Configuration is read either from i. Configuration File (config.yaml) or ii. Environment Variable

Configuration File#

Note

The configuration file template can be generated via dvc cfg init

Config Via Configuration File
  • Showing dvc tool reads configuration from a configuration file.
    • Ran a postgres DB via docker docker run -e POSTGRES_USER=test -e POSTGRES_PASSWORD=test -e POSTGRES_DB=test -p 5433:5432 postgres:latest

    • Copied a config.yaml file with configurations which match the spun up postgres DB.

    • Pinged the DB with dvc db ping. Success!

  • The config.yaml file looks as follows:

credentials:
  dbflavour: postgres
  dbname: 'test'
  host: 'localhost'
  password: 'test'
  port: 5433
  user: 'test'
database_revision_sql_files_folder: sample_revision_sql_files
logging_level: DEBUG
Environment Variable#

Note

The names of the environment variables can be found in the docker compose file

Config Via Env Var
  • Showing dvc tool reads configuration from environment variables
    • Ran a postgres DB via docker docker run -e POSTGRES_USER=test -e POSTGRES_PASSWORD=test -e POSTGRES_DB=test -p 5433:5432 postgres:latest

    • Loaded the env variables to the shell.

    • Pinged the DB with dvc db ping. Success!

  • The environment variables look as follows:

 >>> printnev| grep DVC
 DVC__DATABASE_REVISION_SQL_FILES_FOLDER=sample_revision_sql_files
 DVC__USER=test
 DVC__PASSWORD=test
 DVC__HOST=postgres_db
 DVC__PORT=5432
 DVC__DBNAME=test
 DVC__DBFLAVOUR=postgres
 DEV__LOGGING_LEVEL: DEBUG

Demo Files#

  • This page shows the SQL files used for the demonstraion.

RV1#

Upgrade#
 1-- create schema
 2CREATE SCHEMA IF NOT EXISTS fundamentals;
 3-- create table
 4CREATE TABLE IF NOT EXISTS fundamentals.source
 5  (
 6     source_id INTEGER PRIMARY KEY,
 7     source VARCHAR(50) NOT NULL UNIQUE,
 8     created_at TIMESTAMP NOT NULL DEFAULT (now() at time zone 'utc'),
 9     CONSTRAINT source_ux UNIQUE (source)
10  );
11-- Initialise with pre-set source
12INSERT INTO fundamentals.source (source_id, source)
13VALUES (1, 'yahoo'), (2, 'investing_dot_com') on conflict do nothing;
14-- create table
15CREATE TABLE IF NOT EXISTS fundamentals.price
16  (
17     ticker VARCHAR(10) NOT NULL,
18     date DATE NOT NULL,
19     open   NUMERIC(50,10) NOT NULL CHECK (open >=0),
20     high   NUMERIC(50,10) NOT NULL CHECK (high >=0),
21     low   NUMERIC(50,10) NOT NULL CHECK (low >=0),
22     close   NUMERIC(50,10) NOT NULL CHECK (close >=0),
23     adj_close   NUMERIC(50,10) NOT NULL CHECK (adj_close >=0),
24     volume   NUMERIC(50,10) NOT NULL CHECK (volume >=0),
25     source_id  INTEGER,
26     created_at TIMESTAMP NOT NULL DEFAULT (now() at time zone 'utc'),
27     updated_at TIMESTAMP NOT NULL DEFAULT (now() at time zone 'utc'),
28     PRIMARY KEY (ticker, date),
29     CONSTRAINT fk_source FOREIGN KEY(source_id) REFERENCES fundamentals.source(source_id) ON DELETE CASCADE
30  );
Downgrade#
1drop schema fundamentals cascade;

RV2#

Upgrade#
1create schema if not exists datetime;
2create table if not exists datetime.special_date(
3  date date primary key,
4  is_hk_public_holiday boolean default False,
5  created_at TIMESTAMP NOT NULL DEFAULT (now() at time zone 'utc'),
6  updated_at TIMESTAMP NOT NULL DEFAULT (now() at time zone 'utc')
7);
Downgrade#
1drop schema datetime cascade;

RV3#

Upgrade#
 1create or replace view datetime.vw_trading_days_since_2021 as with weekdays (weekday) as (
 2select
 3	dt
 4-- Set timezone of now() to UTC
 5from
 6	generate_series('2021-01-01':: date, now() at time zone 'utc', '1 day':: interval) dt
 7where
 8	extract(dow
 9from
10	dt) not in (6, 0)
11	),
12	hk_holidays (hk_holiday) as (
13select
14	date
15from
16	datetime.special_date sd
17where
18	is_hk_public_holiday = true
19	),
20trading_days (trading_day) as (
21select
22	weekday
23from
24	weekdays
25except
26select
27	hk_holiday
28from
29	hk_holidays)
30select
31	trading_day
32from
33	trading_days
34order by
35	trading_days ;
Downgrade#
1drop view if exists datetime.vw_trading_days_since_2021;

Step-by-step Guide#

  • This page explains how the CLI can be used

Check version#

/home/ken $ dvc version
WARNING:root:Cannot find logging_level. Using default INFO
0.4.0

Initialise configuration File#

/home/ken $ dvc cfg init
WARNING:root:Cannot find logging_level. Using default INFO
INFO:root:Now generating default config file config.yaml
INFO:root:Reading config from file...
INFO:root:Generating database revision folder

Create Test Database#

 /home/ken $ psql -U postgres
psql (14.3)
Type "help" for help.

postgres=# create database my_test_db;
CREATE DATABASE
postgres=# exit

Create Test Database#

  /home/ken $ vim config.yamlA

  1 credentials:¬
  2 ··dbflavour:·postgres¬
  3 ··dbname:·'my_test_db'¬
  4 ··host:·'localhost'¬
  5 ··password:·xxxxx
  6 ··port:·5432¬
  7 ··user:·'postgres'¬
  8 database_revision_sql_files_folder:·sample_revision_sql_files¬
  9 logging_level:·INFO¬
~                         

Populate sample_database_revision_files_folder#

 /home/ken $ ls ./sample_revision_sql_files
RV1__create_scm_fundamentals_and_tbls.downgrade.sql
RV1__create_scm_fundamentals_and_tbls.upgrade.sql
RV2__create_scm_datetime_and_tbls.downgrade.sql
RV2__create_scm_datetime_and_tbls.upgrade.sql
RV3__datetime__create_vw_trading_days_since_2021.upgrade.sql
RV3__datetime__drop_vw_trading_days_since_2021.downgrade.sql

Ping the database#

 /home/ken $ dvc db ping
INFO:root:Reading config from file...
Database connection looks good!
Database: my_test_db
Host: localhost

Initialise the database#

 /home/ken $ dvc db init
INFO:root:Reading config from file...
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Database init successful!
Database: my_test_db
Host: localhost

Check Current Database Version#

 /home/ken $ dvc db current
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Database Current Version: V0

Do Dry-run for db upgrade#

  • Check the logs to console and ensure the SQL file to be applied is really the correct one.

 /home/ken $ dvc db upgrade --dry-run
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Current Database Version is V0
Next Upgrade Revision Version will be 1
INFO:root:Reading config from file...
INFO:root:Reading config from file...
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Below files will be applied:
[File Path: sample_revision_sql_files/RV1__create_scm_fundamentals_and_tbls.upgrade.sql]
INFO:root:Dry run is complete
Aborted!

Run DB Upgrade#

  • Upgrade the DB and check the db version afterwards

 /home/ken $ dvc db upgrade
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Current Database Version is V0
Next Upgrade Revision Version will be 1
INFO:root:Reading config from file...
INFO:root:Reading config from file...
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Below files will be applied:
[File Path: sample_revision_sql_files/RV1__create_scm_fundamentals_and_tbls.upgrade.sql]
Going to apply file File Path: sample_revision_sql_files/RV1__create_scm_fundamentals_and_tbls.upgrade.sql .....
You sure you want to continue ? [y/N]: y
INFO:root:Now applying sample_revision_sql_files/RV1__create_scm_fundamentals_and_tbls.upgrade.sql and marking to metadata table
INFO:root:Reading config from file...
INFO:root:Reading config from file...

 /home/ken $ dvc db current
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Database Current Version: V1

Use –head#

  • Check the SQL file(s) to be applied with the –head flag

 /home/ken $ dvc db upgrade --head
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Current Database Version is V1
Next Upgrade Revision Version will be 2
INFO:root:Reading config from file...
INFO:root:Reading config from file...
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Below files will be applied:
[File Path: sample_revision_sql_files/RV2__create_scm_datetime_and_tbls.upgrade.sql, File Path: sample_revision_sql_files/RV3__datetime__create_vw_trading_days_since_2021.upgrade.sql]
Going to apply file File Path: sample_revision_sql_files/RV2__create_scm_datetime_and_tbls.upgrade.sql .....
You sure you want to continue ? [y/N]: y
INFO:root:Now applying sample_revision_sql_files/RV2__create_scm_datetime_and_tbls.upgrade.sql and marking to metadata table
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Going to apply file File Path: sample_revision_sql_files/RV3__datetime__create_vw_trading_days_since_2021.upgrade.sql .....
You sure you want to continue ? [y/N]: y
INFO:root:Now applying sample_revision_sql_files/RV3__datetime__create_vw_trading_days_since_2021.upgrade.sql and marking to metadata table
INFO:root:Reading config from file...
INFO:root:Reading config from file...

Use –head#

  • Check the result in the database

 /home/ken $ psql -U postgres -d my_test_db
psql (14.3)
Type "help" for help.

my_test_db=# select * from information_schema.tables where table_schema not in ('information_schema', 'pg_catalog');
 table_catalog | table_schema |         table_name         | table_type | self_referencing_column_name | reference_generation | user_defined_type_cata
log | user_defined_type_schema | user_defined_type_name | is_insertable_into | is_typed | commit_action
---------------+--------------+----------------------------+------------+------------------------------+----------------------+-----------------------
----+--------------------------+------------------------+--------------------+----------+---------------
 my_test_db    | dvc          | database_revision_history  | BASE TABLE |                              |                      |
    |                          |                        | YES                | NO       |
 my_test_db    | dvc          | database_version_history   | BASE TABLE |                              |                      |
    |                          |                        | YES                | NO       |
 my_test_db    | fundamentals | source                     | BASE TABLE |                              |                      |
    |                          |                        | YES                | NO       |
 my_test_db    | fundamentals | price                      | BASE TABLE |                              |                      |
    |                          |                        | YES                | NO       |
 my_test_db    | datetime     | special_date               | BASE TABLE |                              |                      |
    |                          |                        | YES                | NO       |
 my_test_db    | datetime     | vw_trading_days_since_2021 | VIEW       |                              |                      |
    |                          |                        | NO                 | NO       |
(6 rows)

Downgrade the database with –base and –no-confirm#

  • Check the result in the database

 /home/ken $ dvc db downgrade --base --no-confirm
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Current Database Version is V3
Next Downgrade Revision Version will be 3
INFO:root:Reading config from file...
INFO:root:Reading config from file...
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Below files will be applied:
[File Path: sample_revision_sql_files/RV3__datetime__drop_vw_trading_days_since_2021.downgrade.sql, File Path: sample_revision_sql_files/RV2__create_scm_datetime_and_tbls.downgrade.sql, File Path: sample_revision_sql_files/RV1__create_scm_fundamentals_and_tbls.downgrade.sql]
Going to apply file File Path: sample_revision_sql_files/RV3__datetime__drop_vw_trading_days_since_2021.downgrade.sql .....
INFO:root:Now applying sample_revision_sql_files/RV3__datetime__drop_vw_trading_days_since_2021.downgrade.sql and marking to metadata table
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Going to apply file File Path: sample_revision_sql_files/RV2__create_scm_datetime_and_tbls.downgrade.sql .....
INFO:root:Now applying sample_revision_sql_files/RV2__create_scm_datetime_and_tbls.downgrade.sql and marking to metadata table
INFO:root:Reading config from file...
INFO:root:Reading config from file...
Going to apply file File Path: sample_revision_sql_files/RV1__create_scm_fundamentals_and_tbls.downgrade.sql .....
INFO:root:Now applying sample_revision_sql_files/RV1__create_scm_fundamentals_and_tbls.downgrade.sql and marking to metadata table
INFO:root:Reading config from file...
INFO:root:Reading config from file...

Check Database tables again#

  • Check the result in the database

 /home/ken $ psql -U postgres -d my_test_db
psql (14.3)
Type "help" for help.

my_test_db=# select * from information_schema.tables where table_schema not in ('information_schema', 'pg_catalog');
 table_catalog | table_schema |        table_name         | table_type | self_referencing_column_name | reference_generation | user_defined_type_catal
og | user_defined_type_schema | user_defined_type_name | is_insertable_into | is_typed | commit_action
---------------+--------------+---------------------------+------------+------------------------------+----------------------+------------------------
---+--------------------------+------------------------+--------------------+----------+---------------
 my_test_db    | dvc          | database_revision_history | BASE TABLE |                              |                      |
   |                          |                        | YES                | NO       |
 my_test_db    | dvc          | database_version_history  | BASE TABLE |                              |                      |
   |                          |                        | YES                | NO       |
(2 rows)


my_test_db=# \pset format wrapped
Output format is wrapped.
my_test_db=# select * from dvc.database_revision_history;
 revision_id | executed_sql_file_folder |    executed_sql_file_name     | executed_sql_file_content_hash | executed_sql_file_content | operation | revision_applied |       created_at
-------------+--------------------------+-------------------------------+--------------------------------+---------------------------+-----------+------------------+-------------------------
           1 | sample_revision_sql_file.| RV1__create_scm_fundamentals_.| 0f76966869d3d4eb2d3511f1dcd6f6.| -- create schema         +| Upgrade   | RV1              | 2022-06-25 09:56:53.161.
             |.s                        |.and_tbls.upgrade.sql          |.1d                             | CREATE SCHEMA IF NOT EXIS.|           |                  |.503
             |                          |                               |                                |.TS fundamentals;         +|           |                  |
             |                          |                               |                                | -- create table          +|           |                  |
             |                          |                               |                                | CREATE TABLE IF NOT EXIST.|           |                  |
             |                          |                               |                                |.S fundamentals.source    +|           |                  |
             |                          |                               |                                |   (                      +|           |                  |
             |                          |                               |                                |      source_id INTEGER PR.|           |                  |
             |                          |                               |                                |.IMARY KEY,               +|           |                  |
             |                          |                               |                                |      source VARCHAR(50) N.|           |                  |
             |                          |                               |                                |.OT NULL UNIQUE,          +|           |                  |
             |                          |                               |                                |      created_at TIMESTAMP.|           |                  |
             |                          |                               |                                |. NOT NULL DEFAULT (now() .|           |                  |
             |                          |                               |                                |.at time zone 'utc'),     +|           |                  |
             |                          |                               |                                |      CONSTRAINT source_ux.|           |                  |
             |                          |                               |                                |. UNIQUE (source)         +|           |                  |
             |                          |                               |                                |   );                     +|           |                  |
             |                          |                               |                                | -- Initialise with pre-se.|           |                  |
             |                          |                               |                                |.t source                 +|           |                  |
             |                          |                               |                                | INSERT INTO fundamentals..|           |                  |
             |                          |                               |                                |.source (source_id, source.|           |                  |
             |                          |                               |                                |.)                        +|           |                  |
             |                          |                               |                                | VALUES (1, 'yahoo'), (2, .|           |                  |
             |                          |                               |                                |.'investing_dot_com') on c.|           |                  |
             |                          |                               |                                |.onflict do nothing;      +|           |                  |
             |                          |                               |                                | -- create table          +|           |                  |
             |                          |                               |                                | CREATE TABLE IF NOT EXIST.|           |                  |
             |                          |                               |                                |.S fundamentals.price     +|           |                  |
             |                          |                               |                                |   (                      +|           |                  |
             |                          |                               |                                |      ticker VARCHAR(10) N.|           |                  |
             |                          |                               |                                |.OT NULL,                 +|           |                  |
             |                          |                               |                                |      date DATE NOT NULL, +|           |                  |
             |                          |                               |                                |      open   NUMERIC(50,10.|           |                  |
             |                          |                               |                                |.) NOT NULL CHECK (open >=.|           |                  |
             |                          |                               |                                |.0),                      +|           |                  |
             |                          |                               |                                |      high   NUMERIC(50,10.|           |                  |
             |                          |                               |                                |.) NOT NULL CHECK (high >=.|           |                  |
             |                          |                               |                                |.0),                      +|           |                  |
             |                          |                               |                                |      low   NUMERIC(50,10).|           |                  |
             |                          |                               |                                |. NOT NULL CHECK (low >=0).|           |                  |
             |                          |                               |                                |.,                        +|           |                  |
             |                          |                               |                                |      close   NUMERIC(50,1.|           |                  |
             |                          |                               |                                |.0) NOT NULL CHECK (close .|           |                  |
             |                          |                               |                                |.>=0),                    +|           |                  |
             |                          |                               |                                |      adj_close   NUMERIC(.|           |                  |
             |                          |                               |                                |.50,10) NOT NULL CHECK (ad.|           |                  |
             |                          |                               |                                |.j_close >=0),            +|           |                  |
             |                          |                               |                                |      volume   NUMERIC(50,.|           |                  |
             |                          |                               |                                |.10) NOT NULL CHECK (volum.|           |                  |
             |                          |                               |                                |.e >=0),                  +|           |                  |
             |                          |                               |                                |      source_id  INTEGER, +|           |                  |
             |                          |                               |                                |      created_at TIMESTAMP.|           |                  |
             |                          |                               |                                |. NOT NULL DEFAULT (now() .|           |                  |
             |                          |                               |                                |.at time zone 'utc'),     +|           |                  |
             |                          |                               |                                |      updated_at TIMESTAMP.|           |                  |
             |                          |                               |                                |. NOT NULL DEFAULT (now() .|           |                  |
             |                          |                               |                                |.at time zone 'utc'),     +|           |                  |
             |                          |                               |                                |      PRIMARY KEY (ticker,.|           |                  |
             |                          |                               |                                |. date),                  +|           |                  |
             |                          |                               |                                |      CONSTRAINT fk_source.|           |                  |
             |                          |                               |                                |. FOREIGN KEY(source_id) R.|           |                  |
             |                          |                               |                                |.EFERENCES fundamentals.so.|           |                  |
             |                          |                               |                                |.urce(source_id) ON DELETE.|           |                  |
             |                          |                               |                                |. CASCADE                 +|           |                  |
             |                          |                               |                                |   );                     +|           |                  |
             |                          |                               |                                |                           |           |                  |
           2 | sample_revision_sql_file.| RV2__create_scm_datetime_and_.| 993c55319cdeec01775b868b94b7d6.| create schema if not exis.| Upgrade   | RV2              | 2022-06-25 09:59:01.019.
             |.s                        |.tbls.upgrade.sql              |.2e                             |.ts datetime;             +|           |                  |.371
             |                          |                               |                                | create table if not exist.|           |                  |
             |                          |                               |                                |.s datetime.special_date( +|           |                  |
             |                          |                               |                                |   date date primary key, +|           |                  |
             |                          |                               |                                |   is_hk_public_holiday bo.|           |                  |
             |                          |                               |                                |.olean default False,     +|           |                  |
             |                          |                               |                                |   created_at TIMESTAMP NO.|           |                  |
             |                          |                               |                                |.T NULL DEFAULT (now() at .|           |                  |
             |                          |                               |                                |.time zone 'utc'),        +|           |                  |
             |                          |                               |                                |   updated_at TIMESTAMP NO.|           |                  |
             |                          |                               |                                |.T NULL DEFAULT (now() at .|           |                  |
             |                          |                               |                                |.time zone 'utc')         +|           |                  |
             |                          |                               |                                | );                        |           |                  |
           3 | sample_revision_sql_file.| RV3__datetime__create_vw_trad.| ffe591a58e951e7bbe87b0c06b7d8c.| create or replace view da.| Upgrade   | RV3              | 2022-06-25 09:59:01.832.
             |.s                        |.ing_days_since_2021.upgrade.s.|.fa                             |.tetime.vw_trading_days_si.|           |                  |.872
             |                          |.ql                            |                                |.nce_2021 as with weekdays.|           |                  |
             |                          |                               |                                |. (weekday) as (          +|           |                  |
             |                          |                               |                                | select                   +|           |                  |
             |                          |                               |                                |         dt               +|           |                  |
             |                          |                               |                                | -- Set timezone of now() .|           |                  |
             |                          |                               |                                |.to UTC                   +|           |                  |
             |                          |                               |                                | from                     +|           |                  |
             |                          |                               |                                |         generate_series('.|           |                  |
             |                          |                               |                                |.2021-01-01':: date, now().|           |                  |
             |                          |                               |                                |. at time zone 'utc', '1 d.|           |                  |
             |                          |                               |                                |.ay':: interval) dt       +|           |                  |
             |                          |                               |                                | where                    +|           |                  |
             |                          |                               |                                |         extract(dow      +|           |                  |
             |                          |                               |                                | from                     +|           |                  |
             |                          |                               |                                |         dt) not in (6, 0)+|           |                  |
             |                          |                               |                                |         ),               +|           |                  |
             |                          |                               |                                |         hk_holidays (hk_h.|           |                  |
             |                          |                               |                                |.oliday) as (             +|           |                  |
             |                          |                               |                                | select                   +|           |                  |
             |                          |                               |                                |         date             +|           |                  |
             |                          |                               |                                | from                     +|           |                  |
             |                          |                               |                                |         datetime.special_.|           |                  |
             |                          |                               |                                |.date sd                  +|           |                  |
             |                          |                               |                                | where                    +|           |                  |
             |                          |                               |                                |         is_hk_public_holi.|           |                  |
             |                          |                               |                                |.day = true               +|           |                  |
             |                          |                               |                                |         ),               +|           |                  |
             |                          |                               |                                | trading_days (trading_day.|           |                  |
             |                          |                               |                                |.) as (                   +|           |                  |
             |                          |                               |                                | select                   +|           |                  |
             |                          |                               |                                |         weekday          +|           |                  |
             |                          |                               |                                | from                     +|           |                  |
             |                          |                               |                                |         weekdays         +|           |                  |
             |                          |                               |                                | except                   +|           |                  |
             |                          |                               |                                | select                   +|           |                  |
             |                          |                               |                                |         hk_holiday       +|           |                  |
             |                          |                               |                                | from                     +|           |                  |
             |                          |                               |                                |         hk_holidays)     +|           |                  |
             |                          |                               |                                | select                   +|           |                  |
             |                          |                               |                                |         trading_day      +|           |                  |
             |                          |                               |                                | from                     +|           |                  |
             |                          |                               |                                |         trading_days     +|           |                  |
             |                          |                               |                                | order by                 +|           |                  |
             |                          |                               |                                |         trading_days ;    |           |                  |
           4 | sample_revision_sql_file.| RV3__datetime__drop_vw_tradin.| b4fd8e1c50f3f17eab0ad8f05d2d50.| drop view if exists datet.| Downgrade | RV3              | 2022-06-25 10:09:41.442.
             |.s                        |.g_days_since_2021.downgrade.s.|.b1                             |.ime.vw_trading_days_since.|           |                  |.939
             |                          |.ql                            |                                |._2021;                    |           |                  |
           5 | sample_revision_sql_file.| RV2__create_scm_datetime_and_.| d040183e1231b5e4e82f0c7e4133d0.| drop schema datetime casc.| Downgrade | RV2              | 2022-06-25 10:09:41.471.
             |.s                        |.tbls.downgrade.sql            |.c6                             |.ade;                      |           |                  |.216
           6 | sample_revision_sql_file.| RV1__create_scm_fundamentals_.| f28acbf63e26ece3c0324a9cd74dd2.| drop schema fundamentals .| Downgrade | RV1              | 2022-06-25 10:09:41.493.
             |.s                        |.and_tbls.downgrade.sql        |.87                             |.cascade;                  |           |                  |.714

Quickstart#

Executable#

Note

Executables are available on Github Releases

  • The releases page provides executables on the below three Operating Systems.

    • Mac (Latest)

    • Linux Ubunutu (Latest)

    • Windows (Latest)

if you happen to use one of the OSes listed above, you can download the executable directly and use it without installing python!

PyPI Library#

Note

The commandline tool is uploaded to PyPI

Run the below to see it in action.

# Install the library from PyPi
pip install database-version-control

# To get more instructions of the commandline tool, run the below in the terminal
dvc --help

See detailed deployment of the commandline tool

Docker Image#

Note

The tool is containerised and is distributed on Dockerhub

Check out the docker-compose.yml file in the github repository to run a demo!

Run the below to see it in action.

# Clone the repo and checkout release branch
git clone -b release git@github.com:kenho811/Python_Database_Version_Control.git

# cd to the repository
cd Python_Database_Version_Control/docker_compose_demo

# Fnd the docker-compose.yml and run
docker compose up

# Using psql as client, access the postgres DB and see the result
(URL: postgres://test:test@localhost:5433/test)
PGPASSWORD=test psql -U test -d test -h localhost -p 5433

# Check out docker-compose.yml file for usage as a microservice

See detailed deployment of the Docker Image

Deployment#

This page explains how the tool can be deployed.

Docker Image#

This page explains how to use the tool packaged in a Docker Image.

Docker compose#

  • Docker Compose allows spinning up several containers together as a single service.

  • Configurations are stored in a docker-compose.yaml file.

  • For illustration, please refer to the docker-compose.yml file.

Kubernetes#

  • TBC

CLI#

This page explains how to use the tool packaged as commandline tool (CLI).

It is easier to use a configuration file than environment variables.

A typical workflow is as follows:

# Generate configuration file
dvc cfg init

# Update the generated config.yaml

# Test connection to the database with the updated config.yaml
dvc db ping

# Create a folder to store all the SQL files to be applied to the database
# Remember to use the same folder name as specified in the config.yaml file

# Make sure the SQL files follow the naming conventions.

# Run upgrade script
dvc db upgrade

# (Optional) To revert the migration, create a downgrade script and run it
dvc db downgrade

# (Optional) Check the current database version
dvc db current

Design#

This page explains the design of the tool

Overview#

This page explains gives a graphical overview of the tool.

DVC CLI commands and subcommands#

This section explains the client-facing side of the tool. The library is exposed via the commandline dvc.

digraph cli {
   graph [fontname="Verdana", fontsize="12"];
   node [fontname="Verdana", fontsize="12"];
   edge [fontname="Sans", fontsize="9"];
   rankdir="LR";

  dvc -> {cfg, db}
  cfg -> {init, show}
  db -> {ping, init, upgrade, downgrade}


}

DVC CLI Commands and Subcommands#

Data Structures#

Core Structure#

This section explains the core of the tool. It shows the interaction between the classes (marked in yellow). Greyed out items are features yet to be implemented.

digraph core {
   graph [fontname="Verdana", fontsize="12"];
   node [fontname="Verdana", fontsize="12"];
   edge [fontname="Sans", fontsize="9"];
   label = "Core";

   env [label="Env Var"];
   conffile [label="Config File"];
   ConfigReader [label="ConfigReader", shape="class", color="yellow", style="filled"];
   ConfigFileWriter [label="ConfigFileWriter", shape="class", color="yellow", style="filled"];
   DatabaseRevisionFilesManager [label="DatabaseRevisionFilesManager", shape="class", color="yellow", style="filled"];
   sqlfiles [label="SQL Files"];


   subgraph cluster_0 {
       label="SQLFileExecutors and Databases"

       PostgresqlSQLFileExecutor[shape="class", color="yellow", style="filled"];
       MysqlSQLFileExecutor[shape="class", color="grey", style="filled"];
       BigquerySQLFileExecutor[shape="class", color="grey", style="filled"];
       postgresql
       mysql[color=grey, style="filled"];
       bigquery[color=grey, style="filled"];

       PostgresqlSQLFileExecutor -> postgresql[label="apply"]
       MysqlSQLFileExecutor -> mysql[label="apply"]
       BigquerySQLFileExecutor -> bigquery[label="apply"]
   }

   env -> ConfigReader[label="input"];
   ConfigFileWriter -> conffile[label="generate"];
   conffile -> ConfigReader[label="input"];
   ConfigReader -> DatabaseRevisionFilesManager[label="input"];
   DatabaseRevisionFilesManager -> sqlfiles[label="lookup"];
   DatabaseRevisionFilesManager -> {PostgresqlSQLFileExecutor,MysqlSQLFileExecutor,BigquerySQLFileExecutor}[label="call"];

   }

DVC Core structure#

DatabaseVersion and DatabaseRevisionFile#
  • Below is a series of graphs illustrating the relationship between

    • Current Database Version (DatabaseVersion)

    • Target Database Version (DatabaseVersion)

    • One or more Database Revision Files (DatabaseRevisionFiles)

digraph core {
   rankdir="LR";

   curr_dbver [label="V1", shape="cylinder"];
   tar_dbver [label="V2", shape="cylinder"];
   dbrev_file [label="RV2, Upgrade", shape="note"];

   subgraph cluster_0 {

       {rank=same curr_dbver dbrev_file}

       curr_dbver -> dbrev_file[label="plus", arrowhead="none"];

   }

   dbrev_file -> tar_dbver [ltail=cluster_0, lhead=tar_dbver];

   }

V1 + RV2 (Upgrade) = V2#

digraph core {
   rankdir="LR";

   curr_dbver [label="V1", shape="cylinder"];
   tar_dbver [label="V0", shape="cylinder"];
   dbrev_file [label="RV1, Downgrade", shape="note"];

   subgraph cluster_0 {

       {rank=same curr_dbver dbrev_file}

       curr_dbver -> dbrev_file[label="plus", arrowhead="none"];

   }

   dbrev_file -> tar_dbver [ltail=cluster_0, lhead=tar_dbver];

   }

V1 + RV1 (Downgrade) = V0#

digraph core {
   rankdir="LR";

   curr_dbver [label="V3", shape="cylinder"];
   tar_dbver [label="V5", shape="cylinder"];
   dbrev_file_1 [label="RV4, Upgrade", shape="note"];
   dbrev_file_2 [label="RV5, Upgrade", shape="note"];

   subgraph cluster_0 {

       {rank=same curr_dbver tar_dbver}

       tar_dbver -> curr_dbver[label="minus", arrowhead="none"];

   }

   subgraph cluster_ {

       {rank=same dbrev_file_1 dbrev_file_2}

       dbrev_file_1 -> dbrev_file_2[ arrowhead="none"];

   }

   curr_dbver -> dbrev_file_1 [ltail=cluster_0, lhead=cluster_1];

   }

V5 - V3 = [RV4 Upgrade + RV5 Upgrade]#

digraph core {
   rankdir="LR";

   curr_dbver [label="V5", shape="cylinder"];
   tar_dbver [label="V3", shape="cylinder"];
   dbrev_file_1 [label="RV5, Downgrade", shape="note"];
   dbrev_file_2 [label="RV4, Downgrade", shape="note"];

   subgraph cluster_0 {

       {rank=same curr_dbver tar_dbver}

       tar_dbver -> curr_dbver[label="minus", arrowhead="none"];

   }

   subgraph cluster_ {

       {rank=same dbrev_file_1 dbrev_file_2}

       dbrev_file_1 -> dbrev_file_2[ arrowhead="none"];

   }

   curr_dbver -> dbrev_file_1 [ltail=cluster_0, lhead=cluster_1];

   }

V3 - V5 = [RV5 Downgrade + RV4 Downgrade]#

Configuration#

  • Two ways to pass configurations to DVC tool.
    • Via Environment Variables.

    • Via Configuration FIle.

  • Note that Configuration File has precedence OVER Environment Variables. In other words, if a configuration file is detected by DVC tool, the environment variables will be ignored.

Revision SQL files Namaing Convention#

The tool discovers and applies SQL files to the database for version control.

These files need to follow a certain naming conventions.

  • All SQL files are considered revision files

  • They must follow the pattern RV[0-9]*__.*.(upgrade|downgrade).sql. In words, it means

    • They start with the prefix RV

    • After RV, it follows an arbitrary revision number (e.g. RV1, RV2, RV3 etc. etc.)

    • After RV(arbitrary_revision_number), it follows double underscores and an arbitrary number of characters. Everything after __ describes what the SQL file does.

    • After RV(arbitrary_revision_number)__(description), it follows a dot and the character group of either upgrade or downgrade. When applied, an upgrade revision file will move the database version upward by 1, while a downgrade revision file will move the database version downward by 1.

    • After RV(arbitrary_revision_number)__(description).(upgrade/downgrade), it follows a dot and the character group of sql .

    • Overall, RV(arbitrary_revision_number)__(description).(upgrade/downgrade).sql

  • Example SQL revision files

    • RV1__create_scm_company_secrets_and_tbl_earnings.upgrade.sql

    • RV1__delete_scm_company_secrets_cascade.downgrade.sql

    • RV2__alter_scm_company_secrets_tbl_earnings_updated_at_add_index.upgrade.sql

    • RV2__alter_scm_company_secrets_tbl_earnings_updated_at_remove_index.downgrade.sql

Database Metadata#

Just like git which stores all the metadata in a dot git folder (.git), the tool also stores metadata in the database where SQL Revision Files are applied.

  • Schema dvc will be created

    • Table dvc.database_revision_history will be created.
      • History of revision SQL files applied.

    • Table dvc.database_version_history will be created.
      • History of database versions which result from revision SQL files applied.

Contribution#

This page explains how to contribute to the codebase.

Development#

This page explains the development workflow.

Contributors#

# Git clone the repo and checkout master
git clone -b master git@github.com:kenho811/Python_Database_Version_Control.git

# create a feature branch from the master branch
git checkout -b feature/{code_change_theme}

# Pip install dependencies
pip install with `pip install ".[dev]"`

# Enable local githooks
git config --local core.hooksPath .githooks/

# Development

# Write unit + integration tests

# Run pytest
pytest

# Generate Documentation locally. ISLOCAL=1 removes local dependencies.
cd docs
ISLOCAL=1 make clean html

# Open PR against master

Maintainers#

# Review and merge PR into master branch

# Update local master branch
git checkout master
git pull

# Update the tool's version under src/dvc/version.py
# See: https://github.com/kenho811/Python_Database_Version_Control/blob/master/src/dvc/version.py#L1
# Example: __version__ = "{new_version_num}"

# Cut a release branch with the same updated version number
git checkout -b release/{new_version_num}
git push --set-upstream origin release/{new_version_num}
  • The CI pipeline specified here will manage the rest.

Continuous Integration#

The Codebase is on github.

CI is currently done via github action. It is integrated to the following destinations:

The table below shows the details:

CI convention for branches#

Branch

Performs Tests?

Artifacts

DockerHub Tag

Push DockerHub Readme?

PyPI Version

Readthedocs Version

master

Yes

Pytest report.

latest

Yes

N/A

latest

feature/**

Yes

N/A

N/A

No

N/A

N/A

release/{major.minor}

Yes

N/A

N/A

No

N/A

N/A

CI convention for tags#

Tag

Performs Tests?

Artifacts

DockerHub Tag

Push DockerHub Readme?

PyPI Version

Readthedocs Version

release/{major.minor.patch}

Yes

Pytest artifacts + executables in github releases.

release-{major.minor.patch}

No

{majoir.minor.patch}

release-major.minor.patch

Source Documentation#

Src Package#

dvc package#

Subpackages#
dvc.app package#
Subpackages#
dvc.app.cli package#
Subpackages#
dvc.app.cli.commands package#
Subpackages#
dvc.app.cli.commands.database package#
Submodules#
dvc.app.cli.commands.database.backend module#
class dvc.app.cli.commands.database.backend.DatabaseInteractor(config_file_path_str: str)[source]#

Bases: object

Exposes API to interact with Various Database flavours

MAPPING = {SupportedDatabaseFlavour.Postgres: <function DatabaseInteractor.<lambda>>}#
property conn#
Returns

property database_revision_files_manager: dvc.core.config.DatabaseRevisionFilesManager#
Returns

DatabaseRevisionFilesManager

execute_single_sql_file(database_revision_file: dvc.core.struct.DatabaseRevisionFile, mark_only: bool = False) None[source]#

Execute DatabaseRevisionFile to the Database and optionally mark it as metadata

Parameters
  • database_revision_file – Database Revision File to apply to the Database

  • mark_only – whether or not to mark the SQL file as being done as metadata, without actually executing the SQL file

Returns

get_target_database_revision_files(steps: int, pointer: Optional[dvc.core.config.DatabaseRevisionFilesManager.Pointer] = None) List[dvc.core.struct.DatabaseRevisionFile][source]#

Helper to get target database revision files Check number of returned revision files must be same as steps specified

Parameters

steps – Specify how many steps ahead/ backwards.. When None, it goes to the very end in either direction

Returns

List of DatabaseRevisionFiles, if any

property latest_database_version: dvc.core.struct.DatabaseVersion#
Returns

latest Database Version

ping() None[source]#

Ping the database connection

Returns

property sql_file_executor#
Module contents#

database subcommand

Submodules#
dvc.app.cli.commands.config module#

config subcommand

Module contents#
Submodules#
dvc.app.cli.main module#

Starting point of the CLI

Module contents#
Module contents#
dvc.core package#
Subpackages#
dvc.core.database package#
Subpackages#
dvc.core.database.bigquery package#
Module contents#
class dvc.core.database.bigquery.BigQuerySQLFileExecutor(db_conn: psycopg2.extensions.connection)[source]#

Bases: dvc.core.database.SQLFileExecutorTemplate

dvc.core.database.mysql package#
Module contents#
class dvc.core.database.mysql.MySQLSQLFileExecutor(db_conn: psycopg2.extensions.connection)[source]#

Bases: dvc.core.database.SQLFileExecutorTemplate

dvc.core.database.postgres package#
Module contents#
class dvc.core.database.postgres.PostgresSQLFileExecutor(db_conn: psycopg2.extensions.connection)[source]#

Bases: dvc.core.database.SQLFileExecutorTemplate

FILE_HASHER = <dvc.core.hash.FileHasher object>#
METADATA_SQL_FOLDER_PATH = PosixPath('/home/docs/checkouts/readthedocs.org/user_builds/python-database-version-control/checkouts/release-0.5.1/src/dvc/core/database/postgres')#
execute_database_revision(database_revision_file: dvc.core.struct.DatabaseRevisionFile)[source]#

Execute database revision and write to database version control tables :param database_revision_file: :return:

get_latest_database_version() dvc.core.struct.DatabaseVersion[source]#

Get the latest database version :return:

set_up_database_revision_control_tables()[source]#

Create all database revision control schema and tables :return:

Module contents#
class dvc.core.database.SQLFileExecutorTemplate(db_conn: psycopg2.extensions.connection)[source]#

Bases: abc.ABC

Abstract Base Class for all SQLFileExecutors for different datbaases

abstract execute_database_revision(database_revision: dvc.core.struct.DatabaseRevisionFile)[source]#
abstract get_latest_database_version()[source]#
abstract set_up_database_revision_control_tables()[source]#
class dvc.core.database.SupportedDatabaseFlavour(value)[source]#

Bases: enum.Enum

List of database flavours supported in the programme

Postgres = 'postgres'#
Submodules#
dvc.core.config module#
class dvc.core.config.ConfigDefault[source]#

Bases: object

KEY__DATABASE_REVISION_SQL_FILES_FOLDER = 'DVC__DATABASE_REVISION_SQL_FILES_FOLDER'#
KEY__DBFLAVOUR = 'DVC__DBFLAVOUR'#
KEY__DBNAME = 'DVC__DBNAME'#
KEY__HOST = 'DVC__HOST'#
KEY__LOGGING_LEVEL = 'DVC__LOGGING_LEVEL'#
KEY__PASSWORD = 'DVC__PASSWORD'#
KEY__PORT = 'DVC__PORT'#
KEY__USER = 'DVC__USER'#
VAL__DATABASE_REVISION_SQL_FILES_FOLDER = 'sample_revision_sql_files'#
VAL__DBFLAVOUR = 'postgres'#
VAL__DBNAME = ''#
VAL__FILE_PATH: pathlib.Path = PosixPath('config.yaml')#
VAL__FilE_NAME: str = 'config.yaml'#
VAL__HOST = ''#
VAL__LOGGING_LEVEL: str = 'INFO'#
VAL__PASSWORD = ''#
VAL__PORT = 5432#
VAL__USER = ''#
classmethod get_config_dict(database_revision_sql_files_folder: str, user: str, password: str, host: str, port: int, dbname: str, dbflavour: str, logging_level: int, as_file=False)[source]#
Parameters
  • database_revision_sql_files_folder

  • user

  • password

  • host

  • port

  • dbname

  • dbflavour

  • logging_level – Assumed to be integer value

  • as_file – whether to dump the dict as file.

Returns

class dvc.core.config.ConfigFileWriter(config_file_path: Union[pathlib.Path, str] = PosixPath('config.yaml'))[source]#

Bases: object

Read Config Files (in different formats) to Python Dictionary

write_to_yaml() None[source]#
class dvc.core.config.ConfigReader(config_file_path: Union[pathlib.Path, str] = PosixPath('config.yaml'))[source]#

Bases: object

Read Config (in different formats) to Python Dictionary

Precedence in descending order 1. Config File 2. Environment Variable

class dvc.core.config.DatabaseConnectionFactory(config_reader: dvc.core.config.ConfigReader)[source]#

Bases: object

Return connections for various databases

MAPPING = {SupportedDatabaseFlavour.Postgres: 'self.pgconn'}#
property conn: psycopg2.extensions.connection#

Return the expected connection object for different database flavours :return:

property pgconn: psycopg2.extensions.connection#

Return Postgres Database Connection

Returns

validate_requested_database_flavour() dvc.core.database.SupportedDatabaseFlavour[source]#

Validate if requested database flavour is supported

Returns

class dvc.core.config.DatabaseRevisionFilesManager(config_file_reader: dvc.core.config.ConfigReader)[source]#

Bases: object

Manager all Database Revision Files

class Pointer[source]#

Bases: object

Head: ALl the way to the latest

BASE = 'base'#
HEAD = 'head'#
create_database_revision_files_folder() None[source]#

Safely create the database revision files folder.

get_target_database_revision_files_by_pointer(current_database_version: dvc.core.struct.DatabaseVersion, candidate_database_revision_files: List[dvc.core.struct.DatabaseRevisionFile], pointer: dvc.core.config.DatabaseRevisionFilesManager.Pointer) List[dvc.core.struct.DatabaseRevisionFile][source]#

Given current database version and pointer, filter for target database revision files in the folder

Parameters
  • current_database_version

  • candidate_database_revision_files

Returns

get_target_database_revision_files_by_steps(current_database_version: dvc.core.struct.DatabaseVersion, steps: int, candidate_database_revision_files: List[dvc.core.struct.DatabaseRevisionFile]) List[dvc.core.struct.DatabaseRevisionFile][source]#

Given current database version and number of steps, filter for target database revision files in the folder.

Returns

dvc.core.exception module#
exception dvc.core.exception.DatabaseConnectionFailureException[source]#

Bases: Exception

Exception raised when connection to the database fails

exception dvc.core.exception.EnvironmentVariableNotSetException(missing_env_var: str)[source]#

Bases: Exception

Exception raised when required environment variables are not found

exception dvc.core.exception.InvalidDatabaseRevisionFilesException(status: dvc.core.exception.InvalidDatabaseRevisionFilesException.Status, config_file_path: Optional[pathlib.Path], database_revision_file_paths: List[pathlib.Path])[source]#

Bases: Exception

Exception Raised when something is wrong with the DatabaseRevisionFiles

class Status(value)[source]#

Bases: enum.Enum

List of all reasons

FEWER_REVISION_SQL_FILES_FOUND_THAN_REQUIRED_STEPS_SPECIFIED = 103#
MORE_REVISION_SQL_FILES_FOUND_THAN_REQUIRED_STEPS_SPECIFIED = 102#
NONCONSECUTIVE_REVISION_SQL_FILES_FOR_HEAD_OR_BASE_POINTER = 104#
NON_CONFORMANT_REVISION_FILE_NAME_EXISTS = 101#
exception dvc.core.exception.InvalidDatabaseVersionException(database_version: str)[source]#

Bases: Exception

Exception raised when format of Database Version is wrong

exception dvc.core.exception.OperationNotAccountedForException(operation_type=<enum 'Operation'>)[source]#

Bases: Exception

Exception raised when operation is requested but is not yet developed

exception dvc.core.exception.RequestedDatabaseFlavourNotSupportedException(requested_database_flavour: str)[source]#

Bases: Exception

Exception raised when requested database flavour is not supported

dvc.core.file module#
dvc.core.file.validate_file_exist(file_path: pathlib.Path) None[source]#

Throw FileNotFileError if a given file does not exist

dvc.core.hash module#
class dvc.core.hash.FileHasher[source]#

Bases: object

Hash content of any given file

md5(file_path: pathlib.Path) str[source]#

Extract content from a file and hash its output

Parameters

file_path – Pathlib Path

Returns

string

dvc.core.logger module#
class dvc.core.logger.SetRootLoggingLevel(func)[source]#

Bases: object

Used as a decorator to set the root logging level

set_logging_level(logging_level: int)[source]#
dvc.core.regex module#
dvc.core.regex.get_matched_files_in_folder_by_regex(folder_path: pathlib.Path, file_name_regex: str) List[pathlib.Path][source]#

Loop recursively for all files in a given folder. Return those files whose name satisfy the regex.

Parameters
  • folder_path – Path pointing tothe folder with files

  • file_name_regex – regex used to filter for files with the desired file name

Returns

dvc.core.struct module#
class dvc.core.struct.DatabaseRevisionFile(file_path: pathlib.Path)[source]#

Bases: object

Raise error when File Path does not conform to standard

STANDARD_RV_FILE_FORMAT_REGEX = '^RV[0-9]+__.*\\.(upgrade|downgrade)\\.sql$'#
property description: str#

Get the file description

Returns

property ending: str#

Get the file ending

Returns

classmethod get_dummy_revision_file(revision: str, operation_type: dvc.core.struct.Operation) dvc.core.struct.DatabaseRevisionFile[source]#

Return a dummy revision file

Parameters
  • revision

  • operation_type

Returns

property operation_type: dvc.core.struct.Operation#

Get the operation type

Returns

property revision_number: int#

Get the revision number

Returns

class dvc.core.struct.DatabaseVersion(version: str, created_at: Optional[datetime.datetime] = None)[source]#

Bases: object

STANDARD_DATABASE_VERSION_FORMAT_REGEX = '^V[0-9]+$'#
property created_at#
property next_downgrade_database_revision_file: dvc.core.struct.DatabaseRevisionFile#

Get the database revision file for downgrade :return:

property next_upgrade_database_revision_file: dvc.core.struct.DatabaseRevisionFile#

Get the database revision file for upgrade :return:

property version#
property version_number: int#
class dvc.core.struct.Operation(value)[source]#

Bases: enum.Enum

Database Operations Allowed

Downgrade = 'downgrade'#
Upgrade = 'upgrade'#
Module contents#
Submodules#
dvc.version module#
Module contents#

Tests Package#

tests package#

Subpackages#
tests.assets package#
Module contents#
tests.test_dvc package#
Subpackages#
tests.test_dvc.test_app package#
Subpackages#
tests.test_dvc.test_app.test_cli package#
Subpackages#
tests.test_dvc.test_app.test_cli.test_commands package#
Subpackages#
tests.test_dvc.test_app.test_cli.test_commands.test_database package#
Submodules#
tests.test_dvc.test_app.test_cli.test_commands.test_database.test_backend module#
Module contents#
Submodules#
tests.test_dvc.test_app.test_cli.test_commands.test_config module#
tests.test_dvc.test_app.test_cli.test_commands.test_main module#
tests.test_dvc.test_app.test_cli.test_commands.test_main.test__version__only_contain_semver()[source]#

Test dvc version command output version in SemVer format

tests.test_dvc.test_app.test_cli.test_commands.test_sql module#
Module contents#
Module contents#
Module contents#
tests.test_dvc.test_core package#
Submodules#
tests.test_dvc.test_core.test_config module#
class tests.test_dvc.test_core.test_config.TestConfigFileWriter[source]#

Bases: object

pytestmark = [Mark(name='unit', args=(), kwargs={})]#
test__write_dummy_user_configuration(dummy_user_configuration_with_supported_db_flavour, dummy_absent_config_file_path)[source]#

GIVEN a non-existing config file path WHEN ConfigFileReader.user_config is called THEN check dummy user configuration is returned

class tests.test_dvc.test_core.test_config.TestConfigReader[source]#

Bases: object

pytestmark = [Mark(name='unit', args=(), kwargs={})]#
test__when_both_config_file_and_env_var_and_absent__raise_environment_variables_not_set_exception(dummy_absent_config_file_path)[source]#

GIVEN a dummy config file with dummy user configuration, WHEN ConfigFileReader.user_config is called THEN check dummy user configuration is returned

test__when_config_file_is_absent_but_env_var_present__return_expected_user_config_from_env_var(dummy_user_configuration_with_supported_db_flavour, dummy_absent_config_file_path_with_env_var, monkeypatch)[source]#

GIVEN a dummy config file with dummy user configuration, WHEN ConfigFileReader.user_config is called THEN check dummy user configuration is returned

test__when_config_file_is_persent__return_expected_user_config_from_config_file(dummy_user_configuration_with_supported_db_flavour, dummy_existing_config_file_path)[source]#

GIVEN a dummy config file with dummy user configuration, WHEN ConfigFileReader.user_config is called THEN check dummy user configuration is returned

class tests.test_dvc.test_core.test_config.TestDatabaseConnectionFactory[source]#

Bases: object

pytestmark = [Mark(name='unit', args=(), kwargs={})]#
test__pass_user_credentials_to_connect_as_kwargs(dummy_config_file_reader_with_supported_db_flavour, dummy_pgconn)[source]#

(Currently test postgres specifically) GIVEN patched psycopg2.connect WHEN DatabaseConnectionFactory.conn is called THEN check psycopg2.connect is called once and with expected args

test__raise_requested_database_not_supported_exception(dummy_config_file_reader_with_unsupported_db_flavour)[source]#

GIVEN a fake database flavour WHEN DatabaseConnectionFactory.validate_requested_database_flavour is called THEN assert RequestedDatabaseFlavourNotSupportedException is raised

class tests.test_dvc.test_core.test_config.TestDatabaseRevisionFilesManager[source]#

Bases: object

pytestmark = [Mark(name='unit', args=(), kwargs={})]#
test__get_target_database_revision_files_by_pointer(dummy_config_file_reader_with_supported_db_flavour, current_database_version, candidate_database_revision_files, steps, expected_database_revision_files, expected_exception)[source]#
tests.test_dvc.test_core.test_logger module#
class tests.test_dvc.test_core.test_logger.TestSetRootLoggingLevel[source]#

Bases: object

Test Set RootLoggingLevel as a decorator

test__when_config_file_and_env_var_are_absent__set_to_default_logging_level(any_func, dummy_user_configuration_with_supported_db_flavour, dummy_absent_config_file_path)[source]#

GIVEN config file is absent and no env var is set WHEN SetRootLoggingLevel is called THEN use default logging level

Parameters
  • any_func

  • dummy_user_configuration_with_supported_db_flavour

:param dummy_absent_config_file_path :return:

test__when_config_file_is_absent_but_env_var_is_present__set_to_user_defined_logging_level(any_func, dummy_user_configuration_with_supported_db_flavour, dummy_absent_config_file_path_with_env_var)[source]#

GIVEN config file is absent but env vars are set WHEN SetRootLoggingLevel is called THEN use the env vars’ logging levels

Parameters
  • any_func

  • dummy_user_configuration_with_supported_db_flavour

:param dummy_absent_config_file_path_with_env_var :return:

test__when_config_file_is_present_but_env_var_is_absent__set_to_config_file_logging_level(any_func, dummy_user_configuration_with_supported_db_flavour, dummy_existing_config_file_path)[source]#

GIVEN config file is present and no env vars are set WHEN SetRootLoggingLevel is called THEN use the config file’ logging level

Parameters
  • any_func

  • dummy_user_configuration_with_supported_db_flavour

  • dummy_existing_config_file_path

Returns

tests.test_dvc.test_core.test_logger.any_func() Callable[source]#

Return a function which accepts any args and kwargs, but does nothing :return:

tests.test_dvc.test_core.test_regex module#
class tests.test_dvc.test_core.test_regex.TestGetMatchedFilesInFolderByRegex[source]#

Bases: object

pytestmark = [Mark(name='unit', args=(), kwargs={})]#
test__get_matched_files_in_folder_by_regex__assert_number_sql_files(dummy_regex_files_folder_with_correct_files_names, file_name_regex, expected_num_matched_files_paths)[source]#

GIVEN a dummy folder with dummy files WHEN get_matched_files_in_folder_by_regex is called with a certain regex THEN the returned paths should match the regex

tests.test_dvc.test_core.test_struct module#
class tests.test_dvc.test_core.test_struct.TestDatabaseRevisionFile[source]#

Bases: object

test_database_revision_files_comparison(file_1, file_2, predicate, expected)[source]#
test_valid_database_revision_files(sql_file_name: str, expectation)[source]#
class tests.test_dvc.test_core.test_struct.TestDatabaseVersion[source]#

Bases: object

test_valid_dummy_database_revision_files_with_order(target_database_version: dvc.core.struct.DatabaseVersion, current_database_version: dvc.core.struct.DatabaseVersion, expected_dummy_database_revision_files: List[dvc.core.struct.DatabaseRevisionFile])[source]#
Module contents#
Module contents#
Submodules#
tests.conftest module#
Module contents#

Test Report#

  • See latest test report: here

Change Log#

0.5.0#

  • Added –dry-run to both dvc db upgrade and dvc db downgrade. When set to True, the CLI will stop before the execution of SQL files

  • For DatabaseRevisionFilesManager class:
    • Added tests for i. getting files by pointer and ii. getting files by steps methods

    • Made both methods use the same exception handling raise_for_status() method

  • Added option to change logging level in config file and env variables

  • Added demo files and step-by-step guide under in the feature pages

  • Added dvc cfg show to show some configurations

0.4.0#

  • Added dynamic generation of src package and tests package documentation in docs with sphinx-apidoc.

  • Added –head to dvc db upgrade and –base dvc db downgrade respectively

0.3.3#

  • Added CI testing for binaries generated for 3 Oses. For Windows, Ubuntu Linux and Mac, run the below in the CI pipeline:

    • Build binary

    • Run postgres server. Run all DVC commands against it.

    • Push to Github release

0.3.2#

  • Included ./setup.cfg to Dockerfile. Fixed missing .sql files in Docker Image to Dockerhub

0.3.1#

  • Moved package_data to setup.cfg. Fixed missing .sql files in PyPI.

0.3.0#

  • Added –steps and –confirm flags to dvc db upgrade and dvc db downgrade

  • Added dunder methods for the below classes:
    • __le__, __gr__, __eq__ for DatabaseRevisionFile

    • __add__, __sub__, __eq__ for DatabaseVersion

  • Codified the below relationship with dunder methods:
    • DatabaseVersion - DatabaseVersion = [DatabaseRevisionFiles]

    • DatabaseVersion + DatabaseRevisionFile = DatabaseVersion

  • Removed dvc sql generate, as that is Files System related.

  • Added diagram to illustrate DatabaseVersion and DatabaseRevisionFile

0.2.1#

  • Refactored Github workflows. Separated the below components from Github Workflows
    • Running pytest

    • Building and pushing python library to PyPI

    • Building and Pushing artifacts to Readthedocs

    • Building and pushing to Dockerhub

    • Building and pushing Linux, Mac and Windows binaries to Github Releases

0.2.0#

  • Included help text and documentation URL in the CLI.

  • Followed SemVer more closely. Bumped minor version with added feature and bump patch for bug fixes. Switched to using tag (not branch) for releases.

0.1.16#

  • Created binaries for windows, mac and linux with pyinstaller. added sql files to the binaries.

  • Removed confirmation for both upgrade and downgrade command (i.e. dvc db upgrade and dvc db downgrade)

0.1.15#

  • Update Dockerfile. Changed instruction CMD to ENTRYPOINT for dvc command.

  • Added pages to documentation using sphinx

  • Created graphs using graphviz and dots

  • Added Github Action workflows to automate the below:
    • generating pytest report artifacts

    • pushing both artifacts and .rst files to Readtheedocs for building documentation

0.1.14#

  • Created ConfigReader, ConfigFileWriter and other objects to hold states.

  • Added the option to pass configuration as environment variables.

  • Added unit tests and integrations tests (for postgres)

  • Created Dockerfile to containerise the tool

  • Created docker-compose.yml file for demonstration purposes. Created demo_assets to be attached as volume to docker-compose containers.

  • Added Github Action workflows to automate the below:
    • Pushing Docker image (with different tags) and readme to Dockerhub.

    • Pushing the tool to PYPI

Author#

Author

About Me#

  • Language enthusiast

  • Fan of automation via technology

Contact#

Contact me via:

Summary#

Compute (Application) and Storage (Database) are decoupled.

When you make changes to your application code, you should also mke changes to your database. In other words, you probably want to version control both your application code and your database. Without version controlling both, any changes in either side can cause incompatibility issues and break the entire service as a whole.

Use DVC now to version control your database!

Description#

Database Version Control (DVC) is a CLI utility which version controls your database in the following ways:

  • Generate metadata table(s) in your database;

  • For each SQL script applied, update the metadata table(s);

  • Exposes the metadata via CLI commands.

Benefits#

  • Rich metadata is available in the database. The database can be directly queried with SQL for both historical and current database versions.

  • Only plain SQL files are accepted. No extra abstraction layer as is generally available in ORM.

More information#

Supported Databases#