To work in an Agile way, change should be expected and easy to manage. How does this look from a database perspective? Anything that you can “CREATE OR REPLACE” works fine – it can be stored in version control and the latest full version checked out to release to any environment. But what about those things that require migration? Like a table, or data.
A simple example could be an existing table with a ‘person_name’ column that contains first and last name separated by a space. You need to:
- create two new columns ‘first_name’ and ‘last_name’
- migrate the data from the ‘person_name’ column
- then remove the ‘person_name’ column.
A simple question right now would be (assuming I have the existing database and a new database with the changes applied): Is there any tool that could generate a difference script between the two database states to perform those actions? To date I have not found anything that could handle the “simple example”.
To further complicate the problem, not only is there is migration of objects over time, there are dependencies between objects at a point in time (eg foreign keys).
Some initial questions:
- Do you put the original table creation script into source code control and keep it up to date? For what reason (it can never be used except to build a ‘blank’ database)?
- Do you have a database “patch” script associated with a release? How will this work in a development environment with multiple developers adding to the script during development, and also wanting to only run the new changes to the script into their own development areas? How do you rollback a database patch? Or how do you complete a database patch that failed halfway through?
- How to manage continuous integration, where each new change to the database patch should be applied as soon as it is checked in?
- Are you interested in seeing the current state of a table creation script?
- Are you interested in seeing the history of changes to a specific table?
- How to ensure that the objects referenced by a foreign key exist before the foreign key creation?
- How do you branch the database DDL code?
I plan to explore these issues in some further blog posts, ending with a possible solution. Keep reading 🙂
by
Hi, just getting my head around Agile DB development. I am sure your example above is given as a simple example but surely you would have to be really dumb to create a field called Person_Name in the first place. This may appear to be an obnoxious comment, but all the examples I see around Agile refactoring are for refactoring really stupid mistakes, eg breaking 1NF as in this case. It seems from what I have read that the Agile method uses refactoring to solve problems caused by really bad data modelling, whereas Agile should be used to avoid these errors in the first place.
I would be interested in your thoughts on this and would be interested in any real world examples of refactoring complex changes that could not have been preconceived at the modelling phase.
Regards
Steven
Re: Person_name…what about simple web registations (name + email)?
It is a subjective call as to whether a change is a real world refactoring or an omission by the original designer. Some examples from my work:
Today I was adding ‘ROWDEPENDENCIES’ to tables that were not created with that clause (an infrastructure technology change provides for a ‘changestamp’ feature that was once implemented via custom code).
I have worked regularly with legacy systems that contain overloaded fields similar to the person_name example. Pretty much an anti-pattern now, but still out there.
Adding new features to a system in a second phase of development (features which require a new set of tables).
Altering existing tables to add extra attributes that were not required in initial project iterations.
Changing the length/precision of a field to reflect the latest information (that was not available or expected in the intial design).
Moving from a 1:1 relationship with an attribute (stored in the base table as another column) to a 1:many relationship (stored as an intermediate relationship table). Again this example was due to a change in the intial requirements.
It can be (and is) argued that all changes should have been identified completely up-front. However, whatever our approach to software development, change happens. We use our design skills and experience to minimise bad data modelling in any case. When using an adaptive approach like agile, it is also important to have the tools to release a little part of the system and build on it. Change in an agile environment with good data modelling can be as simple as an incremental release of database features.