On the whole integration databases lead to serious problems because the database becomes a point of coupling between the applications that access it. This is usually a deep coupling that significantly increases the risk involved in changing those applications and making it harder to evolve them. As a result most software architects that I respect take the view that integration databases should be avoided – https://martinfowler.com/bliki/IntegrationDatabase.html
I have learnt this lesson painfully in the past. The clearest example is an application which was exposing a new API to replace an older one in a large monolithic app. The desire was to build a model that better represented the domain. The implementer was going to build it in a new rails application. It would be the new API and it would represent a better view of the system. At the same time, it was viewed that this API would not change “too much” after it was released. So when reviewing and conversing about the solution and how it would be maintained, the fateful “it is not going to change, so let’s move forward and not worry about maintainability now” was stated.
The decision in this instance conversed around sharing the model as a Gem between the two applications, or otherwise sharing it as views and having the new application consume the views. Having a model as a Gem for a concrete implementation of data is not a great idea as it means that all consumers must always have the latest gem deployed as that is the latest representation of the database schema and hence any time a schema change happens all consumers containing the gem should be redeployed – even if the change has no bearing on a given consumer, they need to keep in sync so that they interact with the data correctly. This is particularly true if the model includes code to interpret what is in the database columns meaningfully.
In this instance we chose to share views to share the data. If we had multiple applications owning the schema we could get into trouble managing changes to the migrations from multiple sources. To resolve this we chose to make the primary application own the database schema. That meant any view change required for the new API needed to happen in the primary application first. It would also need to be deployed to live before the API code could consume it.
If you recall the “it’s not going to change” comment. It turned out six months later that we thought this API was useful and started to use it – both directly and from a search service. Long story short, we did change the data a lot and this design choice became a real pain. Changes only needed in data for the API caused multiple repos to change and be deployed. Deployment could be painful as sometimes the different database schemas were out of sync. Deploying an older version of the main application would downgrade the views for the API consuming it. Testing could be painful as we needed a matching test database based on the main application so that the views were up to date for the tests to run in the test database schema. There were lots of lots of little niggles due to the coupling of the implementation.
The worst part of this choice was when people moved into the project they struggled to understand the coupling. They would forget the process as too many things were solved by “we should remember to do XX”. The solutions were too often “humans be better”, not systems providing feedback when the humans inevitably did the wrong thing. But it is hard to provide automated feedback about changing code in one repository that is affecting another one.
When we did this I was working with the best group of agile software developers I have worked with. It caused us a lot of pain and we owned all parts of code accessing the database. It is hard to get out of this path once you have started down it.
“As a result most software architects that I respect take the view that integration databases should be avoided” – Martin Fowler
In retrospect this really does continue to feel like the smarter choice.