Category Archives: Software Development

JAOO Brisbane: Goldilocks and the Concurrent Processes

Today I attended the first day of JAOO Brisbane. A pleasant diversion from the everyday, a chance to catch up ghosts of workplaces past, and an opportunity to see presentations by some IT thought leaders.

Erik Meijer started the day wanting to make a case for “fundamentalist functional programming”. The IT community has reached a crisis point of distributed systems and multi-core computers that is not solved by present day programming languages. A (the?) primary issue to be solved is in eliminating hidden side-effects. Stating side-effects is an enabler for implicit concurrency, rather than, say the explicit world of threads in Java. Hopefully that is a reasonable summary – the talk was impressive enough just because of the journey that we went on.

Finishing the day were Erik Dörnenburg and Martin Fowler searching for design that is just right. There was some discussion of essential and accidental complexity. Essential is required by the problem you are facing, accidental is required by the approach you take. Solutions with too much simplicity are not tasty and neither are solutions with too much complexity. Getting your design just right is a primarily a craft. It may be helped by things like Domain Driven Design or reducing irreversible architecture choices by removing them or delaying them.

Facebooktwitterredditpinterestlinkedinmailby feather

Choose Your Poison: Does business logic belong in the database?

There are debates about where to locate the business logic in a software application. Some say that the only place is in some middle language, others argue that the database is the logical home. Once upon a time I guess that no-one gave it much thought, you build a piece of software and it ran on the server, anyone who wanted to use it logged into the server. Or you sent the software off to a user to install on their computer.

Then came two tier applications with part of the software installed on a PC and part residing on a server somewhere. Where did the business logic go? Probably with the piece on the PC, causing heartache whenever it had to be changed. So, the next intelligent move was to centralise the business logic on the server – what was on the server? Probably just the database.

N-tiered applications bought layers separating the storage of data, access to the data, business domain, presentation logic and user interface. Why? Because it means you can change one of the components without having to affect all the other pieces of the software puzzle.

So, there are choices as to where you try to place most of the business logic. In practical terms this equates to a choice between your database and some other language that extracts data from the database.

Some reasons to choose the middle language would be:

  • Your software is meant to work with as many databases as possible.
  • You have no idea about how to use database features like stored procedures.
  • You believe that implementing the business logic in a middle language provides a better separation of concerns or a looser coupling.
  • You believe that the database is only there for data persistence.
  • You believe that the middle language is a better technical choice for manipulation of data. That is, Data Access Objects and possibly the database schema are generated, Object-Relational Mapping (ORM) frameworks can be fully used.
  • The database is only one of many data sources for your software.
  • It is important to create a solution as quickly as possible at the possible expense of later maintainability.
  • There is a higher probability that your database of initial choice will change to something else.

Some reasons to choose the database would be:

  • Your software is meant to work with as many middle languages/interfaces as possible.
  • You believe in having the business logic as close to the data as possible.
  • You believe in making use of database features beyond simple data storage (particularly if you paid a lot of money for it).
  • You believe that the database is a better technical choice for manipulation of data. That is, stored procedures are used to abstract the implementation of data storage, complex queries can be used and are tuned by database experts, unit testing of database code is performed.
  • There is a higher probability that your middle language of initial choice will change to something else.

I believe that the object-relational impedance mismatch should be acknowledged and managed at the DAO interface regardless of use of database features – rather than subverting either the object model or the relational model. Except when the solution indicates otherwise.

My opinion is that the object model and the relational model should both be first-class citizens in your software. Having a stored procedure layer in the database supports the loose coupling of the models. Recognising the importance of both aspects of your software supports:

  • better use the native capabilities of your tool choices
  • better response to change.

There are some passionate views about this topic, a few starters to follow up are:

Object/Relational Mapping is the Vietnam of Computer Science. It represents a quagmire which starts well, gets more complicated as time passes, and before too long entraps its users in a commitment that has no clear demarcation point, no clear win conditions, and no clear exit strategy.

Ted Neward: The Vietnam of Computer Science

For modern databases and real world usage scenarios, I believe a Stored Procedure architecture has serious downsides and little practical benefit. Stored Procedures should be considered database assembly language: for use in only the most performance critical situations.

Coding Horror: Who Needs Stored Procedures, Anyways?

Martin Fowler: Domain Logic and SQL

Facebooktwitterredditpinterestlinkedinmailby feather

The Documentation Reflex

We can find key information. Key information is whatever anyone needs to do their work.

So what? What can we do to achieve this blissful informed state? What are you going to go about it?

People have many opportunities to document. We make the choice to document, or not, all the time…

  • Making a decision, evaluating options
  • Discussing requirements with a user
  • Today’s task list
  • While programming
  • Asking how to use an application’s feature

when do we choose to document? Usually when we are forced to, or the ‘doing’ of a task is assisted by a document…

  • Recording pros and cons for various options
  • Draw a user interface layout on a whiteboard
  • Write your task list on a post-it note
  • Pseudo code to explore design options
  • Email a question to application support

when do we want to keep a document? There is no easy answer, something will be useful if it is ever used. Beyond the base documentation requirements of our workplace, we are empowered to capture information based on our judgement of importance balancing cost and value.

What does it take to move documentation, that was functional during a task, to something suitable for keeping? Effort.

  • Recording options in a Word document.
  • Having a whiteboard that will email its user interface drawing to you (so you can attach it to a change request).
  • Change from post-it note to task management software.
  • Write pseudo code in your development tool as code comments.
  • Post a question to an application forum.

Ideally we want to reduce the effort overhead to zero so that we have a ‘documentation reflex’ – documentation that happens through work habits. Ask yourself if what you are doing is worth documenting. If it is, are you capturing it in a way that minimises your effort to document it?

Encouraging updates

The effort to change documents should also be minimal. Tools can help or hinder:

  • If I have the document open for edit, does it lock out other potertial updaters?
  • Is a history of changes recorded – can you revert to a version without all those mistakes you just made?

In most cases authorship should be shared so that everyone feels able to make a change. There is a cost to update ease when adding ceremony to a document (eg prescribing a certain format, or requiring a set of signatures). It is often the case that these things are added to stop a document from being updated – what is your change control policy? Only secure things from change with a reason, because change is not inherently bad.

Specific requirements

In addition to simple opportunities, we may have requirements for specific documentation. These requirements are usually more formal, and are made evident through business or process requirements…

  • Production Change Request
  • Application training

These become project deliverables in their own right – you just have to sit down and put effort into creating them.

Created vs evolving documents

When capturing documents, it is important to note that some are created, and others evolve. Some documentation will record a moment in time (like a diary), for example: options analysis, a decision, or meeting minutes. Whereas others will need constant change, for example: application training, comments within the source code, or a process checklist.

There is a grey area for what is a ‘created’ or ‘evolving’ document. For example the user interface design captured via a whiteboard. If the user requests a change, should the already documented user interface be revisited, or just the change request be captured? The choice in this case comes back to what you have identified as important to capture. Do you accept the initial design and change requests as diary-like documents, with the built system reflecting the current user interface? Or do you need to have an up-to-date model of the user interface?

Making the choice to have an evolving document is more expensive. A guideline can be to make a ‘created’ document the default output, then only evolve that document as a considered decision.

Summary

  • Capture key information through your work practices.
  • Minimise the effort and ceremony to capture information.
  • Create documents by default, evolve them by choice.
Facebooktwitterredditpinterestlinkedinmailby feather

How much is enough documentation?

What really constitutes a helpful set of documents for software development? That is possibly a poor choice of words as there is not even a need to restrict ourselves to formal ‘paper’ documents for communication.

Have you ever heard:

  1. "You cannot trust the documentation because it is out of date."
  2. "I’ll need to ask a developer to look at the source code and get back to you on the definition for that business rule."

Statements that probably indicate dysfunctional documentation, but not without cause, there are a number of challenges:

  • As a project iterates to maturity, things change. Keeping an up to date, extensive set of documents directly reduces a team’s ability to deliver software.
  • Putting little or no effort into documentation creates a long term problem for the people who would like to know key information.
  • What documentation is valuable?
  • How should documentation be captured?
  • How do you find a piece of documentation?

Looking at those challenges we can probably state the communication vision:

We can find key information.

So what is key information?

Key information is whatever anyone needs to do their work. Quite a broad scope, and probably the point of most disagreement in various documentation strategy discussions. How do we agree on a balancing point for the documentation content? It is too expensive to capture everything that anyone might ever need. So, how much will be captured so that it does not unduly hinder what we are doing now…or next year?

A helpful question to ask is: Who is the audience and what is their likely need?

OK, what tactics can be employed to capture key information?

  • Construct a team culture that requires a number of people to know it.
  • Video – a demo, someone talking about it, …
  • Record (audio) someone talking about it.
  • Photograph it (works well for those whiteboards that don’t print).
  • Write it down.

Fine, we have our key information and it is valuable, how do I find it?

  • I already know it.
  • I know where it is.
  • I can ask someone who knows the answer.
  • I can ask someone who knows where the answer is.
  • We have conventions on where to store information artifacts and I can browse them manually.
  • We employ technology to search for it (eg a search engine like google).

An example:

A software system is developed and maintained.

Identified audiences and their needs:

  • Software developers: high level overview, business requirements/priorities, system architecture, domain model/processes, business rules, source code, key design decisions
  • Business owners: high level overview, domain model/processes, business rules
  • Software users: training, system help

Communication Strategy:

To facilitate communication an intranet site is created which includes automatically produced documents (eg automated test output, automatic code documentation), a wiki is setup to store virtual artifacts, a forum is setup for the system users to ask questions and provide general feedback. A search engine indexes the whole lot, so the team can find information.

  1. High level overview: video the business owners presenting their hopes for the system, write a vision for the project (possibly already part of a business case document)
  2. Business requirements/priorities: record simple statements of key requirements, obtain regular access to key business people, provide constant feedback to key business people, provide an issue tracking system that captures issues and comments
  3. System architecture: produce architectural diagrams with technical and business viewpoints
  4. Domain Model/Process: produce domain model and process diagram(s)
  5. Business Rules: write up business rules
  6. Source Code: write source code with comments that may be extracted to a documentation system (for example javadocs).
  7. Key Design Decisions: write Technical Memos focusing on the decision
  8. Training/System help: none required because the system is so intuitive 😉

Note the focus is on valuable communication, rather than extensive documentation.

Facebooktwitterredditpinterestlinkedinmailby feather

JDBC ResultSet Mapper

In building a better bean processor I discussed some ideas for reflection mapping of a ResultSet to Object.

This time around we can look at a solution design.

Diagram

Here is a UML diagram of the solution (click on the image for a bigger, clearer picture).

Note that the diagram contains a couple of example extensions for the DBUtils BeanProcessor and Spring RowCallbackHandler. One of the requirements for the ResultSetMapper is that it can be easily be plugged into these bigger frameworks.

ResultSetMapper UML

Basic Mapping

Data mapping will occur against ResultSet columns and Object fields.

By default the Object field will require the @MapToData annotation, though this requirement may be turned on/off with ResultSetMapper.setAnnotationRequired(true/false). Why have it off? There would be no requirement to add an empty ‘@MapToData’ annotation. So, why have it on? The source code contains a documentary ‘@MapToData’ annotation to indicate that some magic is being performed on the field.

The mapping will use a NameMatcher to compare the field name with the column name (unless a MapToData.columnAlias is specified). A default NameMatcher – NameConverter – will use simple camelCase to under_score matching.

Data mapping will occur, in order of priority:

  1. Via a specified setter procedure – MapToData.setter
  2. Via the JavaBean standard default setter
  3. Directly into the object field

Inheritance

Inheritance requires that many target classes may be supplied to the ResultSetMapper. However, this means that there must be some way of choosing which target to create. To allow selection of target objects, an ObjectValidator will be used – every type of target class will be constructed from the ResultSet and passed to the ObjectValidator. (The default validator always returns true.)

The advantage of this approach is that the programmer only deals in their domain object, but it does require an overhead of creating extra objects for validation.

Aggregates

Solving this problem requires a couple of technical problem answers:

  • How to specify that an Object is an aggregate target (eg MyDataObject), rather than ‘Just Another Object’ (eg String, MyTransientObject)? Answer: MapToData.isAggregateTarget = true/false
  • How to have multiple fields of the same class with different business meanings (eg MyDataObject start; MyDataObject end;)? Answer: MapToData.columnPrefix or MapToData.columnSuffix

Example

So the example domain from the previous post would now be:

public abstract class Jelly { 
  @MapToData
  Long jellyId; 
  @MapToData
  String jellyType; 
} 
public class JellyAttribute { 
  @MapToData
  String name; 
  @MapToData
  String value; 
} 
public class JellyCompany {
  @MapToData
  String companyName;
  @MapToData
  String address;
}
public class JellyBean extends Jelly { 
  @MapToData
  String targetMarket;
  @MapToData
  BigDecimal weight;
  @MapToData (columnPrefix="flavour_", isAggregateTarget = true)
  JellyAttribute flavour; 
  @MapToData (columnPrefix="colour_", isAggregateTarget = true)
  JellyAttribute colour; 
  @MapToData (isAggregateTarget = true)
  JellyCompany company;
}
public class JellyCup extends Jelly {
  @MapToData
  String productName;
  @MapToData ( columnAliases = { "cup_volume" } )
  BigDecimal volume;
  @MapToData (isAggregateTarget = true)
  JellyAttribute shape;
}
public class JellyCupAndSpoon extends JellyCup {
  @MapToData
  String spoonMaterial;
  @MapToData
  BigDecimal spoonLength;
}
Facebooktwitterredditpinterestlinkedinmailby feather

Building a better BeanProcessor

At the moment I’m coding up a ‘BeanMapper’. Something to automatically transform a ResultSet into a JavaBean object. (Why would you want to do this? Write and maintain less code will probably do as a worthwhile cause.)

The DBUtils Apache commons project has a BeanProcessor that does a basic job. It is good for a simple mapping, however it does not meet the following challenges:

  1. Inheritance – a ResultSet may map to different Objects of the same parent over different rows
  2. Aggregates – a ResultSet may contain the data to build some/all of the aggregate objects
  3. Multiple Objects – a ResultSet may contain the data for many objects in one row

Challenge #1 can occur whenever inheritance is used, Challenges #2 and #3 are linked with improving performance (though I’m finding it difficult to imagine a scenario for #3 right now).

As well as the challenges there are some improvements that may be made:

  • The current BeanProcessor mapping works with the JavaBean standard – getColumnName/setColumnName will map to columnname in the database. A solution that also works with Annotations and/or Object fields would be helpful.
  • Automatic name mapping of the object ‘standard’ (camelCase) to the database ‘standard’ (under_score) would also be helpful.
  • Since I’m a user of the Spring framework, something that I can use as a default RowCallbackHandler

So what would my ideal mapping solution look like in operation?

I would have JDBC DAO that maps a variable ResultSet automatically to a set of target classes…

public interface JellyDAO {
  public Jelly find(Long jellyId);
  public List<Jelly> find(FindRequest findRequest);
} 
public class JellyDAOJdbc implements JellyDAO {

  List<Class> targetClasses;
  List<Class> aggregateClasses;
  BeanMapper beanMapper;

  public jellyDAOJdbc() {
    // The ResultSet may contain data for 
    // any of the following target classes.
    targetClasses = new ArrayList<Class>();
    targetClasses.add(JellyBean.class);
    targetClasses.add(JellyCup.class);
    targetClasses.add(JellyCupAndSpoon.class);
    // The Target classes may contain the following classes
    // that may also come across in the ResultSet.
    aggregateClasses = new ArrayList<Class>();
    aggregateClasses.add(JellyAttribute.class);
    aggregateClasses.add(JellyCompany.class);
    //
    beanMapper = new BeanMapper(targetClasses, aggregateClasses);
  }

  private Jelly processResultSetRow(ResultSet resultSet) {
    // Map the ResultSet to a Jelly class.
    Jelly jelly = (Jelly) beanMapper.toBean(resultSet);
    return jelly;
  }

  private List<Jelly> finder() {
    // Obtain a ResultSet..

    // Loop through result set..
    {
      jellyList.add(processResultSetRow(resultSet));
    }

    return jellyList;
  }

}

The Jelly domain model might be something like…

public abstract class Jelly { 
  Long jellyId; 
  String jellyType; 
} 
public class JellyAttribute { 
  String name; 
  String value; 
} 
public class JellyCompany {
  String companyName;
  String address;
}
public class JellyBean extends Jelly { 
  String targetMarket;
  BigDecimal weight;
  JellyAttribute flavour; 
  JellyAttribute colour; 
  JellyCompany company;
}
public class JellyCup extends Jelly {
  String productName;
  BigDecimal volume;
  JellyAttribute shape;
}
public class JellyCupAndSpoon extends JellyCup {
  String spoonMaterial;
  BigDecimal spoonLength;
}
public class JellyRepository {
  public Jelly find(Long jellyId);
  public List<Jelly> find(FindRequest findRequest);
}

I would also have a database view or stored procedure with the following signature…

jelly_id, jelly_type, target_market, product_name, 
spoon_material, spoon_length, cup_volume, flavour_name, 
flavour_value, colour_name, colour_value, company_name, 
company_address, shape_name, shape_value
Facebooktwitterredditpinterestlinkedinmailby feather

The Four Horseman of the Data Apocalypse

The data was not longer required.

First came the horseman named Delete, he said in a thunderous voice, “You data will soon be gone.”

The user replied, “But what if I want to see it again.”

Next came the horseman named Trash, she said, “You data will be gone soon, but if you sign up for this simple insurance policy, it can be brought back – if you invoke the policy quickly.”

The user replied, “But I may want it at any time in the future.”

Next came the horseman named Archive, he said philosophically, “You data will not be available, except when it is.”

The user replied, “But I want it to be available at any time.”

Next came the horseman named Retire, and she said, “You’ve put effort into your data, you don’t want it disappearing. It may not be as active as it once was, but with good planning you may keep it working for you always.”

The user smiled and was happy, “Yes, I want them all.”

Facebooktwitterredditpinterestlinkedinmailby feather

Repository or DAO?

A few months ago, while considering the design of a new project…the issue occuping my thoughts was in the differences in possible design technologies. Two directions: one a relational database, two an object model. Why was it an issue? How are the designs kept in sync? Where does a developer turn to first if they want to make a change to the core design of an application that is expressed in many places (at least the relational database and the objects)?

So, I considered the idea of a ‘domain model’, turned it around a few times and thought it could be a good thing. Not only would it be the common language between various technical areas, if it was expressed in business terms, the testers and business people could also be involved in a discussion around it, or use it. Convinced this was something good, I Googled ‘domain model’ and found that 1. yes it was a good idea and 2. someone had written a book about it. I ordered Domain Driven Design by Eric Evans and have been reading through it.

Good book (pity it goes for the fashionable ‘xxx driven xxx’ title, which also sounds confusingly similar to Model Driven Development). It has clarified many things for me, and supported other thoughts, like the ‘Ubiquitous Language’ pattern… “A project faces serious problems when its language is fractured.”

Many standard patterns for domain components are listed and discussed in the book. One pattern is called a Repository. It troubled me because elsewhere in the world of development there is a DAO (Data Access Object) pattern. Why was I concerned? Because the two patterns initially seemed to be the same, which should be used? This is what I understand now.

Repository Responsibilities

  • Provide a common language to all team members (including business representatives to whom DAO would need constant explanation)
  • Provide a higher level of data manipulation – something that may be common to the data regardless of how it is retrieved.
  • Provide a mechanism to manage existing entities (where a Factory might create you a new one).

DAO Responsibilities

  • Provide data access. Perhaps you could say it is about persistence strategies, underneath a DAO interface there is one or more implementations – eg JDBC, Hibernate.

Example: the domain model might represent plants, one entity may be a Tree which has an associated TreeRepository.

TreeDomainModel

Implementation 1:

There is a database table in the background with TREE_ID, LATITUDE, LONGITUDE, HEIGHT columns. Perhaps we code TreeRepository as an interface which a JDBC DAO implements.

Implementation 2:

We now access to a location Service which stores the location of things. We also have a database table with TREEID, LOCATIONID, HEIGHT columns. Perhaps we implement a TreeRepository that accesses both the location service and JDBC DAO to construct a full Tree object.

Facebooktwitterredditpinterestlinkedinmailby feather

How Unit Testing is changing my coding style

One of the books that I am reading at the moment is “First Things First”, it mentions a cultural difference (yes blatant generalisation) between east and west along the lines of:

  • West – if it ain’t broke, don’t fix it
  • East – kaizen (continuous improvement).

This idea probably encapsulates the biggest change unit tests have made to my approach. I will now design and code with confidence that I can improve the code at any time.

That idea allows me to start without (too much) future thought as to what will be required – great for someone who makes use of intuition to program. I have often spent considerable non-programming time proving to myself and others that an idea will work. What does having a unit test have to do with this? It provides a framework where change is expected – sometimes an intuitive idea just does not work. I can rely on tests to prove the idea, or not. A schedule of work can be broken into smaller pieces, each of which I can focus on individually. The individual focus effect provides a productivity gain. Instead of building the mental castle of an entire source code structure, I can concentrate on the unit in front of me. (If there is a side effect because of what I do, it will be reported when the tests are run – note that I include integration testing in my idea of a ‘unit’ test framework.) Though I was always willing to make cosmetic code improvements to something that I had personally written, I would regularly reach a point where I would be hesitant. Now, things like cosmetic changes to provide better code understanding, and design changes to better suit the application are done (almost) without hesitation. It is possible to ensure that the code still works even when I make major structural changes.

The cost in using a unit test framework is of course in writing and maintaining the tests. However, the way I have always worked, is code a bit, test a bit, code a bit more, test a bit more, etc. So there has always been a testing cost in what I write, it is just that now the test has also become a deliverable. There could probably be a debate on which approach costs more – possibly over a project lifetime they may be the same? Plus I like the idea that I can code a bit, test that bit, code a bit more, test it all, code a bit more, test it all, etc.

Facebooktwitterredditpinterestlinkedinmailby feather