Author Archives: warren

The Documentation Reflex

We can find key information. Key information is whatever anyone needs to do their work.

So what? What can we do to achieve this blissful informed state? What are you going to go about it?

People have many opportunities to document. We make the choice to document, or not, all the time…

  • Making a decision, evaluating options
  • Discussing requirements with a user
  • Today’s task list
  • While programming
  • Asking how to use an application’s feature

when do we choose to document? Usually when we are forced to, or the ‘doing’ of a task is assisted by a document…

  • Recording pros and cons for various options
  • Draw a user interface layout on a whiteboard
  • Write your task list on a post-it note
  • Pseudo code to explore design options
  • Email a question to application support

when do we want to keep a document? There is no easy answer, something will be useful if it is ever used. Beyond the base documentation requirements of our workplace, we are empowered to capture information based on our judgement of importance balancing cost and value.

What does it take to move documentation, that was functional during a task, to something suitable for keeping? Effort.

  • Recording options in a Word document.
  • Having a whiteboard that will email its user interface drawing to you (so you can attach it to a change request).
  • Change from post-it note to task management software.
  • Write pseudo code in your development tool as code comments.
  • Post a question to an application forum.

Ideally we want to reduce the effort overhead to zero so that we have a ‘documentation reflex’ – documentation that happens through work habits. Ask yourself if what you are doing is worth documenting. If it is, are you capturing it in a way that minimises your effort to document it?

Encouraging updates

The effort to change documents should also be minimal. Tools can help or hinder:

  • If I have the document open for edit, does it lock out other potertial updaters?
  • Is a history of changes recorded – can you revert to a version without all those mistakes you just made?

In most cases authorship should be shared so that everyone feels able to make a change. There is a cost to update ease when adding ceremony to a document (eg prescribing a certain format, or requiring a set of signatures). It is often the case that these things are added to stop a document from being updated – what is your change control policy? Only secure things from change with a reason, because change is not inherently bad.

Specific requirements

In addition to simple opportunities, we may have requirements for specific documentation. These requirements are usually more formal, and are made evident through business or process requirements…

  • Production Change Request
  • Application training

These become project deliverables in their own right – you just have to sit down and put effort into creating them.

Created vs evolving documents

When capturing documents, it is important to note that some are created, and others evolve. Some documentation will record a moment in time (like a diary), for example: options analysis, a decision, or meeting minutes. Whereas others will need constant change, for example: application training, comments within the source code, or a process checklist.

There is a grey area for what is a ‘created’ or ‘evolving’ document. For example the user interface design captured via a whiteboard. If the user requests a change, should the already documented user interface be revisited, or just the change request be captured? The choice in this case comes back to what you have identified as important to capture. Do you accept the initial design and change requests as diary-like documents, with the built system reflecting the current user interface? Or do you need to have an up-to-date model of the user interface?

Making the choice to have an evolving document is more expensive. A guideline can be to make a ‘created’ document the default output, then only evolve that document as a considered decision.

Summary

  • Capture key information through your work practices.
  • Minimise the effort and ceremony to capture information.
  • Create documents by default, evolve them by choice.

How much is enough documentation?

What really constitutes a helpful set of documents for software development? That is possibly a poor choice of words as there is not even a need to restrict ourselves to formal ‘paper’ documents for communication.

Have you ever heard:

  1. "You cannot trust the documentation because it is out of date."
  2. "I’ll need to ask a developer to look at the source code and get back to you on the definition for that business rule."

Statements that probably indicate dysfunctional documentation, but not without cause, there are a number of challenges:

  • As a project iterates to maturity, things change. Keeping an up to date, extensive set of documents directly reduces a team’s ability to deliver software.
  • Putting little or no effort into documentation creates a long term problem for the people who would like to know key information.
  • What documentation is valuable?
  • How should documentation be captured?
  • How do you find a piece of documentation?

Looking at those challenges we can probably state the communication vision:

We can find key information.

So what is key information?

Key information is whatever anyone needs to do their work. Quite a broad scope, and probably the point of most disagreement in various documentation strategy discussions. How do we agree on a balancing point for the documentation content? It is too expensive to capture everything that anyone might ever need. So, how much will be captured so that it does not unduly hinder what we are doing now…or next year?

A helpful question to ask is: Who is the audience and what is their likely need?

OK, what tactics can be employed to capture key information?

  • Construct a team culture that requires a number of people to know it.
  • Video – a demo, someone talking about it, …
  • Record (audio) someone talking about it.
  • Photograph it (works well for those whiteboards that don’t print).
  • Write it down.

Fine, we have our key information and it is valuable, how do I find it?

  • I already know it.
  • I know where it is.
  • I can ask someone who knows the answer.
  • I can ask someone who knows where the answer is.
  • We have conventions on where to store information artifacts and I can browse them manually.
  • We employ technology to search for it (eg a search engine like google).

An example:

A software system is developed and maintained.

Identified audiences and their needs:

  • Software developers: high level overview, business requirements/priorities, system architecture, domain model/processes, business rules, source code, key design decisions
  • Business owners: high level overview, domain model/processes, business rules
  • Software users: training, system help

Communication Strategy:

To facilitate communication an intranet site is created which includes automatically produced documents (eg automated test output, automatic code documentation), a wiki is setup to store virtual artifacts, a forum is setup for the system users to ask questions and provide general feedback. A search engine indexes the whole lot, so the team can find information.

  1. High level overview: video the business owners presenting their hopes for the system, write a vision for the project (possibly already part of a business case document)
  2. Business requirements/priorities: record simple statements of key requirements, obtain regular access to key business people, provide constant feedback to key business people, provide an issue tracking system that captures issues and comments
  3. System architecture: produce architectural diagrams with technical and business viewpoints
  4. Domain Model/Process: produce domain model and process diagram(s)
  5. Business Rules: write up business rules
  6. Source Code: write source code with comments that may be extracted to a documentation system (for example javadocs).
  7. Key Design Decisions: write Technical Memos focusing on the decision
  8. Training/System help: none required because the system is so intuitive 😉

Note the focus is on valuable communication, rather than extensive documentation.

JDBC ResultSet Mapper

In building a better bean processor I discussed some ideas for reflection mapping of a ResultSet to Object.

This time around we can look at a solution design.

Diagram

Here is a UML diagram of the solution (click on the image for a bigger, clearer picture).

Note that the diagram contains a couple of example extensions for the DBUtils BeanProcessor and Spring RowCallbackHandler. One of the requirements for the ResultSetMapper is that it can be easily be plugged into these bigger frameworks.

ResultSetMapper UML

Basic Mapping

Data mapping will occur against ResultSet columns and Object fields.

By default the Object field will require the @MapToData annotation, though this requirement may be turned on/off with ResultSetMapper.setAnnotationRequired(true/false). Why have it off? There would be no requirement to add an empty ‘@MapToData’ annotation. So, why have it on? The source code contains a documentary ‘@MapToData’ annotation to indicate that some magic is being performed on the field.

The mapping will use a NameMatcher to compare the field name with the column name (unless a MapToData.columnAlias is specified). A default NameMatcher – NameConverter – will use simple camelCase to under_score matching.

Data mapping will occur, in order of priority:

  1. Via a specified setter procedure – MapToData.setter
  2. Via the JavaBean standard default setter
  3. Directly into the object field

Inheritance

Inheritance requires that many target classes may be supplied to the ResultSetMapper. However, this means that there must be some way of choosing which target to create. To allow selection of target objects, an ObjectValidator will be used – every type of target class will be constructed from the ResultSet and passed to the ObjectValidator. (The default validator always returns true.)

The advantage of this approach is that the programmer only deals in their domain object, but it does require an overhead of creating extra objects for validation.

Aggregates

Solving this problem requires a couple of technical problem answers:

  • How to specify that an Object is an aggregate target (eg MyDataObject), rather than ‘Just Another Object’ (eg String, MyTransientObject)? Answer: MapToData.isAggregateTarget = true/false
  • How to have multiple fields of the same class with different business meanings (eg MyDataObject start; MyDataObject end;)? Answer: MapToData.columnPrefix or MapToData.columnSuffix

Example

So the example domain from the previous post would now be:

// lang java 
public abstract class Jelly { 
  @MapToData
  Long jellyId; 
  @MapToData
  String jellyType; 
} 
public class JellyAttribute { 
  @MapToData
  String name; 
  @MapToData
  String value; 
} 
public class JellyCompany {
  @MapToData
  String companyName;
  @MapToData
  String address;
}
public class JellyBean extends Jelly { 
  @MapToData
  String targetMarket;
  @MapToData
  BigDecimal weight;
  @MapToData (columnPrefix="flavour_", isAggregateTarget = true)
  JellyAttribute flavour; 
  @MapToData (columnPrefix="colour_", isAggregateTarget = true)
  JellyAttribute colour; 
  @MapToData (isAggregateTarget = true)
  JellyCompany company;
}
public class JellyCup extends Jelly {
  @MapToData
  String productName;
  @MapToData ( columnAliases = { "cup_volume" } )
  BigDecimal volume;
  @MapToData (isAggregateTarget = true)
  JellyAttribute shape;
}
public class JellyCupAndSpoon extends JellyCup {
  @MapToData
  String spoonMaterial;
  @MapToData
  BigDecimal spoonLength;
}

Building a better BeanProcessor

At the moment I’m coding up a ‘BeanMapper’. Something to automatically transform a ResultSet into a JavaBean object. (Why would you want to do this? Write and maintain less code will probably do as a worthwhile cause.)

The DBUtils Apache commons project has a BeanProcessor that does a basic job. It is good for a simple mapping, however it does not meet the following challenges:

  1. Inheritance – a ResultSet may map to different Objects of the same parent over different rows
  2. Aggregates – a ResultSet may contain the data to build some/all of the aggregate objects
  3. Multiple Objects – a ResultSet may contain the data for many objects in one row

Challenge #1 can occur whenever inheritance is used, Challenges #2 and #3 are linked with improving performance (though I’m finding it difficult to imagine a scenario for #3 right now).

As well as the challenges there are some improvements that may be made:

  • The current BeanProcessor mapping works with the JavaBean standard – getColumnName/setColumnName will map to columnname in the database. A solution that also works with Annotations and/or Object fields would be helpful.
  • Automatic name mapping of the object ‘standard’ (camelCase) to the database ‘standard’ (under_score) would also be helpful.
  • Since I’m a user of the Spring framework, something that I can use as a default RowCallbackHandler

So what would my ideal mapping solution look like in operation?

I would have JDBC DAO that maps a variable ResultSet automatically to a set of target classes…

// lang java
public interface JellyDAO {
  public Jelly find(Long jellyId);
  public List find(FindRequest findRequest);
} 
public class JellyDAOJdbc implements JellyDAO {

  List targetClasses;
  List aggregateClasses;
  BeanMapper beanMapper;

  public jellyDAOJdbc() {
    // The ResultSet may contain data for 
    // any of the following target classes.
    targetClasses = new ArrayList();
    targetClasses.add(JellyBean.class);
    targetClasses.add(JellyCup.class);
    targetClasses.add(JellyCupAndSpoon.class);
    // The Target classes may contain the following classes
    // that may also come across in the ResultSet.
    aggregateClasses = new ArrayList();
    aggregateClasses.add(JellyAttribute.class);
    aggregateClasses.add(JellyCompany.class);
    //
    beanMapper = new BeanMapper(targetClasses, aggregateClasses);
  }

  private Jelly processResultSetRow(ResultSet resultSet) {
    // Map the ResultSet to a Jelly class.
    Jelly jelly = (Jelly) beanMapper.toBean(resultSet);
    return jelly;
  }
  
  private List finder() {
    // Obtain a ResultSet..

    // Loop through result set..
    {
      jellyList.add(processResultSetRow(resultSet));
    }

    return jellyList;
  }

}

The Jelly domain model might be something like…

// lang java 
public abstract class Jelly { 
  Long jellyId; 
  String jellyType; 
} 
public class JellyAttribute { 
  String name; 
  String value; 
} 
public class JellyCompany {
  String companyName;
  String address;
}
public class JellyBean extends Jelly { 
  String targetMarket;
  BigDecimal weight;
  JellyAttribute flavour; 
  JellyAttribute colour; 
  JellyCompany company;
}
public class JellyCup extends Jelly {
  String productName;
  BigDecimal volume;
  JellyAttribute shape;
}
public class JellyCupAndSpoon extends JellyCup {
  String spoonMaterial;
  BigDecimal spoonLength;
}
public class JellyRepository {
  public Jelly find(Long jellyId);
  public List find(FindRequest findRequest);
}

I would also have a database view or stored procedure with the following signature…

jelly_id, jelly_type, target_market, product_name, 
spoon_material, spoon_length, cup_volume, flavour_name, 
flavour_value, colour_name, colour_value, company_name, 
company_address, shape_name, shape_value

The Four Horseman of the Data Apocalypse

The data was not longer required.

First came the horseman named Delete, he said in a thunderous voice, “You data will soon be gone.”

The user replied, “But what if I want to see it again.”

Next came the horseman named Trash, she said, “You data will be gone soon, but if you sign up for this simple insurance policy, it can be brought back – if you invoke the policy quickly.”

The user replied, “But I may want it at any time in the future.”

Next came the horseman named Archive, he said philosophically, “You data will not be available, except when it is.”

The user replied, “But I want it to be available at any time.”

Next came the horseman named Retire, and she said, “You’ve put effort into your data, you don’t want it disappearing. It may not be as active as it once was, but with good planning you may keep it working for you always.”

The user smiled and was happy, “Yes, I want them all.”

Repository or DAO?

A few months ago, while considering the design of a new project…the issue occuping my thoughts was in the differences in possible design technologies. Two directions: one a relational database, two an object model. Why was it an issue? How are the designs kept in sync? Where does a developer turn to first if they want to make a change to the core design of an application that is expressed in many places (at least the relational database and the objects)?

So, I considered the idea of a ‘domain model’, turned it around a few times and thought it could be a good thing. Not only would it be the common language between various technical areas, if it was expressed in business terms, the testers and business people could also be involved in a discussion around it, or use it. Convinced this was something good, I Googled ‘domain model’ and found that 1. yes it was a good idea and 2. someone had written a book about it. I ordered Domain Driven Design by Eric Evans and have been reading through it.

Good book (pity it goes for the fashionable ‘xxx driven xxx’ title, which also sounds confusingly similar to Model Driven Development). It has clarified many things for me, and supported other thoughts, like the ‘Ubiquitous Language’ pattern… “A project faces serious problems when its language is fractured.”

Many standard patterns for domain components are listed and discussed in the book. One pattern is called a Repository. It troubled me because elsewhere in the world of development there is a DAO (Data Access Object) pattern. Why was I concerned? Because the two patterns initially seemed to be the same, which should be used? This is what I understand now.

Repository Responsibilities

  • Provide a common language to all team members (including business representatives to whom DAO would need constant explanation)
  • Provide a higher level of data manipulation – something that may be common to the data regardless of how it is retrieved.
  • Provide a mechanism to manage existing entities (where a Factory might create you a new one).

DAO Responsibilities

  • Provide data access. Perhaps you could say it is about persistence strategies, underneath a DAO interface there is one or more implementations – eg JDBC, Hibernate.

Example: the domain model might represent plants, one entity may be a Tree which has an associated TreeRepository.

TreeDomainModel

Implementation 1:

There is a database table in the background with TREE_ID, LATITUDE, LONGITUDE, HEIGHT columns. Perhaps we code TreeRepository as an interface which a JDBC DAO implements.

Implementation 2:

We now access to a location Service which stores the location of things. We also have a database table with TREE_ID, LOCATION_ID, HEIGHT columns. Perhaps we implement a TreeRepository that accesses both the location service and JDBC DAO to construct a full Tree object.

Our new robot – what next?

We recently bought a roomba robot vacuum cleaner…it gives us back some time. One of the best bits of technology that we have purchased lately.

It started me thinking about the mid nineties when I signed up with my first ISP and jumped into the internet and the early eighties when my father bought home a new Apple II computer.

Wonder what consumer robots will be like in 10 years?

Also, I wonder what will be ‘introduced’ in the next 10 year hop – nano-bots, bio-tech?