The latest aspect of software we’ve been discussing and implementing for our ontology is ‘life cycle’. This is one of the features ‘bought’ by the workshop attendees in our planning poker session. We are interpreting this as a requirement for the ontology to capture information about software that allows users to understand where in the software development life-cycle an application sits, or how ‘mature’ it is.
Software maturity is, at first glance, relatively straightforward: mature software is stable, robust, maintained and supported: Microsoft Excel is mature, the script knocked up by a PhD student is not. A little more thought, however, reveals that, like versions, we are trying to capture loose conventions – a difficult task! Even where developers explicitly state the maturity of their software, giving it a ‘beta’ label for example, potential users are often no wiser about how stable or well supported it really is – Google was famous for its use of the ‘beta’ label on software that was implemented on a scale and with a robustness that many products never attain. Statements made by developers must always be considered in the knowledge that they are reflections not just of the maturity of the software but of other factors, such as marketing or political decisions.
Even existing models of maturity are not particularly helpful. For example, we might hope that the Capability Maturity Model would help, but this is really an aid to project management that focuses on software maturity from the point of view of the development process. Another (admittedly old) measure of maturity is the Software Maturity Index from the IEEE, which essentially considers maturity as the rate at which the code is changing; again this is a metric for developers that is of little relevance to our ontology.
So, how can we handle the life-cycle/maturity question? The critical questions to ask when determining the maturity of software seem to be the following:
- Is the software robust (does it have many bugs)?
- Is the software maintained (will bugs be fixed)?
- Is the software under development (can we expect new features, improvements to the UI, algorithms or performance; will the software evolve and adapt to the changing usage environment)?
In addition, we have some other questions that we can ask about software that give useful information to users and reflect the level of maturity:
- Is the software supported (do the publishers or developers respond to questions from users)?
- Has the software been around a long time?
- Does the software have a significant user base?
You will notice that many of these (in fact most of them) match competency questions that were gathered at our workshop. There were also other questions asked (see the full list) that showed that people wanted to be able to compare applications – ‘which is the best/fastest/most robust software to read this data?’. This type of question is more or less impossible to answer directly, as it depends on many subjective factors, e.g., user preferences for graphical user interface vs. command line. What we can do, however, is attempt to capture as much objective information as possible, and capture it in such a way that users will be able to ask sensible questions of the ontology so they can get information that leads them as close as possible to the answers.
These concrete statements might include:
- Release dates. Including the release date for the first version of some software, and for subsequent versions, will allow us to get an idea of how long the software has been around, how frequently it is updated, and how long it is since the last release. Of course, this information is only as good as what is entered in the ontology, and the open world assumption means that we cannot infer that the most recently released version of software in the ontology is actually the most recent unless we explicitly say so. Deciding how best to use release dates will require us to take a pragmatic balance between saying as much as possible about software and minimising maintenance requirements.
- Sources of support. URLs for discussion forums will not only directly help people using using software, but will also help those using the SWO to discover and compare tools for analysing data. The presence of forums about software indicate a level of support and usage – both of which were of interest to people at the workshop.
- The declared status. Notwithstanding the comments I made above about the unreliability of the status given by developers or publishers, this information is objective (in the sense that a given status has been declared) and it makes sense to pass it on to SWO users where available. (Incidentally, this suggests a need for provenance/reliability codes, to describe the source and reliability of information; I hope to write about that later.)
- Bug counts. For open source software it is often possible to get a count of the number of bugs – outstanding bugs, fixed bugs, and those which are unlikely to be fixed. This information is superficially appealing as it directly deals with the first two questions above. However, a number of problems suggest that this is not really suitable for direct inclusion in the ontology: first, the raw numbers take no account of the complexity of the software or how significantly a bug impacts on usability; more importantly, this information is constantly changing and would cause serious maintenance difficulties.
In conjunction with other information already in the ontology, such as publisher and developer, this simple list should cover the majority of requirements without making to ontology too complex or difficult to maintain, allowing users to get a feel for the maturity of software.