metaware

In modern programming, there’s a lot of cool stuff that can be done with metadata. Failing to consider it in one form or another is setting aside a major tool in any system architecture. Sadly, I see this from time to time, a project with a lot of information associated with it… typically the metadata ends up being expressed only as requirements… and is not applied any further in automation. My post on Edgewater’s blog highlights code generation as one of my favorite ways metadata can be leveraged using things like standardized requirements documentation as metadata.

Wikipedia argues that metadata is divided into two categories, “data about data types” such as XSDs and WSDLs, and “data about content” is only helpful in determining application. According to their current article, metadata is akin to information about a container such as a box.. how big is it? how much can it hold? Metacontent is information about what the box contains… what is it? how much does it weigh? what are its storage instructions? what are the contents of the instruction manual? The former information is often best applied at design time. The latter, is often more economically applied at runtime. I would argue that it goes deeper than that, as well. One man’s metacontent is another’s metadata. The term metadata could be used to cover:

raw data
structure/language definition (syntax) (describing how to express a schema)
structure/language expression definition (schema)
content
content description/classification

Metadata in information systems has two major aspects: expression and application. Expression is everything from the media used to the language (even characters) that are used to communicate information. Expression implies how the information can be managed and transmitted, and even how the information can be applied. Application is all about how the information is used, impacting utilization of resources requried to leverage it at the time that it is applied.

In application development, metadata can take many forms. It can look like lots of different things. Here’s a list enumerating a few forms in the development world, along with pros and cons of each:

BLOB (Fully Custom/Encoded Expression)

Notes: At the core of all other computer data expression is a specialized BLOB. I mean, on disk, there’s no visible difference between an Excel Spreadsheet and a JPEG image… it’s just a block of bytes. The meaning of each byte is interpreted by the application that uses it. (Its beauty is in the eye of the beholder.)
Pro’s: Most flexible expression. Any information that can be expressed within a computer can be encoded to live in a Binary file.
Con’s: Any “beholder” must typically be created from scratch, making it typically tightly coupled with its application. Harder to manage in terms of media & transmission.

Plain Text

Pro’s: Very flexible, customizable, can be loosely coupled to the application. Lots of options with respect to media and transmission.
Con’s: Plain text can take any form, often has to be parsed using custom tools, care has to be taken to make sure plain text is extensible, human readable, and structured enough for processing.

Pro’s: Much more structured by definition, easy to extend by definition. Many many useful tools exist to define, manage, communicate, index, and consume XML. Very loosely tied to its application(s).
Con’s: Not quite as human-readable as say, plain text might be.

JSON

Pro’s: Extensible, structured, blurs the line between code & metadata, since JSON (JavaScript Object Notation) is evaluated by runtimes as code. Meant for web client consumption.
Con’s: not nearly as broadly supported as XML, tools are not available in as many contexts. Not so well supported for consumption outside web clients.

XAML

Pro’s: XML based, and like JSON, is an object initialization notation. It is more widely known for describing UI elements in WPF and Silverlight. Directly compiles to code.
Con’s: Requires WPF or Silverlight runtimes to use, not well supported beyond client-side rich apps, media may be typically described as “embedded in binary code”.

Embedded in Programming Languages

Pro’s: typically compiled into code, metadata can be expressed as code litterals or code attribution. This means the data is almost fundamentally instantly available to the running application that consumes it. Done well, especially with code attribution, this tends to provide meaningful human-readable information to a programmer (typically more so than comments) and a runtime processor at the same time.
Con’s: embedded metadata must be processed at runtime, every time. In cases where there are large amounts of data involved, done wrong, this can be time consuming. Typcially forces metadata content management into source code control, which arguably may not be the best way to manage it. (source control of metadata is a “Pro”, but putting it in something like TFS can make it hard to reach.) Litterals and attribution can tie metadata to the code that consumes it.

Excel Spreadsheets

Pro’s: typically easily read & shared, easy to manage & communicate as a file.
Con’s: not so easy to collaborate on, typically concurrent content managers must merge changes. Automation can be klunky.

Database Tables

Pro’s: Easy medium for a developer to work with
Con’s: Not so easy a medium for anyone else, hard to manage in source control, requires custom tools for end-users to work with if they need to.

SharePoint Lists

Pro’s: Easy medium for anyone to use / collaborate, SharePoint provides content management tools. Data is easily decoupled, accessible via web (and web services), and exportable to files such as Excel spreadsheets.
Con’s: Hard to keep metadata sync’d with source code control. (If this is a requirement, the workaround involves exporting documents and checking them in.)

Metadata application, in programming terms, can be accomplished in several different places. It can be applied at:

Design Time

Common application: code generation based on metadata or metacontent information.

Run Time

Common application: runtime behavior/presentation modification and/or extension based on metacontent information. This could include interpreted language processors. One could argue that .NET Common Intermediate Language (CIL) is metadata for that reason, since a runtime interprets it and executes native code from it.

Test Time

Common application: baseline test data, test configuration. (In turn, these can be applied at “design time” such as generating unit test code, or runtime, such as gathering baseline test data to compare to results, or determining executability and/or parameters for tests.)

Deploy Time

Common application: Deployment targets/configuration based on build content metacontent.

Code generation is one of the most compelling uses for metadata in information systems, but it’s certainly not the only use. One of my other favorite topics, SharePoint, highlights this by making metacontent a cornerstone of content and document management, classification, reporting, and search indexing. Throwing a document in a plain old file share provides you with a little metadata; the folder names it its file path typically has meaning, as well as the dates, times, and even the permissions applied to it. Throwing the same document in SharePoint provides a ton more information about the document content by default, and typically indexes it, making search a far more powerful tool.

Tech in the 603, The Granite State Hacker

Code Generation

March 11, 2011 Jim Wilcox (The Granite State Hacker)

I’ve been evangelizing code generation since the work I did at Providus / FRS Global…

One of my arguements on the topic got published by Edgewater…

I love the picture on it… 🙂

Tech in the 603, The Granite State Hacker

Application Platform Infrastructure Optimization

September 25, 2008 Jim Wilcox (The Granite State Hacker)

In doing some research for a client on workflow in SharePoint, I came across this interesting article about the differences between BizTalk 2006 and the .NET Workflow Foundation (WF).

The article itself was worth the read for its main point, but I was also interested in Microsoft’s Application Platform Infrastructure Optimization (“APIO”) model.

The “dynamic” level of the APIO model describes the kind of system that I believe the .NET platform has been aiming at since 3.0.

I’ve been eyeing the tools… between MS’s initiatives, my co-workers’ project abstracts, and the types of work that’s coming down the pike in consulting. From the timing of MS’s releases, and the feature sets thereof, I should have known that the webinars they’ve released on the topic have been around for just over a year.

This also plays into Microsoft Oslo. I have suspected that Windows Workflow Foundation, or some derivative thereof, is at the heart of the modeling paradigm that Oslo is based on.

All this stuff feeds into a hypothesis I’ve mentioned before that I call “metaware”, a metadata layer on top of software. I think it’s a different shade of good old CASE… because, as we all know… “CASE is dead… Long live CASE!”

Tech in the 603, The Granite State Hacker

April 6, 2008 Jim Wilcox (The Granite State Hacker)

There’s been a fair amount of buzz in the IT world about IT-Business alignment lately. The complaints seem to be that IT seems to produce solutions that are simply too expensive. Solutions seem to range from “Agile” methodologies to dissolving the contemporary IT group into the rest of the enterprise.

I think there’s another piece that the industry is failing to fully explore.

I think what I’ve observed is that the most expensive part of application development is actually the communications overhead. It seems to me that the number one reason for bad apps, delays, and outright project failures, is firmly grounded in communications issues. Getting it “right” is always expensive. (Getting it wrong is dramatically worse.) In the current IT industry, getting it right typically means teaching analysts, technical writers, developers, QA, and help desk significant aspects of the problem domain, along with all the underlying technologies they need to know.

In the early days of “application development”, software based applications were most often developed by savvy business users with tools like Lotus 1-2-3. The really savvy types dug in on dBase. We all know why this didn’t work, and the ultimate response was client-server technology. Unfortunately, the client-server application development methodologies also entrenched this broad knowledge sharing requirement.

So how do you smooth out this wrinkle? I believe Business Analytics, SOA/BPM, Semantic web, portals/portlets… they’re providing hints.

There have been a few times in my career where I was asked to provide rather nebulous functionality to customers. Specificially, I can think of two early client-server projects where the users wanted to be able to query a database in uncertain terms of their problem domain. In both of these cases, I built application UI’s that allowed the user to express query details in easy, domain-specific terms. User expressions were translated dynamically by software into SQL. All of the technical jargon was hidden away from the user. I was even able to allow users to save favorite queries, and share them with co-workers. They enabled the users to look at all their information in ways that no one, not even I, had considered before hand. The apps worked without giving up the advances of client-server technology, and without forcing the user into technical learning curves. These projects were both delivered on time and budget. As importantly, they were considered great successes.

In more recent times, certain trends that have caught my attention: the popularity of BI (especially cube analysis), and portal/portlets. Of all the other tools/technologies out there, these tools are actively demanded by business end-users. At the same time, classic software application development seems to be in relatively reduced demand.

Pulling it all together, it seems like the IT departments have tied business innovation into the rigors of client-server software application development. By doing this, all the communications overhead that goes with doing it right are implied.

It seems like we need a new abstraction on top of software… a layer that splits technology out of the problem domain, allowing business users to develop their own applications.

I’ve hijacked the word “metaware” as a way of thinking about the edge between business users as process actors (wetware) and software. Of course, it’s derived from metadata application concepts. At first, it seems hard to grasp, but the more I use it, the more it seems to make sense to me.

Here is how I approach the term…

As I’ve mentioned in the past, I think of people’s roles in business systems as “wetware“. Wikipedia has an entry for wetware that describes its use in various domains. Wetware is great at problem solving.

Why don’t we implement all solutions using wetware?

It’s not fast, reliable, or consistent enough for modern business needs. Frankly, wetware doesn’t scale well.

Hardware, of course, is easy to grasp… it’s the physical machine. It tends to be responsible for physical storage and high-speed signal transmission, as well as providing the calculation iron, and general processing brains for the software. It’s lightening fast, and extremely reliable. Hardware is perfect in the physical world… if you intend to produce physical products, you need hardware. Hardware applications extends all the way out to wetware, typically in the form of human interfaces. (The term HID tends to neglect output such as displays. I think that’s an oversight… just because monitors don’t connect to USB ports doesn’t mean they’re not human interface devices.)

Why do we not use hardware to implement all solutions?

Because hardware is very expensive to manipulate, and takes lots of specialized tools and engineering know how to implement relatively small details. Turnaround time on changes makes it impractical in risk-management aspects for general purpose / business application development.

Software in the contemporary sense is also easy to grasp. It is most often thought to provide a layer on top of a general purpose hardware platform to integrate hardware and create applications with semantics in a particular domain. Software is also used to smooth out differences between hardware components and even other software components. It even smooths over differences in wetware by making localization, configuration, and personalization easier. Software is the concept that enabled the modern computing era.

When is software the wrong choice for an application?

Application software becomes a problem when it doesn’t respect separation of concerns between integration points. The most critical “integration point” abstraction that gets flubbed is between business process and the underlying technology. Typically, general purpose application development tools are still too technical for user domain developers, and so quite a bit of communications overhead is required even for small changes. This communications overhead is becomes expensive, and complicated by generally difficult deployment issues. While significant efforts have been made to reduce the communications overhead, these tend to attempt to eliminate artifacts that are necessary for the continued maintenance and development of the system.

Enter metaware. Metaware is similar in nature to software. It runs entirely on a software-supported client-server platform. Most software engineers would think of it as process owners’ expressions interpreted dynamically by software. It’s the culmination of SOA/BPM… for example BPMN (Notation) that is then rendered as a business application by an enterprise software system.

While some might dismiss the idea of metaware as buzz, it suggests important changes to the way IT departments might write software. Respecting the metaware layer will affect the way I design software in the future. Further, respecting metaware concepts suggests important changes in the relationship between IT and the rest of the enterprise.

Ultimately it cuts costs in application development by restoring separation of concerns… IT focuses on building and managing technology to enable business users to express their artifacts in a technologically safe framework. Business users can then get back to innovating without IT in their hair.

Tag: metaware

Metadata in Software Development

Code Generation

Application Platform Infrastructure Optimization

Metaware