At the NSF Invitational Workshop on Distributed Information, Computation and Process Management for Scientific and Engineering Environments (DICPM) I gave a presentation entitled STEP Models and Technology. At the end of the presentation I was asked to provide more pointers to STEP/EXPRESS related information sources. This note is in response to that request. I also append some personal suggestions resulting from the problems exposed and discussed during the workshop.
The STEP and EXPRESS (Sections step and express) related portion is essentially what I have been sending as a generic response to interested enquirers over the years that I have been in academia and at the National Institute for Standards and Technology (NIST). I have updated some of the pointers for this occasion.
My comments on the relationship between STEP/EXPRESS and the Workshop topics are given in Section comments. Pointers to sources of information about STEP and EXPRESS are in Section resources.
In my generic response I alway included the following disclaimer, which also holds for the whole of this document.
This is purely for your information. There are no guarantees that the information below is either current or correct. There is no endorsement, either express or implied, concerning any organization or commercial product by myself or any past, present or future employer or any other organization with which I have been, are, or will be connected.
STEP (STandard for the Exchange of Product data) is the colloqial term for the International Standard ISO 10303 Industrial systems and integration - Product data representation and exchange, the first release of which occured in 1994. STEP is being developed under the auspices of ISO TC184/SC4.
STEP is published as a series of Parts (e.g., Part 11 is ISO 10303-11:1994).
STEP is targeted at the exchange of data describing a product between Computer Aided X (X = CAD, CAM, ... etc.) systems, and also long term data retention of such data. Specifically, the exchangable product data is defined in the Application Protocols. EXPRESS is the language used within STEP to formally define the semantics of the data, and the 20 series of Parts specify the standard data exchange mechanisms (e.g., data file or API access).
EXPRESS itself is a lexical object flavoured information modeling language and is defined in ISO 10303-11:1994. EXPRESS-G is an iconic language that provides a subset of the lexical modeling capabilities; this is defined in Annex D of ISO 10303-11:1994. EXPRESS-I is another member of the family and is designed for the display of data instances and the specification of abstract test cases. It is defined in ISO/TR 10303-12:1997.
EXPRESS is used in many other activities outside STEP. For example the EDIF standards and in the Petrotechnical Open Software Corporation's standards. Other examples include asset management by the London Stock Exchange and use by the Human Genome Project.
Currently a new edition of the EXPRESS Language Reference Manual is in preparation and should be available for its first public ballot around the beginning of 1998. A new member of the EXPRESS family, called EXPRESS-X, is under development. This is being designed as a formal language for semantic and data mapping purposes.
Part 21 of STEP defines a file format for the exchange of data correspond to an EXPRESS information model. It includes both the specification of the actual file format and a specification of a mapping from an EXPRESS model specification to a data representation.
Part 22 of STEP defines a generic Standard Data Access Interface (an API) for accessing data stored in computer memory or disc corresponding to an EXPRESS information model. Other Parts in this series specify mapping from the generic SDAI to particular programming languages such as C++, IDL, Java, ... (in various stages of development).
Taken together, EXPRESS and the implementation methods provide a technical solution to a diverse range of information modeling and data exchange needs.
The EXPRESS family of languages has been, and is being, developed under the auspices of ISO TC184/SC4. EXPRESS itself is an object-flavored lexical language for information modeling. The EXPRESS Language Reference Manual also defines a graphical subset of the lexical language called EXPRESS-G. The third member of the family is called EXPRESS-I and is a lexical language for the display of data instances and and also for the formal definition of test cases. The Schenck and Wilson book (see below) provides a more user oriented view of these languages than the ISO standard documents, and also outlines a modeling methodology.
A fourth member of the family, called EXPRESS-X, is in preparation as a mapping language for data translation between two EXPRESS models that are similar in semantic meaning but which differ in their data forms. Finally, a second edition of EXPRESS and EXPRESS-G is nearly ready for balloting as an ISO Committee Draft document. Edition 2 adds dynamic modeling and OOPL type methods to the static capabilities of the original EXPRESS language. It will also extend the text definition aspects from the current limitation to ASCII characters to include the full ISO 10646 character set.
In terms of software tools for authouring, apart from EXPRESS-G which requires some drawing capabilities, the members of the EXPRESS family only require a text editor.
EXPRESS was originally developed to provide a formal, and computer processible, means of defining the data necessary to describe a product (anything from a microchip to a battleship) throughout its lifecycle, from time of conception through its manufacture to its time of disposal. ISO 10303, commonly known as STEP (STandard for the Exchange of Product model data), uses EXPRESS as the formal specification of the required data and its relationships. EXPRESS is also used in other standards, such as EDIF for electronic printed wiring boards, ISO TC 211 for Geographic Information Systems, and will be used in forthcoming editions of the SGML and XML standards. It has also found broad applications within industry, such as POSC (Petrotechnical Open Software Corporation) for modeling oilfield exploration and production information, the Human Genome Project for data exchange between genomic databases, and the London Stock Exchange for Asset Management. Many European ESPRIT Projects use EXPRESS. The Swedish Defence Materiel Administration book (see below) describes some of these, as well as describing its use within a CALS environment. US projects and industrial consortia that use EXPRESS include, among others, the CAD Framework Initiative (CFI) and the National Industrial Information Infrastructure Protocols (NIIIP) project.
There are basically two aspects to EXPRESS: (1) it provides for the modeling of data and data relationships with a very general and powerful inheritance mechanism (much more than is provided in OO programming languages), and (2) it includes a full procedural programming language which is used to specify constraints on data instances. As noted, EXPRESS-G is a subset of EXPRESS as it does not include the constraint portions of the lexical language. EXPRESS models may be written in the style of Entity-Relationship, CODASYL, Relational, Object Oriented, or other kinds of data modeling. It may also be considered to be a Set Theoretic specification language, and some have even gone so far as to indicate that it might be classed as a higher order predicate logic language.
Models described using EXPRESS are intended to be implementation independent. As well as providing some unique capabilities, EXPRESS has borrowed from many other languages including Ada, Algol, C, C++, Euler, Modula-2, Pascal, PL/I and SQL. It straddles both programming languages and database specification languages. Being lexical, the language can be compiled, and there are a number of both commercial and public domain compilers available. Typically, these compile EXPRESS into another high level language. Compilers have been developed on the one hand to generate C, C++, Prolog, etc., and on the other hand to generate DDLs (Data Definition Language) for both Relational and OO databases, such as Oracle, ObjectStore and Versant, as well as SQL. Some companies are using EXPRESS as the defintion language for Data Warehouses. Software tools are also available that support modeling using either EXPRESS or EXPRESS-G and enable the transformation between the EXPRESS and the EXPRESS-G representations. A compendium, now a little old, of EXPRESS-based tools and tool suppliers is available (see below). I expect that more tools will become available. For example I recently saw a pre-production demonstration of a bidirectional translator between EXPRESS and UML class diagrams.
One theme of the DICPM Workshop discussions that appeared as a running thread was the question of meta data for scientific data. It appeared as though there was general agreement that this was required, but how to do it, how to pursuade the data generating scientists to perform the extra work necessary to provide the extra information, what form should the meta data take, and so on, were all open questions.
The situation at the Workshop with respect to scientific data seemed to me to be remarkably similar to the situation that manufacturing companies faced in the early 1980s. In the engineering world the solution has become STEP, but it has taken a tremendous effort to get it to the stage where it is in regular use within manufacturing industry. The drivers behind STEP were industry's need to exchange data between a company and its suppliers, to exchange data between partners in some temporary collaboration (for example several companies working on one weapons contract), and the Department of Defence's need to be able to accept data in a vendor-neutral electronic form. The drivers behind the exchange of scientific data are broadly similar but I don't know the entity, or entities, that are equivalent to the DoD; perhaps NSF or some of the professional societies?
I suggest that the scientific community may be able to build on the foundation that is provided by STEP, particularly the STEP technology, and most particularly, on EXPRESS. It is, of course, up to the representatives of the community to decide whether or not this is appropriate, but I do urge that at least it should be considered.
This section provides some pointers to information sources related to STEP and EXPRESS. There is no particular order to the list.
The current Chair of the Secrtariat is Lisa Phillips (firstname.lastname@example.org).
EUG have held an annual conference since 1991. Past proceedings can be ordered from Paula Popson at: email@example.com
NIST hosts various Email exploders for discussion on all aspects of STEP. The exploders are managed by the majordomo software. To start off, send Email to firstname.lastname@example.org with the two line message
HELP ENDand you will receive further instructions. The discussion groups most relevent to EXPRESS are:
NIST has recently announced a new mail exploder, email@example.com. This is for discussions on the NIST Expresso environment and tools for the development of EXPRESS models and checking of data exchange files. For more information go to http://www.mel.nist.gov/msidstaff/denno/nist-expresso.html. Note that NIST Expresso is not the same as the Expresso toolkit from IGD, Germany mentioned above .
To the start