Future Astronomical Systems

Doug Tody (tody@noao.edu)
Fri, 20 Oct 1995 06:23:09 +0100

Hi Folks,

I have spent some time looking at the FADS discussions to date and
considering what if anything to post. It is tough, as the issues are just
too complex to get into in a short email, particularly with the ADASS coming
up. Books could be (and have been) written on the issues touched on in
these discussions.

My main comment is that, except for a few cautious comments by people such
as Brian Glendenning and Eric Mandel and a few others, most of the
discussion has been focused on systems design. Whether they realize it or
not some folks are talking about nothing less than designing a new data
analysis system from scratch comparable to existing systems such as IRAF or
AIPS* (or Khoros, or AVS, or ...). Assuming that such an effort were
successful, which is a very big assumption, what we would end up with after
many years of effort would be just another system, not a set of standard
interfaces. This is true because you are talking about integrating together
a number of complex interfaces to support real world applications, which is
classical systems design, not a standards effort where some low level
library or data structure is defined. A standard API is not very likely to
happen unless these interfaces look a lot like something that is already out
there.

Many things have been said about basic system design, most of which I agree
with. The user interface should be separated from the applications code.
The application should be usable with a variety of user interfaces. One
should not get too deeply in bed with any one operating system or with any
one windowing system such as X, or for that matter with any one programming
language, at least not if the software is expected to be around for a
while. There should be a class library implementing the primary data
structures and access routines needed for astronomical data. A
multiprocess, object oriented or object-like architecture based on messaging
(e.g. distributed objects) is needed for a large system to keep things
modular and flexible. This does not necessarily mean that OO languages need
to be used at the module level.

A high level application-specific environment is needed for large scale
applications development by professional programmers, and is especially
needed for development by scientific users. Ideally the user should be able
to extend the system by programming in a high level interpreted language
(scripting language) providing a lot of built-in application specific
functionality. If the user needs to do any compiled programming the
interfaces should be well isolated and defined so that the user doesn't need
to know any more about the guts of the system than they need to. Naturally
one needs a suite of standard applications as well for things everyone
needs to do, such as data access and management, graphics, image display,
and so on.

And so it goes: we could go on like this forever and probably will. Good
stuff. It was good to see all this reiterated in the FADS discussions.
There is not much new here however. We already have systems which follow
the above model, IRAF is one example.

== Large Systems ==

Lets not confuse "monolithic" with "large system". A large system does not
have to be monolithic, and in fact cannot be if it is larger than a certain
point. Large systems, if they are successful, are always modular inside
with well defined interfaces, even if they appear from the outside to the
user to be monolithic (i.e. well integrated, a whole?). It is the smaller
packages which are most likely to be monolithic, and in fact this may be one
of their advantages since for a small system it may make things simpler. A
well designed large system will be modular and usable either at the module
level (library, task, package, server, etc.) or as an integrated system.
Both modes of usage are desirable for a single system.

Astronomy has a difficult software challenge in that the problem we have to
solve is very complex but our resources are quite limited. We operate in
half a dozen or so quite different wavelength regimes, with many classes of
instrument in each, and the data sets can be both large and inherently
complex. Telescopes and instruments are constantly evolving. Large
applications may take a number of man years to write and large packages many
man years to implement; 50-100 man hears is not unheard of for large
astronomical projects. Modern digital datasets (world wide) are many
Terabytes in size and growing. Most of this data is now archived or will
be soon, and we will need to be able to access these complex data sets
many years in the future.

Because we have to deal with such large problems with very limited
resources, longevity of our systems is very important (much more important
than in most commercial systems for example), as is reuse. Software must be
well structured, with applications sharing common code libraries, or there
is more duplication of effort than we can afford and programs will not be
well integrated. This is why we have large systems. We have large systems
mainly because they are needed to solve the data processing problems faced
by the major centers, e.g. standard reductions for the telescopes and
instruments operated by the centers.

== User Programming ==

Much of the FADS discussion has focused on the concerns of individual
programmers or users who want to write some software but are faced with
either going it alone, or trying to work with someone else's large system.
>From the individual programmers point of view the large system is both a
curse and a blessing. It is a curse because you have to learn a large
system and many design decisions are being forced upon you. It is a
blessing because the large system provides most of the facilites you need to
build a new application, and if you use these common facilities your
application will automatically work with other applications written for the
same environment. Less obviously, if the environment you develop for is
stable over time, i.e., isolates your application from the platform, window
system, user interface, and so on, your application may continue to function
indefinitely with minimal or no maintenance.

An important thing to realize about user programming (folks in the community
developing software for an existing system) is that supporting this is often
secondary to delivering applications software written by the project itself,
or internal use or to support a particular community of users. In the case
of IRAF for example our primary responsibility, at NOAO and at the other
institutions developing software for IRAF, has always been to write our own
applications software. We want folks in the wider community to write
software for IRAF, we support this, and a lot of such software has been
written - but supporting this has never been our top priority. I can't
speak for the AIPS* systems but my impression is that they are in much the
same boat. This is in contrast with IDL, which is just the opposite,
emphasizing nothing but user software development (this is one of the main
reasons for the popularity of IDL - IDL is a very "hands-on" system, and
people always like software they wrote themselves much better than someone
else's software).

Applications development will always be a top priority for the IRAF projects
but in the future we we would like to do more to support user software
development. A system makes a name for itself initially with its
application software. As a system continues to grow however, software
development by the user community becomes increasingly important as
otherwise the size of the core programming group or groups will limit
growth. Easy user programmability at several levels is essential for a
scientific system if end users are to get the most of a system, hence
programmability or extensibility is an important measure of the overall
quality of a system.

== Future Systems ==

So where do we go from here? BOFS like FADS help focus attention on the
issues and this is positive. I can't see much however coming of "resolutions"
which are not backed by the community or any significant resources, and it
is even sillier to talk about anything called a "mandate". Give me a break!
Things happen because people make them happen. Standards aren't designed
by committees, they are successful software products which a lot of people
already use.

It is very unlikely that, within the relatively small worldwide astronomical
community, any committee is going to go off and invent a new system to
challenge the ones we already have. I wonder sometimes with the decline in
science funding worldwide if we will see any more large general purpose
software systems come out of astronomy, at least in the next decade or so.
Probably not, unless they are small systems that leverage off of some
external technology (e.g. the Web-based solutions we are seeing now in some
areas). Given the complexity and uniqueness of the astronomical software
problem this type of approach is unlikely to address our real problems.
There are lots of useful little things out there in the free software
community we can use but we still have to build systems that are useful for
astronomy.

More likely improvements will come in small ways: by making our existing
systems better, by improving the interoperability of our systems, and by
taking advantage of new developments outside astronomy such as fast, cheap
PCs, the great wealth of free software increasingly available, and
technological improvements such as the global network and distributed
processing. As Fred Brooks has said (several times in the past decade),
"there is no silver bullet". People making the right decisions make things
work.

My guess is that most advances in interoperability will come not from a
standard API for astronomical applications, but from improvements to the
integration of modules at the process level, e.g., distributed objects of
various types. APIs define class libraries and nontrival class libraries
are inherently complex. Large class libraries form a hierarchy of
integrated classes and dependent classes. Cross-system integration at this
level is very difficult. Integration of entire programs (objects) at the
process level is much easier because you don't have to standardize what lies
within the process/object boundaries. In the commercial arena Visual Basic
is a successful example of this, as is UNIX shell programming for that
matter. In the FADS discussions, James Coggins description of ARCTIC sounds
like a similar evolution although the emphasis here seems to have been more
on a flexible structure than on integrating heterogenous environments.

== The Open IRAF Initiative ==

These issues have been a matter of much discussion within the IRAF community
for the past several years. We feel that IRAF, at least as currently
presented to the community is too much of a closed, tightly integrated
system - probably what has been meant by "monolithic" in these FADS
discussions. IRAF serves the needs of the large centers or projects well
but is less successful at addressing the needs of the individual astronomer
or small project who/which needs to develop their own software. We have
facilities to support such use but they fall far short of what we or our
users would ideally like to have. IRAF is a very modular system internally
with most of the characteristics mentioned in the first few paragraphs
above, although this may not be evident to someone on the outside looking in.

Of course we in the IRAF community are already doing something about this
and many of you will have already heard about this effort, which is called
the Open IRAF initiative. This addresses a lot of the concerns which have
been expressed, e.g. breaking the system modules out as separate products
and making them useful individually, and giving the programmer more
flexibilty in how they make use of the facilities provided by the system.
It would be inappropriate to go into this in detail here but very briefly
the main elements are as follows:

1) Provide language bindings for the main IRAF libraries (the class
libraries mentioned above). In addition to the native IRAF
preprocessor language, which provides a high level of isolation
from language evolution, C, Fortran and C++ will eventually be
supported, allowing applications programmers to choose which
language to program in. Note that IRAF programs written in C
(for example) are still IRAF programs, indistinguishable from
other IRAF programs unless you look at the source code.

2) Export the main IRAF libraries so that they can be used in non-IRAF
programs. These are external host-level programs, or programs in
other astronomical systems, which call the IRAF libraries using
one of the language bindings. The reverse is also supported, i.e.
calling external libraries from within IRAF programs.

3) There is a third category of enhancements we have been calling
"object glue" for lack of a better name. This refers to things
which make it easier to use pieces of IRAF with pieces of other
systems. For example, calling IRAF tasks standalone outside the
normal IRAF system (this has always been possible), using the CL as
a host (e.g. UNIX) shell for scripting (#!/bin/cl scripts), which
allows entire IRAF tasks to be treated as host level tasks,
improved interprocess communication and messaging, distributed
object support, and support for runtime access to external data
structures such as FITS and the PC image formats.

There are lots of other things going on, for example the X11IRAF code is a
completely separate product from IRAF which is nonetheless well integrated
with IRAF when presented to the user in the installed system. But it could
be used in any other system as well. The Widget Server, a part of X11IRAF,
addresses the problem of separating the user interface GUI from the
applications code and insulating the system from the eventual demise of X
mentioned by Will Deich (even if X goes on for years, and it will, you may
still want to use something else on a new platform).

There is a lot more involved in this as would be the case in any large
system. This project will be discussed further in the IRAF BOF and in the
IRAF developer's workshop on Thursday.

Finally, lest anyone get the wrong impression - despite all this emphasis
on system stuff we are still writing lots of applications software within
the IRAF projects!

Doug Tody National Optical Astronomy Observatories IRAF project
tody@noao.edu P.O. Box 26732, Tucson, Arizona, 85726 iraf@noao.edu