Cartographic Computing Technology

Michael D. Godfrey

Research Department Sperry Univac Blue Bell, PA, USA

For presentation at Euro-Carto I, Oxford, 13 December, 1981

Contents

1.0

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.0

Cartographic Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2.1 Information about the Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.0

Limits of Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

2 2 2

3.1 Physical Constraints. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.2 Economic Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3.3 Ergonomic Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4.0

Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4

Logic - VLSI Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Data Management. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Systems Software Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 5 6 6 7

5.0

Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

6.0

References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4.1 4.2 4.3 4.4 4.5

Cartographic Computing Technology

1

1. Introduction The purpose of this paper is to present some views about the organization of data, and some facts about the nature and development of computing. The present state of computing in general, and data management in particular, is somewhat confused. One major cause of confusion is the fact that most computing systems are still constructed using architectural models which were established in the 1950’s. These systems operate software for which the architectural models, if any, were established in the 1960’s. A key assumption of these architectures is best given by quotation from the original source. F.L. Alt [1] (pg. 270) stated: “A necessary condition in order that a problem may be handled efficiently on any large machine, in preference to computing it by hand and with desk machines, is that the input of orders and numbers be small in comparison with the number of operations performed.” This, and the evident technical trouble of handling large volumes of data, have set the architectural framework for machines down to the present. It is still true today that the data are always fetched into the instruction processing unit for computation, rather than sending the instructions to where the data are. This can only make sense if there is a relatively small quantity of data to be moved. Thus, it is not surprising that the demands that many users place on present systems are not evidently met in an efficient and natural manner. Add to this the fact that there are very few people left around today who know or remember why the current architectures were originally created and confusion is nearly inevitable. However, I will want to argue that things are going to get better in basic and important ways. Specifically, it will be recognized that the data are in overwhelmingly greatest quantity, and that the basic requirement of computing is the ability to describe data and suitable data transformations. The descriptive and data oriented structure of current computing problems will motivate new architectures to address these problems in a direct and efficient manner.

2. Cartographic Data Machine assisted cartography is just one discipline where volume and complexity of data have overwhelmed current techniques. There still exists a strong tendency to accept the conventional wisdom of computing and attempt to describe and manipulate cartographic data by essentially manual means. If computers are to be really useful in cartography they must automate the storage, manipulation, and description of the data. Thus, two problems must be addressed. First is the management of information about the data, i.e. the relational structure. Second is the management of the data themselves. Solving the first problem greatly simplifies the solution of the second.

2

Cartographic Computing Technology

2.1 Information about the Data Within cartographic data processing organizations there have been numerous attempts to describe cartographic data by means based on the tape units of F. Alt’s Bell Labs Machine. Sequential files are defined and the layout of the records in the files is specified in complete detail. Even if this procedure allows the new user to actually read the tape (not a likely event) it leaves him with the burden of developing the relational structure of the data and its organization into a form suitable for his problem. Thus, hardly any of the work involved in understanding the meaning of the data has been accomplished. I propose that the description of cartographic data in such terms as “field 16 of the master header record...” is entirely inappropriate for purposes of using the data. Such internal details should be left to the computer. The user of cartographic data sees the structure of the data as a set of relations. Each relation may itself be a relation, or it may be a data item. Such relational structures must be general in both a definitional and a dynamic sense. The base of such a structure is naturally a dictionary of known objects. From this dictionary the user may obtain the names of interesting objects. He may then ask questions about these objects to discover their attributes including relations to other objects, and he may insert new objects, attributes and relations. Thus, the appropriate definition of cartographic data for purposes of user understanding, and for data interchange, is a data dictionary and the rules for construction and manipulation of data and relations.

2.2 The Data Given the structure indicated in the previous Section, the data are fully described by suitable attribute and descriptor entries. Therefore, the internal representation of the data is no longer visible or of interest to the user. This solves a wide range of representation and compatibility problems. Specifically, as pointed out by Haralick [2], the question of raster representation versus vector format becomes an internal implementation issue. The data system may be requested to present the data in either raster or vector form. Either can be presented regardless of the internal form of the data. In order for the data to be understandable to users the data must be fully described in terms of attributes and relations. If this is done then the user need only see the information which he requests by name or by relation as it is presented by the data system.

3. Limits of Computation It has become popular to assume that computing systems will continue to improve in speed, capacity, reliability, and performance per unit cost. Sometimes it is even

Cartographic Computing Technology

3

assumed that such improvement can be expected to be uniform across various characteristics, or even uniform in time. However, there are a number of constraints on such improvement. These constraints are physical, economic, and ergonomic.

3.1 Physical Constraints The basic physical constraints on computing are the speed of light and the fact that energy is required for entropy destruction. It now appears that efficient computing will not operate at speeds greatly in excess of what is achievable today. However, the efficiency of this computing will increase by several orders of magnitude. Computation, at say 4 MIPS (Million Instructions Per Second) which now requires on the order of 30KVA may be done in the future at an energy consumption rate of a few watts. This will tend to imply that processors which operate at a few MIPS will become readily available in convenient and low-cost forms. This trend is already well underway with systems such as the VAX 11/730, or Motorola MC68000 based systems. Such systems will improve rapidly for several more years. The physical constraints on the storage of data are not so well understood for practical purposes. In any case, the kind of change which VLSI has brought to processing is not evident in data storage technology. It is expected that magnetic disk media will provide the main general-purpose data storage mechanism for the foreseeable future. Such devices will continue to be improved in data density, data access, and cost per bit of storage, but the improvements will tend to by factors of two or four per five year generation. Thus, the rate of change of efficiency for data storage will lag far behind that for processing. This will tend to mean that there will be an increasing substitution of processing for storage, where this is possible. For example, data compression will become increasingly attractive in a wide range of contexts. Generally, data storage techniques which require substantial computation in order to present the data correctly will not be at a comparative disadvantage against alternatives which require less computation but more storage space. In addition, the cardinal rule of data management, which is do not replicate the data, will become even more firmly established.

3.2 Economic Constraints The basic economic factors in computing may be listed as follows. For $1000 one may purchase: 1. 1 MIPS processing resource 2. 30 MB disk storage 3. 30 × 103 MB tape storage. While these relations and values will change somewhat during the next few years, the obvious point is that for data management purposes the bulk of the cost will be in disk storage devices. This fact will pose serious problems for anyone who is contemplating

4

Cartographic Computing Technology

the establishment or operation of large database systems. The effectiveness of disk storage will improve to such an extent that investment in disk equipment should continue to be amortized over a short period. Typically, upgrading of disks will only be possible by unit, or subsystem, replacement. Due to the electromechanical operation, the reliable operating life will always be relatively short.

3.3 Ergonomic Constraints It is often assumed that only the computing system limits the usefulness of the system, and that the ergonomic interface can be improved without limit if only the right techniques can be developed and applied. However, there are fixed limits to such a process. People read at about 200 baud and listen or talk at rates only moderately above this rate. In addition, such rates cannot be sustained for long periods, as we all know. The use of graphics is intended to improve on this performance, but often the result is that much more information is presented but not much more is understood. The trend from scenic maps to various forms which are intended to contain more quantitative information has lead to many cases of obscure and difficult information presentation. The effective limit of the bandwidth required between the computer and the user seems to be determined by the characteristics of the user. This limit seems to be quite a small value, and is evidently below the bandwidths which are now in use in many terminal and display systems today. Thus, it is unlikely that these bandwidths will be increased substantially in the future.

4. Technology In this Section I will discuss the current state of technology and give what indications I can of the near-term prospects for significant changes. However, the “present” for me tends to be what seem to be solved research and development problems. At present it takes about six years from development to delivery of a major computer system. Thus, in the worst case, the present may be up to six years away.

4.1 Logic - VLSI Systems The current state of development in VLSI (Very Large Scale Integration) permits the construction of processor or memory chips which contain on the order of 100,000 logic devices. Examples of such chips are the Intel iapx 432 or the Motorola MC68000. Either of these is a full computing system with internal performance comparable to a small mainframe such as the IBM 4331 or DEC VAX 11/780. Both of these chips are built on a scale of about 4 micron minimum feature size. The operation speed is about 0.5 MIPS, and the address width is 32 bits for the 432 and 23 bits for the MC68000. Purchase prices are about $200 to $300.

Cartographic Computing Technology

5

The basic objective of the main VLSI development is to produce higher integration through smaller feature size and operation at the minimum power point of the technology. Under these conditions, very roughly, system performance increases as the inverse square of the minimum feature size. Thus, a processing chip using 2 micron lithography will operate at about 4 times the rate of a similar chips made at 4 micron size. At present, advanced experimental systems can fabricate at 2 micron size. The technology for 1.2 micron size is well established for operation by about 1983, with 0.7 microns projected for about 1985. These features of fabrication capability imply that single chip processors which operate at internal processing rates of 4 to 6 MIPS will be available during the next few years. There will be an increasing tendency to improve current designs by reduced scaling. For example, the MC68000 which currently operates at 8MHz is being enhanced to 12MHz. This general path of development will not likely produce the fastest possible computation speed, but it will produce the most efficient (minimum power) processing which can be presently imagined. However, it is easy to realize that the full capability of the increasing levels of device integration will only be realized if a completely new approach to design can be developed. The current VLSI circuit design techniques are surprisingly similar to the current state of ‘automated’ cartography. Specifically, while a computer is used to retain, display, and carry out simple manipulations on the data, the hard parts are still done by people. These people must have extraordinary skill and motivation. In the VLSI case they are asked to layout the best topography for on the order of 100,000 interconnected objects. In the VLSI community the analogy is often made with the layout of streets and houses on a map. With current levels of complexity these procedures result in development costs of about $10 per device and development times of several years. Errors in such work are almost inevitable and are catastrophic. Individuals are literally ‘burned out’ by the demands of such work. Research work is underway to provide true automation of such designs, and therefore extend the design possibilities to the vastly more complex structures which will be possible in the near future. More progress depends on the outcome of this research than on any other activity in computing today. This research requires new understanding of graphics, topography, data manipulation, but also of computer language technology.

4.2 Data Storage The current state of magnetic disk storage is represented by the IBM 3370. The 3380, which is about to start deliveries, represents about a factor of 4 improvement over the 3370. The main technical change is the introduction of thin film heads for reading and writing on the disk surface. The thin film heads help to permit reduced dimensions and reduced costs. Developments which are underway, including new recording media and a new recording mode, should produce another factor of 10 improvement in performance during the coming 5 to 10 years. However, at the point then reached it is hard to see how substantial additional gains in magnetic recording could be achieved.

6

Cartographic Computing Technology

The use of optical recording has been given much recent publicity. In principle, this technology has advantages in cost and storage performance of about 2 orders of magnitude as compared with 3380 level magnetic recording. However, substantial unsolved technical problems stand in the way of effective use of optical recording for computer data systems. These problems include the difficulty of writing data as opposed to reading it, an unacceptably high intrinsic error rate, and the lack of archival storage quality. While it is widely believed, and hoped, that magnetic disk will be replaced by something of much higher efficiency and stability, that replacement is not yet clearly in sight.

4.3 Data Management Data management is a relatively young branch of computing. It, unfortunately, got off to a rather poor start. For these reasons it is just beginning to become effective. It is now being realized that the essential benefits of data management are the automation of the management of the names and relations of objects. Early data management systems either did not fully recognize the need for full automation or thought of such automation in a relatively useless static sense. It may still take one or two generations of implementations before the implications of these needs are fully assimilated. Clear expressions of the needs from users of data management services could significantly accelerate development. For many reasons, not the least of which may be legal constraints, data management systems will increasingly be logically distinct from other processing functions within a computing system. However, this structure, which many people have promoted for more or less good reasons, conflicts in a basic way with all current computing systems design and implementations. It is for this reason that the various ‘database processor’ systems which have been investigated or announced recently have faded or been dropped. The basic conflict is that current system designs are based on the ‘single point of serialized control’ model of computing system design. This model was implicitly created with the first computers, and has been with us ever since. A new model which recognizes decentralized control as a basic (or the basic) construct will be required before autonomous data management systems can begin to function effectively.

4.4 Systems Software Services It is this technology area for which I, possibly optimistically, hold out greatest hopes for the near future. The pressure to create systems which are better matched to users needs is substantial today. In large part this pressure comes from the personal computer market. Such systems as CP/M convince the naive user that computers are not so difficult. It is becoming harder for the major systems vendors to explain

Cartographic Computing Technology

7

why all the convenience and ease-of-use of CP/M has to be discarded when using a large system.

4.5 Languages Reviews of technology in computing do not usually single out languages as a prominent area of deficiency. However, it is my view that the deficiency of our present languages is as inhibiting as (and is related to) our inability to do efficient VLSI design. Current language systems are based on the assumption that the algorithm is the central concept of computing and that this concept must be directly expressed by the user. This, of course, stems from the original conception of computers which applied many operations to small quantities of data. It should be clear by now that the natural means of expression of a wide range of problems which are suitable for computer processing is in terms of the input data and the desired output data. The transformation which is required in order to produce the output data is the natural expression of the problem. Such a transformation is most easily expressed in terms of the transformation rules. This has nothing overtly to do with algorithms. It may be all right for the computer to apply some algorithms in the process of carrying out the desired transformation, but this should not be of any concern of the user. While substantial research has yet to be done, it appears that such rule-based languages will be not only natural and easy to use, but will be vastly more compact and efficient than current language systems. Such results will provide the basis for greatly improved means of data access and manipulation.

5. Conclusions This paper has attempted to present some facts about current computing technology which are relevant to the use of computers in cartography, and particularly for cartographic data management. The presentation of these facts has, of course, been colored by my own perceptions, and I have taken the liberty of injecting a few direct opinions. I hope that the facts may be helpful and that the opinions will stimulate some serious discussion. The field of automated cartography is important and should be a source of enlightened criticism of current computing. The criticism could help to stimulate the basic change which computing will have to undertake if it is to become an effective contributor to individual and social wellbeing.

6. References [1] F. L. Alt, “A Bell Telephone Laboratories’ Computing Machine-II,” reprinted in The Origins of Digital Computers, Selected papers, ed. B. Randell, Springer-

8

Cartographic Computing Technology

Verlag, 1973. [2] R. M. Haralick, “A Spacial Data Structure for Geographic Information Systems,” in Map Data processing, Academic press, Inc., 1980.

Cartographic Computing Technology

The purpose of this paper is to present some views about the organization of data, and some facts about the nature and development of computing. The present state of computing in general, and data management in particular, is somewhat confused. One major cause of confusion is the fact that most computing systems are ...

102KB Sizes 0 Downloads 20 Views

Recommend Documents

Download Cloud Computing: Concepts, Technology ...
Download Cloud Computing: Concepts, Technology Architecture (The. Prentice Hall Service Technology Series from Thomas Erl) PDF Full book ... comparison to those hosted on traditional IT enterprise premises. Also provided are templates ...

Download Cloud Computing: Concepts, Technology ...
Download Cloud Computing: Concepts, Technology Architecture (The. Prentice Hall ... Clouds are distributed technology ... business and economic factors that ...