License

This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License. To view a copy of this license, visit https://creativecommons.org/licenses/by-sa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

Copyright 1992 Andrew Tyler

Released under CC BY-SA 4.0 2018

Preface

“A picture is worth a thousand words”. This statement sums it all up.

A few years ago, when I first opened a book on computer graphics, I was stunned by the beautiful simulations of life-like objects generated by computers. But these were from state-of-the-art machines, far more powerful than the popular personal microcomputers of the time, which were almost exclusively 8-bit.

With the advent of 16-bit micros things changed markedly. Their extra power and memory had an immediate impact on all graphics applications, from painting programs to fast flight simulators sporting solid 3D primitives (objects). The low price and high power of micros such as the Commodore Amiga meant that anyone could enjoy high quality computer graphics (especially in games) at relatively low cost. But enjoying other people’s programs is only half the fun. Surprisingly, writing them is not really as difficult as it looks. Of course there is a fair amount of technology to be learnt along the way, but a good deal of the dramatic effect comes from the speed of the machines themselves, performing fairly standard algorithms very fast.

When I first became interested in graphics programming and wanted it to be as fast as possible in machine code, it seemed to me that essential information was spread thinly in the literature. There were certainly books on machine code programming and on computer graphics; there were even a few books on machine code graphics programming. But somehow I could never quite find the balance I was looking for. Standard texts on computer graphics seemed amazingly obscure on certain aspects of transforms, in particular how to picture a scene from an arbitrary view point. I felt, quite unreasonably perhaps, that there was a tendency to hide it all behind a smokescreen of professional mystique; certainly it helped considerably to understand the mathematics of vectors and matrices, but surely all this had been worked out years ago and ought to be fairly straightforward? Perhaps it was just me! Anyway I wanted to write 3D solid graphics programs that would run in real time (like a flight simulator), and couldn’t find anyone who would tell me how to do it. For sure the people who write commercial games knew, but they weren’t telling — for obvious reasons! There were a few very useful serialised articles in magazines but, by necessity I’m sure, these were often too brief and not exactly what I wanted.

Things came to a head when I was assigned to give a college course on Advanced Microcomputer Software (which was another way of saying “Assembly Language Programming on the 68000”). Teaching programming, especially in assembly language, can be a very sterile pastime unless the application is interesting. What better application than graphics and what better machine (for the price) than the Amiga.

This book arose from my efforts to penetrate the world of computer graphics and make some of the basics understandable (I hope) to non-specialists. It is about fast 3D (so-called vector) graphics in assembly language. There is certainly no guarantee that the programs in this book are the most efficient, most elegant and fastest of their kind. But they are reasonably fast. Certainly as fast as some commercial programs! The astute reader will undoubtedly be able to make improvements (and tell me, I hope).

There is no assumption that the reader has any prior knowledge of any of the following subjects, all of which eventually figure heavily in the graphics process: the Amiga Operating System, vectors and matrices. There are further explanations in the Appendices. That is not to say that the book contains exhaustive discussions of these subjects, only sufficient for the purpose in hand. The enthusiast will undoubtedly wish to add to them.

As regards the assembly language, although an Appendix contains a list of the instruction set and (most important) the addressing modes, it is assumed that the reader who wishes to fully understand what is going on will have on hand a 68000 code reference book (they are available in pocket form very cheaply).

For the writing, assembly, debugging and running of the programs in the book the powerful and friendly Devpac Amiga assembler from Hisoft, which is one of several good commercial assembler/debuggers, has been used. This comes as an integrated package within which all functions can be performed. Further information on the assembler is given in Appendix 2.

The book is laid out in serial form. Each chapter deals with a different topic and illustrates its application with example programs. To the experienced reader the early chapters will seem pedestrian. To the newcomer they will not. There is really no easy introduction to the overall process and so each stage (a somewhat artificial division) is dealt with in detail separately. Each stage of the graphics “pipeline" does a specific task and has its own algorithm and strategy. The chapters are laid out to reflect the build up of the overall process. Each chapter has its own example programs and the programs saved from the earlier chapters are used in later ones so that they don’t have to be entered more than once. In this way the example programs at the end of the book end up being the largest and most complex, though the amount of code you have to enter for each new chapter doesn’t really increase very much. The programs are written for the Amiga but can be modified to run on any 680000 based computer since, with the exception of certain specifics concerned with the screen and operating system, the graphics routines are entirely independent and self-contained.

Computer graphics is a vast subject; a book of this length can only cover a small part. Especially since it is not just descriptive but contains working programs. Techniques such as Ray Tracing and Radiosity methods are perhaps better suited to a future, more powerful generation of personal computers. But that will come; it is likely that many of the software routines discussed here will be replaced in future machines by hardware “geometry engines”.

Until then, 3D graphics will have to be done by “bashing the bytes”.

One last very important word of caution. The experienced programmer knows all about it.

Do make frequent back-ups of your work — about every two hours. In writing programs of this kind, close to the hardware, there is no safety net! A faulty program can easily crash the system and spew garbage at the disk drive as it goes down. It’s happened to me. You want to lose as little of your recent work as possible. Associated with this is a useful practice — put on the disk write-protect before you run a new program. It’ll help save the disk if it crashes.

Good luck.

Andrew Tyler,1992

1. An Overview

Computer graphics is not a minority interest of computer freaks. It is a multi-billion dollar industry. Even in 1982 when Hollywood spent 3 billion dollars on movie production, the world commercial computer graphics industry spent 2 billion dollars and was growing at the rate of 30% a year. In the same year in the U.S. 10 billion dollars were spent on video games. There has been no halt since that time. Computer graphics is very big business indeed.

The microcomputer owner meets some of the best graphics for his machine in games, many of which use advanced concepts straight out of the professional computer journals. For small machines there are always limitations on what can be achieved, determined by the speed of the processor and the size of RAM. But in recent years the popular microcomputer has been extremely good value for money, having considerable computational power at very low price and providing complex graphics at minimal cost. The Amiga is just such a computer. This explosion in the power/price ratio of computer hardware has put immense computing capability in the hands of the popular micro owner and made advanced graphics techniques, which were the domain of the professional, available to anyone.

The aim of this book is to develop fast 3D solid graphics routines which run in real time and include features such as windowing (clipping), hidden surface removal, illumination from a light source, joystick control, full perspective and rotational transforms and ending up with a flight simulator type program. The programs are written in 68000 machine code to run on an Amiga 500 but the algorithms are valid for any machine. In short, everything needed to get started on a flight simulator.

The programs are written in assembly language for maximum speed and have been tested and run using the Hisoft Devpac Amiga assembler. There are many excellent commercial assemblers available at modest expense, and even some in the public domain. There is nothing more irritating when looking for a persistent and obstinate bug in a program than an unfriendly assembler. The Devpac assembler has been a friendly and helpful companion through the many hours required to develop the programs in this book.

1.1. A New Medium

What is ‘computer graphics’? It is certainly shrouded in mystique to some degree. Because it is still a relatively young subject its evolution is continuing apace, and is intimately linked to the power of current computers and the special graphics hardware incorporated in them. The solutions to many of the problems of yesterday, once based in software, are now provided at great speed in hardware. It is likely that much of the software of the kind developed in this book will be replaced in future machines by dedicated ‘geometry engines’.

1.1.1. Is it Art, or What?

Humans are very good at generating and recognising complex visual patterns but not very good at doing arithmetic. By contrast, digital computers were designed to be perfect at binary arithmetic. What else they can do depends on how well complex mathematical functions can be constructed from basic binary arithmetic. There is a limitation here since numbers in a computer cannot be more accurate than the number of bits assigned to them but, apart from that, it is clear that complex mathematical calculations can be done quickly on even very modest microcomputers.

In computer graphics, the computer adds tremendous speed to any calculation associated with geometry, which is the mathematics of drawing. Because geometry is concerned with the exact mathematical relations between lines and surfaces, it is ideally matched to the way the computer works. This is the good and the bad news of drawing with computers: precise mathematical functions can be expressed graphically at lightning speed but making them look like natural objects requires considerably more work. In fact much of the effort in computer graphics is now concerned with ‘messing up’ the perfect but sterile images of geometry to make them fit for human consumption. Doing this has less to do with computers and more to do with the traditional skills of animation discovered many years ago by Walt Disney.

It is very easy to draw precise mathematical shapes with a computer because such shapes can be generated from a formula. A circle is an example of a simple mathematical function. For a circle centred at the origin of an x-y coordinate system the formula is

  x2 + y2 = r2

Such a function is a good starting point for a billiard ball but a poor starting point for an apple, although superficially the difference is not all that great (both have an overall spherical shape with a shiny exterior). Let’s consider how we might use a computer to draw an apple.

First of all there has to be a good starting point. There is no such thing as a mathematical formula for an apple. All apples are different. However, apples do have a typical shape and that is what the human artist knows from experience. But an artist would not draw all the apples in a still life with the same shape, it would be too boring. Programming a computer to avoid repetition and simplicity is difficult.

One way to draw apples would be to use equations of curves having the apple shape. By choosing functions with high powers of x, y and z, as much sharpness or flatness as desired can be included. This is the world of bicubic patches, Bezier functions and beta-splines. This would certainly allow variation, but with considerable computational effort. One way to do this would be to hold different apple outlines as (x,y) coordinate pairs in a data base and then use curve and surface fitting techniques to connect then as in a “join the dots” picture. This is how the famous teapot of Martin Newell, which was a prototype in the early development of modelling solid surfaces, was constructed. In technical language it can be constructed from an outline consisting of three Bezier curves. Since the teapot is symmetrical, its surface (with the exception of the spout) is then generated by rotating the outline about the central vertical axis.

Another way is to avoid curves altogether, and instead subdivide the surface of the apple into many flat facets like a gemstone. The little facets, being flat and many sided, are polygons and the surface of the apple is a polygon mesh. This approach is less time consuming than using curved patches but there remains the problem of disguising the sharp boundary edges between polygons.

This leads to the next level of refinement in producing a convincing image. A mathematical function on its own knows nothing of the laws of physics. These are so familiar to us that we take them for granted: glass is transparent but wood is opaque, metals look bright and shiny but human skin is dull and diffuse. Somehow these subtle but essential clues must be included. The most important first step is to make the rear surfaces of opaque objects invisible. This is called hidden surface removal which, despite the apparent simplicity of the task, turns out to be quite difficult. Much time has been spent investigating efficient and thorough ways of doing this. Next there must be visual clues to the surface structure. One obvious step is to illuminate it with a light source so that one side is brighter than the other.

At the next level of refinement the surface must be textured and patterned in a “natural” way to look real. In this the programmer is aided by the mathematics of fractals, developed and promoted by Benoit Mandelbrot. This is the geometry of self-similar structures and quite different from the geometry of Euclid where structures are built from perfect lines and surfaces. Natural objects appear to have a lot in common with self-similar structures and even if the similarity is not exact, they are convincingly modelled by them. A self-similar structure is one which has the same appearance at any level of magnification. Of course natural objects may only satisfy this definition over a limited range of dimensions but it often produces very convincing results. For example, the side branch of a fern when magnified looks like the main branch and small pebbles under magnification look like boulders. Nature is full of such structures. An additional bonus is that algorithms have been discovered which allow self-similar structures and landscapes to be generated from a relatively small amount of information. This relieves the programmer of carrying a colossal database from which to generate each separate detail of a complex scene.

All of these steps are essential to give a convincing image. The fact that so much visual richness is required to make an image look real testifies to the very advanced pattern recognition capability of human beings.

When all this is done, what have we got? Just a very roundabout way of painting an apple? The difference is that once created in software the graphic entity has an independent existence. The picture on the screen is just the final stage. Even if not being currently displayed, it can evolve according to rules included in the program. There is not even the constraint to create objects which are modelled on real life. It is possible to invent new “lifeforms” inside the computer. In Computer Aided Design (CAD) this is what happens all the time. Machines are designed, built and tested inside the computer long before they exist as material objects. In simulators and games this aspect is pushed as far as possible. Computer games specialise in generating artificial realities; the more exotic the better.

Future developments in input-output devices will undoubtedly have a major impact on what is currently called computer graphics. At the moment the emphasis is on generating realistic images. But images are only computer output designed for human input through the eyes. What will it be called when all of the senses are involved? Already, with the aid of spectacles which give separate input to each eye and tactile stimulation on the hands, it is possible to enter totally into the world inside the computer. This is Virtual Reality or Cyberspace. What will it be like when the computer couples directly into the human nervous system without the need for an intermediate interface? Aside from the minor consideration of feeding the body, it will be possible to live out an entirely artificial existence inside the computer.

Computer graphics is the thin end of a very long wedge which started when computers first produced a visual output in response to human input. Where it will end is unknown, but along the way it is sure to be lots of fun.

1.2. What Can You Do With A 16-bit Micro?

The answer to this question is best illustrated by looking at what is achievable and a powerful commercial system, of which a good example is the Reyes system developed at Lucasfilm Ltd and currently in use at Pixar. This has been used to make a number of well known short film sequences including “The Adventures Of Andre and Wally B”, “Luxo Jr.”, “Red’s Dream” and the animated knight sequence from “Young Sherlock Holmes”. The Reyes system was set up to compute a full length feature film in about a year, incorporating graphics as visually rich as real life. Assuming a movie film lasts about 2 hours and the film runs at 24 frames per second, this means each frame must be computed (rendered) in approximately three minutes.

The basic strategy in this system is to represent each object (geometric primitive) in a scene by a mesh of micropolygons which are subpixel-sized quadrilaterals with an area of 1/4 of a pixel (the smallest visible unit on the screen). All the shading and visibility calculations are done on these micropolygons. The overall picture is constructed like a movie set with only the visible parts actually being drawn. Micropolygons are deemed to be invisible if they lie outside a certain viewing angle or are too close or too far away. The final system includes subtleties such as motion blurring, the effect whereby objects in motion appear to be blurred at their trailing edges. This is one of the devices used to enhance the impression of motion and is another lesson learned from traditional cartoonists.

A very complex picture in this system typically uses slightly less than 7 million micropolygons to render a scene of resolution 1024x612 pixels. With 4 light sources and 15 channels of texture a picture takes about 8 hours of CPU time to compute on a CCI 6/32 computer which is 4-6 times faster than a VAX11/780. Frames from “Young Sherlock Holmes” were the same resolution and took an hour per frame to compute. In the final movie all the stored frames are played back as in a conventional film.

But it’s not necessary to go as far as this to produce high quality pictures. There are mow “personal” graphics stations available at prices almost within the reach of mortals. The Personal Iris machines manufactured by Silicon Graphics are good examples. They offer 256 colours (8 planes) from a palette of 4096 and, using a hardware “geometry engine”, are able to perform transforms such as scaling, rotation, hidden-line removal and lighting, amongst others, to produce 3D motion in real-time. The CPU is a 20MHz R3000 RISC processor with a R3010 FPU (floating point unit). Here RISC technology has been used to maximise the speed, but it is interesting to note that before 1986 Silicon Graphics used the 68000 processor. It will not be long before machines such as these drop into the personal computer market.

What about a micro like the Amiga 500 with 512 kbytes of RAM and a CPU working at 7 MHz? The potential for detailed graphics is somewhat less, especially if frames are to run in real time, sufficiently fast to avoid intolerable flicker. But it is surprising how much can be achieved. For speed, building up solid objects using polygon meshes is most attractive since it only requires that the vertices be stored, and a large object can be described by a very small amount of information.

Moreover, since polygons are sets of vertices joined by straight lines, the most complex algebra involved will be that of simple geometry. This is the strategy we will use.

1.3. Assembled for Speed

There are many computer languages but assembly language gives the best opportunity of getting as close to the hardware as possible and tailoring to the application in hand. All the programs in this book are written in 68000 assembly language and except for “housekeeping chores” and specific hardware functions in the first chapter, do not use any of the routines in the Amiga operating system. The programs could therefore easily be rewritten to run on a processor other than he 68000 since the most difficult thing is the overall program structure. Language details are secondary.

Assembly language is very exacting and unforgiving with a masochistic charm all of its own. It really has very little grammatical structure beyond the syntax of the instructions themselves, and the main criteria for efficient programming are speed, economic use of registers and memory, and efficient parameter passing. Sometimes there is conflict between these, especially where there is no shortage of memory. Where speed is all important, programs often sacrifice brevity in order to avoid time-consuming subroutine calls.

The programs in this book have been assembled and run using the Devpac Amiga assembler from Hisoft but any other assembler will do providing changes are made to comply with its format. The simple but powerful INCLUDE directive allows files to be pulled together at assembly time without the need to define global variables. The INCLUDE directive can be nested to any depth that memory will allow so that each chapter can INCLUDE the programs from earlier ones. In this way there is hardly any duplication, and a program file, once entered, can be used later. The overall program therefore grows steadily in size as the book progresses and practically no programming effort is wasted. The final program INCLUDE’s all earlier parts. This is the only linking which needs to be done and it is painless.

Appendix 2 gives a brief description of assembler usage in general and the Devpac assembler in particular, including those commands which have been found to be most useful.

1.4. Writing for a 16 bit Micro

Writing programs in assembler for a 16 bit micro is quite different from writing for an 8 bit micro. Apart from the more powerful addressing modes available, there is a fundamental difference which centres on the ideas embodied in position dependent and independent code. The picture is somewhat confused by other similar sounding terms such as absolute and relocatable code. We shall discuss what these mean because they have a profound effect on how a program is written in assembler.

In an 8-bit micro usually only one program at a time is loaded in RAM and at a fixed location. Of course where an operating system oversees the running of programs, such as CP/M, things are more complicated. But in small micros with built in BASIC and very little else, the operating system reserves fixed space for its variables area and frees everything else for the current program. Knowing where the program resides in memory makes life simple for the programmer since fixed addresses can be assigned for variables and these will never change. A program which directly addresses fixed memory locations is said to be written in position dependent or absolute code.

Though such code can be written for computers with operating systems, there is another way of doing things which gives much greater flexibility, and allows several programs to reside in memory simultaneously. A consequence of this is that the actual position in memory of a particular program will not be known until run time. As a result, no actual numerical address can be referred to in the program since it is not fixed until the program is loaded and run.

There are several ways of overcoming this problem. One way is to use an addressing mode of the processor specifically designed to generate position independent code. This is called PC (program counter) relative addressing. What it does is locate an address not as an absolute value but relative to the value of the program counter where the reference is made. The assembled code will tell the processor to calculate the actual address by adding or subtracting a displacement to the current value of the program counter, which will always have a fixed value relative to the start of the program.

Another way is to calculate all addresses from a base address, or pointer, held in an address register. The program will then constantly refer to offsets from the address register but no actual value for the address need be specified when the program is being written. The register cannot, of course, be used for anything else while it is reserved in this way. The special register will have to be set up at the start of the program with the correct pointer. A good pointer is the address of the end of the program.

Another way is to allow the assembler take care of everything and generate relocatable code. This is code where no reference to specific addresses is made, but instead labels are used. The label name is chosen to be informative and of assistance to the programmer. For example, COLOUR might be the label for the long word address where the byte length value of the current colour of a polygon is held. The assembler will mark such a label as relocatable and its address will finally be fixed by the computer operating system when the program is loaded.

All of the programs in this book use relocatable code generated by the Devpac assembler. It is simple to write.

The instruction set of the 68000 is long and complex. To fully appreciate its power and elegance the reader should refer to the Motorola 16-Bit Microprocessor User’s Manual. A brief listing is given in Appendix 1.

1.5. The Programs

The programs in this book have been written using the Devpac assembler and are ready to run. Once a program has been entered all that is necessary is to assemble it from within the editor and it will run as described. The program files all have the extension .s since they are source files. If a program is to run independently it can be assembled to disc with the file extension .prg. In fact for reasons of space, it is likely that all the programs beyond Chapter 6 will have to be assembled to disc.

The programs have all been run extensively to ensure they are as bug free as possible, and the listings have been obtained from within the assembler Editor using the PRINT BLOCK facility to ensure that there are no further stages of transcription during which errors might creep in. However as with all human endeavours, there can be no guarantee that the programs are completely bug free.

The programs are undoubtedly neither the fastest nor most elegant examples of their kind in existence but, in a tutorial of this kind where the emphasis is on teaching, the main point is to understand how things are done. The astute reader will quickly discover clever ways of improving them. In any case the best commercial programs are proprietary and kept secret from us.

1.6. The Amiga Operating System

The Amiga operating system is large and complex and operates at many levels. There are often many ways of doing the same thing depending on the level of entry.

Using the device independent routines ensures that programs are portable, i.e. they are shielded from hardware details and in principle work on any machine with the same operating system. The penalty is one of speed. Generally the closer you get to the hardware, the faster things run.

Apart from this all the programs are “original” (if there is such a thing in programming) and tailored closely to the graphics applications.

2. Modelling a 3D World

One of the most fascinating things aspects of computers is the way they can be used to build life-like models. The great attraction of realistic computer games and, at the more serious end, simulators stems from the way the computer screen can be made to look like a window onto an invented universe - a Virtual Reality. Some famous scientists, impressed with the similarity to the process of creation, have even gone so far as to consider theories of reality based on a real Universe built up from ‘bits’ of information. Whatever the fundamental significance of it all, the fact remains that computers offer a new dimension for human expression and experience. Simply put, they provide the possibility to create alternative realities where the laws of Nature may or not apply. All sorts of strange and exotic situations can be invented and investigated. For human beings, who relate most easily to objects and situations met in everyday life (and dreams), what appears on the computer screen should look familiar. Great effort has gone into constructing models of this kind. In a simulator which is supposed to accurately depict reality, the emphasis is on models which obey the laws of Nature precisely.

In this chapter we will look at a way of modelling which provides a very fast and reasonably accurate picture of real objects. For the most part, but not completely, this involves polyhedral structures with polygonal faces as the building blocks, the so-called ‘vector’ graphics. Spheres and other objects with a high degree of symmetry can also be drawn quickly. Actually, to set the record straight, vector graphics originally meant something else. It was a name given to a mode of display where points on the monitor were joined directly by an electron beam that could be switch quickly from one part of the screen to another. This did not require much memory devoted to the screen and gave very fast ‘wire-frame’ pictures. The displays on monitors today do not use this technique. Instead, the image is built up from horizontal raster scans from one side of the image to the other. It is called raster scan (or scan conversion) graphics. The speed with which an outline can be filled by raster scans makes it a very useful technique. However the name vector graphics has become commonly used to describe the graphics modelling technique itself, not the display technology. The adjective “vector” here really refers to the extensive use made of vector geometry in the programs.

One other important technique is the Block Image Transfer type of graphics, in which SPRITES play an important role. The Amiga has a piece of hardware on board, the BLITTER (BLock Image TransfER), which handles such operations very quickly. In such graphics, blocks of memory are manipulated as a whole, which is very useful since, once laid out in RAM, scan conversion need not be done a second time. The block of bytes is simply moved to the screen area. Some very clever and fast things can be done this way, particularly with sprites, but the relationships between the parts of the image are essentially determined by how the block is initially laid out in RAM. Sprite graphics is not discussed any further in this book.

Having said that, it is likely that the next generation of popular computers will have hardware implementation of all the common graphics functions including the ‘vector’ graphics we are about to discuss. It is very probable that soon all graphics functions will be done by very fast hardware ‘geometry engines’.

2.1. 3-D Modelling

“Real-time” 3D modelling has to be very fast. This is because humans can spot the flicker of the picture if it changes more slowly than about once every 50 milliseconds. In order to work in real time, the viewer has to be able to enter new data through the keyboard, joystick or mouse and see its effects immediately. The solid 3D structures which can be transformed and drawn on this time scale most easily are polyhedra.

Polyhedra are very good graphics building blocks or ‘primitives’ for several very good reasons:

  • they are completely defined by their vertices,

  • the faces are polygons with straight edges,

  • in any transformation only the vertices need to be recalculated,

  • a transformed polygon is also a polygon

  • polygons can quickly be filled in to look ‘solid’ using raster scans.

What all this means really is that it’s very hard to draw and shade in curved surfaces which don’t have high symmetry (like circles) and the only 3D objects without curved surfaces are polyhedra.

In fact computer graphics does not have a monopoly on the use of polyhedra as basic building blocks. The real world uses them extensively; many houses are made from bricks, which are six-sided polyhedra.

2.2. Transformations and Frames of Reference

All of the above statements concerning polyhedra can be translated into a definite mathematical framework called vector algebra, which is a very elegant and precise formulation of the mathematics of lines and planes. It becomes even move useful when presented in matrix form and it is this approach which usually appears in text books on computer graphics. For someone with little knowledge of advanced mathematics this looks very intimidating. Actually it’s not. Many secondary school syllabuses handle simple rotations using 2x2 matrices, and it really isn’t much more complicated than that. For those of us who do not wish to blaze new trails in the world of mathematics it is simply a case of understanding the general method and taking the results on trust. After all, once you have seen the transforms working you can use them in your programs and forget about them. There’s no need to re-invent the wheel.

For the moment though, in order to see the problem laid out in its entirety, let’s consider all the various stages of transforms, as shown in Figure 2.1. The distinction between the view frame and the world frame, and transformations between them, is discussed in further detail in Appendix 6.

2.2.1. The Object Frame

An object which exists inside the computer has quite a complicated life before it is seen on the screen. Most of this complication arises from the various transforms required to make it ‘lifelike’. But whatever they are (rotations, translation or even something more exotic), the object must preserve its original identity, i.e. its relative dimensions. What this means is that no calculation can be absolutely precise and, with the picture being recalculated faster than 20 times each second, if the original definition were not continually referenced, it would not be long before accumulative errors would make it unrecognisable (this problem crops up in all our calculations which, for speed, are done in only limited accuracy). Therefore it is necessary to constantly refer back to the original data which define the object. We call this place, in which the object is defined, the object frame (there is nothing sacrosanct about this name, other people have invented other names). Of course it doesn’t ‘exist’ in any real sense, it’s just that the numbers which fix the positions of the vertices are coordinates measured from some origin. This origin is where the object frame is said to be located. The object frame can be positioned so as to reflect the symmetry of the object. For example, the natural object frame of a cube could be a cartesian (x,y,z) coordinate system centred at the centre of symmetry (centre of gravity) of the cube, with the sides of the cube parallel to the x, y and z axes of the coordinate system as shown in Figure 2.1.

figure 02 01
Figure 2.1 Frames of reference

There may be several object frames combined together, particularly when a complex object is made up of several simpler objects. The process of sticking together simple objects (primitives) to make a complex one involves just the kind of transforms we have been talking about. These transforms are sometimes referred to as instance transforms.

2.2.2. The World frame

Having constructed a complex object — which can be thought of as an ‘actor’ in the scenario we are about to create — it is necessary to place it in the arena with all other ‘actors’. This common space, inhabited by all objects is called the world frame. It is the place where the Laws of Nature play a role. For example, objects which are not subject to any force either remain at rest or move at constant velocity. That’s Newton’s First Law. Since this world is our creation, we do not have to stick to these laws, if we wish. This is the place where collisions are tested for. We will call the transform which moves the object into its final position in the world frame the object-to-world transform. It will consist of some combination of rotation and translation.

2.2.3. The View Frame

Everyone in the real world has a different view of it, and the same thing applies to the world we are creating inside the computer. The only difference is that there is only one screen and therefore only one viewer. The view of the world depends on where the observer is standing and looking.

The view of the world seen by the observer is most easily represented by the view frame. This is a set of x, y, and z-axes which follow the gaze of the observer. Usually the z-axis points forward and in our convention the x-axis points vertically up. In this picture, an object which is straight ahead at a distance of 100 will have the coordinates (0,0,100) in the view frame and if the observer rotates to the left by 90 degrees it will have view frame coordinates (0,100,0). In general the view frame’s position in the world frame will be changing continuously. In a flight simulator, for example, the view frame is the view from the cockpit.

It might appear at first sight that there is an unnecessary duplication of points of view in all these frames of reference. However they define a natural hierarchy within which the overall picture can be constructed to make it easy to take account of the relative motions of the observer and graphics primitives (objects).

One thing in particular is worth noting. Rotating the view frame to the left or moving the scene to the right results in the same relative motion and gives the same picture on the screen. This suggests that there is a simple connection between two motions. In the language of mathematics, one is said to be the inverse of the other. We will return to this again when we look at the rotations in detail. This point is examined in detail in Appendix 6.

2.2.4. The Screen

This is the logical screen, the block of RAM on which pictures are drawn before being displayed. It is mapped out following the way RAM is allocated to the screen, which in turn depends on the screen resolution, as described in Chapter 3. This results in the origin (the point with screen coordinates (0,0)) being right at the top left hand corner of the screen. To get from the view frame to the screen we must make a ‘projection’ onto a plane, called the view plane, of the objects which we wish to display. This is called a perspective transform and must preserve the ordering in space, so that objects which are farther away look smaller. It is done by tracing “rays” from objects to the view point, which is the location of the observer’s eye. The intersection of these rays with the view plane defines the outlines as they will appear on the screen.

The transform to the screen coordinate system is almost the last stage, but not quite; the screen has limits. It may turn out that parts of the picture lie outside the screen RAM; that part of memory allocated to the screen. If no attempt is made to restrict points to appear on the visible screen then the program will attempt to plot them outside screen RAM, which could lead to a system crash. For this reason, unless it is absolutely certain that no point to be displayed will ever lie outside the screen RAM, only part of what is visible on the view plane will reach the screen. This is “windowing”. What is not visible must be “clipped” away. The outline which defines the window on the display is called a view port. To express clearly the effort that has gone into producing the final image, this is sometimes also called the clip frame.

There is even a need to clip in three dimensions in the view frame itself. Objects which are a long way away from the observer should not be displayed, and no time should be wasted worrying about them. It is a consequence of having a finite drawing resolution on the screen that small objects become badly distorted. Ultimately all very distant objects will end up as single pixels and the horizon could have a cluster of dots all over it. Sets of parallel lines will ultimately converge to a single line which will then never diminish in intensity. To stop all of this it makes sense to clip out altogether objects which are more than a certain distance from the origin of the view frame.

2.3. Coordinate Systems

When we try to put all of these transforms on a mathematical basis we immediately run into a sticky and irritating problem — how to define the coordinate systems. It is standard in engineering, science and most of mathematics to work in right-handed Cartesian coordinates. A right-handed and a left-handed Cartesian coordinate system are both shown in Figure 2.2. In keeping with this convention we will also always use a right-handed Cartesian coordinate system. However, be warned, this is not standard in the world of computer graphics. Left-handed systems abound and sometimes both conventions are used at the same time!

There is another frequently used convention within computer graphics which, if we are to stick with it, forces the orientation of the axes in the view frame. It is that the positive z axis points forward into the picture, along the direction in which the observer is looking.

Putting all this together, we have chosen to end up with the various coordinate systems shown in Figure 2.1. Positive x is up and in the world frame the y-z plane defines ground level.

Coordinate systems and frames of reference are also discussed in Appendix 6.

2.4. Vectors and Matrices

For someone who loves computing but not mathematics, the introduction of matrices and vectors is not very welcome. Although it is possible to do all of the required mathematics by straightforward algebra, vectors and matrices establish an elegant and consistent framework within which to work. In addition there are properties of matrices which make them especially useful. An example is when a series of transforms take place in succession, such as when a rotation of an object about the x-axis is followed by a rotation about the y-axis. Instead of calculating the coordinates of the object twice, after each rotation, it is possible to concatenate (multiply together) the two transformation matrices and then perform the combined transform once only. This can save a lot of time when there are many points to transform.

We will discuss the various types of transforms in detail as they come up. Appendix 5 also explains matrices and vectors.

figure 02 02
Figure 2.2 Right-handed and left-handed coordinate systems

2.4.1. Vectors

Vectors are a mathematical shorthand notation which tell you how far to go in a given direction. Vectors go together with matrices. Here again there are two conventions concerning vectors. Vectors can be row vectors or column vectors. This doesn’t mean very much, except that it changes how a vector looks when it is written down and the arrangement of elements inside the transformation matrices. In the teaching of engineering and science it is more usual to write vectors in column form and we will adhere to this convention exclusively throughout the book.

2.5. Data Structures

2.5.1. Variables and Labels

One of the most difficult things to get used to when first using assembly language is that there are no algebraic variables, just data stored in registers and at memory locations. You can’t add x to y but you can add the contents of register d0 to the contents of register d1. In a 16 bit system such as the Amiga even memory locations become hard to locate because they are not always known when the program is written (except for addresses of registers used by the Operating System, which are fixed). This is in contrast to simpler 8 bit micros where PEEK ing and POKE ing allows access to anywhere in RAM at addresses which will be fixed and always available to the program. The problem with a micro with an advanced operating system, like the Amiga, is that until a program is actually loaded in the machine and ready to run, its exact location will not be known. There is a way of forcing the Operating System to load the program at a particular memory location by the use of absolute code (set by the assembler directive ORG) but that builds inflexibility into the program and may lead to clashes with other software. That may not be a problem with a game which will tie up the computer all to itself, though it may fall victim to later modifications in the operating system.

The general philosophy is to produce programs which are insulated from all of this and come as complete self-contained packages which can be located and run anywhere in RAM. At first sight there appear to be insurmountable problems with this approach: how can you set up a table of data and later find it and how can you set up a table of addresses (jump vectors) of subroutines to execute depending on the outcome of a test? There are various solutions to these problems, some of which utilise particular addressing modes of the processor and others of which rely on the assembler, as we have already mentioned in the discussion of position independent and relocatable code in Chapter 1. The problem is solved by the extensive use of labels which are temporary substitutes for addresses which will be calculated later.

Labels play a very prominent part in any assembler program. The way they appear in the code makes them look like algebraic variables but they are not. A label is a pointer to a memory location where the current value of a variable is held, or it is a pointer to another part of the program. This is where much of the difficulty arises.

2.5.2. Lists

Finding ways of efficiently storing and accessing data has been the subject of intense study in computing. In computer graphics it is very important, particularly where speed matters. The important thing is to store data in a form such that is easy to get at for the problem in hand. It may not always be in the best form for all applications all the time, and some manipulation may be required along the way.

In vector graphics where primitives are modelled by polyhedral structures with polygonal faces, what is most important are lists of vertices (corners) and the straight line edges joining them. Figure 2.3 illustrates a house modelled in this way. There is more than one way of setting up a data list to describe this structure, but the one we will most commonly use has at its centre the list of connections which describe the surfaces uniquely: the edge list. One thing to avoid is having to repeat the actual coordinates of the vertices more than once. It is better to give each vertex a number and instead refer to this. When the x, y and z-coordinates of a vertex are required they can be drawn from the list of coordinates by the powerful indexed addressing modes of the ST, providing the position in the list is simply related to the vertex number. To make this point clear, here are the lists which are needed to draw the house. There will be other lists as well, containing other attributes such as the colour of each surface and so on, but they are not shown here. The house is not very complicated, but sufficiently so to show how long the lists might become for a really complex object.

First the number of polygons in the house as a whole must be specified. Each plane face qualifies: four walls, two sloping roofs, one floor, one door, so we have:

  surface number: 8

There is only one entry here but if there were other buildings it would be a list. Then the number of edges in each surface is given, where the entry has the same position as the number (circled) of the surface as shown in the figure:

edge numbers: 5, 4, 5, 4, 4, 4, 4, 4

After this the ordered list of vertex numbers going clockwise round the exterior face makes up the edge list. To make the data most useful to the program, the first vertex for each surface is again repeated at the end of its group to make a closed loop.

edge list: 7,8,9,2,1,7,1,2,3,4,1,4,3,10,5,6,4,6,5,8,7,6,5,10,9,8,5,2,9,10,3,2 1,4,6,7,1,11,12,13,14,11

Finally the actual coordinates, in whatever scale is being used, are given for x, y and z in the order of vertex numbers:

x coordinates: 0,100,100,0,100,0,0,100,150,150,0,50,50,0

y coordinates: 50,50,50,50,-50,-50,-50,-50,0,0,50,50,50,50

z coordinates: -100,-100,100,100,100,100,-100,-100,-100,100,-10,-10,10,10

These data would be used to define the house in the object frame. Following the transformation to the world frame some of the lists, the edge list, the edge numbers and the surface number would all be unchanged but the coordinates in the world frame would be different.

figure 02 03
Figure 2.3 A house modelled as a polygon mesh

2.6. Summary

What should be one’s attitude towards these very mathematical aspects of 3D graphics? If you are mathematically inclined, then it makes sense to try to understand what’s going on in detail. This gives you the power to write your own transforms and explore some of the very interesting effects that can be produced. If you are not mathematically inclined then just regard the mathematical transforms as software “black boxes” to be “plugged in” as required. The transforms in this book are structured to allow you to do this. You only have to understand how to present data to them.

3. Drawing on the Screen

In this chapter we look at how the Amiga screen is addressed. This is detail which is highly specific to the Amiga but of great importance for fast graphics since our intention is to draw 3D solid objects in real time. A very important aspect of this will be filling in polygonal shapes quickly.

No matter how complex graphics programs are, ultimately their output must appear on the screen. For the new programmer there is a number of confusing terms associated with producing visible output: playfields, bit planes, screens. These concepts arise because of the power and flexibility built into the Amiga.

Simply put, the Amiga has been designed with a powerful set of tools to implement 2-dimensional sprite graphics (the graphics of Pacman, icons and many popular games of the scrolling variety), and this is reflected in the graphics terms. That is not to say that 3-dimensional graphics is difficult; on the contrary it is very well catered for but, as we will see, for 3-D only a small part of the graphics arsenal included in the Amiga need be used.

First of all remember that the picture that ends up on the monitor screen is a direct “map” of the contents of RAM. The word map, as it is used here, is a bit of mathematician’s jargon. It means what is in RAM entirely defines what is on the monitor screen though it will not look at all like it. RAM is simply a series of 1-D bytes of contiguous (all in a line) memory which the hardware converts to the 2-D picture. To make life easy, we will refer to that part of memory dedicated to displaying pictures by a special name -Video RAM- a term commonly used in computer systems. In the Amiga, Video RAM must lie in that section of memory which can be accessed by all the custom hardware (called Agnus, Paula and Denise) and which is called Chip Ram. The prime requirements of Video RAM are that it must be laid out so as to allow easy drawing and to hold colour information. Understanding how colour is included is the key to understanding the layout of Video Ram.

There is an additional and important complication in the graphics we will be doing. It is called double buffering or screen buffering and is essential for flicker-free pictures. What it amounts to is having two chunks of Video RAM available: one to draw on and the other to display. In the Atari ST these are given the helpful names logical and physical screen, respectively. They are switched back and forth so that whilst a picture is being drawn on one screen (logical) the other (physical), which holds the last complete picture, is displayed on the monitor. Then when the new picture is completed it is put on display and becomes the physical screen. The old physical screen then becomes the new logical screen and is erased ready for drawing the next picture. It helps to think of each new picture as a frame, in the movie sense, so that the real-time graphics evolves like an interactive movie.

The programmer arranges that the switch from one screen to the other is naturally synchronised to the program; the program doesn’t ask for the switch until the new frame is complete and the hardware doesn’t change the display until the raster on the screen has reached the bottom right-hand corner and is ready to fly back to the top. The short time for this to occur — called the vertical blank — is more than sufficient for the hardware to switch the screens.

3.1. The Screen

To understand the problem, think of the differences between the actual monitor screen and the block of Video RAM holding the image. The monitor screen is a rectangular end of a cathode ray tube on which an electron beam writes. To make this look like a picture the beam moves very quickly from left to right and top to bottom in a series of ‘raster’ scans; the picture is made up of closely spaced horizontal lines each made up of units called pixels. There isn’t really a solid picture at all, it just looks that way from a distance. To see this for yourself, inspect the monitor screen closely with a magnifying glass.

Memory, on the other hand, is laid out as a contiguous line of bytes, which are the smallest elements the microprocessor can directly address. Of these, the smallest resolvable unit is the bit (8 bits = 1 byte). Somehow each bit in memory must directly relate to the smallest ‘spot’ or pixel on the screen. The great flexibility of the Amiga allows many variations depending on the number of lines and size of a pixel. Taken together these two constitute the screen resolution: the more lines and the smaller the pixel, the greater the resolution. We will examine only the case illustrated in the programs in this book: the so-called low resolution with 32 colours and pixel matrix of 320x200 (320 across and 200 up). This mode allows us to produce fast real-time pictures with a good colour range. Other modes may be higher resolution and more colourful but are too slow. The interested reader is referred to the Amiga Hardware Reference Manual for details of other modes.

3.1.1. Playfields and Sprites

At the heart of the Amiga graphics system is the playfield. The playfield is really nothing more than our logical screen. Playfields go together with sprites to make up the 2-D orientated system which is so powerful for Pacman-like images and icons. Sprites are pre-drawn images, 16 bits (one word) wide and any number of lines high, which can be copied onto the playfield very fast. By first preparing in memory several sprites showing different orientations of an object, and then copying them onto the playfield in succession, it is possible to simulate 3-D type motion. As far as we are concerned this is a cheat. The graphic objects, or primitives, we will create will have an independent existence of their own and only at the instant of drawing will they be converted to a pattern of bytes. Since we will never use sprites they are of no further interest to us.

So the place in chip RAM where the picture is to be drawn is called the playfield. As a consequence of the flexibility built into the Amiga system, the playfield can be too big (or too small) for the monitor screen. This does not lead to a catastrophe, it simply means, if it is too big, that only part of the playfield will be visible at any one time. This is an excellent arrangement for games and displays where the background can be made to scroll across the screen. As a further complication you don’t even have to fill the whole screen. The picture can be “windowed” down to the desired size at the time of display. Once again, we will not utilise these variations. In our case the playfield will exactly fill the monitor and the window will include the whole playfield. This will not in any way limit our graphics.

Having decided on the screen resolution, size and number of colours, we can now calculate the playfield size in RAM. This leads naturally to the idea of bit-planes.

3.1.2. Bit-Planes and Colour

A playfield holds the picture which will be displayed on the monitor and, in our case of low resolution, must hold sufficient information to display 200 lines vertically, 320 pixels horizontally and show 32 colours. Colour makes the big complication and is the key to understanding why the playfield looks the way it does.

To get a feel for what is going on let’s first consider the simplest possibility, that of only 2 colours. In this each pixel can be either on or off. The simplest playfield we could construct to do this is where the first 40 bytes (40 x 8 pixels) represent the first raster scan line , the second 40 bytes the second scan line and so on. In this way each bit in memory represents a pixel; if it is set to 1 the pixel is on and if it’s set to 0 the pixel is off. To fill the entire screen the playfield would have to be 200 x 40 = 8000 bytes in size. That is in fact how it is done in this case. The array of 8000 bytes is called a bit-plane, for that is exactly what it is. In this case the playfield contains a single bit plane. Moving up to 4 colours doesn’t present a problem in this scheme but now 2 bit planes are needed. There is still a one-to-one connection between each pixel on the screen and a particular bit in each bit plane, but whether the bit is set 1 or 0 fixes the colour. For example, the second pixel in the top row on the screen has coordinates x = 1, y = 0 (it’s a peculiar feature of computer displays that x = 0, y = 0 (the origin) starts at the top left-hand corner of the screen). This pixel corresponds to the second bit in both the first and second bit planes, which are just two 8000 byte-sized blocks of RAM. Now the way in which colour can be encoded becomes clear. If the bit in both planes is 0 then the pixel has colour 0, if the bit in plane 2 is 0 but the bit in plane 1 is 1 the pixel has colour 1, and so on as show below:

bit plane number 1 2 colour

bit value

0

0

0

1

0

1

0

1

2

1

1

3

It is therefore possible to have 4 different colours for the second pixel depending on how the second bit in each bit plane is set. Clearly there is a pattern to this: 1 bit plane gives 2 colours, 2 bit planes give 4 colours, 3 bit planes give 8 colours, 4 bit planes give 16 colours and 5 bit planes give 32 colours. Expressed mathematically, the formula is:

  number of colours = (2)**number of bit planes.

In the Amiga the bit planes are separate 8000 byte blocks of RAM. In the Atari ST there are 4 bit planes but they are interwoven so that it is not so easy to see what a given pixel is doing.

For our purposes, we have 5 bit planes and therefore 32 colours at our disposal. Figure 3.1 shows the 5 bit planes set to fill pixel (x=2,y=3) with colour 13.

One last question remains. The Amiga 500 has 4096 colours at its disposal; how is this vast range related to the 32 we can display with the 5 bit planes? The answer is that at any one time we can display only 32 out of the total range of 4096. This list of the colours we have selected to currently display (colours 0 to 31) is contained in a special series of registers which, taken together, are called the Colour Table or Colour Palette. It is the programmer’s responsibility to write the values of the colours selected into the table at the start and, if they are changed at any time, to rewrite the table. The codes for the standard palette, which is how the Amiga is set up when it is turned on, are listed in the file data_00.s. The custom hardware takes care of the practical details of converting the 32 numbers in the colour table to colours on the screen. Of course the programmer has to know what number makes which colour in order to write the table; we will discuss this point later in more detail when examining the example program and again in Chapter 7.

figure 03 01
Figure 3.1: The five bit planes for 32 colours

3.1.3. Copper Lists

The Copper is a friendly name for the graphics coprocessor which handles nearly all of the graphics system. It works independently of the 68000 processor leaving it free to get on with the program execution. Again it is not our purpose here to savour all the wonderful and exotic functions of the Copper, such as setting up a screen with horizontal slices of different resolution and colour depth. We will discuss only those aspects of the Copper’s function which are used in the example programs.

There is one function for which the Copper is essential for our purposes; mapping the playfield onto the monitor screen. In our case with double-buffering (one screen to draw on and another to display) there is the additional chore of swapping screens during the vertical blank. The Copper must be given the right information at the right time to take care of all this. This information is gathered together in the Copper lists.

All the Copper needs to do its job are the addresses in RAM of the two sets of 5 bit planes that make up the two screens. These addresses are entered, one at a time, together with corresponding special System addresses (called unsurprisingly Bit Plane Pointers) into two lists which the Copper will use to make the display. Of course the Copper will also be told how many bit planes to use, so our Copper list is really nothing more than a series of linkages connecting each bit plane address with its destination System register. In order to construct the display the Copper will, at the vertical blank, move the addresses of the bit planes to be displayed into the bit plane pointers. The hardware will take care of the rest. More details are given with the examples.

3.2. The Blitter and the Screen

The blitter, which is another hardware coprocessor residing in the Agnus chip, is indeed a mighty device. Its presence overshadows the entire operation of the Amiga making possible very high speed graphics. The word BLIT comes from BLock Image Transfer and the sound of the word itself is just right to suggest the powerful operations that can be done.

For our purposes the blitter plays a key role at several basic stages of assembling a picture on a playfield. It can be used to draw outlines, fill them in, copy them to the bit planes and perform fast erasure (though for our purposes we will bypass the line drawing for reasons explained later). In every respect it gives a great speed advantage, performing elementary graphics functions independent of the 68000, leaving it free to get on with the program. In a system without a blitter all the graphics functions would have to be done by the main processor, which slows things down.

The blitter’s main function is to copy bytes from one place to another very quickly. This is especially important in 2-D style graphics where sprites “move” by being copied from one playfield position to another. Remember a sprite is a rectangular word-wide block (one word = 2 bytes); the blitter only handles rectangles of words. In our case it will be used in the filling in of outlines and transferring images. Let’s have a look at how a picture will be constructed. Once again, this is not an exhaustive nor general description of the operations of the blitter. It is quite specifically directed towards explaining how the blitter has been used in our 3-D graphics “pipeline”.

3.2.1. The Bit Planes Layout — An Overview

It will be easier to understand what is to follow if the general bit plane strategy employed in the example programs is explained at this stage. It is not claimed by the author that this strategy is the best there is. Other programmers, no doubt, will have better ideas. However the method used relies on correct, documented, use of the custom hardware and provides an excellent insight into its operation at the deepest level.

The drawing of a new picture on the logical screen follows several stages. Each new graphics object (primitive) to be added to the composite picture is drawn from filled polygons on a special bit plane called the mask plane, which is not one of the playfield bit planes. The name “mask” has a special significance since it is projected onto the playfield bit planes in a way which is determined by what is already there and the colour of the object. This all sounds rather complicated, and so it is.

To understand the problem clearly, remember we are using the blitter throughout and must stick to the rules. The blitter likes to work in word-wide rectangles of RAM and so we copy a rectangle which completely encloses the image. Now we have to be careful since if we simply copy the image on the mask plane to the playfield it will blot out a whole rectangle including everything to the side of the object as well as what is underneath. What we want to do is only cover what is obscured by the object itself, not everything in its blit rectangle. Somehow the “background” itself will have to figure in the copying process.

To make things easy, consider the case where there is only one bit plane, though of course in fact we will always have five. We’ll discuss the added complication of several bit planes afterwards. With one bit plane in the playfield, let us suppose that we are building up a composite picture from a triangle and a square. The square has been copied to the playfield bit plane and the triangle has been drawn on the mask plane and is ready to be copied to the bit plane. Figure 3.2 shows what is going on.

figure 03 02
Figure 3.2: Copying the triangle to the bit plane
figure 03 03
Figure 3.3: What we don’t want - the square is blitted out
figure 03 04
Figure 3.4: Correctly copying the triangle

If we simply blitted the rectangular mask containing the triangle to the bit plane it would block out the entire square as shown in Figure 3.3. What we want instead is the result in Figure 3.4.

The blitter can handle this: it just needs to be shown the destination background rectangle and with a clever bit of logic called a “cookie-cut” function (like cutting out cookies with a shaped cutter) will combine both the triangle and square without obscuring anything.

There is the additional complication of the 5 bit planes required to encode 32 colours to discuss. Now we can consider the triangle to have a colour — colour 5 for example. We don’t know yet what colour this is; it could be anything, depending on what number we choose to set in colour register 5. What colour 5 does mean is that inside the triangle, bits must be set 1 in bit planes 1 and 3 and cleared 0 in all the other planes. It isn’t sufficient to only set the bits in planes 1 and 3 and ignore the others. Hence the true meaning of the object as a mask is revealed; the mask, like a stencil, must set bits 1 in some bit planes and clear bits 0 in others, depending on the colour.

Taken altogether the above discussion sounds like a lot of lengthy programming and hard work. It isn’t. These complex manoeuvres can be simply accomplished by writing the appropriate codes to blitter registers. Understanding what codes to write is another thing. But we will delay the details until the example program is presented so that you have something to look at.

Finally the blitter can be used to erase the logical screen just before the next frame is drawn. It turns out that erasing five bit planes at a time (40,000 bytes) is a time consuming operation helped enormously by the blitter.

3.3. Drawing

At the heart of our fast graphics program are the routines which draw and fill in polygons. Using polyhedra as models for solid 3D objects will produce many polygonal surfaces to fill in. The job is best done in two stages: first an outline is drawn by joining up the vertices at the corners, then the region within the outline is filled in. It turns out that this isn’t quite as simple as it sounds. To understand why, let’s look at how fast line drawing and region filling can be done.

The procedure for fast drawing lines on a computer screen has been around for a long time. It is called the Bresenham algorithm. It is fast because it doesn’t use arithmetic — in particular it doesn’t use division or multiplication, which are among the slowest instructions in the 68000 set. The blitter itself has the capability to draw lines and even special lines for outlines to be blitter filled, but our routines do not use the blitter for line drawing. Why is this so? The answer lies in the way the blitter draws outlines and what it expects them to look like for filling in.

The blitter can only fill in a closed polygon by a series of raster scans starting at the top of the screen and working down a line at a time. It expects to find only two pixels set on each line and will the start filling in at the first pixel and stop at the second. Now you see the problem. If it ever finds more than two pixels set per scan line it’s in trouble. Suppose, by accident there are three pixels set. It will start filling at the first, stop filling at the second and start again at the third. But without a fourth pixel to stop at it will go on filling to the edge of the screen, which is not what is desired.

For most carefully chosen situations a special line drawing mode of the blitter can be selected so that there is no problem but, with the dynamic action of 3-D graphics, situations occasionally arise where the blitter ends up drawing an outline which has more than two pixels set per scanline. This is not really a fault of the blitter but rather a complication of passing information to it concerning the positions of polygon vertices to be connected in the outline. To solve the problem in the general case becomes laborious, so for the routines in this book the problem has been avoided by using custom routines that always work, instead of the blitter. There is no loss of speed here since drawing outlines is very quick compared to all the other chores.

3.3.1. The Bresenham Line Drawing Algorithm

How can you draw a line between two points without using the equation

y = mx + c

where a multiplication is required to get the y-value for a given x value?

Fortunately the solution to this problem was solved many years ago in 1962 by J.E. Bresenham. The problem at that time was to control a digital plotter which could neither multiply nor divide. Such operations are available on the 68000 but they are time consuming and we want to avoid them where possible. The great advantage of the Bresenham algorithm is that it can find all the screen coordinates of a line using only additions and subtractions. When described in algebraic terms the Bresenham algorithm looks intimidating but, like all great ideas, is really very simple. Of course some (though not all) commercial programs use algorithms which draw lines and fill polygons faster than the Bresenham method will allow, but having understood it you can try to do better. In any case it is very elegant and very fast.

The problem facing us is to find the (x,y) coordinate pairs along the sides of a polygon so that we can use them as the start and end points for horizontal lines to do a fill. The fill of a very small area, chosen so to exaggerate the irregularity caused by the pixels, is shown in Figure 3.5. Regarding the boundary as a line, we see that it looks different in different screen resolutions. At the normal resolution, the position of a pixel on the screen is specified by an integer value between 0 and 319 horizontally and between 0 and 199 vertically. With this limitation any line (unless it is either horizontal or vertical) will, under a magnifying glass, look like a staircase. This is shown in Figure 3.6. There is clearly no need for us to try to calculate the coordinates of a point to better accuracy than the screen resolution will allow, which means that integer arithmetic is quite adequate. There is no point in calculating the position of a point on the screen to 4 places of decimals because it can only be plotted to no places of decimals. The Bresenham strategy owes its success to the way it fits in with the pixel layout of the screen. Here is the way it works.

figure 03 05
Figure 3.5 A small polygon enlarged to show pixels

Let us suppose that we are plotting a line on the screen which starts at the point S(x1,y1) and ends at point T(x2,y2) as shown in Figure 3.6. These points will, of course, lie precisely on the line. Now we could take a pencil and ruler and draw an ideal mathematical line between the two end points and then shade in those pixels which lie closest to the line. This is how our line will look on the screen. The result is shown in the figure where the pixels are represented by squares. We want an algorithm to do what the human brain does automatically in deciding which points to shade.

Here is the Bresenham algorithm which does this. To make the picture simpler we replace each pixel by a dot at its centre which makes very clear the degree to which each pixel misses the ideal line. Suppose we have just reached the point A, which didn’t lie precisely on the line, and we have to choose which point to do next. The next point could be B(x+1,y) or C(x+1,y+1). It seems an obvious choice; point C because it is closer. Closer in this sense means a shorter vertical distance to the line at the point E from the centre of the pixel. We can call this the error. On the diagram, error t is less than error s. Notice that somehow we didn’t consider point H in this decision. That’s because the angle of the line is less than 45°. If the angle had been greater than 45°, we would have considered the points H and C. Already it is clear that lines of slope less than 1 (angle less than 45°) are a different case from lines of slope greater than 1 (angle greater than 45°). We will come back to this later.

figure 03 06
Figure 3.6 Pixel positions along a line

Well it looks like the problem is solved! Just inspect the next two points ahead, like B and C, calculate the vertical distance of each to the line and choose the shorter. In principle that’s it. If the vertical distance up to the ideal line is taken as a positive error (like s) and a vertical distance down to the line is taken as a negative error (like t) then the overall quantity on which the choice is based is (s-t):

  if (s-t) = D is positive, the next point is C
  if (s-t) = D is negative, the next point is B.

The quantity (s-t) is called the decision variable D for obvious reasons.

Bresenham’s great innovation was to spot two tricks to make this a simple operation. The first is that since only the sign of (s-t) matters, any quantity which is proportional to (s-t) will do. The second is that there is no need to re-do this calculation each time. The value of D used for the present choice can be quickly corrected to find the value of D for the next choice.

So it goes like this. The updated decision variable, D, is tested to see if it is positive or negative. If it is negative the next point to set is B. Then D is updated accordingly. If it is positive, the next point to set is C. Then D is updated accordingly. We just have to find out what these updates are and what the value of D at the very start of the line should be.

The key to answering these questions is to look at how to get from A to B or from A to C. To get from A to B do a horizontal move; to get from A to C do a horizontal followed by a vertical move. To calculate the errors associated with the individual horizontal and vertical moves it is simpler to look at point S. From this a horizontal move produces an error of AF, but a simple vertical move to G produces an error of -SG (points below the ideal line have a positive error and points above have a negative error). But SG is equal to SA, so we really only have to consider the relative lengths of the vertical and horizontal sides of the triangle SAF. But, very important, triangle SAF is similar to the overall triangle SUT and the sides are in proportion:

  AF/SA = TU/SU = (y2-y1)/(x2-x1) = dy/dx

where dy is the overall distance in y and dx is the overall distance in x from the start to the end of the line.

As we have said, anything in proportion will do, so the errors could be taken as dy and -dx. A further factor of 2, which still keeps everything in proportion, will bring us into line with Bresenham’s original scheme:

simple horizontal move: error = 2dy

simple vertical move: error = -2dx

For the actual moves from A to B or from A to C:

horizontal move (AB): error1 = 2dy

horizontal plus vertical move (AC): error2 = 2dy-2dx

These are the updates which must be made to the decision variable D, for the next choice.

Finally, what value of D should we start with? Everything works fine if we take the starting value D1 as the average error of error1 and error2

  D1 = (error1 + error2)/2 = 2dy-dx

To summarise, here’s the algorithm

  1. initialize the first point to x1,y1 and the initial value of D to D1,

  2. if D1 is -ve, increment x but don’t increment y and make D = D + error1, if D is +ve, increment both x and y and make D = D + error2

  3. repeat step 2 until x = x2.

Now what about lines which have a slope greater than 1? The solution is very simple. To see it clearly, just draw a line with slope greater than 1 on a piece of tracing paper and clearly label the x and y axes. Now turn the tracing paper over. With the y axis horizontal and the x axis vertical, it now looks like our original line of slope less than 1 except that the x and y axes have been interchanged. Everything therefore works exactly as before if x and y are interchanged in the formulae.

3.3.2. Tailoring Bresenham to the Polygon Fill

The blitter is ideally suited to filling in outlines but, as we have said, it needs those outlines to be drawn in a special way. In particular it wants to find only two pixels set on each scan line: one to start filling at and the other to stop.

The procedure we have described will certainly generate points along a line, but for our purpose we do not need them all. When considering lines of slope less then 1, points which lie on the horizontal part of the “staircase”, such as S and A, all have the same y coordinate but different x coordinates. Only the x-coordinate of the first one, S, is required since the others, like A, will confuse the blitter. The first one in the line follows immediately the change in sign of D. Our version of the Bresenham algorithm is modified to generate only the start and end coordinates of horizontal lines for raster scans to fill a convex polygon. It is not exactly a Bresenham algorithm in the usual sense since the coordinates it generates would, if plotted alone, produce a line full of holes along horizontals. But that’s how the blitter likes it.

3.4. Example Program

There is one example program included in this chapter but it is quite long and it illustrates all of the points included in the above discussion and plenty more besides. It doesn’t do a great deal; it draws two solid triangles one after another with double buffering, but contains all the routines we will need in later stages of the book. The coordinates of the triangles are given at the end of the file polydraw.s.

It is not claimed here that the routines are the best or fastest possible. Other versions may be more elegant and faster. But these programs are fast and do the job adequately. Besides, they do have an educational value, illustrating various aspects of assembly code programming and functions of the Amiga hardware. When you have studied how they work you may wish to make your own improvements.

In order to prevent programs evolving in a disorganised mess, several files have been set up containing subroutines of a similar kind. There will always be a main control program which will have a different name in each chapter. In addition there will probably be a core file which will contain all the important subroutines, a systm file with “housekeeping” (like erasing screens) subroutines, a bss file containing labels of variables and a data file containing numbers. Files generally end with the .s extension to show that they are source files. In general these files are added in at assembly with the powerful include directive so that once written they are there for the future.

Here is a discussion of each self contained program file, with an explanation of its salient features. The files themselves are listed on the succeeding pages. They are ready for assembly by Devpac Amiga, but any 68000 assembler can be used providing changes are made to fit in with its rules of syntax as specified in its manual.

Since this is not a textbook of all the hardware details of the Amiga, it is inevitable that frequently only brief mentions are given to many features which deserve a complete section to themselves. This is regrettable but essential to preserve the flow of the narrative and not to lose sight of the overall objective. The interested reader will find a complete hardware description in the Amiga Hardware Reference Manual.

3.4.1. Polydraw.s

This is the main control program. It calls lots of subroutines which are contained in the other files included at the beginning. This directive makes the assembler insert the whole of the source file referred to at this point in the program when the assembling is done. All of the subroutines called in this file are described in detail in the core_00.s and systm_00.s sections.

First memory is allocated for screens and copper lists. Then the Copper lists are constructed followed by the blitter allocation and the writing of the colour table. There is also a look-up table written to speed access to the mask plane. Then the program enters an infinite loop, which you can only interrupt by resetting the machine. It first draws a triangle on screen 1, the first logical screen, and displays screen 2 which, at the start is empty. Then it displays the triangle on screen 1 (now the physical screen) and draws an inverted triangle on screen 2 which has now become the next logical screen. Now the bottom of the loop has been reached and with a

_bra blit_loop_

the cycle is repeated. All the important work is done in the long subroutine poly_fill which does just what it suggests — it fills polygons.

That’s the main program. You could try changing the shapes which are drawn. They are filled polygons and the coordinates of their vertices are given at the end of the file. More about that later.

The program is simple but it illustrates the things we have been talking about and contains all the routines we will use later.

3.4.2. systm_00.s

This is a “housekeeping” file; it contains utility routines which are frequently used. At the start is the routine to allocate memory space for the screens and the Copper lists; this must be done by calls to routines incorporated in the Amiga Operating System in a special Libraries. The Amiga uses dynamic memory management in which free space is allocated as required. Unlike the early 8-bit micros, you can’t assume that a particular range of memory addresses are reserved for programs and another range for the screen. You have to ask the Operating System for space and wait to see what you get. If the system cannot find memory to allocate it will return a 0 in the register d0 after the call to allocmem. The program doesn’t check for this condition but it probably happens for large programs and is something to watch for.

For our purposes the screens must be in chip memory so that they can be accessed by the custom hardware. Remember this is the lower 512k range (if your computer only has this much then it’s all chip memory) and it is possible to specify that the space we want is in this memory. There is another request we can make; we can ask for the memory to be cleared when it is allocated — it’s useful to start off with a clean slate. Both of these requests are made by placing the long word #$10002 in d1 before calling the allocmem routine. Before we look at this in any more detail, let’s see briefly how libraries are used.

Libraries

There are several libraries in the Amiga. They are sets of routines of a similar kind provided “free of charge” by the Operating System. The way they are laid out and the fact that they are collected into libraries reflects the C language orientation of the system. Of course you could replace them with your own routines, but why bother (no doubt if the Amiga programmers had included a set of vector graphics routines in the libraries this book would never have been written).

The way in which a particular library function is accessed in assembler is quite straightforward since each, subroutine starts at a fixed distance (offset) in memory from the start, or base, address of the library (each library will have a different base address). So to run a library routine you first get the base address, then set up any entry parameters that are required and finally jump to the offset from the base address. When the function has done its work you are returned to the main program.

In allocmem a particularly important library called Exec library is used to allocate memory for the screens and Copper lists. Exec is a kind of Master Library with functions that are immediately available after reset. Its base address is always the same but in any case it is stored as a long word at memory address 4 and can be placed in a register, such as a6, by the instruction

move.l 4,a6

This is what happens in allocmem except that the constant execbase (defined equal to 4 in equates.s) is used instead as it is more readable.

Looking at the allocmem we can see that first the base address of Exec library is placed in register a6, then space equal to 12 bit planes to build the two screen planes, the mask plane and the store plane (where the background is stored preceding the cookie-cut) is written into register d0. Then the long word $10002, being a combination of $2 to specify chip RAM and $10000 for clearing memory during allocation is written into d1. Finally a call to the function allocmem (offset by -198 bytes from the base address) allocates the memory and this is then partitioned into the various parts. Finally, using further calls to the function allocmem, space is allocated for the two Copper lists — one for each screen. Notice that the offsets are given the names of the function to call, as defined in the equates.s file, so as to make the program readable. One other interesting feature is that function offsets are all negative. Positive offsets lead into the ExecBase structure which is where many important Operating System variables are stored.

Following this the two Copper lists are set up. The Copper can perform several useful functions, but all we plan to use it for is to switch screens during the vertical blank. The two Copper lists contain the addresses of the system plane pointers, bpl1pt, bpl2pt,..etc., side by side with the actual screen plane addresses, in succession. That’s all that is required to make up the Copper lists. The hardware will do the job of displaying the screen pointed to by the active Copper list. There’s one small trick at the end of the lists. The copper is a processor in its own right and it wants to execute instructions. The last instruction in each of the lists (the long word $fffffffe) tells the copper to wait until the screen raster gets to the impossible position y = $ff and x = $fe. This is impossible since the largest horizontal position it can record is $e2. So what it does is to wait for an event that will never happen. It is put out of its misery when the vertical blank interrupt occurs, at which point we will switch the copper lists to implement double buffering.

In blit-alloc the blitter is reserved for our use alone and task switching is turned off. Let’s discuss what’s going on here.

The blitter can perform many functions; it is even involved in reading the disk drive. Since we intend using it intensively it makes sense to disable all other applications. To do this we have to use the OwnBlitter function in the graphics library. There is a problem here in that, unlike the ExecBase, the base address of other libraries is not known to the system; it must be found. There is however an Exec library function called OpenLibrary which finds other library addresses. OpenLibrary requires a pointer to the library name in register a1 (an ASCII string will do) and the library version in d0. It returns the library address in d0 (or a zero if it fails) which can then be used as a base address for offsets for its own functions. In the program the base address is placed in a6. Having got the graphics library base address in a6 it is straightforward to jump to the OwnBlitter function as an offset defined in the equates.s file and reserve the blitter for our program alone.

Multitasking

The Amiga is a multitasking machine. Despite having only one CPU it can appear to run several programs simultaneously. This means that each task is switched on for a short time and then switched off again until its next go. Each task is slowed down but the overall effect is of multitasking. As far as we are concerned, multitasking should not occur; the last thing we want is another application altering data structures. Multitasking is switched off with the Exec library function Forbid.

DMA

Next in the subroutine init_scrns the playfield structure is set up together with the initial logical/physical screen assignment for completeness (actually the screen assignment needn’t be done here since it will be switched back and forth in the main program loop anyway). Before messing with these important data structures, one final source of outside meddling is eliminated: DMA.

DMA means direct memory access, which is a way of letting various parts of the system read and write to memory independently without having to overburden the 68000 processor. Of course, not everything can be allowed have access to memory simultaneously and the overall control is managed by a separate processor called the DMA controller, which can be manipulated through its main register called DMACON. DMACON has two separate parts: one for read-only and the other for write-only. We wish to send instructions to DMACON so it is the write-only part that concerns us. The highest bit (15) of DMACON has a special function when a word is written to the register: if it’s 1 it sets the written bits, if it’s 0 it clears the written bits. So if you write $8004 to it bit 3 will be set, but if you write $0004 it will be cleared. Only bit 3 will be affected by this, other bits will remain undisturbed from whatever state they are in. In the write-only DMACON the bit assignments are:

  BIT       NAME	    FUNCTION
  15        SET/CLR   set/clear bits
  14        -         no function
  13        -         no function
  12 & 11   -         unassigned
  10        BLTPRI    if 1, blitter has priority over 68000
  9         DMAEN     master enable for all bits 0 - 8
  8         BPLEN     enable bit plane DMA
  7         COPEN     enable Copper DMA
  6         BLTEN     enable blitter DMA
  5         SPREN     enable sprite DMA
  4         DSKEN     enable disk DMA
  3 - 0	    AUDxEN    enable channels for audio DMA

To get to DMACON the base address, $dff000, of the chip register structure is put into a5 and the register itself accessed through the offset DMACON defined in equates.s. While the playfield and screen are being set up all DMA is shut off by writing the word $03ff to DMACON.

The Playfield

The playfield initialization which follows next writes to a number of chip registers associated with the playfield hardware. Since we are working with five bit planes in low resolution and a standard 200 x 300 size window in the non-interlaced mode, setting these registers is about as simple as it can be, though in general the setting up is a fairly complicated operation.

First of all we have to decide how much of the playfield is to be displayed since there is an option here to have a playfield which is larger than the on-screen display. The on-screen display is called the window. Such an option would be useful in sprite graphics with scrolling scenery. We just want the playfield and window to be the same size. There are two registers called DIWSTRT (display window start) and DIWSTOP (display window stop) which have to be written to accordingly. The point of it all is that although the electron beam scans the whole monitor screen, it doesn’t draw on all of it and we have to tell the system where the display part of the screen starts and ends. This is to avoid the edges where the picture is distorted and also to leave space for the blanking gaps. Without further ado, we will use the standard values for low resolution: DIWSTRT = $2c81, DIWSTOP = $f4c1.

In addition to the window position it is necessary to tell the system how to fetch and display data from the bit planes in the window. This information is contained in the DDFSTRT (data-fetch start) and DDFSTOP (data fetch stop) registers. Once again, without further ado, we will use the normal low resolution values: DDFSTRT = $0038, DDFSTOP = S00d0.

The job isn’t done yet. Now we have to set up the bit plane control registers BPLCON0, BPLCON1 and BPLCON2 and the modulo registers BPL1MOD and BPL2MOD. Of these really only BPLCON0 matters as it establishes the number of bit planes in colour. It is set to the value %0101001000000000, which means low resolution (bit 15 = 0), 5 bit planes (bits 12 to 14), no hold and modify (bit 11 = 0), no dual playfield (bit 10 = 0), in colour (bit 9 = 1), no genlock audio (bit 8 = 0), no light pen, interlace mode of external synchronization (bits 1,2 and 3); bits 0 and 4 to 7 are unused. BPLCON1 sets up scrolling which we aren’t interested in and BPLCON2 is concerned with sprites which once again we don’t want.

The BPL1MOD and BPL2MOD registers are concerned with displaying a rectangular fraction of a playfield and would contain what are called modulo values for even and odd numbered bit planes. Modulo values make it possible for the system to know where the rectangular part is relative to the whole playfield. We’ll meet this idea again when we get to the blitter. Right now there is no modulo value to worry about since we’re using the entire playfield.

Finally, with the Copper list directed to screen 1, the Copper is turned on by writing to the COPJMP1 register so that bit plane, Copper and blitter DMA are turned on.

Colours

The next routine in systm_00.s sets up the colour palette. Remember this is the list of 32 colours (32 out of a possible 4096) which reside in the colour table. It transfers the contents of the colour list at col_tble in the file data_00.s into the 32 chip colour registers starting at COLOR00. The colours are a full set, following the spectrum, but starting with black since the first colour is the background.

What remains in the file systm_00.s is concerned with drawing and is a set of routines to erase the screens and other planes, which we will put to one side for a moment, and the two complementary screen buffering routines drw1_shw2 and drw2_shw1. What these do is to point the system register COP1LC at the to-be-displayed copper list and save the address of the to-be-drawn screen in the log_screen pointer ready for the blitter.

At the very end in v_blnk is a routine to wait for the vertical blank — when the electron beam in the monitor flies back up to the top of the screen. In fact to avoid drawing at the very top of the screen, where there will be distortion of the picture, there are a number of lines which aren’t drawn. Of course that does not mean that we loose the top of the picture, it simply means that the picture is not started until it is someway down the monitor screen. A special chip register VPOSR keeps a record of the vertical position of the raster scan and by reading it we can find out where it is. By switching the screens when the scan is in the “hidden” band at the top we can hide the change. A sufficiently hidden line is number $10. What v_blnk does is to wait until this position is reached. VPOSR is really a long word register with the vertical position in the bit range 8 to 16, i.e. it spans two words, so those bits have to be singled out.

3.4.3. core_00.s

Here’s where all the action is. This file contains the important subroutine, poly_fil, which does the drawing.

Notice how the routine blt_chk is used before writing to the blitter registers throughout this large section of code. That is because the blitter runs independently of the 68000 CPU and is likely to be completing the previous task when it is next needed. As a consequence we must wait for it to finish. Any attempt to alter its register contents while it’s still going will lead to strange effects.

Let’s briefly recap what’s going on here. The routine is passed the coordinates of the vertices of a polygon which is to be filled with a particular colour. To do this it first has to draw an outline in a form acceptable for filling and then fill it using the blitter for speed. In addition the blitter will be involved in erasing the screen ready for the next frame. Though the routines for that are in the file systm_00.s, they look very similar to that which fills a polygon. In fact in erasure the screen is simply filled with nothing! Let’s look in detail at what poly_fil contains.

Drawing the Outline

Before starting anything the mask plane is erased.

We have already met the modified Bresenham algorithm that draws outlines ready for the blitter to fill. That is what is done in the Part 1 of polyfil. First the coordinates of the outline are calculated and temporarily stored in a reserved part (x-buffer) of RAM consisting of 200 long words which starts at xbuf. Each long word refers to a particular y-position on the mask plane and holds in its high and low words the start and end x-coordinates of the outline. This turns out to be a simple way of avoiding setting more than two set pixels per scan (y) line. The x-buffer can only hold one start and one end x-value, and that’s an end to it!

Once the outline has been assembled in the x-buffer it is drawn onto the mask plane in Part 2.

Filling In With the Blitter

In Part 3 of polyfil the outline drawn in the mask plane is filled in and then copied onto each of the five bit planes which make up the logical screen. What appears on each bit plane is determined by the colour and what logic has been included.

During the fill of the outline on the mask plane we can save time by confining the blitter’s attention to a rectangle containing the polygon. This is especially important when only small objects are being drawn. We don’t want the blitter wasting time looking at parts of the screen which are empty. The vertices of the polygon can be used to construct the rectangle.

The settings for the blitter registers now start to make some sense when interpreted in terms of the basic blitting process, i.e. copying data from here to there. To see the blitter in all its glory we will have to wait until the mapping of the mask plane to the screen bit planes is discussed (next).

One curious feature of blitter filling is that it works backwards; the fill occurs in order of descending addresses from right to left in screen coordinates and therefore starts at the highest address on the outline which is the “bottom right-hand” vertex. This address must be given to the two blit pointers A and D. Normally in a simple block image transfer these pointers point respectively at the source and destination addresses. In this case they are one and the same thing.

The BLITMOD registers are set with the difference between the plane width and the blit rectangle. These allow the blitter to start at the right place on each successive line.

The contents of the registers BLITCON0 and BLITCON1 now have specific meanings. In BLITCON0 the top bits 12-15 are concerned with shifting which we never use so they are set to $0. Bits 8-11 turn on DMA channels A, B, C and D. In the limited application of filling only A and D are used so the nibble is $9. Bits 0-7 set the logic function which combines the channels in a particular way. In this case to simply copy A to D they are set to $F0 (the setting of these logic functions is discussed in more detail below). In BLITCON1 the top bits 12-15 are again concerned with shifting and not used. Bit 3 is set for an inclusive fill, which means include the boundary lines (bit 4 is set for exclusive). Bit 1 has to be set for descending mode. There are two other registers, BLTAFWM and BLTALWM, which contain masks to filter out unwanted bits within a word but to the side of the object being copied. In filling they are set to all 1‘s.

Copying With the Blitter

Here’s where the blitter excels. What we want to do is copy the filled-in polygon shape from the mask plane to the five bit planes which make up the logical screen. How we go about this is linked to both the colour of the polygon and how the composite picture on the screen is put together. Remember the 32 available colours have their codes entered in the 32 colour registers of the colour table. That’s where the 5 bit planes come from, since 2^5 = 32. The number of the colour register to be used, when converted to binary, tells you what bit planes have to be set and what bit planes have to be cleared to use its colour. So if we want to use the colour in register 7, the first, second and third bit planes have to be set 1 and the fourth and fifth bit planes have to be cleared over regions where the image is set in the mask plane. As far as the blitter is concerned, this means five block image transfers, which it what it was made for.

The additional complication comes from the way in which the composite picture on the logical screen is put together. Since it is to be a scene of some kind, where distant objects are obscured by near ones, we have to put the distant objects in first and the near ones in last. Hence in copying an image from the mask plane to the bit planes some care must be taken not to mess up what is already there, particularly because what is copied is everything within the rectangle containing the polygon, including the space around the polygon (all this is shown in figures 3.2, 3.3 and 3.4). The key to all this is minterms, which is the way of describing the logical operations which control how the mask and its screen destination interact. What we want to do is the “cookie cut” function. Here’s how it works.

First let’s assume a filled polygon has been drawn in the mask plane. Now we are going to copy the rectangle which just surrounds the polygon to the first bit plane of the logical screen. If the first colour bit is set, what is set 1 in the mask must be set 1 in the first bit plane. But what is set 0 in the mask (i.e. the space around the polygon but within the entire blitted rectangle), must not overwrite what is already in the bit plane. On the other hand, if the first colour bit is not set, we want what is set 1 in the mask to be set 0 in the bit plane. But the border around the polygon to the edge of the blitted rectangle must not alter what is already on the bit plane.

Now it is quite clear that what we are doing is combining the image already on the bit plane with the mask image with bit-wise logic. We can think of the process as a logical combination of two source data channels, mask and bit plane, to produce a destination channel, the final bit plane. This is how the blitter treats the data. It has four DMA channels — three sources called A, B and C and a destination channel called D. For our purposes, the mask plane is in channel A, the original bit plane data is in channel B and the final combination is in channel D. Channel C is not used. To avoid a conflict, since the bit plane is both a source and destination, the active rectangle in it is first saved as the storeplane. So in the actual logical combination it is the storeplane which is channel B.

Having labelled the planes A, B and D it is now easy to state what we want to happen in the logical combination:

If the colour bit is set, a bit in D (final bit plane) must be set if it is set in A (mask) OR B (store); any other combination leaves it cleared: D = A OR B

If the colour bit is not set, a bit in D must be set if it is NOT set in A AND it is set in B; any other combination leaves it cleared: D = (NOT A) AND B.

That’s really all there is to it. It’s called a “cookie-cut” function because the mask is cut out and laid on top of the bit plane, I suppose. The logical instruction is given by setting appropriately the LF bits (numbers 0 - 7) of register BLITCON0. This is done by expanding the logical expression in terms of products involving A, B and C, each of which is called a minterm. Each minterm has one of the LF bits dedicated to it so if that minterm shows up in the expansion, the LF bit is set. The details need not concern us. For the logic expressions above the values to be entered into the LF byte are:

  expression      LF byte
  A OR B          $FC
  (NOT A) AND B   $0C

Now looking back at core_00.s we can see this happening. First the active rectangle (the one containing the polygon in the mask) in the bit plane is saved in to the storeplane. This is done as a straight copy for which the minterms give the LF byte $F0. Bits 8-11 in BLITCON0 specify in order which of the four DMA channels are being used; in this straight copy, only A and D.

The entries to the other blitter registers deserve some comment. BLITxMOD (x means each channel) requires the difference between the bit plane width (40 bytes) and the width, in bytes, of the current blit rectangle, which of course will vary with the program. BLITSIZE wants the number of lines in its upper ten bits and the rectangle width in words in its lower six bits.

Then the logical combination of the mask and the source bit plane occurs depending on the value of the current colour bit. The procedure is repeated for each of the five colour planes.

As a final point, the blitter is also used to clear screens in systm_00.s.

3.4.4. equates.s

This contains the offsets for hardware registers, library functions and other constants.

3.4.5. bss_00.s

This contains the variables which are calculated during in the program.

3.4.6. data_00.s

This mainly contains the colours for the standard palette.

  *****************************************************************************************
  *                                     Polydraw.s                                        *
  *****************************************************************************************
  * SECTION TEXT
  * assembler directive
    opt d+        put in labeles for debugging
    bra main      dont' try to execute the includes

  * all these files are to be include here
    include equates.s   all the constants
  	include bss_00.s    variables locations
  	include data_00.s   mainly standar palette colours
  	include systm_00.s  a lot of housekeeping routins
  	include core_00.s   the meat
  *****************************************************************************************
  * heres the main control program
  main
    bsr   alloc_mem   allocate memory for screens etc
    bsr   copr_list   set up the copper lists
    bsr   blit_alloc  take over the blitter
    bsr   colr_set    set up the standard palette
    bsr   wrt_phys_tbl  look-up table for fast screen access

  * The program cycles here; screen buffering is used
  blit_loop:
  * first draw a triangle
    bsr   drw1_shw2     draw on screen 1, display screen 2
    lea   my_coords,a0  the vertices defined in the triangle
    move.l  a0,coords_lst here's where to find them
    move.w  #2,colour   coloured red
    move.w  #3,no_in    3 sides to a triangle
    bsr     poly_fill   draw the outline and fill it red

  * then an inverted triangle
    bsr     drw2_shw1   draw on screen 2, display screen 1
    lea     my_inv_coords,a0  the vertices
    move.l  a0,coords_list  here they are
    move.w  #12,colour    coloured green
    bsr     poly_fill     draw the outline and fill it green
    bra blit_loop         repeat the cycle



  * The coordinates of the two triangles

  my_coords dc.w 100,160,150,70,190,140,100,160
  my_inv_coords dc.w 100,80,160,160,170,90,100,80

    END
  ******************************************************************************************
  *                                     Core_00.s                                          *
  *                                                                                        *
  ******************************************************************************************
  * This fills a polygon.
  * It consists of 4 parts:
  * 1. the x coords of the boundary are stored in xbuf
  * 2. the outline is drawn in the mask plane
  * 3. the the outline is filled by the blitter
  * 4. the blitter copies the mask to the bitplanes
  ******************************************************************************************
  * Part 1. Fill the buffer with the outline.
  * a3 pointer to crds_in coords list (x1,y1...xn,yn,x1,y1)
  * a2 pointer to xbuf
  * d0-x1: d1-y1: d2-x2: d3-y2: d4-vertex number/decision vertex
  * d5-lowest y: d6-highest y/the increment: d7-edge counter
  * Polygon vertices are ordered anticlockwise
  ******************************************************************************************
  poly_fill
  	bsr	blit_mask															clear mask plane
  *INITIALISE ALL VARIABLES
  filxbuf
  	move.w	no_in,d7
  	beq	fil_end																	quit if no more edges
  	move.l	coords_lst,a3
  	subq.w	#1,d7 															counter of num edges
  	move.w	#MINIMUM_Y,d5
  	clr.w	d6																				maximum y to zero
  filbuf1
  	lea	xbuf,a2
  	addq.w	#2,a2																point to ascending side
  	move.w	(a3)+,d0													next x1
  	move.w	(a3)+,d1													next y1
  	move.w	(a3)+,d2													next x2
  	move.w	(a3)+,d3													next y2
  	subq.w	#4,a3 															point back to x2
  *FIND THE HIGHEST AND LOWEST Y VALUES: THE FILLED RANGE OF XBUF
  	cmp.w d5,d1 																test(y1-miny)
  	bge filbuf3																	miny unchanged
  	move.w	d1,d5																miny is y1
  filbuf3
  	cmp.w	d1,d6																	test(maxy-y1)
  	bge filbuf5																	unchanged
  	move.w	d1,d6																maxy is y1
  filbuf5
  	exg d5,a5 																		save miny
  	exg d6,a6 																		save maxy
  	clr.w	d4																				init. decision var
  	moveq	#1,d6 															init. increment

  * All lines fall into 2 categories: [slope<1], [slope>1].
  * The difference is whether x and y are increasing or decreasing.
  * See if line is ascending [slope>0] or descending [slope<0].

  	cmp.w	d1,d3 																(y2-y1)=dy
  	beq y_limits																ignore horizontal altogether
  	bgt ascend 																	slope > 0
  * It must be descending. Direct output to LHS of buffer. a2 must
  * be reduced and we have to reverse the order of the vertices.

  	exg d0,d2 																		x1 and x2
  	exg d1,d3 																		y1 and y2
  	subq.w	#2,a2																point to left hand buffer
  ascend
  	sub.w	d1,d3																	now dy is positive

  * Set up y1 as index to buffer
  	lsl.w	#2,d1
  	add.w	d1,a2

  * Check the sign of the slope
  	sub.w	d0,d2 															(x2-x1)=dx
  	beq	vertical 														special case to deal with
  	bgt	pos_slope

  * It must have a negative slope but we deal with this by making the
  * increment negative.
  	neg.w	d6 																		increment is negative
  	neg.w	d2 																		dx is positive
  * Now decide if the slope is High (>1) or Low (<1).
  pos_slope
  	cmp.w	d2,d3																test (dy-dx)
  	bgt hislope																slope is > 1

  * Slope is < 1 so we want to increment x every time and then
  * check whether to increment y. If so this value of x must be saved
  * dx is the counter. Initial error D1=2dy-dx.
  * If last D -ve, then x=x=inc, dont record x, D=D+err1
  * If last D +ve, then x=x+inc, y=y+inc, record this x, D=D=err2
  * err1=2dy; err2=2dy-2dx
  * d0=x: d2=dx: d3=dy: d6=incx.

  	move.w	d2,d5
  	subq.w #1,d5 														dx-1 is the counter
  	add.w d3,d3 															2dy=err1
  	move.w	d3,d4															2dy
  	neg.w	d2 																		-dx
  	add.w d2,d4 															2dy-dx= D1
  	add.w d4,d2 															2dy-2dx=err2
  	move.w d0,(a2)													save first x
  inc_x
  	add.w d6,d0																x=x+incx
  	tst.w d4 																		what is the decision?
  	bmi no_stk 																dont inc y dont record x
  	add.w #4,a2 															inc y, record x. next buffer place
  	move.w d0,(a2) 												save this x
  	add.w d2,d4 															update decision D=D=err2
  	bra.s next_x
  no_stk
  	add.w d3,d4 															D=D+err1
  next_x
  	dbra d5,inc_x 													increment x again
  	bra y_limits

  * The slope is > 1 so change the roles of dx and dy.
  * This time increment y each time and record the value of x after having done so.
  * Init error D1 = 2dx-dy
  * If last D -ve, then y=y+inc, D=D+err1, record x
  * If last D +ve, then x=x+inc, y=y+inc, D=D+err2, record x
  * err1=2dx, err2=2(dx-dy)
  * d2=dx: d3=dy: d6=inc: d0=x
  hislope
  	move.w	d3,d5
  	subq.w #1,d5 														dy-1 is counter
  	add.w d2,d2 															2dx=err1
  	move.w	d2,d4 														2dx
  	neg.w d3 																		-dy
  	add.w d3,d4 															D1=2dx-dy
  	add.w d4,d3 															2dx-2dy=err2
  	move.w	d0,(a2) 												save 1st x
  inc_y
  	addq.w #4,a2 														next place in buffer
  	tst.w d4 																		what is the decision
  	bmi	same_x 																dont inc x
  	add.w d6,d0 															inc x
  	add.w d3,d4 															D=D+err2
  	bra.s next_y
  same_x
  	add.w d2,d4 															D=D+err1
  next_y
  	move.w	d0,(a2) 												save x value
  	dbra d5,inc_y
  	bra y_limits
  * The vertical line x is constant. dy is the counter
  vertical
  	move.w	d0,(a2) 											save next x
  	addq.w #4,a2 													next place in buffer
  	dbra d3,vertical 									for all y
  * Restore the y limits
  y_limits
  	exg d5,a5
  	exg d6,a6
  next_line
  	dbra d7,filbuf1 										do rest of lines (if any left)
  * This part ends with min y in d5 and max y d6
  	move.w	d6,ymax
  	move.w	d5,ymin
  *****************************************************************************************
  * PART 2. Copy the xbuf to the mask plane.
  * Set up the pointer
  	lea xbuf,a0 														base address of buffer
  	move.l maskplane,a1 						base address of maak plane
  	lea msk_y_tbl,a2 									mask plane y look up table
  	sub.w	d5,d6 														num pairs to set -1
  	move.w	d6,d7 													is the counter
  	beq fil_end 														quit if all sides horizontal
  	move.w	d5,d2 													miny is the start
  	lsl.w #2,d5 														4*min y = offset into xbuf
  	add.w d5,a0 														for the address to start
  	subq.w	#1,d2 													reduce initial y
  poly2
   addq #1,d2 															next y
   move.w (a0)+,d0 										next x1
   move.w (a0)+,d1 										next x2
   cmp.w d0,d1 														test(x1-x2)
   beq poly4 																cant draw a line with one point
   move.w d2,d5 													pass y
   bsr set_pix 														set the 2 pixels
  poly4
  	dbra d7,poly2													repeat for all y values
  *****************************************************************************************
  * PART 3. Fill in the outline
  * Confine the blit to the rectangle (xmax-xmin)*(ymax-ymin).
  * First xmax and xmin are recorded to define the rectangle.
  	bsr blt_chk
  frme
  	move.w	no_in,d7
  	subq.w	#1,d7
  	movea.l coords_lst,a3 				here they are
  	move.w	#MINIMUM_X,xmin				initialise xmin
  	clr.w	xmax 															and xmax
  x_test
  	move.w	(a3),d0 											next x
  	cmp.w	xmin,d0 												test(x1-xmin)
  	bgt lnblit4 														xmin unchanged
  	move.w	d0,xmin 											this is x min
  lnblit4
  	cmp.w xmax,d0 												test(x1-xmax)
  	blt lnblit5 														xmax unchanged
  	move.w	d0,xmax 											this x is xmax
  lnblit5
  	addq.l #4,a3 													increment x pointer
  	dbra d7,x_test 											for all x

  * Here's the fill blit. Several things must be found.
  * Calculate the address of the bottom rh corner of the rectangle
  * bltstrt contains its offset in the plane
  	move.w xmax,d0
  	lsr.w	#SIXTEEN,d0 														xmax/16
  	move.w	d0,d2 													save it
  	add.w d2,d2 														*2 = byte position in row
  	move.w	ymax,d1
  	mulu #WIDTH,d1															row address
  	add.w d2,d1
  	ext.l d1
  	move.l d1,bltstrt 								save offset in the plane

  * address to start blit
  	movea.l maskplane,a0 					plane base address
  	add.l d1,a0 														plus offset is where blit starts
  	move.l #$dff000,a5
  	move.l a0,bltapt(a5) 			 SOURCE
  	move.l a0,bltdpt(a5) 			 DESTINATION

  * bltmod says how much of plane to blit
  	move.w xmin,d1
  	lsr.w #SIXTEEN,d1 													 xmin/16
  	sub.w d1,d0 														xmax/16 - xmin/16
  	addq.w #1,d0 												 word width of window
  	move.w d0,bltwidth 							save it
  	move.w #WIDTH,d2
  	add.w d0,d0 														width in bytes
  	sub.w d0,d2 														blitmod
  	move.w d2,blitmod
  	move.w d2,bltamod(a5)					SOURCE MODULO
  	move.w d2,bltdmod(a5) 				DESTINATON MODULO

  * set the control registers for a simple descending fill.
  	move.w #$09f0,bltcon0(a5) USE A&D D=A (no shift)
  	move.w #$000a,bltcon1(a5) INCLUSIVE FILL, DESCENDING
  	move.w #$ffff,bltafwm(a5)
  	move.w #$ffff,bltalwm(a5)

  * set the size and do the blit
  	move.w ymax,d0
  	sub.w ymin,d0
  	addq.w #1,d0
  	lsl.w #6,d0 													set height
  	add.w bltwidth,d0 							and width
  	move.w d0,blitsize 						sizeof blit
  	move.w d0,bltsize(a5) 			do the fill

  * PART 4.
  * Copy the mask to the screen bitplanes which must be set or cleared
  * depending on it's colour bit.
  * Only the smallest rectangle is blitted.
  * The mask is used in the cookie cut function:
  * If the colour bit is set, the masked region is set
  * If the colour bit is clear, the masked region is cleared.
  pln_cpy
  	bsr blt_chk
  	move.w #DEPTH-1,d7 									number of planes to blit
  	move.w colour,d6
  	move.w #0002,bltcon1(a5)    COPY DESCENDING
  	move.w blitmod,d0
  	move.w d0,bltamod(a5)
  	move.w d0,bltdmod(a5)
  	move.w d0,bltbmod(a5)
  	IFD DOUBLE_BUFFERING
  	move.l workplanes,a2        get address of planepointers list
  	ELSEIF
  	move.l showplanes,a2        get address of planepointers list
  	ENDC
  ;	sub.l  #WIDTH*HEIGHT,a0      (ready to increment in next part)
  ;	add.l  bltstrt,a0            offset to draw at

  nxtplane 	                    ;LOOP POINT
  	bsr    blt_chk
  ;	add.l  #WIDTH*HEIGHT,a0 get next bitplane base address
   move.l (a2)+,a0              get next address into a0
   add.l bltstrt,a0             and add offset to start drawing at...

  * store the destination plane first, (copy to storeplane)
  	move.l a0,bltapt(a5)        SOURCE
  	move.l storeplane,a1        destination
  	add.l  bltstrt,a1           start position of rectangle
  	move.l a1,bltdpt(a5)        in plane 6
  	move.w #$09f0,bltcon0(a5)   straight copy
  	move.w blitsize,bltsize(a5) store destination plane
  	bsr    blt_chk

  * now mask region and set/clear as colour bit dictates
  	movea.l maskplane,a1       the mask
  	add.l   bltstrt,a1         start here
  	move.l  a1,bltapt(a5)      A IS MASK
  	move.l  storeplane,a1
  	add.l   bltstrt,a1         offset
  	move.l  a1,bltbpt(a5)      B IS STOREPLANE
  	move.l  a0,bltdpt(a5)      DESTINATION

  * do we set or clear the masked region?
  	lsr.w #1,d6               get colour bit into carry flag
  	bcc bltclr	               bit is zero so clear masked region

  * we have to set the masked region
  	move.w #$0dfc,bltcon0(a5) NO SHIFT: USE A,B,D: D=A OR B
  	bra bltcopy

  bltclr
  * clear region
  	move.w #$0d0c,bltcon0(a5) NO SHIFT: USE A,B,D: D=NOT A AND B

  bltcopy
  	move.w blitsize,bltsize(a5) perform the required blit function
  	dbf d7,nxtplane             do all the planes
  * done
  fil_end
  	rts
  *****************************************************************************************
  * Get pixel address and mask to set pixels in the mask plane which
  * mark start and end of a scan line .
  * d0=x1: d1=x2: d2=y1: a0=xbuf: a1=maskplane base: a2=msk y line tbl
  set_pix
  	lsl.w #2,d5                  4*y is offset in table
  	movea.l 0(a2,d5.w),a3        row address in mask plane
  	move.l a3,a4                 save it

  * set pixel x1
  	move.w d0,d3                 save x1
  	lsr.w #EIGHT,d0              byte num in row (/8)
  	adda.w d0,a3                 the byte containing the pixel
  	andi.w #$0007,d3             pixel num in word
  	subi #7,d3
  	neg.w d3                     bit to set
  	clr.w d0
  	bset d3,d0                   this is a mask
  	or.b d0,(a3)                 set the pixel

  * set pixel x2
  	move.l a4,a3                 restore row address
  	move.w d1,d3                 save x2
  	lsr.w #EIGHT,d1              byte num in row (/8)
  	adda.w d1,a3                 the byte containing the pixel
  	andi.w #$0007,d3             pixel num in word
  	subi #7,d3
  	neg.w d3                     bit to set
  	clr.w d0
  	bset d3,d0                   this is a mask
  	or.b d0,(a3)                 set the pixel
   rts
  *****************************************************************************************
  * Get the screen address of a word
  * at a0=base: d0=x: d1=y:
  scrn_wrd
  	move.w #WIDTH,d2                plane width
  	mulu d1,d2                   y*width
  	add.l a0,d2                  + base
  	lsr.w #SIXTEEN,d0            x/16
  	add.w d0,d0                  word pos in row
  	ext.l d0
  	add.l d0,d2                  address
  	rts
  *****************************************************************************************
  * See if last blit is finished
  blt_chk
  	move.l #$dff000,a5
  	move.l d7,-(sp)
  blt_chk1
  	move.w dmaconr(a5),d7
  	btst.l #14,d7
  	btst.l #14,d7
  	bne blt_chk1
  	move.l (sp)+,d7
  	rts
  ************************************************************************************
  *	                                    BSS_00.s                                     *
  *                Put all the variables you want to use in here                     *
  ************************************************************************************
  * POLYGON VARIABLES
  colour         ds.w 1 current colour
  no_in          ds.w 1 number of polygon vertices
  xmin           ds.w 1 limits
  xmax           ds.w 1 for
  ymin           ds.w 1 xbuf
  ymax           ds.w 1
  coords_lst     ds.l 1

  * BITPLANE VARIABLES
  maskplane      ds.l 1 base addresses
  storeplane     ds.l 1
  log_screen     ds.l 1
  msk_y_tbl      ds.l 200 scan line addresses
  xbuf           ds.l 200 x buffer
  bltstrt        ds.l 1
  bltwidth       ds.w 1
  blitsize       ds.w 1
  blitmod        ds.w 1
  showplanes_list ds.l DEPTH a list of pointers to each of the bitplanes per playfield
  workplanes_list ds.l DEPTH

   IFD A500
  scrn1_base     ds.l 1
  scrn2_base     ds.l 1
  cl1adr         ds.l 1
  cl2adr         ds.l 1
  cladr          ds.l 1
  oldcop         ds.l 1
   ENDC

  gfxversion					ds.w	1 lib version
  vbi_flag       ds.w 1
  show_bitmap				ds.l	1 BitMap structure pointers
  work_bitmap				ds.l	1
  showlist							ds.l	1 Copperlist pointes
  worklist							ds.l	1
  showplanes					ds.l	1 VIDEO Ram pointers
  workplanes					ds.l	1
  draw_buffer				ds.l	1	Bitplane sized memory to construct objects in
  frame_done     ds.l 1 flag to indicate frame finished

  File_Handle				ds.l	1
  File_Buffer				ds.l	1

  DosBase								ds.l	1
  GrafBase							ds.l	1
  IntuiBase						ds.l	1
  LowLevelBase			ds.l	1
  OldActiView				ds.l	1
  intHandle						ds.l	1	V40 interrupt handle
  colormap							ds.l	1
  ReturnMsg						ds.l	1	For system to use when we quit

  * SYSTEM STRUCTURES I WANT TO USE.
  	EVEN
  vblank									ds.b	IS_SIZE	     ;store interrupt structure here.
  	EVEN
  my_view								ds.b	v_SIZEOF
  	EVEN
  my_viewport				ds.b	vp_SIZEOF
  	EVEN
  my_rasinfo					ds.b	ri_SIZEOF

4. Windowing

If a picture is larger than the limits of the screen then there is a problem with what happens to the excess. Unless some provision is made for this possibility, the program will attempt to write to addresses outside of the section of RAM reserved for the screen which in our case is the five bit planes called the logical screen. Unless we are sure that everything will always lie within the screen size, some provision must be made to clip off those sections of the picture which lie outside. Confining a picture in this way is called windowing because of the obvious analogy to someone looking out of a window. The screen is a window onto the internal world of the computer. This window could be the maximum allowed on a given resolution or something smaller (one obvious way to make graphics fast is to keep the picture small so that not much has to be drawn). The freedom to vary the size of the visible image can even give rise to special effects — an aperture opening, for example. Because of the ‘clipping off’ of the unwanted parts of the picture that takes place, we shall call outline of this window the clip frame.

The algorithm we need is one which will handle filled polygons. It is not sufficient to just chop off vertices where they exceed the clip frame. The line left by the chop must become an additional edge to close the polygon. Once again an elegant solution to this problem was found many years ago by Sutherland and Hodgman.

4.1. Sutherland-Hodgman Clipping Algorithm.

The Sutherland-Hodgman algorithm is actually more powerful than we require; it can handle polygons of any shape. In this book, for speed, only convex (round-shaped, all external angles greater than zero) polygons are filled. The requirement to be convex is a consequence of a later constraint; the need to keep the hidden-surface-removal algorithm simple. This is something we will meet at a later stage.

figure 04 01
Figure 4.1 Windowing a polygon

Strictly speaking, Sutherland-Hodgman does not require polygons to be convex nor does it require the clipping frame to be a rectangle. But, for simplicity, the version given here does use a rectangular clipping frame parallel with the monitor screen. The boundaries of the clipping frame are defined by xmin, xmax, ymin and ymax and are shown for a general polygon in Figure 4.1. The Sutherland-Hodgman strategy is to find the intersections in turn of all of the edges of the polygon with each boundary. Since our boundary has four sides this means that four cycles of the polygon will be made. On each cycle some of the original edges may be lost and new ones added.

As each new vertex is examined, various actions are taken which depend on the position of it and the previous vertex. These cases are illustrated in Figure 4.1 and examined below:

  1. If the next vertex is outside the frame, (A), check the position of the previous vertex, ©. If that was in, find the point of intersection, (S), of the edge joining them with the clip frame and save it. Don’t save the next vertex (A).

  2. If the next vertex is inside the frame, (B), check the position of the previous vertex, (A). If that was out, find the point of intersection of the edge joining them with the clip frame, ® and save it. Also save the next vertex, (B).

This is the algorithm applied to all the vertices going round the polygon.

Once again it might appear that calculating points of intersection of sloping lines with the clip frame requires a lot of mathematical computation involving divisions and multiplications. Surprisingly this is not so. As usual in assembly language programming, where variables are not abstract algebraic symbols, but contents of memory locations or registers, it is possible to find answers using only addition and subtraction and, where it occurs, to use division and multiplication by powers of two which can quickly done by right and left shifts.

To illustrate this consider the case where the previous point was outside but the next point is inside the frame limit xmin. This is shown in more detail in Figure 4.2 where the two possible cases, depending on which point is closest to the limit, are examined. As part of the process to determine that B(x2,y2) lies inside and A(x1,y1) lies outside the limit, it is necessary to compare both x1 and x2 with xmin. But instead of just using the COMPARE instruction, the actual differences (xmin-x1) and (xmin-x2) are calculated and the sign of the result used as the basis for decision. Note that (xmin-x1) is positive and (xmin-x2) is negative. Having then decided that there is a point of intersection to determine and save, these differences are used as the starting point for calculating the point of intersection in the following way.

One of the coordinates of the point of intersection is already known; it is xmin, the limit itself; it remains to find the y value at the intercept. This is done iteratively in the following way. The average of A and B is calculated by adding coordinates and dividing by 2. The result T1 is closer to the intercept than either A or B and we can see what side of the boundary it lies by following the sign of the average of (xmin-x1) and (xmin-x2). More important, the average of y1 and y2 will be the intercept value itself if the average of (xmin-x1) and (xmin-x2) is zero, because when this happens the two points are either evenly spaced on either side of the boundary, or coincident with it. This is the basis of the iterative algorithm used in the example program.

What happens the first time is that the average of y1 and y2 and the average of (xmin-x1) and (xmin-x2) are calculated by means of an addition and a shift right (a quick divide by two). This yields the y coordinates of the point T1. If the average of (xmin-x1) and (xmin-x2) is zero then the intercept has been found. If the x-average is negative, as at point T1 in case 1, then it lies inside the boundary and the next average must be taken between (xmin-x1) and(xmin-xT1). Likewise, the next y-average must be taken between y1 and yT1. If, on the other hand, the initial average of (xmin-x1) and (xmin-x2) is positive, as case 2, the next average must be taken between (xmin-xT1) and (xmin-x2) and the next y-average between yT1 and y2. This iterative process continues until the x-average is zero, at which point the current y-average is the y coordinate of the point of intersection, which is then saved.

figure 04 02
Figure 4.2 Intersection of the boundary by iteration

4.2. Example Program

The example program clips a polygon using a version of the Sutherland-Hodgman algorithm and then fills it. The polygon is shown in Figure 4.3.

4.2.1. clipfrme.s

This is the control program plus the data for the polygon vertices. The coordinates in my_data are, as usual in the order x0,y0,x1,y1…​…​x0,y0, with the first coordinate repeated at the end. The clip frame limits are also given in the data and you can change them to suit yourself.

figure 04 03
Figure 4.3 Windowed polygon

4.2.2. core_01.s

Here is where the actual clipping routine resides (together with all the other routines used so far by means of the include core_00.s directive at the end). Most of the work is done by the subroutine clip. It looks rather long but that is to try to make it more readable. Because many of its parts are very similar, it would be possible to make it shorter with inner subroutine calls, but then it would be harder to follow. It is a complicated routine but that is a consequence of the rather difficult task it does, which has been described above.

It is laid out in the order that it clips against boundaries: xmin first followed by the others. In all four complete traversals of the data are made with new vertices being added each time. The data for the vertices is input on the first traversal from crds_in and output to crds_out. The next traversal reverses the order. Because there are four traversals, the data ends up back where it started in crds_in, ready for the next part of the program, to follow in later chapters.

4.2.3. bss_01.s

As the number of variables gets larger, so new bss files appear. The earlier ones have to be included.

  * clipfrme.s
  *
  * Program for chapter 4
  * A program to clip and fill apolygon to a window (clip fram)
  * defined by the limits clp_xmin, clp_xmax, clp_ymin, clp_ymax
  *

    *SECTION TEXT
    opt     d+          incldue levels for debugging
    bra     main        don't execute the includes
    inlude  equates.    constants
    inlude  systm_00.s  housekeeping
    include core_01.s   important subroutines

  main  bsr set_up    screens, copper, blitter, etc

  blit_loop:
        bsr drw_shw2
        move.w  #12-1,d7  six pari of coords for vertices
        lea     crds_in,a0    destination
        move.l  a0,a3         ready for drawing
        lea     my_data,a1    from here

  clp_loop
        move.w  (a1)+,(a0)+   transfer
        dbf     d7,clp_loop   them all

        move.w  #5,no_in    5 sides to the polygon
        move.w  my_colour,colour      set the colour
        move.w  my_xmin,clp_xmin      set the
        move.w  my_max,clp_xmax       clip
        move.w  my_ymin,clp_ymin      fram
        move.w  my_ymax,clp_ymax      limits

        bsr     drw2_shw1             draw on screen 2, display 1
        bsr     clip                  window it
        bsr     poly_fill             fill it
        bsr     drw1_shw2             show the drawing

  loop_again:
        bra     loop_again            forever

  *SECTION DATA
  * A pentagon
  my_data       dc.w    20,100,200,20,300,80,260,180,140,180,20,100
  * which is pink
  my_colour     dc.w    24

  * The window limits
  my_xmin       dc.w    50
  my_xmax       dc.w    270
  my_ymin       dc.w    50
  my_ymax       dc.w    150

  *SECTION BSS
    include bss_01.s

  * SECTION DATA
    include data_00.s

  END
  *****************************************************************************************
  *                                  Core_01.s                                            *
  *                                                                                       *
  * A version of the Sutherland-Hodgman clipping algorithm.
  * It goes around the the polygon clipping it against one boundary at a time. It goes
  * around four times in all.
  * a0=crds_in: a1=crds_out: a2=no_out: a3=saved(crds_out):
  * d0=current limit: d1=x1: d2=y1: d3=x2: d4=y2: d5=(saved)x2: d6=(saved)y2:
  *****************************************************************************************
   include core_00.s

  * First clip against xmin.
  clip
  	bsr   clip_ld1           set up pointers
  	tst.w d7                 any sides to clip?
  	beq   clip_end           not this time...

  * do first point as a special case.
   move.w (a0)+,d5           1st x
   move.w (a0)+,d6           1st y
   move.w clp_xmin,d0        limit
   cmp.w  d0,d5              test(x1-xmin)
   bge    xmin_save          inside limit
   bra    xmin_update        outside limit

  * do succesive vertices in turn
  xmin_next
  	move.w (a0)+,d3          x2
  	move.w (a0)+,d4          y2
  	move.w d3,d5             save x2
  	move.w d4,d6             save y2

  * now test for position
   sub.w   d0,d3             x2-xmin
   bge     xmin_x2in         x2 is in

  * x2 is inside, find x1
   sub.w   d0,d1             x1-xmin
   blt     xmin_update       both x2 and x1 are outside

  * x2 is out but x1 is in so find intersection, needs d1=dx1(+ve):d3=dx2(-ve)
  * d2=y1: d4=y2:
  * find the y intercept and save it.
   bsr     y_intercept

  * but because it's out, don't save x2.
   bra     xmin_update
  xmin_x2in

  * x2 is in but where is x1? GOD KNOWS!!
   sub.w   d0,d1             x1-xmin
   bge     xmin_save         both x1 and x2 are in

  * x2 is in but x1 is out so find intercept, but need -ve one in d3, so swap
   exg     d1,d3
   exg     d2,d4
   bsr     y_intercept
  xmin_save
   move.w  d5,(a1)+          save x
   move.w  d6,(a1)+          save y
   addq.w  #1,(a2)           inc count
  xmin_update
   move.w  d5,d1             x1=x2
   move.w  d6,d2             y1=y2
   dbra    d7,xmin_next

  * The last point must be the same as the first
   movea.l a3,a4             pointer to first x
   subq #4,a1                point to last x
   cmpm.l (a4)+,(a1)+        check first and last x and y
   beq xmin_dec              already the same
   move.l (a3),(a1)          move first to last
   bra clip_xmax
  xmin_dec
   tst.w (a2)                if count
   beq clip_xmax             is not already zero
   subq.w #1,(a2)            reduce it

  * Now clip against xmax. Essentially the same as above except that the order
  * of subtraction is reversed so that the same subroutine can be used to find
  * the intercept.
  clip_xmax
   bsr clip_ld2              set up pointers
   tst.w d7                  any to do?
   beq clip_ymin             no...

  * do first point as a special case.
   move.w (a0)+,d5          1st x
   move.w (a0)+,d6          1st y
   move.w clp_xmax,d0
   cmp.w d5,d0              test (xmax-x1)
   bge xmax_save            inside limit
   bra xmax_update          outside limit

  * do succesive vertices in turn
  xmax_next
   move.w (a0)+,d3           x2
   move.w (a0)+,d4          y2
   move.l d3,d5             save x2
   move d4,d6               save y2

  * now test for position
   sub.w d0,d3
   neg.w d3                 xmax-x2
   bge xmax_x2in            x2 is in

  * x2 is outside. where is x1?
   sub.w d0,d1
   neg.w d1                 xmax-x1
   blt xmax_update          both x2 and x1 are out

  * x2 is out but x1 is in so find intersection
  * needs dx1(+ve) in d1, and dx2(-ve) in d3, y1 in d2 and y2 in d4
  * find the intercept and save it.

   bsr y_intercept
  * but because its out dont save x2
   bra xmax_update

  * x2 is in but where is x1
  xmax_x2in
   sub.w d0,d1
   neg.w d1                 xmax-x1
   bge xmax_save            both x1 and x2 are in

  * x2 is in but x1 is out so find intercept
  * but must have the -ve one in d3,so switch
   exg d1,d3
   exg d2,d4
   bsr y_intercept

  xmax_save
   move.w d5,(a1)+         save x
   move.w d6,(a1)+         save y
   addq.w #1,(a2)          inc count
  xmax_update
   move d5,d1              x1=x2
   move d6,d2              y1=y2
   dbra d7,xmax_next

  * the last point must be the same as the first

   movea.l a3,a4           pointer to first x
   subq #4,a1              point to last x
   cmpm.l (a4)+,(a1)+      check 1st and last x and y
   beq xmax_dec            already the same
   move.l (a3),(a1)        move first to last
   bra clip_ymin
  xmax_dec
   tst.w (a2)              if count
   beq clip_ymin           is not already zero
   subq.w #1,(a2)          reduce it

  clip_ymin
   bsr clip_ld1            set up pointers
   tst.w d7                any to do?
   beq clip_ymax           no...
  * do first point as a special case
   move.w (a0)+,d5         ist x
   move.w (a0)+,d6         1st y
   move.w clp_ymin,d0      this limit
   cmp.w d0,d6             test (y1-ymin)
   bge ymin_save           inside limit
   bra ymin_update         outside limit
  * do successive vertices in turn
  ymin_next
   move.w (a0)+,d3         x2
   move.w (a0)+,d4          y2
   move d3,d5              save x2
   move d4,d6              save x1
  * now test for position
   sub.w d0,d4             y2-xmin
   bge ymin_y2in           y2 is in
  * y2 is outside where is y1?
   sub.w d0,d2             y1-xmin
   blt ymin_update         both y2 and y1 are out

  * y2 is out but y1 is in so find intersection
  * needs x1 in d1, x2 in d3, dy1 in d2 and dy2 in d4
  * find the intercept and save it
   bsr x_intercept

  * but because its out, dont save y2
   bra ymin_update
  ymin_y2in

  * y2 is in but where is y1
   sub.w d0,d2           y1-ymin
   bge ymin_save         both y1 and y2 are in

  * y2 is in but y1 is out so find intercept
  * but must have the -ve one in d4 so switch
   exg d1,d3
   exg d2,d4
   bsr x_intercept
  ymin_save
   move.w d5,(a1)+        save x
   move.w d6,(a1)+        save y
   addq.w #1,(a2)         increment no
  ymin_update
   move d5,d1             x1=x2
   move d6,d2             y1=y2
   dbra d7,ymin_next

  * the last point must be the same as the first
   movea.l a3,a4         pointer to first x
   subq.w #4,a1          point to last x
   cmpm.l (a4)+,(a1)+    check first and last x and y
   beq ymin_dec          already the same
   move.l (a3),(a1)      move first to last
   bra clip_ymax
  ymin_dec
   tst.w (a2)            if count
   beq clip_ymax         is not already zero
   subq.w #1,(a2)        reduce it
  clip_ymax
   bsr clip_ld2
   tst.w d7              any to do?
   beq clip_end          no...
  * do first point as a special case
   move.w (a0)+,d5       1st x
   move.w (a0)+,d6        1st y
   move.w clp_ymax,d0
   cmp.w d6,d0           test(ymax-y1)
   bge ymax_save
   bra ymax_update
  * do vertices in turn
  ymax_next
   move.w (a0)+,d3      x2
   move.w (a0)+,d4      y2
   move d3,d5 save      x2
   move d4,d6 save      y2
  * test for position
   sub.w d0,d4
   neg.w d4             ymax-y2
   bge ymax_y2in
  * y2 is outside where is y1?
   sub.w d0,d2
   neg.w d2 ymax-y1
   blt ymax_update both x2 and x1 are out
  * y2 is out but y1 is in so find intersection
   bsr x_intercept
   bra ymax_update
  ymax_y2in
  *y2 is in but where is y1?
   sub.w d0,d2
   neg.w d2             ymax-y1
   bge ymax_save        both y1 and y2 are in
  * y2 is in but y1 is out so find intercept
   exg d1,d3
   exg d2,d4
   bsr x_intercept
  ymax_save
   move.w d5,(a1)+       save x
   move.w d6,(a1)+       save y
   addq.w #1,(a2)        increment num
  ymax_update
   move.w d5,d1          x1=x2
   move.w d6,d2          y1=y2
   dbra d7,ymax_next

  * the last point must be the same as the first
   movea.l a3,a4        pointer to first x
   subq.w #4,a1         point to last x
   cmpm.l (a4)+,(a1)+   check first and last x and y
   beq ymax_dec         already the same
   move.l (a3),(a1)     move first to last
   bra clip_end
  ymax_dec
   tst.w (a2)          if count
   beq clip_end        is not already zero
   subq.w #1,(a2)      reduce it
  clip_end
   lea crds_in,a0
   move.l a0,coords_lst
   rts

  clip_ld1
   lea crds_in,a0      pointer to vertex coords before
   lea crds_out,a1     and after this clip
   move.l a1,a3        saved
   move.w no_in,d7     this many sides before
   lea no_out,a2       where the number after is stored
   clr.w no_out
   rts

  clip_ld2
   lea crds_out,a0     pointer to vertex coords before
   lea crds_in,a1      and after this clip
   move.l a1,a3        saved
   move.w no_out,d7    this many sides before
   lea no_in,a2        where the number after is stored
   clr.w no_in
   rts

  y_intercept
   tst.w d1
   beq yint_out
   tst.w d3
   beq yint_out
   movem d5/d6,-(sp)
  yint_in
   move.w d2,d6
   add.w d4,d6
   asr.w #1,d6
   move.w d1,d5
   add.w d3,d5
   asr.w #1,d5
   beq yint_end
   bgt yint_loop
   move d5,d3
   move d6,d4
   bra yint_in
  yint_loop
   move d5,d1
   move d6,d2
   bra yint_in
  yint_end
   move.w d0,(a1)+
   move.w d6,(a1)+
   addq.w #1,(a2)
   movem (sp)+,d5/d6
  yint_out
   rts

  x_intercept
   tst.w d2
   beq xint_out
   tst.w d4
   beq xint_out
   movem d5/d6,-(sp)
  xint_in
   move d1,d5           x1
   add.w d3,d5          x1+x2
   asr.w #1,d5          ()/2=,x> a possible intercept
   move d2,d6           dy1
   add.w d4,d6          dy1+dy2
   asr.w #1,d6          (dy1+dy2)/2 =<dy>
   beq xint_end         if <dy>=0. boundry reached
   bgt xint_loop        if not loop again
   move d6,d4           unless <dy> is -ve and becomes dy2
   move d5,d3           and <x> becomes x2
   bra xint_in          and try again
  xint_loop
   move d5,d1           <x> is new dx1
   move d6,d2           and <dy> is new dy1
   bra xint_in

  xint_end
   move.w d5,(a1)+      store intercept <x>
   move.w d0,(a1)+      and the y as new vertex coords
   addq.w #1,(a2)       and increment the vertex count
   movem (sp)+,d5/d6
  xint_out
   rts                  next vertex
  * Leaves with a list of vertex coords at coords_in
  * the number of polygon sides at no_in

  set_up:
  * set up memory, screens, blitter etc
    bsr     alloc_mem
    bsr     copr_lst
    bsr     blit_alloc
    bsr     colr_set
    bsr     wrt_phys_tbl

    include core_00.s     add on the previous core





  *****************************************************************************************
  *****************************************************************************************
  *                                   BSS_01                                              *
  *****************************************************************************************
   include bss_00.s

  * Polygon attributes
  crds_in      ds.w 100 input coords
  crds_out     ds.w 100 output as above
  no_out       ds.w 1   output number
  colr_lst     ds.w 20  list of polygon colours
  clp_xmax     ds.w 1   clip frame limits
  clp_xmin     ds.w 1
  clp_ymin     ds.w 1
  clp_ymax     ds.w 1

5. Getting Things Into Perspective

It is a curious thing that distant objects look smaller than ones which are close. They aren’t smaller, but they do subtend a smaller angle at the eye. For any scene to look real therefore, the size of primitives must diminish as they recede into the distance. All of this is done by the eye and the brain. Simulating the same effect on the computer screen is what the perspective transform is all about.

You don’t really need to understand much maths to use the transforms in this book. The maths and the transforms have all been worked out; you only have to understand how to feed data to them. The perspective transform is just such an example. However, to understand and use transforms fully requires some understanding of maths and matrices. We will introduce these as the need arises. The Appendices also contain information on these topics

5.1. The Perspective Transform

The perspective transform is a set of mathematical operations which project an image of an object from the world reference frame onto the screen. This has a similarity to the way in which a shadow is formed, except that in that case the shadow falls behind the object and is larger, whereas in the perspective projection it is between the viewpoint and screen and smaller. This is shown in Figure 5.1.

One aspect that crops up repeatedly in transforms and matrices is the use of homogeneous coordinates. Yet it is possible to avoid using them altogether and in many cases it is an inconvenience to use them at all. What do they mean? Do they matter? In this chapter we find out about homogeneous coordinates and how to use them in the perspective transform which is done using matrix multiplication just to illustrate the method. At the same time it will be clear how to do the transform without using matrix multiplication at all. It just turns out that the perspective transform is a good opportunity to try it out.

figure 05 01
Figure 5.1 Perspective projection of a cube

Figure 5.1 shows an object, in this case a cube, defined inside the computer in the world frame and projected onto the screen. The screen lies in the xv-yv plane of the view frame and the projected image is defined by the points where the ‘rays’ from the view point (also called the centre of projection, at -d along the zv axis) pierce the view plane. The window is the area of the view plane which is visible on the screen. That’s really all there is to it. The view point plays a very important role in this scheme and could be placed anywhere. Placing it along the -z axis makes the algebra simple and centres the projection about the view frame origin. This is a very simple type of projection; draughtsmen use many other kinds. But it works fine and the algebra associated with it is minimal.

To make life simple, take the case where the window entirely fills the monitor screen. Then the distinction between the two disappears. Let’s look at how a very simple object projects onto the screen. This is shown in Figure 5.2. As part of the transform it is also necessary to adjust to the screen coordinate system, where the origin is at the top left-hand corner. There are three coordinate systems shown in the diagram: the view frame (xv,yv,zv), the screen frame (xs,ys), and the projected coordinates (Xv,Yv). This projected coordinate system is an intermediate one, introduced for convenience and centred at the view frame origin.

figure 05 02
Figure 5.2 Perspective projection of a line

From the similar triangles ABC and ADE and the similar triangles ABF and ADG we get the results:

Xv/xv = d/(zv+d) and Yv/yv = d/(zv+d)

or

Xv = xv.d/(zv+d) and Yv = yv.d/(zv+d).

It only remains to choose where to centre the projection on the visible screen. If it is to be centred half-way across at the bottom then in screen coordinates, then

xs = Yv+Wx/2 and ys = Wy-Xv

where Wx and Wy are the width and height of the screen in the current resolution.

In low resolution Wx=320 and Wy=200. In what follows we shall only consider low resolution, though a conversion from one resolution to another is straightforward.

In low resolution the perspective transform becomes, for display in screen coordinates:

xs = 160+yv.d/(zv+d) ys = 200-xv.d/(zv+d)

These transforms can be worked out using straightforward algebra. The only thing to look out for is that the denominator doesn’t ever become zero because this will cause a ‘divide by zero’ exception. The program can be set up to watch out for this.

5.2. Homogeneous Coordinates

The perspective transform, above, is quite simple but has a serious disadvantage if it is to be concatenated with several other types of transform. Remember, in the jargon of matrix transforms, concatenation simply means multiplying matrices together. That is the advantage of writing transforms as matrices. Where several transforms (rotations etc.) take place in succession, the overall transform can be constructed by multiplying the individual transforms and then applied to the coordinates in one go. The problem with this perspective transform is that as it stands it cannot be written as a matrix at all.

Basically, a matrix can represent any transform which is linear, which means there is a proportional relation between the initial and the transformed coordinates. What we would like to see for the transforms between Xv,Yv and xv,yv,zv are equations like

Xv = a.xv + b.yv + c.zv

Yv = d.xv + e.yv + f.zv

where the coefficients a,b,c,d,e and f are simple numbers.

Then it could be written as a matrix product (see Appendix 6 for more information on matrices)

  Xv      a b c       xv
      =           *   yv
  Yv      d e f       zv

Unfortunately the perspective transform we have derived does not have this form. What messes it up is the (zv+d) in the denominators; the coordinates themselves have to be in the numerators. Therefore as it stands our transform cannot be put into 3x3 matrix form. The perspective transform isn’t the only one to suffer from this problem. Simple translations do as well. The way out of the problem is to go to homogeneous coordinates.

As far as we are concerned the use of homogeneous coordinates is just a trick to get round this problem. The trick is to introduce another dimension, temporarily, to give more “space”. That’s all this extra dimension does because in this extra dimension all vertices have the same value, 1. In homogeneous coordinates the point (xv,yv,zv) becomes (xv,yv,zv,1).

How does this help? Now the transform can be written as a product but there are penalties to pay: the matrix product will generate an extra term which must be divided into the others. Also all matrices are now bigger (4x4). Here’s how it works.

First do the perspective transform in homogeneous coordinates to give an intermediate result:

  d.xv    d 0 0 0       xv
  d.yv    0 d 0 0       yv
  0    =  0 0 0 0   *   zv
  zv+d    0 0 1 d       1

Then divide by the fourth element (zv+d) to give

Xv = xv.d/(zv+d)

Yv = yv.d/(zv+d).

Finally translate to the screen centre (this translation can also be done as a matrix multiplication in homogeneous coordinates but that would be making work for the sake of it):

xs = 160 + yv.d/(zv+d)

ys = 200 - xv.d/(zv+d).

The perspective matrix has zeros for most of its elements and so many of the multiplications are a waste of time. In the program at the end of this section which illustrates the transform, we have used the homogeneous form. It serves as a useful introduction to matrix multiplication in assembly language and allows us to try a few little-used assembler instructions.

5.3. Example program

The example program shows a view of a plane with the letter “A” (an A monolith) sloping forwards in the world frame. When the perspective transform is done (together with windowing and everything else) it appears on the screen like the opening logo in a movie, where the words diminish into the distance. Figure 5.3 shows how the plane is set up in the view frame. Figure 5.4 shows how it looks on the screen.

figure 05 03
Figure 5.3: "A" monolith
figure 05 04
Figure 5.4: Screen picture of A monolith

You can look at the coordinates in the data file and change them if you wish to see how it looks in different orientations. If you want, you can change the data altogether to draw something different, but first read carefully how the data is laid out. This is explained more fully below in the data file. Be careful to join up the characters and label the vertices properly.

5.3.1. perspect.s

This is the control program. Its function is to load up the data, draw the picture and terminate with a key press. The data are stored in the file data_01.s, described below.

5.3.2. data_01.s

This is discussed next because it contains lists of the data. Understanding how these are used is essential to understanding how the program works. Since we start off with an object drawn in 3D in the view frame, each of its vertices must be fixed by three coordinates (xv,yv,zv). The lists of these are held at my_datax, my_datay and my_dataz. There is a scheme to identify each vertex in these lists. Each vertex has a number as shown in Figure 5.5. To find its coordinates simply read in from the start counting the first coordinate as number zero. The number of vertices in each polygon is given at vectors.

figure 05 05
Figure 5.5: Vertex numbers of A monolith

More data than this is required to actually draw the picture. The connections between the vertices are specified in my_edglst. For each polygon there is a list of connections in this table. The overall object is split into 6 polygons, all of which lie in the same plane. The vertex connections for these, going clockwise and closing the polygon, are

polygon 0: 0,1,2,3,0
polygon 1: 4,5,6,4
polygon 2: 7,8,9,10,7
polygon 3: 11,12,13,11
polygon 4: 14,15,4,18,14
polygon 5: 16,17,19,6,16

Arranged in this way all the information required to draw the object is readily available. To colour in the polygons a list of individual colours is held at my_colour. Notice that in this picture it was decided to construct the “A” by drawing an outline (polygon 1) and masking out the open parts (poly’s 2 and 3) with the background colour, rather than by drawing each segment separately. This is also evident from the actual colour list, my_colour where it can be seen that the background is gold and the letters are magenta. Doing it this way saves a bit of time but may lead to problems when the boundaries don’t quite match up. To supplement these lists the total number of polygons is given at my_npoly. These are the data blocks must be loaded up at initialisation. Other variables are calculated by the various parts of the program as it goes.

You can change these lists to draw anything you wish. Just remember it is a 3D object in the view frame and coordinates are easiest to determine from views along the different axes. It must also be placed in front of the view plane as shown in Figure 5.3.

5.3.3. data_02.s

The 4x4 matrix for the perspective transform is stored here, a row at a time, with a viewpoint at -100 on the view frame z axis. It isn’t included with data_01.s since that file will only be used once

If you can’t follow the matrix multiplication used in the transform, don’t worry. Just think of the transform as a piece of ‘machinery’ to perform a function. If you want to alter the angle of view, change each of the numbers 100 to the new position of the view point. Remember 100 here is the distance of the view point along the negative zv axis.

5.3.4. bss_02.s

This contains a list of the variables used by the programs. Data is loaded into the variables blocks from the data file data_01.s by the control program. What goes where is clear from the control program. It consists of the lists of the x, y, and z coordinates of the vertices in the view frame, and other attributes as described in the previous sections.

5.3.5. core_02.s

This has two parts: the perspective transform, and polydraw which takes care of clipping and the actual drawing.

The perspective transform is done by matrix multiplication in homogeneous coordinates. It could be done by direct algebra but it is done this way to illustrate the use of homogeneous coordinates and matrix multiplication in a very compact way. Also it utilises a useful but little-used assembler instruction, LINK. When invoked, this causes the processor to open a space on the stack, called a frame, where data can be stored without interfering with the main stack. The pointer to the frame, one of the address registers, is declared in the LINK instruction together with the space required. The processor takes care of adjusting the regular stack pointer clear of the frame. In the present case it’s where the intermediate perspective calculations are stored. When finished with, the frame is closed by means of the UNLK instruction and the tidying up of the stack pointer is taken care of by the processor.

The perspective transform calculates the projections of the vertices on the view plane and stores them in two lists: scoordsx and scoordsy.

Polydraw is the final part. It contains all the previous subroutines necessary to complete the drawing. It also contains at the start a test for the visibility of each polygon. This is in anticipation of things to come. The test is to look for a colour number greater than $1f. Such a value would have been set earlier if the polygon was found to be facing away from the view point.

  *
  * perspect.s
  *

  *SECTION TEXT
    opt   d+        labels for debugging
    bra   main      dont execute the includes
    include core_02.s core subroutines
    include systm_00.s

  main  bsr   set_up    allocate memory etc

  * Transfer data from the data file to variables locations:
  * first the edge numbers and colours
    move.w  my_npoly,d7 no of polygons?
    beq     main        if none, quit
    move.w  d7,npoly    or becomes
    subq.w  #1,d7       the counter
    move.w  d7,d0       save it
    lea     my_nedges,a0  source
    lea     snedges,a1    destination
    lea     my_colour,a2    source
    lea     col_lst,a3      destination
  loop0 move.w  (a0)+,(a1)+   transfer edge nos
        move.w  (a2)+,(a3)0   transfer colours
        dbra    d0,loop0

  * second the edge list and coordinates
      move.w  d7,d0 restore count
      lea     my_nedges,a6
      clr     d1
      clr     d2

  loop1 add.w (a6),d1
        add.w (a6)+,d2
        addq  #1,d2     last one repeated each time
        dbra  d0,loop1  = total no of vertices
        subq  #1,d2     the counter
        lea   my_edglst,a0  source
        lea   sedglst,a1    destination
  loop2 move.w  (a0)+,(a1)+  pass it
        dbra    d2,loop2
        move.w  d1,vncoords
        subq    #1,d1
        lea     vcoordsx,a1
        lea     my_datax,a0
        lea     vcoordsy,a3
        lea     my_datay,a2
        lea     vcoordsz,a5
        lea     my_dataz,a4

  loop3 move.w  (a0)+,(a1)+
        move.w  (a2)+,(a3)+
        move.w  (a4)+,(a5)+
        dbra    d1,loop3

  * the clip form boundaries
        move.w  my_xmin,clp_xmin  ready
        move.w  my_xmax,clp_xmax  for
        move.w  my_ymin,clp_ymin  clipping
        move.w  my_ymax,clp_ymax  clipping
  * Calculate the perspective view and draw it
  bit_loop:
    bsr drw_shw2
    bsr perspective
    bsr polydraw
    bsr drw2_shw1
  pers_loop
    bra pers_loop forever

  *SECTION DATA
    include data_01.s
    include data_02.s
  *SECTION BSS
    include bss_02.s
    END
  ******************************************************************************************
  *                                     Core_02.s
  *                                 Perspective stuff
  ******************************************************************************************
   include core_01.s

  perspective
   move.w vncoords,d7 any points to do?
   beq prs_end
   subq.w #1,d7 counter
   lea vcoordsx,a0
   lea vcoordsy,a1
   lea vcoordsz,a2
   lea scoordsx,a4
   lea scoordsy,a5
   link a6,#-32      open 16 word frame
  prs_crd
   moveq #3,d6
   lea persmatx,a3
  prs_elmnt
   move.w (a0),d0
   move.w (a1),d1
   move.w (a2),d2
   muls (a3)+,d0
   muls (a3)+,d1
   muls (a3)+,d2
   add.l d1,d0
   add.l d2,d0
   move.w #1,d1
   muls (a3)+,d1
   add.l d1,d0
   move.l d0,-(a6)
   dbra d6,prs_elmnt
   move.l (a6)+,d3
   bne prs_ok
   addq #1,d3
  prs_ok
   addq.l #4,a6
   move.l (a6)+,d4
   divs d3,d4
   add.w #160,d4
   move.w d4,(a4)+
   move.l (a6)+,d4
   divs d3,d4
   sub.w #199,d4
   neg.w d4
   move.w d4,(a5)+
   addq.l #2,a0
   addq.l #2,a1
   addq.l #2,a2
   dbra d7,prs_crd
   unlk a6
  prs_end
   rts

  polydraw
   move.w npoly,d7
   beq polydraw5
   subq #1,d7
   lea scoordsx,a0
   lea scoordsy,a1
   lea sedglst,a2
   lea snedges,a3
   lea col_lst,a4
  polydraw2
   move.w (a4)+,d0
   cmp.w #$1f,d0
   ble polydraw3

   move.w (a3)+,d0
   addq.w #1,d0
   add d0,d0
   adda.w d0,a2
   bra polydraw4
  polydraw3
   move.w d0,colour
   move.w (a3)+,d0
   beq polydraw3
   move.w d0,no_in
   lea crds_in,a5

  polydraw1
   move.w (a2)+,d1
   lsl #1,d1
   move.w 0(a0,d1.w),(a5)+
   move.w 0(a1,d1.w),(a5)+
   dbra d0,polydraw1
   movem.l d7/a0-a4,-(sp)
   bsr clip
   bsr poly_fill
   movem.l (sp)+,d7/a0-a4
  polydraw4
   dbra d7,polydraw2
  polydraw5
   rts
  ******************************************************************************************
  * BSS_02.s
  ******************************************************************************************
   include bss_01.s

  scoordsx ds.w 100 xcoords
  scoordsy ds.w 100 ycoords
  sedglst ds.w 100 edge connections
  snedges ds.w 20 number of edges in each polygon
  npoly ds.w 1 number of polygons in this object
  col_lst ds.w 20 colours

  vcoordsx ds.w 100 viewframe xcoords
  vcoordsy ds.w 100
  vcoordsz ds.w 100
  vncoords ds.w 1
  *****************************************************************************************
  * Data_01.s
  *****************************************************************************************
   include data.s
   IFND TRANSFORM
  my_datax dc.w 115,115,25,25,43,107,43,40,65,65,40,75
           dc.w 88,75,34,34,34,34,43,43
  my_datay dc.w -100,100,100,-100,-70,20,73,-55,-20,30
           dc.w 53,-8,10,22,-40,-91,90,20,-50,48
  my_dataz dc.w 120,120,0,0,24,108,24,20,53,53
           dc.w 20,66,84,66,12,12,12,12,24,24
   ENDC
  my_edglst dc.w 0,1,2,3,0,4,5,6,4,7,8,9,10,7
            dc.w 11,12,13,11,14,15,4,18,14,16,17,19,6,16
  my_nedges dc.w 4,3,4,3,4,4
  my_npoly dc.w 6
  my_colour dc.w 5,23,5,5,23,23
  my_xmin dc.w 0
  my_xmax dc.w 319
  my_ymin dc.w 0
  my_ymax dc.w 199

  persmatx dc.w 100,0,0,0,0,100,0,0,0,0,0,0,0,0,1,100
  * data_02.s


  persmatx:
    dc.w    100,0,0,0,0,100,0,0,0,0,0,0,0,0,,1,100

    include equates.s

6. Simple Rotations

What we want to do here is rotate an object in the world frame. In our world model this is part of what happens when an object is moved from its object frame to the world frame. In addition, in general, there will be an associated translation as it is moved to its current location. As an example of simple rotations in action, the object-to-world transform is a good thing to do next. In a complex world with several different objects, each one would have different translations and rotations to bring them all together to make the world picture.

Let’s take a simple world with just one object to start with. We already have a good example to work on - the monolith with the “A” written on it, which was used to illustrate the perspective transform. The data is already entered and ready to go. What we would like to see is the monolith rotating in the centre of the screen. That’s what we’ll do next.

6.1. Geometric Transforms

Geometric transforms are those which change the coordinates of objects. Arc there any other kinds? Yes, those which change frames of reference, called coordinate transforms. In mathematical language a geometric transform is the inverse of a coordinate transform (this topic is also discussed in Appendix 6). An example of the latter kind is the transform from world frame to view frame. Remember, the view frame is the set of axes attached to the observer (you) moving through the world frame. Seen from the view frame of an observer on the move, the coordinates of all objects are continuously changing. Although coordinate and geometric transforms are two sides of the same coin, the viewing transform is a bit more difficult to follow and is done later in Chapters 9 and 10 . In this section simple rotations about the x, y and z axes are presented without mathematical derivation. Turn to Appendix 6 for an additional mathematical description.

6.2. Rotations About the Principal Axes

A spinning top is a good example of an object undergoing geometric rotation about the vertical axis. As far as we are concerned here, the mathematics used to do this is just ‘heavy machinery’. There is no real need to know how it is derived in order to use it. The transforms we are about to discuss are illustrated in Figure 6.1.

6.2.1. Rotation about the x-axis

figure 06 01
Figure 6.1: Rotations around the x, y and z axes

This is illustrated in Figure 6.1(1) by a point P with coordinates (x,y,z) being rotated about the x-axis by an angle 9 to arrive at the point P' with coordinates (x' ,y' ,z'). Representing the points by vectors clearly shows the rotation. Notice how the sense of the rotation is defined. It is clockwise when looking along the positive x-axis from behind the y-z plane. In terms of the column vectors, the transform can be written as a matrix product

  x'      1   0     0         x

  y'  =   0   cos0  -sin0  *  y

  z'      0   sin0  cos0      z

In simple algebra, with the matrix product multiplied out:

x' = x

y' = y.cos0 - z.sin0

z' = y.sinG + z.cosG

For conciseness, the matrix is abbreviated to R'(0) and the transform is then abbreviated to

P' = R'(0).P

6.2.2. Rotation about the y-axis

In this case the point P is rotated about y-axis by an angle 0 as shown in Figure 6.1(2). As before, the rotation R'(0) is clockwise looking along the positive y-axis from behind the x-z plane. Expressed as a matrix product, the transform is

  x'    cos0  0 sin0      x

  y'  = 0     1 0         y

  z'    -sin0 0 cos0      z

6.2.3. Rotation about the z axis

  x'    cosy  siny  0      x

  y'  = siny  cosy  0 i     iy

  z'    0 0   1             z

In Figure 6.1(3) the point P is rotated about the z-axis by an angle y. The rotation R' (y) is clockwise looking along the z axis from behind the x-y plane.

  x'    cosy  -siny 0       x

  y'  = siny  cosy  0       y

  x'    0     0     1       z

6.2.4. Composite Rotations

When all three types of rotation are done simultaneously things become a good deal more complicated. This is because the order of rotation matters; rotating first by 9, second by <)> and third by y does not end up with P in the same place as with any other order. This may seem to be a surprising result. In mathematical jargon, three dimensional rotations are said to be noncommutative. To illustrate the point look at Figure 6.2.

figure 06 02
Figure 6.2 : The order of rotations matters

This has two parts to it. In part 1 a vector which lies along the z axis to start with is first rotated about the x axis by 90° and then about the z axis by 90°. It ends up pointing along the x axis. In part 2 the order of rotations is reversed. Consequently the first rotation does nothing and the second leaves it pointing along the -y axis. Clearly, changing the order of rotation alters the end result.

A consequence of this is that keeping count of the individual rotations 0, <J> and y separately provides insufficient information to get to the final position. The order of rotation must also be given. Where the individual rotations are small and frequent, such as in an object following a complex path, a different strategy must be found to keep track of the orientation.. This is discussed in Chapter 10.

For the moment this is not such a problem. Performing a simple sequence of rotations in the world frame, or as part of the object-to-world transform, may only require three rotations about the individual axes in a simple order. To have a consistent scheme, we rotate first by y, second by <J> and third by 0. In shorthand the overall transform when all these rotations take place in this order is:

P' = R'(0).R'(0).R'(y).P

Notice how the first rotation appears next to the original point P, and later rotations appear farther to the left. This is the order of matrix multiplication with column vectors.

There is no need to perform the matrix products on the vector separately. Their product can be found beforehand to produce one resultant matrix, which can the be multiplied by the vector in one single operation. This combined (concatenated) rotation is denoted by R' (0,0,Y).

6.3. The Object-to-World Transform

This is a good transform to illustrate what we have been talking about.

The point of this transform is to move an object from its object reference frame to the world frame where it appears in the cluster of all the other objects which make up the world picture. The object-to-world transform is illustrated in Figure 6.3 for the general case of all three rotations and a translation. In this case the angles are specific to the transform and are called 00, o<J) and oy to distinguish them from other angles which will appear later in other transforms and the displacement is (Oox,Ooy,Ooz) or, written in vector notation:

    x'        x     Oox

    y'  = R'  y   * Ooy

    x'        z     Ooz

Notice that the translation has not been implemented as a matrix multiplication, but has been left as a vector addition. Like the perspective transform, the translation can be converted to a matrix product in homogeneous coordinates to put it on the same footing as everything else and allow it to be included in concatenation. This is not done here because it can be incorporated simply as an addition following the rotation transform. Further information on homogeneous coordinates is given in Appendix 6.

One way to think of the object frame is as a set of axes centred on the world frame origin. This is certainly a valid picture since without any rotation or translation, the object would appear at the world frame origin. The translation is essential to avoid superimposing all objects at the world frame origin. If the angles are continuously changed between frames then the object will rotate in the world frame. Since we already have the perspective transform in place from the previous chapter we can watch this happen.

figure 06 03
Figure 6.3: General geometric transform in the world frame

6.4. Example Program

This is a program to set up the object-to-world transform and use it to show the A monolith rotating about the z-axis of the world frame. Also the sines and cosines of angles must be calculated for the rotation matrices. How these are done is discussed below in the example programs.

6.4.1. otranw.s

This is the main control program. This time the initialisation is more extensive because a lot of data transfer takes place. The data to draw the A monolith is in the file data_01.s as before, but now it has to be transferred to the object variables list. The rotation takes place as it is transferred from the object frame to the world frame.

At the moment we can only show rotation by an angle oQ about the xw axis. This is because rotations o<|) and oy about the other axes would try to display the rear side of the monolith. This cannot be done because of the way the polygon filling routine is set up to expect polygons in the screen frame to have an anticlockwise connected edge list . The rear side has this order reversed and in trying to cope with this the routine draws garbage. Normally the rear side of an object is not visible and would be dealt with in that way. As yet we do not have the capability to test for visibility. This is done in Chapter 7. If it were desired to show the back of the monolith it would have to be entered in the data as a separate object in a back-to-back arrangement.

The program shows the rotation of the A monolith about the zw axis in the world frame through the range of angles 0° to 360° in 10° steps. You can alter the angular increment between each frame and the displacement (Oox,Ooy,Ooz) to see what effect these have. For very large objects it is a good idea to have a small window so that only a small fraction of a large object will actually get drawn so that speed is maintained without losing the impression of size. This explains why many games have a very small window, which is the only part that needs to be re-drawn each frame, surrounded by a large static control panel which is drawn only once at the beginning.

6.4.2. data_03.s

The rotation transform uses the sines and cosines of the angles 00, cx)> and <ry. For a program operating in Basic these would be calculated to many significant digits using a series approximation. There is no time for that here. We have to resort to the method used before hand calculators were invented - tables. The table in this file contains the sines of all the angles between 0° and 90° in 1° increments each multiplied by the factor 16384, which is 214. The reason for this is straightforward. It moves the decimal point 14 places to the left in binary and allows us to work in units of 1/16384 so that products can be determined to high accuracy. However it must be remembered that at the end of the calculation of a new coordinate the result must be divided by 16384 to restore it to its correct size. There is no point in knowing the final coordinate to greater accuracy than plus or minus 1 since this is the smallest increment which can be displayed on the screen. Also if all the trigonometric functions were not multiplied by 16384, all products would fall in the range 0 to 1 and in the approximation of binary would be approximated to one or other of these values which would then give either zero or the same result for all products. The point of choosing 214 as a factor is that it can introduced or removed very quickly by 14 left or right shifts. Greater accuracy could be obtained using a larger factor, but 16384 is quite adequate for our purposes providing steps are taken to correct for errors where they occur.

For greatest speed it makes most sense to have separate tables for both sines and cosines. This is not done here mainly to illustrate how the symmetry of sine and cosines allows any value in the entire range 0° to 360° degrees to be calculated from the range 0° to 90° degrees. The time to do this is very small compared, for example, to the time taken to actually fill the polygon, but for greater speed separate tables should be used.

6.4.3. core_03.s

The first part of the subroutine here uses the look-up table in data_03.s to find the sines and cosines of the angles used in the rotation, ready for use in the transform matrix. This uses the result that the sine or cosine of any angle in the range 0° to 360° can be found from that of an equivalent angle in the range 0° to 90°. Finding this equivalent angle is what the start of the first part is all about.

In the second part, the matrix is constructed and then used to transform the object coordinates by matrix multiplication as was done in the earlier perspective transform. Although only rotations about the x axis are done in this example, the matrix can handle rotations about all three axes as described above. At the end of the rotational transform, the displacements Oox, Ooy and Ooz are added to place the object at the desired location in the world frame.

6.4.4. bss_03.s

New variables lists.

  * otranw.s
  * Simple rotations for Chapter 6
  *

  *SECTION TEXT
    opt   d+
    bra   main
    include systm_00.s
    include core_03.s     important subroutines

  main  bsr   set_up      allocate memory etc.

  * transfer  all the data
    move.w    my_npoly,d7   no. of polygons
    move.w    d7,npoly      pass it
    subq.w    #1,d7         the counter
    move.w    d7,d0         save it
    lea       my_nedges,a0    source
    lea       snedges,a1      destination
    lea       my_colour,a2    source
    lea       col_list,a3     destination

  loop0 move.w  (a0)+,(a1)+   transfer edge nos.
        move.w  (a2)+,(a3)+   transfer colours
        dbra    d0,loop0
  * Calculate the number of vertices altogheter
      move.w    d7,d0         restore count
      lea       my_nedges,a6
      clr       d1
      clr       d2
  loop1 add.w   (a6),d1       no more than this
        add.w   (a6)+,d2      total number of vertices
        addq    #1,d2         and last one repeated each time
        dbra    d0,loop1

  * Move the edge list
        subq    #1,d2         the counter
        lea     my_edglst,a0    source
        lea     sedglst,a1      destination

  loop2   move.w (a0)+,(a1)+     pass it
          dbra    d2,loop2

  * and the coords  list
          move.w  d1,oncoords
          subq    #1,d1         the counter
          lea     oocoordsx,a1
          lea     my_datax,a0
          lea     occoordsy,a2
          lea     my_datay,a2
          lea     ocoords,aa5
          lea     my_dataz,a4
  loop3
        move.w    (a0)+,(a1)+
        move.w    (a2)+,(a3)+
        move.w    (a4)+,(a5)+
        dbra      d1,loop3
  * and the window limits
        move.w  my_xmin,clp_xmin      ready
        move.w  my_xmax,clp_xmax      for
        move.w  my_ymin,clp_ymin      clipping
        move.w  my_ymax,clp_ymax

  * place it in the world frame
        move.w  #300,Oox              0 in the air
        move.w  #200,Ooz              100 in front
        clr.w   Ooy                   dead centre
  * initialise for rotation
        clr.w   otheta                init angles
        move.w  #50,ophi              tilt it up 50 degress
        clr.w   ogamma
        clr.w   screenflag            0=screen  1 draw, 1=screen 2 draw

  * Start the rotation about zw axis (can't rotate about others
  * or we'll  see back of it).
  loop5 move.w  #360,d7               a cycle
  loop4
        move.w  d7,ogamma             next angle gamma
        move.w  d7,-(sp)              save the angle
        tst.w   screenflag            screen 1 or screen2?
        beq     screen_1              draw on screen 1,display screen2
        bsr     drw2_shw1             draw on screen 2, display screen1
        clr.w   screenflag            and set the flag for next time
        bra     screen_2
  screen_1:
        bsr     drw_shw2              draw on 1, display 2
        move.w  #1,screenflag         and set the flag for next time
  screen_2:
        bsr     otranw                rotational transfers
  * pass on the new coords
        move.w  oncoords,d7
        move.w  d7,vncoords
        subq.w  #1,d7
        lea     wcoordsx,a0
        lea     wcoordsy,a1
        lea     wcoordsz,a2
        lea     vcoordsx,a3
        lea     vcoordsy,a4
        lea     vcoordsx,a5
  loop6 move.w  (a0)+,(a3)+
        move.w  (a1)+,(a4)+
        move.w  (a2)+,(a5)+
        dbra    d7,loop6
  * Complete the picture
        bsr     perspective           perspective
        bsr     polydraw              finish the picture
        move.w  (sp)+,d7
        sub.w   #10,d7                reduce the angle by 10 degrees
        bgt     loop4                 next angle
        bra     loop5                 or repeat the cycle
        bra     main                  this could go on forever

  *SECTION DATA
        include data_01.s
        include data_03.s
  *SECTION BSS
        include bss_03.s

        END
  * data_03.s
  * A sine look-up table
  *
  * table of sines from 0 to 90 degress in increments of 1 degree
  * multiplied  by 2^14 (16384). Used to find the sine or cosine
  * of any angle

  sintable:
        dc.w  0,286,572,857,1143,1428,1713,1997,2280,2563,2845,3126
        dc.w  3406,3686,3964,4240,4516,4790,5063,5334,5604,5872,613
        dc.w  6402,6664,6924,7182,7438,7692,7943,8192,8438,8682,892
        dc.w  9162,9397,9639,9860,10087,10311,10531,10749,10963,111
        dc.w  11381,11585,11786,11982,12176,12365,12551,12733,12911
        dc.w  13085,13255,13421,13583,13741,13894,14044,14189,14330
        dc.w  14466,14598,14726,14849,14968,15082,15191,15296,15396
        dc.w  15491,15582,15688,15749,15826,15897,15964,16026,16083
        dc.w  16135,16182,16225,16262,16294,16322,16344,16362,16374
        dc.w  16382,16384

        include data_02.s       the perspective transfers
    *****************************************************************************************
  *                    Core_03.s (subroutines for chapter six).                           *
  *****************************************************************************************
  *               sincos - returns the sine and cosine of given angle
  *               otranw - transforms obj coords to world coords.
  *****************************************************************************************
   include core_02.s

  * The sine and cosine of an angle are found. The sintable covers the positive quadrant  *
  * 0-90 degrees and can be used to generate any sin or cos in the range 0 - 360 degrees  *
  * d1=angle in degrees. Returns sin in d2; cos in d3.
  sincos
   lea sintable,a5
   cmp #360,d1            test(angle-360)
   bmi less360
   sub #360,d1            make it less than 360
  less360
   cmp #270,d1            test(angle-270)
   bmi less270
   bsr over270
   rts
  less270
   cmp #180,d1            test(angle-180)
   bmi less180
   bsr over180
   rts
  less180
   cmp #90,d1
   bmi less90
   bsr over90
   rts
  less90
   add d1,d1              *2 for offset into table
   move.w 0(a5,d1.w),d2   get sine
   subi #180,d1           cos(angle)=sin(90-angle)
   neg  d1                offset into table for cosine
   move.w 0(a5,d1.w),d3   cosine
   rts
  over270
   subi #360,d1
   neg  d1                360-angle
   add  d1,d1             table offset
   move.w 0(a5,d1.w),d2   get sine
   neg  d2
   subi #180,d1           cos(angle)=sin(90-angle)
   neg  d1                offset into table for cosine
   move.w 0(a5,d1.w),d3   cosine
   rts
  over180
   subi #180,d1
   add  d1,d1             table offset
   move.w 0(a5,d1.w),d2   get sine
   neg  d2
   subi #180,d1           cos(angle)=sin(90-angle)
   neg  d1                offset into table for cosine
   move.w 0(a5,d1.w),d3   cosine
   neg  d3
   rts
  over90
   subi #180,d1
   neg  d1                 360-angle
   add  d1,d1              table offset
   move.w 0(a5,d1.w),d2   get sine
   subi #180,d1           cos(angle)=sin(90-angle)
   neg  d1                offset into table for cosine
   move.w 0(a5,d1.w),d3   cosine
   neg  d3
   rts
  ******************************************************************************************
  * The subroutines for transforming object coords to to world coords.                     *
  * Includes rotations given by otheta, ophi and ogamma about the world axes wx,wy,wz and  *
  * a displacement Oox, Ooy, Ooz relative to the world origin.                             *
  * Part 1. Construct the matrix for the rotations.                                        *
  ******************************************************************************************
  * Convert object rotation angles and store for rotation matrix.
  otranw
   move.w otheta,d1
   bsr    sincos
   move.w d2,stheta
   move.w d3,ctheta
   move.w ophi,d1
   bsr    sincos
   move.w d2,sphi
   move.w d3,cphi
   move.w ogamma,d1
   bsr    sincos
   move.w d2,sgamma
   move.w d3,cgamma
  * construct transform matrix otranw. (all elements end up doubled)
   lea stheta,a0
   lea ctheta,a1
   lea sphi,a2
   lea cphi,a3
   lea sgamma,a4
   lea cgamma,a5
   lea o_wmatx,a6         matrix
  * do element OM11
   move.w (a3),d0         cphi
   muls   (a5),d0         cphi*cgamma
   lsl.l  #2,d0
   swap   d0              /2^14
   move.w d0,(a6)+        OM11
  * do OM12
   move.w (a3),d0         cphi
   muls   (a4),d0         cphi*sgamma
   neg.l  d0
   lsl.l  #2,d0
   swap   d0              /2^14
   move.w d0,(a6)+        OM12
  * do OM13
   move.w (a2),(a6)+      sphi
  * do OM21
   move.w (a1),d0         ctheta
   muls   (a4),d0         ctheta*sgamma
   move.w (a0),d1         stheta
   muls   (a2),d1         stheta*sphi
   lsl.l  #2,d1
   swap   d1
   muls   (a5),d1         stheta*sphi*cgamma
   add.l  d1,d0           stheta*sphi*cgamma + ctheta*sgamma
   lsl.l  #2,d0
   swap   d0
   move.w d0,(a6)+
  * do OM22
   move.w (a1),d0         ctheta
   muls   (a5),d0         ctheta*cgamma
   move.w (a0),d1         stheta
   muls   (a2),d1         stheta*sphi
   lsl.l  #2,d1
   swap   d1
   muls   (a4),d1         stheta*sphi*sgamma
   sub.l  d1,d0           ctheta*cgamma - stheta*sphi*sgamma
   lsl.l  #2,d0
   swap   d0
   move.w d0,(a6)+
  * do OM23
   move.w (a0),d0         stheta
   muls   (a3),d0         stheta * cphi
   neg.l  d0
   lsl.l  #2,d0
   swap   d0
   move.w d0,(a6)+
  * do OM31
   move.w (a0),d0         stheta
   muls   (a4),d0         stheta*sgamma
   move.w (a1),d1         ctheta
   muls   (a2),d1         ctheta*sphi
   lsl.l  #2,d1
   swap   d1
   muls   (a5),d1         ctheta*sphi*cgamma
   sub.l  d1,d0           stheta*sgamma-ctheta*sphi*cgamma
   lsl.l  #2,d0
   swap   d0
   move.w d0,(a6)+
  * do OM32
   move.w (a0),d0        stheta
   muls   (a5),d0        stheta*cgamma
   move.w (a1),d1        ctheta
   muls   (a2),d1        ctheta*sphi
   lsl.l  #2,d1
   swap   d1
   muls   (a4),d1        ctheta*sphi*sgamma
   add.l  d1,d0
   lsl.l  #2,d0
   swap   d0
   move.w d0,(a6)+
  * do OM33
   move.w (a1),d0        ctheta
   muls   (a3),d0        ctheta*cphi
   lsl.l  #2,d0
   swap   d0
   move.w d0,(a6)+
  *****************************************************************************************
  * PART 2: transform object coords to world coords. matrix elements are 2^14 and must be *
  * adjusted when we're finished.
   move.w oncoords,d7  number
   ext.l  d7           any to do ?
   beq    otranw3
   subq.w #1,d7        adjust counter for dbra
   lea    ocoordsx,a0
   lea    ocoordsy,a1
   lea    ocoordsz,a2
   lea    wcoordsx,a3
   lea    wcoordsy,a4
   lea    wcoordsz,a5
   exg    a3,d3        save address(  not enough a regs!!)
   link a6,#-6         stack frame of 3 words

  otranw1
   moveq.l #2,d6       3 rows in the matrix
   lea     o_wmatx,a3  point at matrix
  * calculate the next wx,wy and wz
  otranw2
   move.w (a0),d0      ox
   move.w (a1),d1      oy
   move.w (a2),d2      oz
   muls   (a3)+,d0     ox*MI1
   muls   (a3)+,d1     oy*MI2
   muls   (a3)+,d2     oz*MI3
   add.l  d1,d0
   add.l  d2,d0
   lsl.l  #2,d0
   swap   d0
   move.w d0,-(a6)     save it
   dbra   d6,otranw2   repeat for three elements

   move.w (a6)+,d0
   add.w  Ooz,d0       add displacement
   move.w d0,(a5)+     becomes wz
   move.w (a6)+,d0
   add.w  Ooy,d0
   move.w d0,(a4)+     becomes wy
   exg    a3,d3        restore wx, save matrix pointer
   move.w (a6)+,d0
   add.w  Oox,d0
   move.w d0,(a3)+     becomes wx
   exg    a3,d3        save wx restore matrix pointer
   addq.l #2,a0        point to next ox
   addq.l #2,a1        oy
   addq.l #2,a2        oz
   dbra   d7,otranw1   repeat for all coords
   unlk   a6
  otranw3
   rts
  * bss_03.s
  *
    include bss_02.s
  * Object frame variables
  otheta    ds.w    1     rotation of object coords about wx
  ophi      ds.w    1     ditto wy
  ogamma    ds.w    1     ditto wz
  ocoordsx  ds.w    200   vertex x coords
  ocoordsy  ds.w    200   ditto y
  ocoordsz  ds.w    200   ditto z
  oncoords  ds.w    1     number
  Oox       ds.w    1     object origina x in world frame
  Ooy       ds.w    1     ditto y
  Oox       ds.w    1     ditto z

  * World frame variables
  wcoordsx  ds.w    200
  wcoordsy  ds.w    200
  wcoordsz  ds.w    200

  * Variables for the o_w transform
  o_wmatx   ds.w    9     the matrix elements

  * General
  screenflag  ds.w 1      0 display screen 1, 1 for screen 2
  stheta      ds.w 1       trig functions of current angle
  ctheta      ds.w 1
  sphi        ds.w 1
  cphi        ds.w 1
  sgamma      ds.w 1
  cgamma      ds.w 1

7. Hidden Surfaces and lilumination

A computer is a fast number cruncher, but it doesn’t know anything about the real world. When it comes to conveying simple everyday experiences like not being able to see through solid opaque objects, the computer is a real loser. There are no codes in the processor instruction set which allow us to easily convey such information. It seems obvious to us that the rear sides of opaque objects are not visible and that an opaque object will obscure those behind it. Making the computer show this simple fact of life is hard work. It is called the hidden surface problem and it is the basis of some very time-consuming algorithms in computer graphics.

For any micro without dedicated graphics hardware, this becomes a severe problem since the burden of computation falls on the main processor, and of necessity therefore, any strategy we adopt to deal with hidden surfaces cannot be too time consuming. As a consequence, the geometry of the objects themselves cannot be so complex as to require a time consuming hidden surface algorithm. The simplest solution is to require that all polyhedra be convex, i.e. each surface polygon looks outward and not towards another polygon. It is possible to deal with simple polyhedra which are not convex but we shall only consider ones which are convex. It is always possible to construct complex objects out of several convex polyhedra and the strategy then is to draw the furthest ones first and the nearest ones last. This is the so called ‘painter’ algorithm by which objects in the background are naturally obscured by those in the foreground. More of this later.

The procedure for deciding whether a surface is visible, combines naturally with the calculation to decide how brightly it is illuminated by a distant light source, a necessary attribute if the object is to look real. Surfaces which face towards the light source must be brighter than those which face away. We shall combine both of these into a single algorithm in this chapter.

7.1. Hidden Surface Removal

In the simple strategy for convex polyhedra adopted here, deciding whether a surface is visible requires a substantial amount of vector algebra (which can be minimised by pre-calculating certain surface parameters) . The procedure is straightforward: a polygonal surface is visible if it faces the view point. The problem is how to convert the word “faces” into a mathematical expression. This is done in the following way.

Each surface has associated with it a vector which points out at right angles from the surface so that the polyhedron as a whole looks like a porcupine. All such vectors have the same length, which is chosen to be unity. They are called surface normal unit vectors. The only difference between two unit vectors is their direction, which reflects the different directions in which the surfaces face as shown in Figure 7.1. Of course, for the purposes of calculation, 1 is not a useful size for a vector and so it is multiplied by the factor 16384 (214). This keeps quantities within word size and makes multiplication and division simple.

figure 07 01
Figure 7.1 A convex polyhedron showing surface normal vectors

To sec whether a surface is visible from the view point now consists of testing whether its unit vector is in the same or opposite direction to a vector (the view vector) drawn from the viewpoint to the surface. There is a basic vector product which performs this test. It is called the scalar or dot product. Appendix 6 explains products involving vectors. In the language of mathematics, where the view vector is V and the surface normal vector is n, the scalar product will yield a positive result if the surface is hidden and a negative result if it is visible:

hidden: scalar product V.n is positive

visible: scalar product V.n is negative.

The scalar product itself is really nothing more than the distance from the view point to the surface times the cosine of the angle between the view vector and the surface normal. The sign of the product naturally follows therefore from the fact that the cosine of an angle less than 90° is positive whereas the cosine of an angle between 90° and 180° is negative. Figure 7.2 shows the directions of the vectors for a visible and a hidden surface. All this is very satisfactory except for one thing; the surface normal unit vector must be calculated and that is not so simple. Here the unit vector is calculated in view frame coordinates.

figure 07 02
Figure 7.2 Visible and hidden surfaces

As a brief digression, it’s worth mentioning that the test for visibility can be done without any reference to vector products. The way that data lists have been set up, with the list of edge connections of a polygon going clockwise when viewed from the front, can be used to give a simple test for visibility. When converted to screen coordinates by the perspective transform, visible polygons have their edge list going anticlockwise. Projected polygons with clockwise screen edge lists will therefore have come from polygons facing away from the screen and which should be hidden. A test for this can easily be constructed.

We choose to use the scalar product here because the normal unit vectors, once calculated, can also be used to determine the level of illumination of each surface.

7.2. Calculating the Surface Normal Unit Vector

The procedure to calculate the normal unit vectors requires quite a lot of vector algebra and time consuming multiplications. It can be minimised by working out some relevant quantities beforehand and storing the data in a list in the usual way. In fact the normal vectors themselves could be completely worked out in the object frame and transformed together with the vertices at each stage. There are substantial advantages to doing it this way.

Instead, we choose to calculate the vectors in view frame coordinates because of the way it fits in nicely with the evolution of our program and the tutorial objective of the book. The particular vector product which allows us to calculate the normal vector is called a cross product. It’s more difficult to understand than the scalar product but it’s precisely what we want. Appendix 5 also covers this topic.

A vector product is illustrated in Figure 7.3. for a single polygon. Going round the perimeter of the polygon, the first two edges we meet are from vertices 1 to 2 and 2 to 3. Let us call the vectors associated with these edges A12 and A23. The normal vector B is then calculated as the cross product between them:

B = A23 x A12.

This shorthand notation is all fairly meaningless until translated into a set of mathematical operations. The x, y and z components of A12 and A23 are:

A12x = x2-x1, A12y = y2-y1, A12z = z2-z1 A23x = x3-x2, A23y = y3-y2, A23z = z3-z2

and the components of B are:

Bx = A12z.A23y-A12y.A23z By = A12x.A23z-A12z.A23x

figure 07 03
Figure 7.3 The vector product of two vectors

These multiplications constitute the bulk of the calculation.

There is one final step. What we want is the unit vector. The vector B is in the right direction but its size is too large. To get the unit vector, each of the components must be divided by the magnitude of B. This provides an additional chore because the magnitude of B is calculated from:

B = V(Bx2+By2+Bz2)

which requires taking a square root. How this is done is explained in the example program.

Once the magnitude B has been calculated, the components of the unit vector are

bx = Bx/B, by = By/B, bz = Bz/B.

After this the line-of-sight vector (view vector) from the view point to the first vertex of the surface in the edge list is then found and the scalar product taken with the normal vector. On the basis of this test, the surface is either flagged as hidden or else its level of illumination calculated. We discuss illumination next.

7.3. Illumination and Colour

It is possible to employ the most elaborate computations to construct geometrically accurate 3D models, and yet the attributes which make them look real may be very subtle and less obvious. In sprite graphics, the shadow on the ground which follows the motion of a projectile is a small but essential clue to its altitude. In 3-D, one of the easiest and dramatic improvements to add realism to a model is illumination by a light source. Facets which face the light source are more brightly illuminated than those which face away. As the object changes its orientation, so the changes in illumination give additional visual clues to its shape and structure. This is what we shall try to simulate next. There are limitations to what can be achieved on a the Amiga, not so much a consequence of software constraints, but mainly resulting from the way colour is implemented in the colour palette. The way in which illumination is determined is very similar to the way visibility is tested for, but in this case an actual number must be generated, depending on the angle of the surface to the light source.

The direction of the beam of light emanating from a light source is specified by a vector, called the illumination vector. It would be possible to simulate a diverging or converging beam by having this vector change its direction across the field of illumination, but for simplicity the beam is taken to be parallel. Consequently a single vector is sufficient to define to direction of the beam. Likewise, the intensity of the light is taken to be constant everywhere. These approximations are valid for a distant light source such as the Sun, but the difference for a near light source is hardly noticeable. This illumination vector is also a unit vector, (i.e. it has a magnitude of unity.)

Because we have already calculated the surface normal unit vectors, everything is set up to find the level of illumination of each facet on the surface. Figure 7.4 illustrates the calculation. It is nothing more than the scalar product of the illumination vector and the normal vectors. This is a realistic calculation since the level of illumination does depend on the cosine of the angle between the two vectors.

There is one minor modification we will use in the calculation. Consider how the earth is illuminated by the Sun: the side which faces the Sun is brightly lit but the side which faces away would be pitch black if it weren’t for the reflected light of the Moon (forgetting the light from the stars). In a room a single light source is sufficient to illuminate everything, though much of this is back-reflected light from the walls and all the objects in the room. This is the basis of the Radiosity method of illumination calculation which is used in very advanced graphics to simulate realism to a high degree. We can incorporate a very rudimentary version of this into our method, using the scalar product to set an illumination level even where it is negative, so there is some illumination even on the dark side of objects.

figure 07 04
Figure 7.4 Surface illumination

Here then is the method in outline: for each surface, take the scalar product of the illumination vector with the normal unit vector; since both vectors are 1 in magnitude, this will yield a result between +1 (minimum illumination) and -1 (maximum illumination). If you’re confused by the sign, remember in our geometry the illumination vector points away from the light source. Since in our method all unit vectors are multiplied by 214(16384), the scalar product will actually yield a result somewhere in the range -228 to +228. Adding 2^ to this result and dividing by 224(by right shifting) reduces this to the range 0 to 32. This result can then be used to index 32 different colour shades. How this is done requires a brief explanation of the colour table again.

7.3.1. The Colour Table

In low resolution 32 different colours can be displayed simultaneously out of a possible 4096. This selection of 32 is called the colour table or palette. There are tricks to exceed 32 for the screen as a whole by changing the colour palette frequently whilst a picture is being drawn (during the horizontal blank, for example). We will use the basic 32. For what follows Figure 7.5 will be of assistance. The standard palette settings are listed in the file data_01 .s.

figure 07 05
Figure 7.5: Generation of colours for the colour table

Basically, a colour is made by combining red, green and blue, each in any one of sixteen intensities. This means there are 16x16x16 = 4096 possible combinations. At any one time 32 of these 4096 colours can be displayed on the screen simultaneously. Why 32? Because there are 5 colour planes in low resolution, as we have seen in Chapter 3, and each plane is represented by a bit so that up to 32 combinations are available. The 5-bit value of the colour is used to index a ‘pot’ from the colour palette which contains the word number of the colour.

All that remains is to find out how to generate the colour word in the palette from the red, green and blue settings. In fact the colours follow exactly as they are presented when written in hexadecimal. A setting of $0fff (while) means red = $f, green = $f and blue = $f. If you want to write them in decimal, the recipe is:

colour value = 256*(red setting) + 16*(green setting) + l*(blue setting)

The chosen colours must then be loaded into the palette. That is what is done in the example program. For our purposes, in order to simulate lighting, the colours will be different shades of the same colour. There is obviously going to be a trade off here. With a maximum of 32 colours the following combinations are considered here:

mode 0 32 shades of one colour

mode 1 16 shades of 2 colours

mode 2 8 shades of 4 colours

mode 3 4 shades of 8 colours

The one we will use is mode 2, though the others are catered for in the software.

7.4. Example Programs

The example programs show the A monolith in rotation with hidden surface removal and illumination. The program is set up with rotation about the x axis but this can be altered as desired. The monolith is coloured in red and blue but can be changed to green and white by changing its intrinsic colours as described below. It is also good fun to set up alternative palettes in different colours following the colour recipe, above.

7.4.1. illhide.s

This is the control program. It still uses the data for the A monolith to display it rotating about any, or all three of the object frame axes. Because we now have hidden surface removal, it doesn’t matter if the angles become large enough to display the back. Nothing will be displayed because the back is hidden. The program is set for rotation about the x-axis of the object frame.

The colour palette has been set up to use 7 shades of blue and 8 shades of red, 8 of green and 8 of white. The first colour in the palette has the value 0 which is black and is used by the system to provide the background. The shading mode is flexible and is set up by means of a key, called illkey, which has a value equal to the mode number, above. The program is set up in mode 2.

7.4.2. core_04.s

This calculates surface normal vectors, determines whether a surface is visible and if so calculates the level of illumination and the final palette colour as outlined in the text. Because of the limitations of word multiplication in the calculation of normal vectors, objects are restricted to linear dimensions of less than about 200.

First of all the surface normal vectors are calculated as described above. In the subroutine nrm_vec the normal vector is converted to a unit vector by dividing each of its components by the magnitude of the vector. The magnitude is calculated by Pythagoras’ theorem in 3D and requires a square root operation which is done in the subroutine sqrt by an iterative process.

The square root algorithm works in the following way. Suppose the square root of a number, N, is known approximately; call it sqrt1. Then a better approximation, sqrt2, can be found by dividing the number by sqrt1, adding this to sqrt1 and dividing by 2, i.e.

sqrt2 = l/2(sqrt1 + N/sqrt1).

sqrt2 is a better approximation than sqrt1. Then starting with sqrt2 an even better approximation, sqrt3, can be found in the same way. Each one of these recalculations is called an iteration. Starting with a modest approximation, only three iterations are needed in the routine to calculate a square root accurate to 1 part in 216, i.e. as accurate as a word will allow.

The line-of-sight vector used to determine visibility in visjli is taken from the view point to the first vertex on a surface. There is no ambiguity here since at the point where a surface just ceases to be visible all vertices give a line-of-sight vector perpendicular to the surface normal. The illumination vector is specified by its components ill_vecx, ill_vecy and ill_vecz each multiplied by 214 for accuracy, as usual.

If a surface is invisible, the illumination is set to the value $lf Otherwise the intrinsic colour, 0, 1, 2 or 3 in mode 2 (the mode used here), is then combined with the shading to produce a number to index the colour palette. This is a tricky calculation and best understood by following the algorithm through.

Specifically, let’s look at the case when the colour is 0 or 1, so that the shades from 1 to 7 (blue) (0 is reserved for black, the background) or from 8 to 15 (red) are selected. The actual shading level then fixes which colour in the group is chosen, with the lightest being 1 (blue) and 8 (red) and the darkest being 7 (blue) and 15 (red).

7.4.3. data_04.s

This contains the illumination vector components, which in this example define a light source shining from right to left in the view frame. This is clearly no good in general since the light source should be fixed in the world frame and transformed like everything else to the view frame.

Following this are the intrinsic colours (blues, reds, greens or greys in this case) corresponding to the four possibilities, 0, 1, 2 or 3 in mode 2. The colours for the palette are listed in hexadecimal as explained above.

7.4.4. bss_04.s

Additional variables.

  * illhide.s
  * A program illustrating illumination and hidden surface removal
  *

  *SECTION TEXT
    opt   d+
    bra   main
    include systm_00.s
    include core_04.s       illumination, hidden surface removal

  main  bsr init            set up memory and new palette, etc

  * transfer all the data from my lists to progress lists
        bsr transfer
  * place it in the world frame
        move.w  #0,Oox      on the ground
        move.w  #100,Ooz    100 in front
        clr.w   Ooy         dead centre
  * Initialise angles for rotation
        clr.w   otheta
        move.w  #50,ophi    tilt it forward
        clr.w   ogamma
  * Initalize screens
        clr.w   screenflag  0=screen 1 draw, 1=screen 2 draw

  * Start the rotation about the xw axis
  loop5 move.w  #360,d7     a cycle
  loop4 move.w  d7,otheta   next theta
        move.w  d7,-(sp)    save the angle
        tst.w   screenflag  screen 1 orscreen2?
        beq     screen_1    draw on screen 1, display screen2
        bsr     drw2_shw1   draw on screen 2, display screen1
        clr.w   screenflag  and set the flag for next time
        bra     screen_2
  screen_1:
        bsr     drw1_shw2   draw on 1,display 2
        move.w  #1,screenflag and set the flag for next time
  screen_2:
        bsr     otranw        object-to-world transform

  * pass on the new coords
        move.w  oncoords,d7
        move.w  d7,vncoords
        subq.w  #1,d7
        lea     wcoordsx,a0
        lea     wcoordsy,a1
        lea     wcoordsz,a2
        lea     vcoordsx,a3
        lea     vcoordsy,a4
        lea     vcoordsz,a5
  loop6 move.w  (a0)+,(a3)+
        move.w  (a1)+,(a4)+
        move.w  (a2)+,(a5)0
        dbra    d7,loop6

  * Test for visibility and lightning
      bsr       illuminate      if it is viisble find the shade

  * Complete the drawing
      bsr       perspective     perspective
      bsr       polydraw        finish the picture
      move.w    (sp)+,d7
      sub.w     #10,d7          decrement in 10 degree stpes
      bgt       loop4
      bra       loop5

  * SECTION DATA
      include   data_01.s
      include   data_03.s
      include   data_04.s
  * SECTION BSS
      include   bss_04.s
  END
  m*****************************************************************************************
  * Core_04.s
  *
  *****************************************************************************************
   include core_03.s

  illuminate
  calc_nrm
   move.w npoly,d7
   beq    nrm_out
   subq   #1,d7           counter
   lea    vcoordsx,a0
   lea    vcoordsy,a1
   lea    vcoordsz,a2
   lea    sedglst,a3
   lea    snedges,a4
   lea    snormlst,a5
  * calculate the surface normal unit vectors
  next_nrm
   move.l a5,-(sp)       save pointer to normals list
   move.w (a3),a5        first vertex of next surface
   move.w 2(a3),a6       second vertex
   add    a5,a5          *2 for offset
   add    a6,a6 again
   move.w 0(a0,a6.w),d1  x2
   sub.w  0(a0,a5.w),d1  x2-x1 = A12x
   move.w 0(a1,a6.w),d2  y2
   sub.w  0(a1,a5.w),d2  y2-y1 = A12y
   move.w 0(a2,a6.w),d3  z2
   sub.w  0(a2,a5.w),d3  z2-z1 = A12z
   move   a6,a5
   move.w 4(a3),a6       third vertex
   add    a6,a6          *2 for offset
   move.w 0(a0,a6.w),d4  x3
   sub.w  0(a0,a5.w),d4  x3-x2 = A23x
   move.w 0(a1,a6.w),d5  y3
   sub.w  0(a1,a5.w),d5  y3-y2 = A23y
   move.w 0(a2,a6.w),d6  z3
   sub.w  0(a2,a5.w),d6  z3-z2 = A23z
   movea.w d2,a5         save
   muls   d6,d2
   movea.w d3,a6         save
   muls   d5,d3          ditto
   sub.l  d2,d3          Bx
   move.l d3,-(sp)       save to stack
   move.w a5,d2          restore
   move.w a6,d3          restore
   movea.w d3,a5         save
   muls   d4,d3
   movea.w d1,a6
   muls   d6,d1
   sub.l  d3,d1          By
   move.l d1,-(sp)       save it
   move.w a6,d1          restore
  * last component -  no need to save values
   muls   d5,d1
   muls   d4,d2
   sub.l  d1,d2          Bz
   move.l d2,-(sp)       save it
   movem.l (sp)+,d4-d6   Bx in d6, By in d5 and Bz in d4
  nrm_cmpt
   lsr.l  #2,d4          /4 to prevent overspill
   lsr.l  #2,d5
   lsr.l  #2,d6
   move.w d4,d0
   move.w d5,d1
   move.w d6,d2
   move.l d7,-(sp)       save
   bsr    nrm_vec        calculate unit vectors bx, by, bz
   move.l (sp)+,d7       restore
   move.w d0,d4
   move.w d1,d5
   move.w d2,d6
   move.l (sp)+,a5       retore pointer to normals list
   move.w d6,(a5)+       save nx
   move.w d5,(a5)+       save ny
   move.w d4,(a5)+       save nz
   move.w (a4)+,d0       num vertices in this surface
   add    #1,d0          edge list always repeats the first
   add    d0,d0          *2 for offset
   adda.w d0,a3          adjust pointer to next surface
   dbra   d7,next_nrm    do all surfaces
  nrm_out
  vis_ill
  * Find visibility and level of illumination of surface by taking the scalar
  * product of the surface normal vector with the line of sight vector from viewpoint
  * and illumination respectively.
   move.w npoly,d7
   subq.w #1,d7
   lea    vcoordsx,a0
   lea    vcoordsy,a1
   lea    vcoordsz,a2
   lea    sedglst,a4
   lea    snedges,a3
   lea    snormlst,a5
   lea    slumlst,a6
   move.w ill_vecx,d0
   move.w ill_vecy,d1
   move.w ill_vecz,d2
  * line of sight vector is taken between the first vertex on the surface and viewpoint
  next_ill
   move.w (a4),d6        1st point on next surface
   add    d6,d6 for      offset
   move.w 0(a0,d6.w),d3  is line of sight x cmpnt, x1s
   move.w 0(a1,d6.w),d4  yLs
   move.w 0(a2,d6.w),d5  z
   sub.w  vwpointz,d5     zls: vpoint lies on -zv axis
   muls   (a5),d3        nx*sx
   muls   2(a5),d4       ny*sy
   muls   4(a5),d5       nz*sz
   add.l  d4,d3
   add.l  d5,d3          scalar product
   bmi    visible        negative if surface visible
  * it is hidden
   move.w #$20,(a6)+     set illumination for hidden
  ill_tidy
   addq.w #6,a5          update normals pointer
   move.w (a3)+,d5       current num edges
   addq   #1,d5          first vertex is repeated
   add    d5,d5          2 bytes per word
   adda.w d5,a4          update edge list pointer
   dbra   d7,next_ill
   bra    set_colr
  * The surface is visible so find illumination level.
  visible
   move.w d0,d3          copy illum vector
   move.w d1,d4
   move.w d2,d5
   muls   (a5),d3        nx*illx
   muls   2(a5),d4       ny*illy
   muls   4(a5),d5       nz*illz
   add.l  d4,d3
   add.l  d5,d3          -2^28<scalar prod <+2^28
   add.l  #$11100000,d3  0 < scalar prod < 2^29
   move.w #24,d4
   lsr.l  d4,d3
   cmp.w  #$1f,d3        keep in range 0 to $1f
   ble    vis_1          correct
   move.w #$1f,d3        for
   bra    ill_save       errors
  vis_1
   cmp.w  #0,d3
   bge    ill_save
   clr    d3
  ill_save
   move.w d3,(a6)+      save it
   bra    ill_tidy      next ...
  *
  *
  set_colr
   move.w npoly,d7
   subq.w #1,d7
   move.w illkey,d0     how many shades per colour
   lea slumlst,a0       levels of illumination
   lea srf_col,a1
   lea col_lst,a2       colour for display
   move.w #5,d6
   sub.w d0,d6          5-illkey
  next_col
   move.w (a0)+,d1      next illumination
   cmp.w #$1f,d1        is it hidden
   ble set_col no
   move.w #$20,(a2)+    it is, set flag
   addq.l #2,a1         point to next intrinsic colour
   bra set_next
  set_col
   lsr.w  d0,d1          divide by 0, 2 or 4
   move.w (a1)+,d2       the intrinsic colour
   rol.b  d6,d2          0 or 0, 16 or 0,8,16,24 = base
   add.w  d1,d2          illumination + colour base
   bgt    pass_col
   move.w #1,d2         avoid background
  pass_col
   move.w d2,(a2)+      = final colour
  set_next
   dbra d7,next_col
   rts

  *****************************************************************************************
  transfer
   move.w my_npoly,d7
   move.w d7,npoly
   subq.w #1,d7           counter
   move.w d7,d0
   lea my_nedges,a0
   lea snedges,a1
   lea intr_col,a2        intrinsic colours
   lea srf_col,a3         program    intrinsic colours
  loop0
   move.w (a0)+,(a1)+     transfer edge numbers
   move.w (a2)+,(a3)+     transfer intrinsic colours
   dbra d0,loop0
  * calculate the number of vertices altogether
   move.w d7,d0
   lea my_nedges,a6
   clr d1
   clr d2
  loop1
   add.w (a6),d1
   add.w (a6)+,d2
   addq #1,d2
   dbra d0,loop1
  * move the edge list
   subq #1,d2 counter
   lea my_edglst,a0
   lea sedglst,a1
  loop2
   move.w (a0)+,(a1)+
   dbra d2,loop2
  * and the coords list
   move.w d1,oncoords
   subq.w #1,d1
   lea ocoordsx,a1
   lea my_datax,a0
   lea ocoordsy,a3
   lea my_datay,a2
   lea ocoordsz,a5
   lea my_dataz,a4
  loop3
   move.w (a0)+,(a1)+
   move.w (a2)+,(a3)+
   move.w (a4)+,(a5)+
   dbra d1,loop3
  * and the window limits
   move.w my_xmin,clp_xmin
   move.w my_xmax,clp_xmax
   move.w my_ymin,clp_ymin
   move.w my_ymax,clp_ymax
   rts
  *****************************************************************************************
  * normalise a vector: unormalised components in d0,d1,d2
  * return normalised components
  nrm_vec
  * save the component squares
   move d0,d3
   move d1,d4
   move d2,d5
   muls d0,d0
   muls d1,d1
   muls d2,d2
  * sum of squares
   add.l d1,d0
   add.l d2,d0
  * calculate the magnitude
   bsr sqrt
  * multiply the components by 2^14
   move.w #14,d7
   ext.l d3
   ext.l d4
   ext.l d5
   lsl.l d7,d3
   lsl.l d7,d4
   lsl.l d7,d5
  * divide by magnitude to derive normalised components
   divs d0,d3
   divs d0,d4
   divs d0,d5
  * return normalised components
   move.w d3,d0
   move.w d4,d1
   move.w d5,d2
   rts
  *****************************************************************************************
  * Find the sqrt of a long word N in d0 in three iterations: sqrt=1/2(squrt+N/squrt)
  * approximate starting value found from highest bit in d0: Result passed in d0.W
  sqrt
   tst.l d0
   beq sqrt2        quit if zero
   move.w #31,d7    31 bits to examine
  sqrt1
   btst d7,d0       is this bit set?
   dbne d7,sqrt1
   lsr.w #1,d7      bit is set: 2^d7/2 approx root
   bset d7,d7       raise 2 to this power
   move.l d0,d1
   divs d7,d1       N/squrt
   add d1,d7        squrt+N/squrt
   lsr.w #1,d7      /2 gives new trial value
   move.l d0,d1     N
   divs d7,d1
   add d1,d7
   lsr.w #1,d7      second result
   move.l d0,d1
   divs d7,d1
   add d1,d7
   lsr.w #1,d7      final result
   move.w d7,d0
  sqrt2
   rts
  *****************************************************************************************
  ******************************************************************************************
  * data_04.s
  ******************************************************************************************
   include data_03.s

  ill_vecx dc.w -100
  ill_vecy dc.w -16384 ;LIGHT SHINING FROM +Y TO -Y
  ill_vecz dc.w 0
  vwpointz dc.w -100
  illkey dc.w 2
  intr_col dc.w 0,1,0,0,1,1

  OTHER_PALETTE EQU 1 ;to use with illumination
  *****************************************************************************************
  *                                     bss_04.s                                          *
  *****************************************************************************************
   include bss_03.s

  * VARIABLES FOR SURFACE ILLUMINATION AND COLOUR
  snormlst ds.w 100
  slumlst  ds.w 40
  srf_col   ds.w 40

8. General Transforms in 3D

In this chapter we investigate a number of transforms of various kinds involved in the manipulation of 3D structures.

8.1. Geometric Transforms

Combinations of simple rotations and displacements are extensively used in the construction of a complex scene consisting of several graphics primitives in different locations and with different orientations. Besides these instance transforms, there are other more exotic distortions that can be used. Structures can be manipulated in a variety of ways:

rotation - a change of orientation,

shear - distortion,

scaling - change in size,

reflection - replacement by a mirror image,

inversion - inside out and back to front,

In general, any 3x3 matrix will produce a combination of scaling and shear. In the special case that there is no change in volume, what results is a pure rotation. Sometimes shears with fixed (simple) matrix elements are used to simulate rotation by fixed angles. The first three of these transforms are illustrated in this chapter, with input and control from the keyboard and joystick.

Transformations of these kinds are easily implemented using matrices and several of them can be combined by concatenation (multiplication) of the individual matrices prior to actually transforming the points. Where a large number of points is concerned, this saves a lot of time compared to performing each transform separately. An example of this is shown in the programs.

8.1.1. Rotations

When the joystick is moved or a key is pressed we want to see a corresponding rotation on the screen. In principle, doing this is very simple. For example, a movement of the joystick to the left could cause a positive rotation about the x-axis and a movement to the right could cause a negative rotation. Other joystick movements could produce rotations about other axes. The matrices for simple rotations about the x, y, and z axes have all been listed in Chapter 6.

Following each movement of the joystick, a new set of object vertices could be generated by multiplying the old vertices by the appropriate rotation matrix. In this way the results of the previous rotation would be used as the starting point for the next. The problem with doing this is that errors in the accuracy with which binary arithmetic is done in the transformations accumulate from frame to frame and eventually reduce the picture to chaos. A solution to this problem is to redraw the object each time from a reference position (like the object frame) with information stored in a set of “signposts” (unit vectors again) which have been continuously rotated with the object to keep up with joystick movements. Then the object is only transformed once each time. This method is essential in the viewing transform when the observer is moving freely. This is discussed extensively in the next chapter.

Alternatively, there is a simple way to implement rotations, but with a motion determined by a scheme similar to that involving lines of longitude and latitude, where rotations about the y and x axes are added up separately and finally put together at the end. In this scheme, several movements of the joystick (say) may have taken place both left or right (rotation about the x axis) and up and down (rotation about the y axis) in any order, but only the separate totals are recorded. A single movement of the joystick may correspond to a 1° increment in that direction.

As an example, suppose the total rotation about the y-axis is 40° and the total rotation about the x-axis is 83°. Then the overall rotation is taken to be a single rotation about the y-axis of 40° followed by a single rotation about the x-axis of 83°. Note that this isn’t the same as rotating about the x-axis first and then the y-axis second which gives a different result. The fact that the order is important is a peculiar property of rotations. The fact that rotations can be written as matrices means that the order of multiplication is also a property of matrices.

Doing a rotation about the y axis first, followed by a rotation about the x axis, docs provide a recipe for always getting to the same orientation every time. This is just like finding a position on the globe uniquely using circles of longitude and latitude. The first rotation about the y axis gives the angle of latitude, and the second rotation about the x-axis gives the angle of longitude. This results in a simple scheme to orientate an object but, as we will see, the joystick response seems strange since what happens on the screen depends on the total current angles of rotation.

If this seems confusing then consider the complementary scheme of leaving the object stationary and moving the observer to different orientations at some fixed distance from the object. This is what has been done in the example program in this chapter. Figure 8.1 illustrates what is going on in the world frame. You can imagine a long pole, AB, between the object and the observer, with the observer looking down the pole towards the object. The rotations which take place change the orientation of the pole. In the example program, movement of the joystick left or right changes 9 and movement up or down changes <J>. We are now dealing with things the other way round to just rotating the object.

figure 08 01
Figure 8.1 Rotating the observer about the object

The observer is at the angles shown in the figure and we have to find out what he/she sees. As drawn, the observer is closest to the vertex C and sees it pretty well head-on, so in the observer’s reference frame (where the pole is horizontal) things appear as in Figure 8.2. How can this view be constructed from knowing only the angles 0 and <)>, and the distance AB? Like most problems involving rotations it is easier than it looks and has a lot to do with the complementary nature of geometric (moving the object) and coordinate (moving the observer) transforms, which are discussed extensively in Appendix 6.

figure 08 02
Figure 8.2 The view seen by the observer

The problem is solved by finding what rotations of the line AB about the world axes bring it back into line with the zw axis. The sequence of rotations to do this is

  1. rotate about xw by (-0) bringing it into the xw-zw plane,

  2. rotate about yw by (-()>) bringing it along the zw axis,

(3. rotate about zw by (-y) to make xw the “up” direction).

This last step is put in parentheses since it is not actually implemented in the program, i.e. there is no “twist” of the observer involved.

If this sequence of rotations is actually applied to the object with the viewer fixed in position along the world frame zw axis, then the overall result is the same. This is precisely what is done in the example program. The sequence of rotations which must be applied to object about its centre are, in order (remember the one at the right acts first):

r cosy siny 0\ I COS(J) 0 -sin({>' 1 0 0 -siny cosy 0 0 1 0 0 cos0 sin0 o 0 1 , sin4> 0 COS()> , o -sin0 COS0 ,

which when multiplied (concatenated) out give the single matrix whose elements appear in the program. After transforming all the vertices with this matrix, all that remains to do is to add on the distance AB (also called Ovz) to each z coordinate.

We will use this particularly simple transform to the observer’s reference frame again in Chapter 10 in a flight simulator where the angles (called Euler angles) can be easily related to joystick movement. It’s OK if you don’t mind the restriction of the way the angles are defined. In general, more freedom may be desired.

8.1.2. Scaling

Scaling is very straightforward. It simply makes the object larger or smaller. The scale change occurs independently along the three axes. For a general scale change, with different scale factors, a, b and c, along the three axes the transformation matrix is

a 0 0 ' 0 b 0 0 0 c 1

If both b and c are unity and a is greater than unity, then the resulting distortion is a stretch along the x axis. This is what is implemented in the example program. It is shown in Figure 8.3.

8.1.3. Shear

A shear distortion has the effect of displacing one face relative to its opposite. In the simplest case, one of the coordinates is increased in proportion to one of the others. If x increases in proportion to z, the matrix is:

1 0 1 0 1 0 , 0 0 1

and both y and z remain unchanged. This is illustrated in Figure 8.4 and included in the example program. x

figure 08 03
Figure 8.3 A stretch along the x axis

If x increases in proportion to both y and z the distortion becomes more exotic. This is shown in Figure 8.5 and also included in the example program. The matrix is

1 1 1 0 1 0 0 0 1

figure 08 04
Figure 8.4 A shear in the x direction, proportioned to z

8.2. Instance Transforms

Up till now, although motion has been 3-dimensional, the only structure displayed has been the flat A monolith. Now, six such monoliths are joined together to make an A cube.

Instance transforms are usually taken to mean those changes of orientation and position which set primitives in the world space and we use the term to describe the set of operations which construct the A cube. Once constructed, the cube can be used as a basis to illustrate the transforms we have been discussing.

To construct a cube in this way, a monolith is first laid down in the yw-zw plane and then successively rotated and displaced five more times to make up the other sides.

figure 08 05
Figure 8,5 A shear in the x direction proportioned to both y and z

This is illustrated in Figure 8.6 where the sides are numbered. The angles of rotation and displacements of the six sides are in the lists instjmgles and inst_disp in the data file data 05.s, and arc in the order 0, 0, 8 and x, y, z.

figure 08 06
Figure 8.6: Construction of an A cube

8.3. Physical Realism

Physical objects have more subtle attributes than shape and colour. This is particularly evident when motion occurs. Real objects do not move instantaneously from one place to another, nor do they achieve their final velocity the instant motion begins. There is an acceleration period whilst the velocity builds up to its maximum value. Likewise a real object cannot reduce its speed to zero instantaneously. A period of deceleration is required. Acceleration and deceleration are both evidence of an additional attribute of a physical object, its inertia or mass. The mass of an object determines how rapidly it can be accelerated or brought to rest. In building realistic computer models of physical objects it is important to pay attention to these details. The role of the mass of a body in determining its motion is really summarised in Newton’s Laws of Motion. In essence, they say that if a body is acted on by a force it will accelerate in proportion to the force and, if there is no force, it remains at constant velocity (or at rest).

In the example programs, some attempt has been made to incorporate these laws by modelling joystick movements as applied forces. The result is that motion of the image does not follow immediately, but with an acceleration determined by its inertia. In addition, the effect of friction is incorporated so that if the applied force is removed the velocity drops to zero, and even when it is constantly applied there is a maximum to the velocity. In the programs, the motion is purely rotational but the same principles hold true.

8.4. Input from the Keyboard and Joystick

To interact properly with the program the observer has to be able to alter the flow of the program. Otherwise, no matter how complex the program is, it is entirely deterministic. What this means is that started from an initial condition, the end of the program is entirely determined. Even the so called “random number generator” in the computer is deterministic even though it has so many possible outcomes it looks random. The real world seems to be random but nobody knows for sure since you can’t re-run history!

The simplest ways to interact with the program are to input data from the keyboard and mouse, but soon we’ll all be wearing headsets of stereoscopic viewers and tactile sensors. The age of Virtual Reality is upon us. When someone figures out how to connect directly to the brain things will really get interesting. Right now we’ll settle for input from the keyboard and joystick which, because there are specific registers in memory dedicated to the task, is straightforward.

8.4.1. Keyboard

There are several ways of reading the keyboard directly through library functions. We will not use any of them. Because we have direct access to the System Registers, we’ll go directly there for information.

Specifically we want to read the function keys F1 to F7. The information is in the byte at the address SbfecOl. When one of these keys is pressed, the byte will be as follows:

KEY f1 f2 13 f4 15 16 17

CODE $51 $5d $5b $59 $57 $55 $53

All we have to do is read it and act accordingly.

8.4.2. Joystick

The joystick can also be read directly from an address in memory; in this case it is the word at SdffOOc. Reading the joystick is a little more complicated than the keyboard but only because there are four possibilities: left, up, right or down.

It works this way:

BITS SET only 8 only 8 and 9 only 0 only 0 and 1

DIRECTION up left down right

Inspection of the bits provides an easy test of which direction the joystick has been moved.

8.5. Example Program

The program shows a cube with the letter A written on each face in rotation under the control of the joystick. In addition the cube can be subject to shear and scaling transforms whilst the rotation takes place by pressing the function keys as detailed below.

8.5.1. trnsfrms.s

This is the control program. After initializing variables, it reads the joystick and keyboard settings to choose the rate of rotation, viewing distance and whether a shear or scale change should take place. Both of these latter transforms are accompanied by a size reduction to keep word-size variables within range.

Once input is complete the cube is assembled, unrotated or distorted, in the world frame by the multiple object-to-world transform for all the sides. Following this the distortion is concatenated with the viewing transform to produce the overall transform which then converts the vertices for perspective projection.

8.5.2. core_05.s

Here are the new subroutines. The first part is concerned with constructing the rotation transform from the viewing angles v9, vp and vy and then using it (after it is concatenated with the shear) to transform the vertices. Following this the routines arc concerned with reading the joystick and keyboard and making adjustments accordingly.

In order to simulate inertia, movements of the joystick are converted not to angles of rotation themselves but as increments to the angles of rotation, up to a maximum. These increments are added to the angles each time to give the total angles to rotate.

In addition, the increments arc always decremented by 1 each time to give built-in frictional slowing down. The procedure to implement joystick alternatives uses a vector jump table to the various possible subroutines. This is an elegant way of avoiding testing for each possibility in a long list. This technique is also used for keyboard input.

There are seven possible keyboard inputs concerned entirely with the function keys F1 to F7:

F1 move closer (continuously) to a minimum distance,

F2 move away (continuously),

F3 implement shear 1 (x increases with z, called xshear),

F4 implement shear 2 (x increases with y and z, called yshear),

F5 implement a stretch (y and z reduced by 1/2),

F6 stop movement (of F1 and F2),

F7 quit - reset the system.

Input from F3, F4 and F5 is used to set the bottom three bits of a word length flag, shearflg, in a toggle fashion using the bit-change instruction. This simply NOTs the appropriate bit to provide a record of whether the transform should be implemented. The routine which examines which flag bits arc set also includes the option of combinations of them which are not actually used for anything, and can be used to try other transforms (providing products do not exceed word size in the concatenation products).

Finally the shear and rotation matrices are multiplied to produce the overall transform to act on the cube.

8.5.3. bss_05.s

New variables for this chapter.

8.5.4. data_05.s

New data for this chapter. In particular note that the 3x3 matrices for the shears and stretch are arranged in column order to simplify the matrix concatenation routine.

  *
  * trnsfrrms.s
  * Various transforms

  *SECTION TEXT
    opt     d+
    bra     main
    include systm_00.s
    include core_05.s       motion of the view frame

  main bsr      init
  * transer all the data
      bsr       transfer
      move.w    oncoords,vncoords
      move.w    vncoords,wncoords
  * Initialise dynamical variables
      move.w    #-50,Ovx        view frame initial position
      move.w    #0,Ovy
      move.w    #150,Ovz        intialise rotation angles to zero
      clr.w     vphi
      clr.w     vgamma
      clr.w     shearflg        set flag to no shear
      move.w    #25,vtheta_inc  initial rotation rates
      move.w    #25,vphi_inc
      clr.w     speed
      clr.w     screenflag      0=screen 1 draw, 1=screen 2 draw
  loop4:
  * Switch the screens each time round
      tst.w     screenflag      screen 1 or screen 2?
      beq       screen_1        draw on screen 1, display screen2
      bsr       drw2_shw1       draw on screen 2, display screen1
      clr.w     screenflag      and set the flag for next time
      bra       screen_2
  screen_1:
      bsr       drw1_shw2       draw on 1, display 2
      move.w    #1,screenflag   and set the flag for next time
  screen_2:
  * look for changes in the rotation angles
      bsr       joy_in
  * check function keys for a shear or a change the speed
      bsr       key_in
  * Adjust to new rotation angles and speed
      bsr       angle_update
      bsr       speed_adj

  * Construct compound object from same face at different position
      move.w    nparts,d7     how many parts in the object
      subq      #1,d7
      lea       inst_angles,a0  list of angles for each part
      lea       inst_disp,a1    ditto displacements
  * Do one face at a time
  instance:
      move.w    d7,-(sp)        save the count
      move.w    (a0)+,otheta    next otheta
      move.w    (a0)+,ophi      next ophi
      move.w    (a0)+,ogamma    next ogamma
      move.w    (a1)+,Oox       next displacements
      move.w    (a1)+,Ooy
      move.w    (a1)+,Ooz
      movem.l   a0/a1,-(sp)     save position in list
      bsr       otranw          object to world transform
      bsr       wtranv_1        construct the rotation transform
      bsr       shear           concatenate with shear (if flag set)
      bsr       wtranv_2        and transform the points
      bsr       illuminate      if it is visible find the shade
      bsr       perspective
      bsr       polydraw        draw the face
      movem.l   (sp)+,a0/a1     restore pointers
      move.w    (sp)+,d7        for all the parts of the object
      bra       loop4
  *SECTION DATA
      include   data_00.s
      include   data_03.s
      include   data_05.s
  *SECTION BSS
      include   bss_05.s

      END
        tst.w     screenflag      screen 1 or screen 2?
        beq       screen_1        draw on screen 1, display screen2
        bsr       drw2_shw1       draw on screen 2, display screen1
        clr.w     screenflag      and set the flag for next time
        bra       screen_2
    screen_1:
        bsr       drw1_shw2       draw on 1, display 2
        move.w    #1,screenflag   and set the flag for next time
      screen_2.
  :
  * look for changes in the rotation angles
      bsr       joy_in
  * check function keys for a shear or a change the speed
      bsr       key_in
  * Adjust to new rotation angles and speed
      bsr       angle_update
      bsr       speed_adj

  * Construct compound object from same face at different position
      move.w    nparts,d7     how many parts in the object
      subq      #1,d7
      lea       inst_angles,a0  list of angles for each part
      lea       inst_disp,a1    ditto displacements
  * Do one face at a time
  instance:
      move.w    d7,-(sp)        save the count
      move.w    (a0)+,otheta    next otheta
      move.w    (a0)+,ophi      next ophi
      move.w    (a0)+,ogamma    next ogamma
      move.w    (a1)+,Oox       next displacements
      move.w    (a1)+,Ooy
      move.w    (a1)+,Ooz
      movem.l   a0/a1,-(sp)     save position in list
      bsr       otranw          object to world transform
      bsr       wtranv_1        construct the rotation transform
      bsr       shear           concatenate with shear (if flag set)
      bsr       wtranv_2        and transform the points
      bsr       illuminate      if it is visible find the shade
      bsr       perspective
      bsr       polydraw        draw the face
      movem.l   (sp)+,a0/a1     restore pointers
      move.w    (sp)+,d7        for all the parts of the object
      bra       loop4
  *SECTION DATA
      include   data_00.s
      include   data_03.s
      include   data_05.s
  *SECTION BSS
      include   bss_05.s

      END
      _
  ******************************************************************************************
  * Core_05.s
  * A set of subroutines for transforming world coords. Including rotations of vtheta
  * vphi and vgamma about the x,y and z axes as well as x, y and z shears.
  *
  ******************************************************************************************
   include Core_04.s
  * The matrix for the rotations is constructed.
  * convert rotation angles to sin & cos and store for rotation matrix.
  wtranv_1
   bsr view_trig        find the sines and cosines
  * construct transform matrix wtranv.
   lea stheta,a0
   lea ctheta,a1
   lea sphi,a2
   lea cphi,a3
   lea sgamma,a4
   lea cgamma,a5
   lea w_vmatx,a6
  * do element WM11
   move.w (a3),d0      cphi
   muls   (a5),d0      cphi*cgamma
   lsl.l  #2,d0
   swap   d0
   move.w d0,(a6)+     WM11
  * do element WM12
    move.w (a1),d0     ctheta
    muls   (a4),d0     ctheta*sgamma
    move.w (a0),d1     stheta
    muls   (a2),d1     stheta*sphi
    lsl.l  #2,d1
    swap   d1
    muls   (a5),d1     stheta*sphi*cgamma
    add.l  d0,d1       stheta*sphi*cgamma + ctheta*sgamma
    lsl.l #2,d1
    swap d1
    move.w d1,(a6)+
  * do WM13
   move.w (a0),d0      stheta
   muls   (a4),d0      stheta * sgamma
   move.w (a1),d1      ctheta
   muls   (a2),d1      ctheta*sphi
   lsl.l  #2,d1
   swap   d1
   muls   (a5),d1      ctheta*sphi*cgamma
   sub.l  d1,d0        stheta*sgamma - ctheta*sphi*cgamma
   lsl.l  #2,d0
   swap   d0
   move.w d0,(a6)+
  * do WM21
   move.w (a3),d0      cphi
   muls   (a4),d0      ctheta*sgamma
   lsl.l  #2,d0
   swap   d0
   neg    d0
   move.w d0,(a6)+
  * do WM22
   move.w (a1),d0     ctheta
   muls   (a5),d0     ctheta*cgamma
   move.w (a0),d1     stheta
   muls   (a2),d1     stheta*sphi
   lsl.l  #2,d1
   swap   d1
   muls  (a4),d1      stheta**sphi*sgamma
   sub.l  d1,d0       ctheta*cgamma-stheta*sgamma
   lsl.l  #2,d0
   swap   d0
   move.w d0,(a6)+
  * do WM23
   move.w (a0),d0     stheta
   muls   (a5),d0     stheta*cgamma
   move.w (a1),d1     ctheta
   muls   (a2),d1     ctheta*sphi
   lsl.l  #2,d1
   swap   d1
   muls   (a4),d1     ctheta*sphi*sgamma
   add.l  d0,d1
   lsl.l  #2,d1
   swap   d1
   move.w d1,(a6)+
  * do WM31
   move.w (a2),(a6)+
  * do WM32
   move.w (a3),d0    cphi
   muls   (a0),d0    cphi*stheta
   lsl.l  #2,d0
   swap   d0
   neg    d0
   move.w d0,(a6)+
  * do WM33
   move.w (a1),d0    ctheta
   muls   (a3),d0    ctheta*cphi
   lsl.l  #2,d0
   swap   d0
   move.w d0,(a6)+
   rts
  *****************************************************************************************
  * PART 2: Transform the World coords to view coords.
  wtranv_2
   move.w wncoords,d7
   ext.l d7 any to do?
   beq wtranv3
   subq.w #1,d7
   lea wcoordsx,a0
   lea wcoordsy,a1
   lea wcoordsz,a2
   lea vcoordsx,a3
   lea vcoordsy,a4
   lea vcoordsz,a5
   exg   a3,d3        save cos we're short of registers
   link  a6,#-6       save 3 words
  wtranv1
   moveq.l #2,d6     3 rows in matrix
   lea w_vmatx,a3    init max pointer
  * calculate the next wx, wy and wz
  wtranv2
   move.w (a0),d0     wx
   move.w (a1),d1     wy
   move.w (a2),d2     wz
   sub.w  #50,d0      wx-50
   sub.w  #50,d1      wy-50
   sub.w  #50,d2      wz-50
   muls   (a3)+,d0    wx*Mi1
   muls   (a3)+,d1    wy*Mi2
   muls   (a3)+,d2    wz*Mi3

   add.l  d1,d0
   add.l  d2,d0       wx*Mi+wy*Mi2+wz*Mi3
   lsl.l  #2,d0
   swap   d0
   move.w d0,-(a6)
   dbra   d6,wtranv2  repeat for 3 elements

   move.w (a6)+,d0
   add.w  Ovz,d0
   move.w d0,(a5)+     becomes vz
   move.w (a6)+,(a4)+
   exg a3,d3           restore vx, save matx pointer
   move.w (a6)+,d0
   add.w  #100,d0
   move.w d0,(a3)+     becomes vx
   exg a3,d3           save vx, restore matx pointer
   addq.l #2,a0        point to next wx
   addq.l #2,a1 wy
   addq.l #2,a2 wz
   dbra   d7,wtranv1   repeat for all ocoords
   unlk   a6           close frame
  wtranv3
   rts
  *
  * Calculate the sines and cosines of the view angles
  view_trig
   move.w vtheta,d1 theta
   bsr sincos
   move.w d2,stheta sine
   move.w d3,ctheta cosine
   move.w vphi,d1
   bsr sincos
   move.w d2,sphi
   move.w d3,cphi
   move.w vgamma,d1 gamma
   bsr sincos
   move.w d2,sgamma
   move.w d3,cgamma
   rts
  *
  * Read jstick and update vars accordingly.
  joy_in
   move.w $dff00c,d0 read jstick register
  * convert value to angle totals
  angle_speed
   btst #8,d0 up or left?
   beq dwn_rt nope
   btst #9,d0 left?
   beq up
   bra left
  dwn_rt
   btst #0,d0 down or right?
   beq joy_out
   btst #1,d0 right?
   beq down
   bra right
  joy_out
   rts
   IFD JOY1
  * set up the increments to angles +/-10 is the limit
  up
   subq.w #2,vphi_inc
   rts
  down
   addq.w #2,vphi_inc
   rts
  left
   addq.w #2,vtheta_inc
   rts
  right
   subq.w #2,vtheta_inc
   rts
   ENDC
   IFD JOY2
  up
   move.w #350,vyangle
   bsr rot_vy
   rts
  down
   move.w #10,vyangle
   bsr rot_vy
   rts
  left
   move.w #10,vxangle
   bsr rot_vx
   rts
  right
   move.w #350,vxangle
   bsr rot_vx
   rts
   ENDC
   IFD JOY3
  up
   bsr rot_down
   rts
  down
   bsr rot_up
   rts
  left
   bsr rot_left
   rts
  right
   bsr rot_right
   rts
   ENDC
   IFD JOY4
  up
   move.w #-5,vphi_inc
   rts
  down
   move.w #5,vphi_inc
   rts
  left
   move.w #5,vtheta_inc
   rts
  right
   move.w #-5,vtheta_inc
   rts
   ENDC
  **************************************************************
  angle_update
   move.w vtheta_inc,d0
   bmi    vth_neg
   beq    chk_phi
   subq.w #1,vtheta_inc
   cmp.w  #25,vtheta_inc
   ble    chk_phi
   move.w #25,vtheta_inc
   bra    chk_phi
  vth_neg
   addq.w #1,vtheta_inc
   cmp.w  #-25,vtheta_inc
   bge    chk_phi
   move.w #-25,vtheta_inc
  chk_phi
   move.w vphi_inc,d0
   bmi    vph_neg
   beq    chk_out
   subq.w #1,vphi_inc
   cmp.w  #25,vphi_inc
   ble    chk_out
   move.w #25,vphi_inc
   bra    chk_out
  vph_neg
   addq.w #1,vphi_inc
   cmp.w  #-25,vphi_inc
   bge    chk_out
   move.w #-25,vphi_inc
  chk_out
  * update vtheta
   move.w vtheta,d0         the previous angle
   add.w  vtheta_inc,d0     increase by increment
   bgt    thta_1             check it lies between 0 and 360
   add    #360,d0
   bra    thta_2
  thta_1
   cmp.w  #360,d0
   blt    thta_2
   sub    #360,d0
  thta_2
   move.w d0,vtheta becomes the current angle
  * update vphi
   move.w vphi,d0
   add.w  vphi_inc,d0
   bgt    phi_1
   add    #360,d0
   bra    phi_2
  phi_1
   cmp.w  #360,d0
   blt    phi_2
   sub    #360,d0
  phi_2
   move.w d0,vphi
   rts
  *****************************************************************************************
  key_in
  in_key
   clr.w d0
   move.b $bfec01,d0
   cmp.b #$5f,d0
   beq f1
   cmp.b #$5d,d0
   beq f2
   cmp.b #$5b,d0
   beq f3
   cmp.b #$59,d0
   beq f4
   cmp.b #$57,d0
   beq f5
   cmp.b #$55,d0
   beq f6
   cmp.b #$53,d0
   beq f7
   rts
   IFD JOY3
  f1 bsr roll_left
   rts
  f2 bsr roll_right
   rts
  f3 move.w #-2,speed
   rts
  f4 move.w #2,speed
   rts
  f5 move.w #3,speed
   rts
  f6 move.w #0,speed       stop
   rts
  f7 move.w #QUIT,quitflag
   rts
   ELSEIF
  f1 move.w #-1,speed      reverse
   rts
  f2 move.w #1,speed       forward
   rts
  f3 bchg.b #2,shearflag   toggle x shearflag
   rts
  f4 bchg.b #1,shearflag   toggle yshearflag
   rts
  f5 bchg.b #0,shearflag   toggle z shearflag
   rts
  f6 move.w #0,speed       stop
   rts
  f7 move.w #QUIT,quitflag
   rts
   ENDC
  ******************************************************************************************
  * concatenate the shear with the rotation
  shear
   clr d0
   move.b shearflag,d0 flag is lower 3 bits
   and #$f,d0
  * there are 8 possibilities 111 - 000, xyz respectively
   lea shear_jump,a0
   lsl.w #2,d0 get offset
   move.l 0(a0,d0.w),a0
   jmp (a0)
  shear_jump
   dc.l null,z,y,user1,x,user2,user3,user4
  null
   rts
  z
   lea zshear,a0
   lea w_vmatx,a1
   bsr concat
   rts
  y
   lea yshear,a0
   lea w_vmatx,a1
   bsr concat
   rts
  user1
   rts
  x
   lea xshear,a0
   lea w_vmatx,a1
   bsr concat
   rts
  user2 rts
  user3 rts
  user4 rts
  *
  * Multiply two 3x3 matrices pointed to by a0 and a1
  * order is (a1)x(a0) with result sent to temp store at (a2)
  * (a0) is in column order whilw (a1) and (a2) are in row order, of word length elements.
  * Finally (a2) is copied to (A1).
  concat
   lea tempmatx,a2
   move.w   #2,d7 3     rows
  conc1
   move.w   #2,d6
   movea.l  a0,a3       reset shear pointer
  conc2
   move.w  (a1),d1
   ext.l   d1
   lsr.l   #1,d1
   move.w 2(a1),d2
   ext.l   d2
   lsr.l   #1,d2
   move.w  4(a1),d3
   ext.l   d3
   lsr.l   #1,d3
   muls    (a3)+,d1
   muls    (a3)+,d2
   muls    (a3)+,d3
   add.w   d2,d1
   add.w   d3,d1
   move.w  d1,(a2)+     next product element
   dbra    d6,conc2     do all elements in row
   addq.w  #6,a1        point to next row
   dbra    d7,conc1     for al rowa
  * transfer result back to rotation matrix
   lea tempmatx,a0
   lea w_vmatx,a1
   move.w #8,d7 num elements -1
  conloop
   move.w (a0)+,(a1)+
   dbra d7,conloop
   rts
  * set the velocity components
  speed_adj
   move.w  speed,d0
   lsl.w   #3,d0       scale it
   move.w  Ovz,d1
   cmp.w #10,Ovz
   bgt    adj_out
   move.w #10,Ovz
  adj_out
   add.w d0,Ovz
   rts
  ******************************************************************************************
  * bss_05.s
  ******************************************************************************************
   include bss_04.s

  * World Frame Variables
  wncoords   ds.w 1   num vertices in world frame
  * View frame vars
  vtheta     ds.w 1   rotation of view frame abouut wx
  vphi       ds.w 1   wy
  vgamma     ds.w 1   wz
  Ovx        ds.w 1   view frame x origin in world frame
  Ovy        ds.w 1
  Ovz        ds.w 1
  * General transform matrices
  w_vmatx    ds.w 9
  tempmatx   ds.w 9
  * joystick
  joy_data   ds.w 1
  * Dynamic vars
  speed      ds.w 1
  vtheta_inc ds.w 1
  vphi_inc   ds.w 1
  vgamma_inc ds.w 1
  shearflag  ds.w 1
  quitflag   ds.w 1
  *****************************************************************************************
  * Data_05.s
  *****************************************************************************************
  TRANSFORM EQU 1

   include data_04.s

  my_datax dc.w 100,100,0,0,20,90,20,15,45,45,15,55
           dc.w 70,55,10,10,10,10,20,20
  my_datay dc.w 0,100,100,0,15,60,87,25,40,65,74,46
           dc.w 55,61,30,5,95,60,25,74
  my_dataz dc.w 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

  xshear dc.w 1,0,0,0,1,0,1,0,1
  yshear dc.w 1,0,0,1,1,0,1,0,1
  zshear dc.w 2,0,0,0,1,0,0,0,1

  nparts      dc.w 6
  inst_angles dc.w 0,0,0,90,0,0,180,0,0,270,0,0,0,270,0,0,90,0
  inst_disp   dc.w 0,0,0,0,100,0,0,100,100,0,0,100,100,0,0,0,0,100

9. Flying Around The World

9.1. Introduction

A flight simulator? Well not exactly, but getting there.

In order to fully implement the simulation of independent motion of the observer, we require a little more vector algebra. The task is to construct a view of the world model from the point of view of an observer free to move in any direction. This is different from the simple procedure we used in the previous chapter. We now wish to operate a joystick and navigate our way through the assembly of objects constructed in the world frame. We want the view on the screen to move up or down when the joystick is pushed forward or pulled back and to move to the left or right when the joystick is moved to the right or the left. In other words, all of the motion on the screen must be relative to the observer’s current position. Even if the pilot of a plane is flying upside down, his perception of “up” is directed towards the roof of the cockpit, which as far as someone on the ground is concerned is “down”. What matters is that all of the movements corresponding to “up”, “down”, “left” and “right” apply to the observer’s reference frame, which we have called the view frame. Unlike rotation by Euler angles, which we used in the previous chapter, here we want the rotations to be about the view frame axes.

To be specific, let’s ask what we expect to happen when the joystick is pulled back. We expect to see the picture move vertically upwards, and this must always happen no matter what the orientation of the observer. Suppose we have got into the position where the aircraft, or whatever it is being controlled, is flying horizontally but with its wings vertically.

Figure 9.1 shows this orientation.

If the joystick is pulled back, object A will come into view at the top of the screen and the object B will go out of view at the bottom of the screen. The view seen by the pilot of the plane is shown in Figure 9.2. Herein lies the problem. The pilot has a very definite perception of what is “up” and what is “down” at any given moment and, while this does not change in the cockpit, it is changing continuously with respect to the world outside. In the previous chapter it was easy to relate “up” to an increase of v(J) and left to an increase in v9, but when referenced from the view frame all these motions depend on the orientation of the observer at any given instant.

figure 09 01
Figure 9.1 World view of observer (airplane)

There is more than one way of solving this problem. One method is to use control matrices to perform rotations of coordinates after they have been transformed to the view frame. The control matrices perform simple rotations about the view x, y and z axes. This method is employed in the next chapter. Another way is to keep a constant record of the position and orientation of the view frame in the world frame and to generate movements of the view frame resulting from movements of the joystick. This second method relies heavily on the notion of a set of view frame axes undergoing rotations and translations following the path of the observer. It also embodies the notion of rotation about an arbitrary axis that we would also like to introduce in this chapter which is very useful for performing rotations about any axis in the world frame.

We could, of course, decide to accept the limitations of Euler angles to fix the view frame orientation in the simpler orbital-like fashion. In Chapter 10, we show how a flight simulator works well by using each of these approaches.

figure 09 02
Figure 9.2 Observer’s view (from airplane)

9.2. Coordinate Transforms and Direction Cosines

Here’s a bit of maths. It’s not as hard as it looks.

If you know the coordinates of the vertices of an object in one reference frame and want to know what they are in another, it is necessary to do a coordinate transform. (Remember the other type of transform is called a geometric transform, which is what happens when the object itself is moved inside a single reference frame). If a point has coordinates (xw,yw,zw) in the world frame, it will have coordinates (xv,yv,zv) in the view frame. Thus the point A in Figure 9.3 has coordinates (0,0,50) in the world frame, and coordinates (0,-50,0) in the view frame (what is seen on the screen has later to be worked out by means of the perspective transform). As far as rotations are concerned there is always a linear relation between these two sets of coordinates, and for this case we can write in general terms:

xv = nll.xw + nl2.yw + nl3.zw

yv = n21.xw + n22.yw + n23.zw

zv = n31.xw + n32.yw + n33.zw

where the n’s are numbers that remain to be worked out. This relation can also be written as a matrix product:

 xv =   n11 n12 n13     xw

 yv =   n21 n22 n23  *  yw

 zv =   n31 n32 n33     zw
figure 09 03
Figure 9.3 Point A seen in two coordinate frames

The n matrix is the transformation matrix. The elements nil, nl2, etc., are specific to the relative orientation of the two reference frames and are called the direction cosines.

To see how the direction cosines are related to the geometry, look at Figure 9.4. The direction cosines are simply the cosines of the angles between the axes of the reference frames. It is quite hard to draw a comprehensive diagram which is not confusingly messy but, for example, nil is the cosine of the angle between vx and xw, nl2 is the cosine of the angle between xv and yw, nl3 is the cosine of the angle between xv and zw and so on:

nil = cos(a), nl2 = cos(b), n!3 = cos(c).
figure 09 04
Figure 9.4 Direction cosines

If these direction cosines can be found, the problem of converting world frame coordinates into view frame coordinates is solved. We are however still left with the problem of converting movements of the joystick into changes in the direction cosines. It is clear that we should solve the problem with a strategy that centres on the direction cosines. Here is one way it can be done.

9.3. Base Vectors and Direction Cosines

Just for a moment let’s forget all about the maths. Let’s try to visualise what’s going from the point of view of a second, stationary, independent observer at rest in the world frame and able to see both the world frame and the moving observer simultaneously. This is the point of view of a man on the ground watching a plane fly past. Think of the plane as the view frame but with the fuselage replaced by the zv axis, the wings replaced by the yv axis and the vertical tail wing in the direction of the xv axis. Although he is not in the plane, the stationary observer can calculate the view according to the pilot if he knows the position and orientation of the plane at any instant.

To see how that view would change when the pilot pulls back the joystick, for example, he has only to rotate the plane about the axis of the wings (the angle depends on how long the joystick is pulled back), which is a rotation about the yv axis. Since the plane is moving forward during the rotation this has the added complication of making it fly upwards. Like the stationary observer, we need to keep a continuous record of the position and orientation of the view frame as it flics around the world.

To do this, imagine three unit vectors in the directions of the view frame axes. In vector geometry these unit vectors are given a special name. They are called base vectors. At the very start of the program let us suppose that the view frame is positioned coincident with the world frame. This is equivalent to having a second set of world frame base vectors at the airfield from where the plane has taken off. (Actually it isn’t really necessary to have them start off coincident and in general they don’t, but it makes the argument easier to visualise).

Now at each stage of the subsequent motion it is necessary to record the position and orientation of the view frame unit vectors. It is not possible simply to keep a running total of how many degrees the plane rotated to the left (about vx) or up (about vy) since we have no way of knowing how to translate this information into the final orientation of the plane after many movements. In the method of Euler angles used in the previous chapter it was possible to keep a running total since the first angle referred to rotation about an axis of the static world frame. But now we are using angles referred to the view frame which is moving all the time.

Here comes the big question. Suppose we can keep a record of the positions of the view frame base vectors, what do they have to do with the original transform? The answer is very simple: the components in the world frame of the view frame base vectors are just the direction cosines that are the elements, nil to n33, of the world-to-view transform matrix. In other words, where iv, jv and kv are the view frame base vectors and iw, jw and kw are the world frame base vectors, the relation between them is:

iv = n11.iw + n12.jw + n23.kw

jv = n21.iw + n22.jw + n23.kw

kv = n31.iw + n32.jw + n33.kw.

Or, writing the view frame base vectors in terms of their world frame components

        n11             n21             n31

iv =    n12   ,   jv =  n22 , kv =      n32

        n13             n23             n33

At the start of the motion, when the view frame and world frame axes were aligned, the view frame base vectors had components

        1             0             0

iv =    0   ,   jv =  1 , kv =      0

        0             0             1

If we can keep a record of the view frame base vectors we therefore have the direction cosines immediately available to construct the view from the cockpit. The strategy is straightforward but there are some tricky problems to solve on the way.

9.4. Rotating the Base Vectors: Rotation About an Arbitrary Axis

The base vectors which fix the current orientation of the view frame depend on what movements have already taken place. Suppose at a given instant the view frame is oriented with its base vectors in the positions shown in Figure 9.5. The base vector of the vx axis, iv, has three components in the world frame nil, nl2 and nl3 (the other unit vectors jv and kv also have components but for clarity these are not shown in the diagram). Now suppose a movement of the joystick occurs corresponding to a rotation about the vy axis. To find the new components of iv and kv (jv remains unchanged in this rotation) we must rotate them about vy. The vy axis is the axis of rotation and is specified in the world frame by its direction cosines. But we are in luck! This problem has already been solved. It is known as rotation about an arbitrary axis. Since at this point yv can be pointing anywhere in the world frame, the axis is very arbitrary. In fact the solution to the problem is given in just the format most useful to us. It is in the form of a matrix for rotation by an angle about an axis specified by its direction cosines. Just the form we want. The transform can also be used for rotation about any other axis in the world frame. All that is required are the three direction cosines.

Once constructed, the rotation matrix can be multiplied by iv and kv to yield the new components of iv and kv, which then replace the old ones and are also used directly to construct the world-to-view transform (there is a catch, which we’ll discuss shortly).

For rotation by an angle 𝛿 about an axis with direction cosines n1, n2 and n3 (just the last index in the cosine to show it can refer to any axis), the matrix is

n1.nl+(1-n1.n1)cos(𝛿)	    n1.n2(1-cos(𝛿))-n3sin(𝛿)	n1.n3(1-cos(𝛿))+n2sini(𝛿)

n1.n2(1-cos(𝛿))+n3sin(𝛿)	n2.n2+(1-n2.n2)cos(𝛿)	    n2.n3(1-cos(𝛿))-n1sin(𝛿)

n1.n3(1-cos(𝛿))-n2sin(𝛿)	n2.n3(1-cos(𝛿))+n1sin(𝛿)	n3.n3+(1-n3.n3)cos(𝛿)
figure 09 05
Figure 9.5 Rotation of base vector

9.5. Accumulating Errors

Broadly speaking, all the ingredients required to steer the view frame through the world frame controlled by joystick movements are in place. Let us lay out the algorithm as it stands at the moment:

  • movement of the joystick specifies a rotation of the view unit vectors about one of the view frame axes,

  • construct the rotation matrix to rotate the other two unit vectors about this axis and replace them with their new components,

  • use the components of the unit vectors, now called direction cosines, to construct the world-to-view transform,

  • perform the transform and display the picture

  • and repeat the cycle.

This is all OK and it works. For a while.

Eventually it will lead to a degenerating picture, or worse a chaotic mess, because of accumulating errors. As it stands the program has a built-in pathological self-distruct. Because calculations are done in integer arithmetic, and sines and cosines are calculated to an accuracy no better than 1 in 16384, given enough transforms, large errors will accumulate in the unit vectors and, as a consequence, the world-to-view transform. In life nothing is perfect and this is a good example of that adage. In addition, the algorithm has feedback in that joystick movements are made on the basis of the picture on the screen that is generated, in turn, from the transform constructed from the joystick movements. This has all the ingredients necessary to create chaos, and so it does.

In order to beat the accumulation of errors, the cycle of error accumulation must be broken. This is achieved by regenerating the base vectors afresh each time. This requires more work but it solves the problem. Figure 9.6 shows the stages in the regeneration of the view frame unit vectors.

The vectors that matter most are kv, the one that points in the direction of motion and iv, the pointer to the “up” direction. Without these two it is not possible to define either the direction of motion or which way is up as far as the pilot is concerned. Let’s suppose that, because of errors in the last transform, we have three unit vectors iv’, jv’ and kv’ which are slightly wrong. The errors will result in the base vectors not being at right-angles to each other and not having size equal to unity. As a first step, the vector kv’ is normalised, i.e. its magnitude is made to be unity. It becomes kv. This at least ensures that if its direction is slightly wrong, its size isn’t. The only effect a slightly wrong direction will have is that the view will be slightly in error, but that hardly matters since the view is being constantly adjusted by the joystick anyway. Second, the vector cross product of kv and iv’ is taken in order to generate a new vector at 90 degrees to them both. A vector cross product has just this property (see Appendix 5). This new vector is in the direction that jv would have if it weren’t in error. The new vector is then normalised i.e. its magnitude is made to be 1, and it becomes the new jv. Third, the vector cross product of the new kv and the new jv is taken, and normalised, in order to generate a new iv. In this way all three unit vectors are regenerated each frame and errors do not accumulate (it is interesting to remove the regeneration stage in the example program and watch the disintegration take place).

figure 09 06
Figure 9.6 Regeneration of base vectors

The components of the new unit vectors then become the components of the viewing transform matrix and the cycle is repeated.

The technical details are discussed as they appear in the example programs.

9.6. Clipping in 3D

No part of an object which lies behind the view plane (zv < 0) must be drawn. If this is attempted, the program will not crash but what appears on the screen will be garbage. This is because the polygon drawing routines expect to see the edge list of vertices go clockwise round the perimeter of a polygon and this will be wrong for polygons projected backwards onto the view plane. In addition, objects that lie too far from the view plane should not be drawn either. This is because nothing can be drawn smaller than a pixel, and very distant objects reduce to an incoherent cluster of pixels.

Besides these obvious cases, there is no point in wasting time on objects that lie too far outside the field of view. This field of view is defined by the frustum (truncated pyramid) defined by the line of sight from the view point to the viewport boundaries. This is illustrated in Figure 9.7.

figure 09 07
Figure 9.7 Windowing in 3D

In a more leisurely application it would be possible to clip polyhedra to the boundary of the frustum in a 3D generalisation of the way polygons have been clipped to the screen window. In this application that would be too time consuming. Here, the centre of symmetry (Oox,Ooy,Ooz) is used to locate objects in the field of view and the angle of the frustum is increased to lie beyond the screen limit. This means that some time is wasted drawing distant objects which cannot be seen, but objects that are close up are not abandoned the instant their centres pass beyond the field of view. They are marked as visible but only part will appear on the screen as a result of screen clipping.

The top and base of the frustum are called the hither and yon planes. In the example program they are defined by the equations zv=100 (hither) and zv=2000 (yon). The sides of the frustum of the field of view are defined (where the viewport centre coincides with the view frame origin) by the planes

zv + 100 = xv	          side A

zv + 100 = -xv	        side B

(1.2).(zv + 100) = yv	  side C

(1.2).(zv + 100) = -yv	side D

but the actual sides used in the program extend beyond this limit, for reasons explained above, and are described by

8.(zv + 100) = ±xv	sides A	and	B

8.(zv + 100) = ±yv	sides C	and	D

9.7. Velocity of the Observer

The observer (you) does not only use the joystick to do rotations. The observer also has a velocity that may be changing as time passes. To include velocity, all that has to be done is to increment the observer’s position in the world frame in proportion to the velocity. The velocity is a vector, so it has direction as well as size - speed is the magnitude of the velocity. The procedure is to change each component of the observer’s position, each frame, by an amount proportional to the speed times the relevant component of the base vector kv.

In other words, if the view frame is pointing only in the direction of the zw axis, only Ovz should be incremented each time. On the other hand if the view frame is pointing along the xw axis, only Ovx should be incremented each time. For anything between, Ovx, Ovy and Ovz should be incremented in proportion the components of kv in those directions. This ensures that the direction in which the observer is looking is the direction of motion. The details are explained in the example program.

9.8. Example Programs

In this program it is possible to fly round the A cube. The program starts with the cube at mid-screen and with the observer stationary. Pressing F2 causes the view frame to move towards the cube at constant speed (pressing F1 causes it to retreat). Thereafter motion is controlled by the joystick. It is possible to fly past the cube and then do an about turn to return to it. Because of 3D clipping, the cube is not displayed if it comes closer than 100 or is farther away than 2000 or is outside the field of view (see above). Motion can be stopped by pressing F6 and the program aborted by pressing F7.

9.8.1. wrld_vw.s

This is the control program. Much of it is similar to that of the previous chapter. It draws an A cube that can be flown around under the control of the joystick. This time the joystick performs rotations about the axes of the view frame, i.e. the pilot. When the joystick is pulled back the viewer looks upwards into the world and if there is forward motion he/she follows a rising trajectory. Other motions of the joystick produce corresponding motion as if the viewer were flying through the world frame. In this way it is possible to fly past an object and then sweep through an arc to return to it.

The program follows the sequence described above. First the view frame base vectors are initialized. Following this the joystick is read and immediately the view frame unit vectors in the world frame are rotated. Then the keyboard is read to see if the speed has changed. Following this the new position of the view frame in the world frame is calculated from the speed and the view frame z-axis base vector kv which is now pointing along the new direction of motion. In motion that is not in a straight line, the velocity is changing all the time (the velocity is a vector and so it can change if its direction changes even if its size, the speed, doesn’t). Finally the unit vectors are themselves regenerated to avoid accumulating errors and passed on directly as the elements of the world-to-view transform before drawing the picture of the A cube.

The function keys F1 and F2 are reverse and forward respectively. F6 is stop and F7 resets the system. Be careful to press the keys lightly and not hold them down since the keyboard buffer is not cleared between frames.

There are no subdeties such as inertia in the mouon but these could be incorporated along the lines described in the previous chapter.

9.8.2. core_06.s

Here is where all the work is done. The subroutine dircosines regenerates the base vectors and passes the new values to the viewing transform matrix. To do the regeneration requires vector cross products and normalisation (i.e. scaling the size of the vector to unity). To normalise a vector requires dividing each of its components by the magnitude of the vector, which must be calculated as the square root of the sum of the squares of the components. This is dealt with using the nrm vec routine used previously for the illumination calculation.

In the subroutine in Joy, the joystick is read and action taken immediately to rotate the view frame base vectors about an axis in the world frame, which here is one of the base vectors, but could be any axis defined by its direction cosines. The matrix for rotation is constructed in v_rot_matx. The elements of this are quite large but the overall work is minimised by calculating pairs of elements at a time due to the similarity of elements with their row and column indices interchanged.

In vel_adj the new direction of motion, which is the direction pointed to by the kv vector, is combined with the speed to produce a displacement of the view frame. What this amounts to is simply multiplying the components of kv by the speed and adding them to Ovx, Ovy and Ovz, the current value of the view frame origin in the world.

The test for visibility of objects follows the criteria explained above, where the object frame origin (Oox,Ooy,Ooz) is examined to see if it lies in the frustum defined as the field of view. To do this, the origin itself is first transformed into the view frame where it becomes (Vox,Voy,Voz).

One final routine, sern adj, is included to reset the centre of the screen at the origin of the world frame. This is not the same as simply moving the view frame in the world frame since it affects the appearance of perspective. Having the view frame centred on the screen is more natural to “flying around in space” experiences.

9.8.3. bss_06.s

This contains the few new variables introduced in this section: the base vectors and the rotations resulting from movement of the joystick.

  * wrld_vw.s
  * Joystick control of the view frame for chapter 9
  *

  *SECTION TEXT
    opt   d+
    bra   main
    include systm_00.s    screens and tables
    include core_06.s     new subroutines

  main  bsr   init        allocate memory etc.
  * transfer all the data
        bsr   transfer
        move.w  oncoords,vncoords
        move.w  vncoords,wncoords
  * Initalise dynamical variables
        move.w  #0,Ovx    view frame
        move.w  #0,Ovx    starts off
        move.w  #-200,Ovz 200 behind world frame
  * Set up view frame base vectors
  * 1.  iv
        lea     iv,a0     align
        move.w  #$4000,(a0) view
        clr.w   (a0)+       frame
        clr.w   (a0)        axes
  * 2   jv
        lea     iv,a0       with
        clr.w   (a0)+       the
        move.w  #$4000,(a0)+  world
        clr.w   (a0)        frame
  * 3. kv
        lea     jv,a0       axes
        clr.w   (a0)+
        clr.w   (a0)0
        move.w  #$4000,(a0)

        clr.w   speed       start at rest
        clr.w   screenflag  0=screen 1 draw, 1=screen 2 draw
        clr.w   viewflag
  loop4:
  * Switch the screens each time round
        tst.w   screenflag  screen 1 or screen2?
        beq     screen_1    draw on screen 1, display screen2
        bsr     drw2_shw1   draw on screen 2, display screen1
        clr.w   screenflag  and set the flag for next time
        bra     screen_2
  screen_1:
        bsr     drw1_shw2   draw on 1, display 2
        move.w  #1,screenflag
  screen_2:
  * Look for changes in the view frame angles
        bsr     in_joy      read joystick rotate view frame
  * See if the function keys have been pressed to change the speed
        bsr     key_in
  * Adjust to new velocity
        bsr     vel_adj
  * Recalculate view frame base vectors and set up the world-view
  * transform matrix
        bsr     dircosines
  * See if the object is within the visible angle of view
        bsr     viewtest
        tst.b   viewflag    is it visible
        beq     loop4       no,try again
  * Construct compound objects from same face att different positions
        move.w  nparts,d7   how many parts in the object
        subq    #1,d7
        lea     inst_angles,a0   list of angles for each part
        lea     ins_disp,a1     ditto displacements
  * Do one face at a time
  instance:
        move.w  d7,-(sp)      save the count
        move.w  (a0)+,otheta  next otheta
        move.w  (a0)+,ophi    next ophi
        move.w  (a0)+,ogamma  next ogamma
        move.w  (a1)+,Oox     next displacements
        move.w (a1)+,Ooy
        move.w  (a1),Ooz
        movem.l a0/a1,-(sp)   save position in the list
        bsr     otranvw       object to world transform
        bsr     w_tran_v      world to view transform
        bsr     illuminate    if not hidden find the shade
        bsr     perspective   perspective
        bsr     scrn_adj      centre window
        bsr     polydraw      draw this face
        movem.l (sp)+,a0/a1   restore pointers
        move.w  (sp)+,d7      restore the parts count
        dbra    d7,instance   for all the parts of object
        bra     loop4         draw the next frame
  *SECTION DATA
        include data_00.s
        include data_03.s
        include data_05.s
  *SECTION WSS
        include bss_06.s

        END
  *****************************************************************************************
  *                                     Core_06.s                                         *
  *                             subroutines for Chapter 9                                 *
  *****************************************************************************************
  CORE6 EQU 1

   include Core_05.s

  * Find the direction cosines for the transform from the world frame to view frame.
  * These are components of the view frame base vectors in the world frame.
  * To avoid accumulating errors they are regenerated and normalised to a magnitude of:
  * 2^14.
  dircosines
   lea    iv,a0
   lea    jv,a1
   lea    kv,a2
  * Kv is normalised
   move.w (a2),d0
   move.w 2(a2),d1
   move.w 4(a2),d2
    bsr   nrm_vec
   move.w d0,(a2)         new components
   move.w d1,2(a2)
   move.w d2,4(a2)
  * calc vj from cross product of vk & vi using subroutine AxB.
  * A pointer in a2: B pointer in a0:
    bsr   AxB
   move.w d0,(a1)
   move.w d1,2(a1)
   move.w d2,4(a1)
  * finally the cross product of kv & jv is used for iv.
   lea    jv,a2
   lea    kv,a0
    bsr   AxB
   lea    iv,a1
   move.w d0,(a1)        regenerated iv
   move.w d1,2(a1)
   move.w d2,4(a1)
  * The components of the view frame base vectors in the world frame are the elements
  * of the transform matrix required for the world to view transform.
   lea w_vmatx,a0
   lea iv,a1
   lea jv,a2
   lea kv,a3
   move.w (a1)+,(a0)+ matrix elements of the view transform
   move.w (a1)+,(a0)+
   move.w (a1)+,(a0)+
   move.w (a2)+,(a0)+
   move.w (a2)+,(a0)+
   move.w (a2)+,(a0)+
   move.w (a3)+,(a0)+
   move.w (a3)+,(a0)+
   move.w (a3)+,(a0)+
   rts
  *****************************************************************************************
  AxB
   move.w 2(a2),d0 Ay
   muls   4(a0),d0 bz*Ay
   move.w 4(a2),d1 Az
   muls   2(a0),d1 By*Az
   sub.l  d1,d0    Bz*Ay-By*Ax
  * 2nd component
   move.w 4(a2),d1 Az
   muls   (a0),d1  Bx*Az
   move.w (a2),d2  Ax
   muls   4(a0),d2 Bz*Ax
   sub.l  d2,d1    Bx*Az-Bz*Ax
  * 3rd component
   move.w (a2),d2  Ax
   muls   2(a0),d2 By*Ax
   move.w 2(a2),d3 Ay
   muls   (a0),d3  Bx*Ay
   sub.l  d3,d2    By*Ax-Bx*Ay
  * Reduce them to < word size by dividing by 2^14
   move   #14,d7
   lsr.l  d7,d0
   lsr.l  d7,d1
   lsr.l  d7,d2
  * normalise them
   bsr    nrm_vec
   rts
  *****************************************************************************************
  * Do a rotation of the view frame about one of the view frame axes in the world frame.
  * The direction cosines for the axis are the base vector components.

  * First a rotation about the view frame x-axis, vx.
  rot_vx
   lea    iv,a0        the axis of rotation
   move.w vxangle,d1   the angle to rotate
    bsr   v_rot_matx    construct the rotation matrix
  * only jv and kv are affected
   lea    jv,a0        1st transform
    bsr   rot_view
   lea    kv,a0        2nd transform
    bsr   rot_view
   rts
  *--------------------------------------------------
  rot_vy
   lea    jv,a0
   move.w vyangle,d1
   bsr    v_rot_matx
  * only iv and kv are affected
   lea    iv,a0        1st transform
    bsr   rot_view
   lea    kv,a0        2nd transform
    bsr   rot_view
   rts
  *--------------------------------------------------
  rot_vz
   lea    kv,a0
   move.w vzangle,d1
   bsr    v_rot_matx
  * only iv and kv are affected
   lea    iv,a0        1st transform
    bsr   rot_view
   lea    jv,a0        2nd transform
    bsr   rot_view
   rts
  *--------------------------------------------------
  * Rotate a view frame base vector. The vector is pointed to by a0. Since it is
  * a unit vector it is specified by three components which are the direction cosines.
  * (nx, ny, nz).
  rot_view
   moveq #2,d6            rows in matrix
   lea   vrot_matx,a3
   link  a6,#-6
  rot_vw1
   move.w (a0),d0        nx components
   move.w 2(a0),d1       ny
   move.w 4(a0),d2       nz
   muls   (a3)+,d0       nx*Mi1
   muls   (a3)+,d1       ny*Mi2
   muls   (a3)+,d2       nz*Mi3
   add.l  d1,d0
   add.l  d2,d0
   lsl.l  #2,d0
   swap   d0
   move.w d0,-(a6)
   dbra   d6,rot_vw1

   move.w (a6)+,4(a0)   z
   move.w (a6)+,2(a0)   y
   move.w (a6)+,(a0)    x
   unlk   a6
   rts
  ***************************************************************************************
  * Construct the rotation matrix for rotations about an arbitrary axis specified by a
  * unit vector with components (direction cosines) n1, n2, n3.
  * ENTRY: Pointer to direction cosines in a0: Angle in d0.
  v_rot_matx
   lea vrot_matx,a6
   bsr sincos
   move.w d2,d6        sine delta
   move.w d3,d7        cos delta
  * elements M12 and M21
   move   #16384,d5
   move   d5,d0
   move.w (a0),d1      n1
   muls   2(a0),d1     n1*n2
   lsl.l  #2,d1
   swap   d1
   sub.w  d7,d0        1-cosdelta
   move   d0,d4
   muls   d1,d0
   lsl.l  #2,d0
   swap   d0           n1*n2(1-cosdelta)
   move   d0,d2
   move.w 4(a0),d1     n3
   muls   d6,d1        n3*sindelta
   lsl.l  #2,d1
   swap   d1
   sub.w  d1,d0        n1*n2(1-cosdelta)-n3*sindelta
   move.w d0,2(a6)     M12
   add.w  d1,d2        n1*n2(1-cosdelta)+n3*sindelta
   move.w d2,6(a6)     M21
  * elements M13 and M31
   move   d4,d0        1-cosdelta
   muls   (a0),d0      n1*(1-cosdelta)
   lsl.l  d0
   swap   d0
   muls   4(a0),d0     n1*n3(1-cosdelta)
   lsl.l  #2,d0
   swap   d0
   move   d0,d2
   move.w 2(a0),d1     n2
   muls   d6,d1        n2*sindelta
   lsl.l  #2,d1
   swap   d1
   add.w  d1,d0        n1*n3(1-cosdelta)+n2*sindelta
   move.w d0,4(a6)     M13
   sub.w  d1,d2        n1*n3(1-cosdelta)-n2*sindelta
   move.w d2,12(a6)    M31
  * elements M23 and M32
   move   d4,d0        1-cosdelta
   muls   2(a0),d0     n2*(1-cosdelta)
   lsl.l  #2,d0
   swap   d0
   muls   4(a0),d0     n2*n3(1-cosdelta)
   lsl.l  #2,d0
   swap   d0
   move   d0,d2
   move.w (a0),d1      n1
   muls   d6,d1        n1*sindelta
   lsl.l  #2,d1
   swap   d1
   sub.w  d1,d0        n2*n3(1-cosdelta)-n1*sindelta
   move.w d0,10(a6)    M23
   add.w  d1,d2        n2*n3(1-cosdelta)+n1*sindelta
   move.w d2,14(a6)    M32
  * elemnt M11
   move.w (a0),d1      n1
   muls   d1,d1        n1*n1
   lsl.l  #2,d1
   swap   d1
   move   d5,d2        1
   sub.w  d1,d2        1-n1*n1
   muls   d7,d2        (1-n1*n1)cosdelta
   lsl.l  #2,d2
   swap   d2
   add.w  d2,d1        n1*n1+(1-n1*n1)cosdelta
   move.w d1,(a6)      M11
  * element M22
   move.w 2(a0),d1     n2
   muls   d1,d1        n2*n2
   lsl.l  #2,d1
   swap   d1
   move   d5,d2        1
   sub.w  d1,d2        1-n2*n2
   muls   d7,d2        (1-n2*n2)cosdelta
   lsl.l  #2,d2
   swap   d2
   add.w  d2,d1        n2*n2+(1-n2*n2)cosdelta
   move.w d1,8(a6)     M22
  * element M33
   move.w 4(a0),d1     n3
   muls   d1,d1        n3*n3
   lsl.l  #2,d1
   swap   d1
   move   d5,d2
   sub.w  d1,d2        1-n3*n3
   muls   d7,d2        (1-n3*n3)cosdelta
   lsl.l  #2,d2
   swap   d2
   add.w  d2,d1        n3*n3+(1-n3*n3)cosdelta
   move.w d1,16(a6)    M33
   rts
  ************************************************************************
  w_tran_v
   move.w wncoords,d7
   ext.l d7 any to do?
   beq w_tranv3
   subq.w #1,d7
   lea wcoordsx,a0
   lea wcoordsy,a1
   lea wcoordsz,a2
   lea vcoordsx,a3
   lea vcoordsy,a4
   lea vcoordsz,a5
   exg   a3,d3        save cos we're short of registers
   link  a6,#-6       save 3 words
  w_tranv1
   moveq.l #2,d6     3 rows in matrix
   lea w_vmatx,a3    init max pointer
  * calculate the next vx, vy and vz
  w_tranv2
   move.w (a0),d0          wx
   move.w (a1),d1          wy
   move.w (a2),d2          wz
   sub.w  Ovx,d0
   sub.w  Ovy,d1
   sub.w  Ovz,d2
   muls   (a3)+,d0         wx*Mi1
   muls   (a3)+,d1         wy*Mi2
   muls   (a3)+,d2         wz*Mi3

   add.l  d1,d0
   add.l  d2,d0            wx*Mi+wy*Mi2+wz*Mi3
   lsl.l  #2,d0
   swap   d0
   move.w d0,-(a6)
   dbra   d6,w_tranv2     repeat for 3 elements

   move.w (a6)+,(a5)+
   move.w (a6)+,(a4)+
   exg    a3,d3           restore vx, save matx pointer
   move.w (a6)+,(a3)+
   exg    a3,d3           save vx, restore matx pointer
   addq.l #2,a0           point to next wx
   addq.l #2,a1           wy
   addq.l #2,a2           wz
   dbra   d7,w_tranv1     repeat for all ocoords
   unlk   a6              close frame
  w_tranv3
   rts
  ****************************************************************************************
  * Set the velocity components
  vel_adj
   lea     kv,a0
   moveq.l #14,d7         ready to divide by 2^14
   move.w  speed,d0
   lsl.w   #3,d0          scale it
   move    d0,d1
   move    d0,d2
   muls    (a0),d0        v*VZx
   lsr.l   d7,d0          /2^14
   add.w   d0,Ovx         xw speed component
   muls    2(a0),d1       v*VZy
   lsr.l   d7,d1
   add.w   d1,Ovy         zw speed component
   muls    4(a0),d2       v*VZz
   lsr.l   d7,d2
   add.w   d2,Ovz
   rts
  ****************************************************************************************
  * test whether the primitive is vsible. see whether its centre (oox,Ooy,Ooz) lies within
  * the angle of visibilty. Oox, Ooy and Ooz are transformed to view coords and then tested.
  viewtest
   moveq.l #2,d6 rows in matrix
   lea     w_vmatx,a3
   link    a6,#-6
   move.w  Oox,d3
   addi.w  #50,d3
   move.w  Ooy,d4
   addi.w  #50,d4
   move.w  Ooz,d5
   addi.w  #50,d5
   sub.w   Ovx,d3       Oox-Ovx relative to the view frame
   sub.w   Ovy,d4
   sub.w   Ovz,d5
  tran0v
   move    d3,d0
   move    d4,d1
   move    d5,d2
   muls    (a3)+,d0     *Mi1
   muls    (a3)+,d1     *Mi2
   muls    (a3)+,d2     *Mi3
   add.l   d1,d0
   add.l   d2,d0        *Mi1+*Mi2+*Mi3
   lsl.l   #2,d0
   swap    d0
   move.w  d0,-(a6)
   dbra    d6,tran0v   repeat for three elements

   move.w  (a6)+,d3    Voz
   move.w  (a6)+,d2    Voy
   move.w  (a6)+,d1    Vox
   move.w  d3,Voz
   move.w  d2,Voy
   move.w  d1,Vox
   unlk    a6
  * Clip Ovz. For visibility must have 100<Voz<2000
   cmpi.w  #100,d3     test(Voz-100)
   bmi     invis
   cmpi.w  #2000,d3    test(Voz-2000)
   bpl     invis
  * is it within the view angle?
   addi.w  #100,d3     Voz+100
   add.w   d3,d3       *2
   add.w   d3,d3       *4
   add.w   d3,d3       *8
  * First test horizontal position
   tst.w   d2          is Voy +ve or -ve
   bpl     pos_y
   neg.w   d2
  pos_y
   cmp.w   d2,d3       Voy is +, (test(8*(Voz+100)_Voy))
   bmi invis
  * Test vertical position
   tst.w   d1 Vox
   bpl     pos_x
   neg.w   d1
  pos_x
   cmp.w   d1,d3       test(8(Voz+100)-Vox)
   bmi     invis
  * It IS visible
   st      viewflag
   rts
  * It is INVISIBLE
  invis
   sf      viewflag
   rts
  **************************************************************************************
  *Adjust screen coords so that view frame (0,0) is at centre
  scrn_adj
   move.w  vncoords,d7
   beq     adj_end
   subq.w  #1,d7
   lea     scoordsy,a0
  adj_loop
   subi.w  #100,(a0)+
   dbra    d7,adj_loop
  adj_end
   rts
  *****************************************************************************************
  *                                     bss_06.s                                          *
  *****************************************************************************************
   include bss_05.s

  * VARIABLES FOR ROTATING THE VIEW FRAME
  iv       ds.w 3 view frame base vector components in world
  jv       ds.w 3
  kv       ds.w 3
  vxangle  ds.w 1 rotation angles about these axes
  vyangle  ds.w 1
  vzangle  ds.w 1
  vrot_matx ds.w 9 rotation matrix about an arbitrary axis
  * VISIBILTY
  viewflag ds.w 1
  Vox      ds.w 1 object centre in view frame
  Voy      ds.w 1
  Voz      ds.w 1

10. A World Scene

In this chapter a world containing many objects is constructed.

The transition from a single graphics primitive to a scene containing several brings a host of new problems. For example, in the complex scene of many objects, spatial relationships must be preserved; objects in the foreground must not be obscured by those in the distance. Some form of depth sorting is required that orders objects for drawing on the basis of their distance from the observer.

Just as important is a sound strategy for ignoring all objects outside the immediate environment of the observer. In a world consisting of hundreds of objects spread out over a landscape, it would be pointlessly time consuming to attempt to draw them all. As in real life, the observer need only be concerned with those that are close by and affect current decisions. We examine these aspects of die multi-object world in turn.

10.1. A Database

Associated with each object in the complex world will be a list of its attributes (type, position, colour, rotation angles, etc.), and the set of lists of all the objects is a database. It contains all information needed to draw the view seen by the observer. Exacdy how this database is laid out in memory is very important in determining the speed with which it can be accessed for graphics.

To explain this point further, consider the choices available in ordering the objects in the database. Objects could be entered in the database in order of increasing x (world) coordinate or increasing y coordinate or increasing z coordinate, or indeed at random with no spatial order whatsoever. Objects could be listed according to their type, colour or any one of their attributes. Of all the possibiliues there will be those that provide fast access to those objects which are going to be drawn, i.e. those in the immediate vicinity of the observer. It is clear that some kind of ordering in position is needed to achieve this.

10.1.1. A Map

The position of an object in the world is specified by its three coordinates in the form (xw,yw,zw). It is clear that ordering the database in any one single coordinate (xw or yw or zw) alone will not provide an immediate picture of where each object is in relation to its neighbours.

What is needed is a database where the objects are arranged in 3D order. This is difficult to visualise until it is realised that what is being described is nothing more than a map. The similarity to an ordinary route map is fairly exact for the world we will construct which consists of objects sitting on a surface, just like the surface of the Earth. The advantage of a map of this kind, (which is a 2D array) is that all the objects that lie in a particular region are immediately obvious in their spatial relations.

figure a10 1
Figure 10.1 Layout of world 'tiles’

What is actually done is shown in Figure 10.1. The world space is divided into a 16*16 array of “tiles” (just like on the bathroom wall) each one of which has the dimensions 256*256. Each tile is a unit of space to be considered for display. It can contain a collection of objects; in the example program it contains just one, for simplicity. Of course this is not a very extensive world, but there is nothing in the method which limits it to these dimensions; it could be a big as you like and the individual tiles as small as you like. But, “wrap” occurs so that when the observer strays off any edge he reappears on the opposite side; in this way the world is effectively “infinite”, like a sphere. For our purposes a 16*16 tile world is sufficient to illustrate the method. Each tile defines a region of space which, for the purposes of display, is a single entity. To construct the view seen by the observer, all that has to be done is to find her/his position on the tile grid, select the nearest-neighbour tiles, find which ones are in front of the observer and draw the objects placed on them.

How can this 2D array be laid out in the ID contiguous RAM? There is nothing new here. The screen itself is a 2D world which is represented in memory as a ID database. The pixel is analogous to a tile and the four bits which specify its colour are analogous to the data list specifying the attributes of the object on the tile. An arrangement of information in this way, where each element is linked to its adjacent ones is called a linked list. In this case, the links are permanent and implied by the physical position in the array. The world database is thus a list of 256 bytes, each one holding the attributes of one tile in the 16*16 tile world. In the example program it is held in the file data_08.s. The list starts at map_base and every 16th byte starts a new tile in the z direction. The tile position in the list, mod 16, represents the 16 y values. In this model the world is flat and x does not vary.

There is very little information needed for the attributes, since the position in space is automatically included by the tile’s position in the list. The first nibble gives the colour of the background (1-15) and the second gives the type of object which is to sit on the tile. At present only six are possible (listed in data_06.s), but in principle there is no limit.

10.2. Sorting

As mentioned above, once the visible objects in the near vicinity to the observer have been identified there is the problem of ordering them for drawing so that the more distant ones are drawn first. This is commonly known as the painter’s algorithm, since in painting a picture the last brush stroke overlays earlier ones.

There are many well known algorithms for sorting data in order. Most of the more exotic varieties have been developed to handle large databases with a large number of entries (records). In our case it is necessary to sort a small number (<16) of records in depth order. Sorting at this level is efficiently done by one of the simplest sorting methods, called a bubble sort. Note that at this stage we are referring to the attributes and other accumulated data about the objects to be drawn as records. A record is a set of data of different types where each data type is confined to specific parts or “fields”. This is how data for visible objects is carried around in the example programs. A record is constructed containing all the relevant data to draw in the tile and during depth sorting the records are actually sorted like a deck of cards. That way, although the depth field is the basis for sorting, it carries with it other information for drawing, reducing the retrieval of additional data at a later stage to a minimum. Of course, to avoid slowing things down too much it’s important to keep the record short. In the example program a record consists of 2 long words divided into 7 fields.

10.2.1. A Bubble Sort

Let’s illustrate the bubble sort by direct example from the program. In this we have a short list of records for the visible objects to be displayed. The field on which the sort is based is the second word in the record. It is the distance of the object from the origin of the view frame in the positive z direction, i.e. the direction in which the viewer is looking. The other fields are unimportant for the sorting. Figure 10.2 shows a possible arrangement of simple objects in front of the view frame. The number on each object is its type, which is the content of the second Field on its record. A suitable order in which they should be drawn so that objects in the rear lie behind those in the forefront is: 2,1,4,5,6,3. But this is unlikely to be the order in which the tiles have been retrieved from the database. Let us suppose that they have been withdrawn in the order 6,1,3,4,2,5. The sorting now begins.

figure a10 2
Figure 10.2: Depth ordering of objects

The procedure in a bubble sort is to go through the list comparing each entry with its successor and making a switch if necessary. In the present case we will order the list with the objects to be drawn first at the top of the list, i.e. the list will be in the order: distant objects - near objects. In the first sweep, first the first pair 6 and 1 are examined, found to be in the wrong order and exchanged. At the same time, to record that the list was found to be out of order, a flag is set. This leaves 1 as the first entry and 6 as the second. Then the next pair 6 and 3 are examined. The order here is O.K so no switch is made. This is continued through the entire list. Each time a switch is made the flag is set (of course it can only be set once so the following swaps do nothing to the flag). The following lines show the progression of the first sort:

6.1.3.4.2.5   start
1.6.3.4.2.5   1st pair tested
1.6.3.4.2.5   2nd
1.6.4.3.2.5   3rd
1.6.4.2.3.5   4th
1,6,4,2,5,3   5th

Notice how, like bubbles, the distant objects “float” to the top.

At the end of the list the flag is tested to see if a switch was made. If so the entire list is tested again. This is repeated until a pass is made in which the flag was not set, in which case the list in order and the sort is deemed to be complete.

10.2.2. The Viewing Transform

In this chapter we include two different ways of constructing the view seen by the observer. The first uses control matrices and is a simpler version of the view transform used in the previous chapter. The second is altogether different and much simpler; it uses the Euler angles met in Chapter 8 and is widely used in elementary flight simulators. It is slightly limited as a consequence of the way the angles are defined. We discuss the application of control matrices first.

10.2.3. Control Matrices

Let us suppose that we have reached the stage where all the transforms have been done to present a scene from the viewpoint of an observer. The vertices of all visible objects will then be given in the frame of reference of the observer, i.e. the view frame. If, as a consequence, for example, of a movement of the joystick the observer moves his head to the left, all that is required to show the new view is to rotate the vertices to the right. Rotation of the observer about any axis in his reference frame can be implemented by rotating view frame vertices coordinates in the opposite direction. Such a transform is called a coordinate transform since it calculates the view seen from a different coordinate system, i.e. the rotated coordinate system of the observer.

So it seems that all that is required to show the view of the observer, as he flies through the world, is to multiply the view frame coordinates by the sequence of rotation matrices representing his accumulated motion to date. It won’t work! First a record of the total sequence of rotations would have to be kept and then, for each frame, they would have to be multiplied out in order. Not exactly an efficient algorithm for fast graphics. After a while the picture would stop altogether as hundreds of matrix multiplications were done for each frame. What is the solution?

The solution to this problem is very similar to the method used in the previous chapter where the view frame base vectors were rotated and then used to construct the view transform. In this case the procedure is done backwards. At any instant, as a result of calculations done to display the previous frame, we know the view transform matrix. This is the starting point for the next frame. The sequence of events at the end of the calculations will be to: 1) do the view transform to convert vertices to the view frame, 2) do the rotations about view frame axes we have been talking about, 3) finally, do the perspective transform and everything else that follows. Here is now the solution to the problem. Instead of regarding the view transform, (V), and the view frame rotations, ©, as separate transforms, to be done to the vertices, (PW), in the world frame in sequence to produce first the view frame vertices (PV) and then the rotated vertices (PV’).

(C)(V)(PW) = (V’)(PW) = (PV’).

we concatenate (multiply out) © and (V) separately beforehand to produce a rotated view transform, (V’)

(C)(V)(PW) = (C)(PV) = (PV’),

In this scheme each rotation of the observer is brought about by pre-multiplying the view transform by a “control” matrix appropriate to the rotation. The control matrices for the separate rotations about the view frame xv, yv and zv axes are:

        |  1    0     0    |
        |                  |
 (Cx) = |  0   cosθ  sinθ  |
        |                  |
        |  0   -sinθ  cosθ |
        |                  |

        | cosθ  0   -sinθ  |
        |                  |
 (Cy) = |  0    1     0    |
        |                  |
        | sinθ  0   cosθ   |
        |                  |

        | cosy  siny  0    |
        |                  |
 (Cz) = | -siny cosy  0    |
        |                  |
        |   0     0   1    |
        |                  |

Notice that these are exactly the same as the geometric transforms of Chapter 6 except that the sine terms have the opposite sign. This is because

sin(-θ) = -sin(θ)

and shows that the coordinate transforms are the same as geometric transforms with negative angles, i.e. they correspond to backward rotations. This is saying mathematically what we know to be true: rotating the observer’s head to the left achieves the same end result as rotating the scene to the right. (See Appendix 6).

The physical motions corresponding to the rotations are shown in Figure 10.3. They are: yaw (rotation about the x axis), pitch (rotation about the y axis) and roll (rotation about the z axis).

To speed things up the control matrices can be precalculated. If it is accepted that rotations always occur in 1 degree increments then the elements of the matrices will be sine(l) and cos(l) (multiplied by 16384 as usual). This is indeed what is done in the example program file dat_07.s where angle increments are taken to be 5, although here rotations only occur about the xv and yv axes.

There still remains the need to ensure that errors do not accumulate. So, remembering that the rows of the view transform can be visualised as the view frame base vectors, we regenerate the view matrix rows by vector products as was done in Chapter 9.

The details of all these stages are shown in the example program, wrld_scn.

10.2.4. Euler Angles

We have already discussed these in section 8.1.1. Euler angles are a way of specifying the orientation of one reference frame with respect to another using only three angles but with some restriction as to how the angles are defined. Most important is that they specify rotations about different axes in a fixed order. There are many combinations possible. The sequence defined below is the one beloved of aeronautical engineers and is called the 321 sequence because it describes rotations about the x, y, and z axes in order. These correlate with motions of the joystick and so describe yaw (bearing), pitch and roll but note that yaw here, being an initial rotation about the world frame axis, wx, is different from that described in section 10.3.1. The physical rotations of the observer are shown in Figure 10.3.

figure a10 3
Figure 10.3 Aeronautical terms for viewframe rotations

Here is the sequence of rotations (displacements have already been subtracted off) which carry the world reference frame into the observer’s view frame. It is illustrated in Figure 10.4. Both frames are coincident to begin with and rotations arc about view frame axes, wherever they are at the time:

  1. rotate by 0 about the x axis - the same for both frames (yaw)

  2. rotate by 4> about the y axis (pitch)

  3. rotate by y about the z axis (roll)

The end product is the orientation of the view frame.

Looking back to section 8.1.1. it will be seen that this is precisely the sequence of rotations done there and so the results, in particular the final matrix product, can be used directly. The results are illustrated in the example program eulr_sen.

figure a10 4
Figure 10.4 Rotation sequence of Euler angles

10.3. Running Times

The example program in this chapter allows you to roam around a world containing 256 different graphic entities under the control of the joystick as in a rudimentary flight simulator. There is no limitation here; a larger world database could be constructed with no additional time penalty. A world of this limited size has been used because it is sufficient to illustrate the procedures involved without involving excessively long listings.

Because of the serial way the book has introduced the different stages of getting a moving picture on the screen, and the manner in which programs have been included together to make an overall program of increasing power, there has been an inevitable compromise in speed. The final program in this last chapter could be rationalised and simplified to become substantially faster.

10.4. Example Program

10.4.1. wrld_scn.s and eulr_scn.s

There are two main control programs here. They both allow free flight through a landscape of moving objects but differ in the type of viewing transform used. In one of them, wrld_scn.s, motion is controlled through the joystick and keyboard by means of rotations about the instantaneous axes of the observer’s coordinate frame. In the other, eulr_scn.s the joystick increments or decrements the Euler angles and to vary the orientation of the observer’s reference frame. The detailed controls are

wrld_scn: up, down, left, right = joystick roll left = fl, roll right = F2

eulr scn: up, down, left, right = joystick.

In both cases the other function keys are:

reverse=F3, slow forward=F4, fast forward=F5, stop=F6, abort=F7.

10.4.2. data_06.s

This is the data file of the graphics primitives, which are simple 3D structures. They appear littered about the landscape according to the database in data_08.s where the primitive associated with each tile is specified in the low nibble of the attribute byte. There are 6 types (0-5) vectored from a jump table at the address primitive. There is no limit to the variety or number; to include a new one simply add one more label to the jump vectors and fill in the details at the end of the list. The primary jump vectors at primitive point to a list of secondary vectors, which are the tables of data for each particular type For a particular type data is given in a scries of lists:

  • the secondary pointers,

  • the intrinsic colours (0, 1, 2 or 3 for 8 shades of 4 colours),

  • the number of faces on each polyhedral object,

  • the list of edge numbers on each face,

  • the list of vertex connections on all faces in order,

  • the three sets of x, y and z coordinates of the vertices,

  • the total number of vertices and

  • the type of rotation which the object is undergoing.

The type of rotational motion which each type displays is specified in the lowest nibble of the high word of the variable 0n (where n is the type number) and the low word is used by the program to hold the current angle but appears as 0 in the list. The type of rotation is given by the bit which is set in the nibble:

bit 0 - rotation about x axis of object frame

bit 1 - ditto y

bit 2 - ditto z

so that any combination of simultaneous rotations can be included.

10.4.3. data_07.s

Here are the four control matrices for positive and negative rotations about the view frame x and y axes laid out in row order.

10.4.4. data_08.s

Here are the 256 bytes which make up the 16*16 tile world unit. In the program, wrap-around occurs so that motion beyond the extreme left boundary returns the viewer to the right boundary. In this sense, like a sphere, the world is “infinite”. In each byte the high nibble gives the actual colour of the background \((0-7), no illumination) and the high nibble gives the object type (0-15) sitting on the tile. Only 6 types are used in the program. The reader can easily invent new ones.

10.4.5. core_07.s

The first subroutine in the core, patch_ext first takes the observer’s current position and normalises it to lie within the world map. This is where the wrap-around occurs. Following this the location in tile coordinates (Ty,Tz) is calculated by dividing by the y and z positions by 256. Remember there are 16*16 tiles spread out over the y-z plane. This is the vertical projection of the observer’s position onto the plane. Then the attributes of the 16 tiles centred about this position are retrieved from the database and, for each tile, stored as the first byte of the first word in the 4-word record which accompanies each one. The offset of each tile from the observer’s position is saved in the second byte of the first record word. This collection of potentially visible tiles is called a patch.

Following this a visibility test is done on every tile in the patch. The test here does not consider a frustum of visibility, but only whether the centre of the tile lies in front of the observer. The central parameter calculated for each tile during this test is its distance (zv) in front of the observer. This is also saved as the second word in the record for depth sorting later. Less than half the tiles pass the visibility test. The visibility sort, next, simply uses a bubble sort to place the records in order of depth, that is in order of decreasing distance from the observer. The tiles with records at the top of the list will be drawn first since they are farthest away.

The subroutine which follows, drwjt, sets up the data to draw each tile and its resident object in the ordered list of visible tiles, and calls all the earlier subroutines to draw the complete picture. There is a lot going on at this stage. The background on each tile is just a cross of a particular colour so that all the tiles together define a grid on which the objects sit. Since the background is the same for every tile, it is entered directly from the program rather than being stored in a data file. Also since it has a fixed colour without varying illumination, there is no need to call the time-consuming illumination calculations.

The data lists for each object are pulled in from the data file and before it is drawn its new angle in the world frame is determined for whatever mode of rotation is active.

10.4.6. bss_07.s

New variables

10.4.7. systm_Ol.s

Just a few routines to set up the system. In particular the view point is moved back a bit to -300 on the zv axis to reduce the perspective distortion and eliminate the possibility of parts of objects falling behind the observer, which would not cause the system to crash but would produce a display of spectacular garbage as the basic drawing routines attempt to cope with drawing backwards.

Also a bit of a cheat. The Amiga is being stretched with this program and it helps to speed things up by reducing the size of the window (clip frame) on the screen so that the picture is smaller (ever wondered why games show a tiny screen surrounded by a lot of static ornamentation looking like a console?).

10.4.8. core O8.s

This is the core file for the Euler angle transform.

10.5. Epilogue

How far have we got? What’s next? For a start the overall program can be speeded up considerably by rationalising the anomalies caused by the serial way in which programs have been introduced in this book.

There also remains the inclusion of the third party (you, the world scene and the alien). So far the graphic entities have been static in the sense that their evolution has been determined by their attributes. To give entities life requires that their actions evolve independent of the deterministic structure of the program. But there is really only one truly random element in this scenario - you, the observer. Hence to create life within the computer it is necessary to make the entities respond to your actions. This is of course what happens in all games. Aliens head for the target. To invent a third party is no more complicated than has already been done in reading the movements of the joystick to follow the motion of the observer. In the case of the third party there are no joystick movements, but rather, the response to world conditions.

  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  * wrld_scn.s
  * A multi-object scene
  * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
  * A world scene consisting of various types of graphics primitives
  * in notion. The viewer is free to *fly* to any location. At any
  * position a patch consisting of 4*4 "tiles" is visible.
  * Joystick controls Yaw and pitch. F1 and F2 controll roll
  * Don't held down keys as keyboard buffer is not cleared.

  * SECTION TEXT
      opt   d+
      bra   main
      include systm_01.s
      include core_07.s

  main:
  * Initalize the system.
      bsr   init_vars     initialize view transform
      bsr   flg_init      initialize  flags

  loop:
  * Read input and make adjustments.
      bsr   swp_scn       swap the screens
      bsr   dircosines    regenerate view matrix
      bsr   joy_read      see which direction to move
      bsr   in_key        update the speed
      bsr   adj_vel       adjust the velocity

  * Draw the scene
      bsr   scne_drw      everything to complete the picture

  * Draw the next frame
      bra   loop

  *SECTIOM DATA
      include data_00.s
      include data_06.s
      include data_07.s
      include data_08.s
  *SECTION BSS
      include bss_07.s

  END
  *****************************************************************************************
  *                                   Core_07.s                                           *
  *                           subroutines for chapter 10                                  *
  *****************************************************************************************
   include Core_06.s

  scne_drw          ; draw a scene of several primitives
   bsr patch_ext      select the local scene
   bsr sight_tst      select only the visible ones
   bsr vis_srt        sort in depth order
   bsr drw_it         draw them in depth order
   rts
  *****************************************************************************************
  * Extract the tile patch. Put the 16 tiles in a list at patch_lst
  patch_ext
   move.w oposx,d0 observers x pos
   move.w oposy,d1
   move.w oposz,d2
  * Find position in world. Keep to range 4096
   andi.w #$fff,d0   range x
   andi.w #$fff,d1   range y
   andi.w #$fff,d2   range z
   move.w d0,oposx   restore x etc..
   move.w d1,oposy
   move.w d2,oposz
   move.w d1,d3
   move.w d2,d4
  * Find coords of patch centre=local world origin
   lsr.w  #8,d1
   move.w d1,Ty      y coord. in 16*16 layout
   lsr.w  #8,d2
   move.w d2,Tz      z coord
  * Coords of view frame, referenced to this origin
   lsl.w  #8,d1      Ty*256
   lsl.w  #8,d2      Tz*256
   sub.w  d1,d3      oposy-Ty*256 = Ovy
   move.w d3,Ovy
   sub.w  d2,d4      opoz-Tz*256 = Ovz
   move.w d4,Ovz
   move.w oposx,Ovx (the height is universal)

  * Fetch the attributes of the 16 surrounding tiles from the map and calculate their world
  * coords. Store the data in a record/structure with the format:-
  * WORD 1 : HIBYTE - graphics attribute
  *          LOBYTE - clear
  * WORD 2 : Voz tile centre z in view frame coords
  * WORD 3 : tile y in local world coords
  * WORD 4 : ditto z
  * Ty & Tz are the patch centre coords = local world origins.

   move.w  Ty,d0
   move.w  Tz,d1
  * A 4*$ patch of tiles centred on the Ty,Tz are retrieved
   move.w  #-2,d5          z offset of start tile
   lea     map_base,a0
   lea     patch_lst,a1    the local list of 4*4
   move.w  #3,d7           4 z values
  tile_lp1
   move.w  #-2,d4          reset start yoffset
   move.w  #3,d6           4 y values
   move.w  d1,d3           origin Tz
   add.w   d5,d3           +offset = next z
   andi.w  #$f,d3          stay in range 0-15
   lsl.w  #4,d3            *16
  tile_lp2
   move   d0,d2            origin Ty
   add.w  d4,d2            +offset = next y
   andi.w #$f,d2           stay in range 0-15
   add.w  d3,d2            16*z+y = tile address in map
   move.b 0(a0,d2.w),d2    fetch attribute in low byte
   swap   d2               of high word
   clr.w  d2               0 for low word
   lsl.l  #8,d2            everything into high word
   move.l d2,(a1)+         store the first half of the record
  * Calculate the tile local coords: Ooy & Ooz. Coords are offset*256.
   movem.l d4/d5,-(sp)     stack offsets
   lsl    #8,d4            yoffset*256
   swap   d4               in high word
   lsl    #8,d5            zoffset*256
   move.w d5,d4            in low word
   move.l d4,(a1)+
   movem.l (sp)+,d4/d5     restore offsets
   addq   #1,d4            next y offset
   dbra   d6,tile_lp2      for all tiles in this row
   addi.w #1,d5            next z offset
   dbra   d7,tile_lp1      for all rows
   rts
  ******************************************************************************************
  sight_tst
   lea    patch_lst,a0     pointer to source list
   lea    vis_lst,a1       list of visible tiles
   lea    vis_cnt,a2       count of previous
   clr.w  (a2)             zero count
   move.w #15,d7           16 tiles in a patch
   clr.w  Oox              all tiles are on the ground
  sight_tst1
   move.w 4(a0),d0
   addi.w #128,d0
   move.w d0,Ooy           tile
   move.w 6(a0),d0
   addi.w #128,d0
   move.w d0,Ooz           centres
   movem.l d7/a0-a2,-(sp)
    bsr    testview        is tile within filed of vision
   movem.l (sp)+,d7/a0-a2
   tst.b  viewflag         visible?
   beq    nxt_tile
   addq.w #1,(a2)          yes, increment visible count
   move.w Voz,2(a0)        save the depth for sorting
   move.l (a0),(a1)+       transfer 1st half to visible list
   move.l 4(a0),(a1)+      2nd half
  nxt_tile
   addq   #8,a0            point to next record
   dbra   d7,sight_tst1    for all tiles
   rts
  ******************************************************************************************
  *Test whether the primitive is visible.
  * Tile centre (Oox, Ooy, Ooz) transformed to view coords then tested. Correct for 2^14.
  testview
   moveq.l #2,d6          3 rows in matrix
   lea     w_vmatx,a3     init max pointer
   link    a6,#-6         3 words to store temporarily
   move.w  Oox,d3
   move.w  Ooy,d4
   move.w  Ooz,d5
   sub.w   Ovx,d3         Oox-Ovx rel to the view frame
   sub.w   Ovy,d4         Ooy-Ovy
   sub.w   Ovz,d5         Ooz-Ovz
  tranv0
   move d3,d0 restore
   move d4,d1
   move d5,d2
   muls (a3)+,d0 *Mi1
   muls (a3)+,d1 *Mi2
   muls (a3)+,d2 *Mi3
   add.l d1,d0
   add.l d2,d0 *Mi1+*Mi2+*Mi3
   lsl.l #2,d0
   swap d0 /2^14
   move.w d0,-(a6) save it
   dbra d6,tranv0 repeat for 3 elements
   move.w (a6)+,d3 off my stack becomes Voz
   move.w (a6)+,d2 off my stack becomes Voy (centre in view frame)
   move.w (a6)+,d1 off my stack becomes Vox
   move.w d3,Voz
   move.w d2,Voy
   move.w d1,Vox
   unlk a6
  * Clip Ovz. To be visible must have 50<Voz<2000
  * This test only looks at depth.
   cmp.w #50,d3 test(Voz-50)
   bmi notvis fail
   cmp.w #2000,d3 test(Voz-2000)
   bpl notvis fail
   st viewflag we can see it
   rts
  notvis
   sf viewflag can't see it
   rts
  *****************************************************************************************
  * Order the visible tiles in orderof decreasing Voz (the distance of the tile centre from
  * the view frame origin). Largest Voz's (furthest) should be drawn first.
  vis_srt
   move.w vis_cnt,d7 number to do
   beq srt_quit
   subq #1,d7
   beq srt_quit
   subq #1,d7
  * Bubble sort
  vis_srt1
   lea vis_lst+2,a0 pointer to 1st record Voz
   movea.l a0,a1
   addq.l #8,a1 pointer to 2nd Voz
   move d7,d6 reset count
   clr.w srt_flg
  vis_srt2
   cmpm.w (a0)+,(a1)+ test(Voz2-Voz1)
   ble no_swap 1st is farther
   move.l -4(a0),d0 fetch 1st record
   move.l (a0),d1
   move.l -4(a1),-4(a0) make
   move.l (a1),(a0) 2nd the 1st
   move.l d0,-4(a1) & 1st
   move.l d1,(a1) 2nd
   st srt_flg
  no_swap
   addq.l #6,a0 point to next record Voz
   addq.l #6,a1 and the one follwing
   dbra d6,vis_srt2
   tst.w srt_flg
   beq srt_quit
   bra vis_srt1
  srt_quit
   rts
  *****************************************************************************************
  drw_it
  * draw the visible tiles
   move.w vis_cnt,d7
   beq drw_it_out
   subq.w #1,d7
   lea vis_lst,a0 ptr to list
  drw_it1
   movem.l d7/a0,-(sp)
   bsr set_prim drw next prim
   movem.l (sp)+,d7/a0
   addq.l #8,a0 next record
   dbra d7,drw_it1
  drw_it_out
   rts
  *****************************************************************************************
  * Set up next primitive for drawing; pointer to record in a0.
  * 1. DO BACKGROUND
  set_prim
   move.l a0,-(sp) save ptr
   bsr ldup_bkg
   bsr otranw obj->world
   bsr w_tran_v world->view
  * Background always visible at a constant illumination level
   movea.l (sp)+,a0 restore ptr
   move.w (a0),d0 1st word of record
   move.l a0,-(sp) save pointer
   lsr.w #8,d0 top byte
   lsr.w #4,d0 top nibble is colour
   move.w d0,col_lst the final colours
   move.w d0,col_lst+2
   bsr perspective
   bsr scrn_adj centre it
   bsr polydraw
  *2. Draw the object
   movea.l (sp)+,a6 restore pointer
   bsr ldup_obj
   bsr otranw
   bsr w_tran_v
   bsr illuminate
   bsr perspective
   bsr scrn_adj
   bsr polydraw
   rts
  *****************************************************************************************
  * Load background data as program data. Background is a grid.
  ldup_bkg
   move.w #2,npoly 2 rectangles
   move.l #$40004,snedges 4 edges in each
   lea sedglst,a2 edgelist 0,1,2,3,0,4,5,6,7,4
   move.l #1,(a2)+ edges 0,1
   move.l #$20003,(a2)+ 2,3
   move.l #4,(a2)+ 0,4
   move.l #$50006,(a2)+ 5,6
   move.l #$70004,(a2)+ 7,4
  * The background vertices define a cross. All x coords are zero.
   lea ocoordsx,a2  vertex coords x =
   move.l #0,(a2)+ 0,0
   move.l #0,(a2)+ 0,0
   move.l #0,(a2)+ 0,0
   move.l #0,(a2)  0,0
   lea ocoordsy,a2 y =
   move.l #$ff800080,(a2)+ -128,128
   move.l #$80ff80,(a2)+ 128,-128
   move.l #$fffcfffc,(a2)+ -4,-4
   move.l #$40004,(a2) 4,4
   lea ocoordsz,a2
   move.l #$40004,(a2)+ 4,4
   move.l #$fffcfffc,(a2)+ -4,-4
   move.l #$ff800080,(a2)+ -128,128
   move.l #$80ff80,(a2)+ 128,-128
   move.w #8,oncoords
   move.w #8,vncoords
   move.w #8,wncoords
  * The tile centre in the world frame is Oox=0 and the contents of the 3rd & 4th
  * words of the records.
   move.w #0,Oox
   move.w 4(a0),Ooy 3rd word
   addi.w #128,Ooy
   move.w 6(a0),Ooz 4th word
   addi.w #128,Ooz
   clr.w otheta no orientation
   clr.w ophi
   clr.w ogamma
   rts
  ****************************************************************************************
  * This has no label in the book and therefore it seems unlikely that it will ever be used.
   move.w #1,npoly 1 rectangles
   move.l #$4,snedges 4 edges
   lea    sedglst,a2 edgelist 0,1,2,3,0
   move.l #1,(a2)+ edges 0,1
   move.l #$20003,(a2)+ 2,3
   move.l #0,(a2)+ 0,4
  * The background vertices are the corners of the tile.
   lea ocoordsx,a2 vertex coords x =
   move.l #0,(a2)+ 0,0
   move.l #0,(a2)+ 0,0
   move.l #0,(a2)+ 0,0
   move.l #0,(a2)+  0,0
   lea ocoordsy,a2 y =
   clr.l (a2)+
   move.l #$ff00ff,(a2)
   lea ocoordsz,a2
   move.l #$ff,(a2)+ 0,255
   move.l #$ff0000,(a2)+ 255,0
   move.w #4,oncoords
   move.w #4,vncoords
   move.w #4,wncoords
  * The tile centre in the world frame is Oox=0 and the contents of the 3rd & 4th
  * words of the records.
   move.w #0,Oox
   move.w 4(a0),Ooy 3rd word
   move.w 6(a0),Ooz 4th word
   clr.w otheta no orientation
   clr.w ophi
   clr.w ogamma
   rts
  ****************************************************************************************
  ldup_obj
  * Find out what type of object it is.
   move.w (a6),d0 top word
   lsr.w #8,d0 top byte
   andi.w #$f,d0 low nibble is type (call it n)
   lsl.w #2,d0 *4 for offset
   lea primitive,a5 ptr to vector table
   movea.l 0(a5,d0.w),a5 ptr to type n lists
   movea.l 4(a5),a2 pointer to npolyn
   move.w (a2),d7 got it
   move.w d7,npoly
   subq.w #1,d7
   move d7,d0
   movea.l 8(a5),a0 ptr to nedge list
   movea.l a0,a4 saved
   lea snedges,a1 destination
   move.l (a5),a2 ptr to intrinsic colours
   lea srf_col,a3 dest
  obj_lp1
   move.w (a0)+,(a1)+ transfer edge numbers
   move.w (a2)+,(a3)+ transfer intrinsic colours
   dbra d0,obj_lp1
  * Calculate total number of edges
   move.w d7,d0 retore count
   clr d1
   clr d2
  obj_lp2
   add.w (a4)+,d2 number of edges
   addq #1,d2 and with last repeated
   dbra d0,obj_lp2
  * Move the edge list
   subq #1,d2 counter
   movea.l 12(a5),a0 edglstn, the source
   lea sedglst,a1 dest
  obj_lp3
   move.w (a0)+,(a1)+ pass it
   dbra d2,obj_lp3
  * and the coords list
   movea.l 28(a5),a0 ptr to num vertices
   move.w (a0),d1 num vertices
   move.w d1,oncoords
   move.w d1,vncoords
   move.w d1,wncoords
   subq #1,d1 counter
   movea.l 16(a5),a0 ptr to object x
   lea ocoordsx,a1
   movea.l 20(a5),a2 object y
   lea ocoordsy,a3
   movea.l 24(a5),a4 object z
   movea.l a5,a6
   lea ocoordsz,a5
  obj_lp4
   move.w (a0)+,(a1)+
   move.w (a2)+,(a3)+
   move.w (a4)+,(a5)+
   dbra d1,obj_lp4
  * Increment the rotation angle
   bsr next_rot
   addi.w #128,Ooy
   addi.w #128,Ooz
   rts
  *****************************************************************************************
  * Increment the rotation of the object.
  next_rot
   movea.l 32(a6),a0 ptr to angle and flag
   move.l (a0),d0 top word is flag, bottom is angle
   move.l d0,d1
   andi.l #$ffff,d0 the angle
   addi.w #2,d0 increment it
   cmp #360,d0
   blt obj_lp5
   subi #360,d0
  obj_lp5
   move.w d0,2(a0) next angle
  * see what angles to rotate
   swap d1
   andi.w #$f,d1 flag in lo nib
  * flags are set:bit 0= xrot 1=yrot 2=zrot
   lsl.w #2,d1 offset
   lea rot_vec,a0 ptr to jump table
   move.l 0(a0,d1.w),a0
   jmp (a0)
  rot_vec
   dc.l no_rot,rotx,roty,rotxy,rotz,rotxz,rotyz,rotxyz
  no_rot rts
  rotx
   move.w d0,otheta
   rts
  roty
   move.w d0,ophi
   rts
  rotxy
   move.w d0,otheta
   move.w d0,ophi
   rts
  rotz
   move.w d0,ogamma
   rts
  rotxz
   move.w d0,otheta
   move.w d0,ogamma
   rts
  rotyz
   move.w d0,ophi
   move.w d0,ogamma
   rts
  rotxyz
   move.w d0,otheta
   move.w d0,ophi
   move.w d0,ogamma
   rts
  ****************************************************************
  * These are the rotations the joystick reader sends us here.
  rot_down
   lea rot_y_neg,a0 ptr to ctrl matrix
   bsr ctrl_view
   rts
  rot_up
   lea rot_y_pos,a0 ptr to ctrl matrix
   bsr ctrl_view
   rts
  rot_left
   lea rot_x_pos,a0 ptr to ctrl matrix
   bsr ctrl_view
   rts
  rot_right
   lea rot_x_neg,a0 ptr to ctrl matrix
   bsr ctrl_view
   rts
  roll_left
   lea rot_z_neg,a0 ptr to ctrl matrix
   bsr ctrl_view
   rts
  roll_right
   lea rot_z_pos,a0 ptr to ctrl matrix
   bsr ctrl_view
   rts
  ctrl_view
  * multiply the control matrix poited to by a0 by the view matrix
  * to calculate the new elements of the view base vectors.
  * 1.base vector iv
   lea w_vmatx,a1 ptr to view matrix
   lea iv,a2 ptr to view frame base vector
   move.w #2,d6 3 elements to iv
   movea.l a1,a3 set view ptr
  iv_loop
   move.w (a3),d1    next view elements
   move.w 6(a3),d2
   move.w 12(a3),d3
   muls (a0),d1
   muls 2(a0),d2
   muls 4(a0),d3
   add.l d2,d1
   add.l d3,d1
   lsl.l #2,d1
   swap d1
   move.w d1,(a2)+ next element in base vector
   addq.l #2,a3 next column in base vector
   dbra d6,iv_loop
  *2. No need to do jv; it's calculated from the other two.
  *3. base vector kv
   lea kv,a2
   move.w #2,d6
   movea.l a1,a3
  kv_loop
   move.w (a3),d1
   move.w 6(a3),d2
   move.w 12(a3),d3
   muls 12(a0),d1
   muls 14(a0),d2
   muls 16(a0),d3
   add.l d2,d1
   add.l d3,d1
   lsl.l #2,d1
   swap d1
   move.w d1,(a2)+ next element in base vector
   addq.l #2,a3 next column in base vector
   dbra d6,kv_loop
   rts
  *****************************************************************************************
  * Set the velocity components
  adj_vel
   lea kv,a0
   move.w #14,d7
   move.w speed,d0
   lsl.w #4,d0
   move d0,d1
   move d0,d2
   muls (a0),d0 v*VZx
   lsr.l d7,d0
   add.w d0,oposx xw speed component
   bpl adj1
   clr.w oposx oposx must be > 0
  adj1
   muls 2(a0),d1 v*VZy
   lsr.l d7,d1
   add.w d1,oposy yw speed component
   muls 4(a0),d2 v*VZz
   lsr.l d7,d2
   add.w d2,oposz zw speed component
   rts
  *****************************************************************************************
  *                                        bss_07.s                                       *
  *                                variables for chapter 10                               *
  *****************************************************************************************
   include bss_06.s
  * Observer's position in world.(mod 4096)
  oposx ds.w 1
  oposy ds.w 1
  oposz ds.w 1

  * Tile offset in 16*16 patch
  Ty ds.w 1
  Tz ds.w 1

  * Tile lists
  patch_lst ds.l   32 records of 16 tiles in patch
  vis_lst   ds.l   32 records of visible tiles

  * List variables
  vis_cnt ds.w 1   number of visible tiles
  srt_flg ds.w 1   set during depth sorting
  *****************************************************************************************
  *                                System_01.s                                            *
  *****************************************************************************************
  init_vars:
  * set up the screens
   bsr  init
  * Set up the view point
   move.w #100,oposx
   clr.w oposy
   clr.w oposz
  * and the clip frame
   move.w #50,clp_xmin
   move.w #270,clp_xmax
   move.w #30,clp_ymin
   move.w #170,clp_ymax
  * Set up view frame base vectors
  *1. iv
   lea    iv,a0 								align view frame axes
   move.w #$4000,(a0)+
   move.w #0,(a0)+
   move.w #0,(a0)
  *2. jv
   lea   jv,a0                  with the world frame
   clr.w (a0)+
   move.w #$4000,(a0)+
   clr.w (a0)
  *3.kv
   lea    kv,a0
   move.w #0,(a0)+
   clr.w (a0)+
   move.w #$4000,(a0)

  flg_init:
  * Initialize flags and other variables
   clr.w speed                  start at rest
   clr.w screenflag             0=screen 1 draw, 1=screen 2 draw
   clr.w viewflag
  * Move the view point to -300 on the view frame z axis
   lea   persmatx,a0
   move.w #300,d0
   move.w d0,(a0)
   move.w d0,10(a0)
   move.w d0,30(a0)
   rts
  swap_scn:
    tst.w screenflag  screen 1 or screen2?
    beq   screen_1    draw on screen 1, display screen2
    bsr   drw2_shw1   draw on screen 2, display screen1
    clr.w screenflag  and set the flag for next time
    bra   screen_2

  screen_1:
    bsr     drw1_shw2   drar on 1, display 2
    move.w  #1,screenflag and set the flag for next time
  screen_2:
    rts
  *****************************************************************************************
  *                                     data_06.s                                         *
  *                                Data for chapter 10                                    *
  *****************************************************************************************

   include data_05.s            (ensure we include data_03.s as well).
  * The vector table of graphics primitives in 8 shades of 4 colours.
  primitive:
    dc.l prim0,prim1,prim2,prim3,prim4,prim5

  * Now follow the vector tables for each primitive.

  prim0 ; A simple block
   dc.l colrs0,npoly0,nedg0,edglst0,prm0x,prm0y,prm0z,npts0,theta0
  colrs0  dc.w 1,1,1,1,1
  npoly0  dc.w 5
  nedg0   dc.w 4,4,4,4,4
  edglst0 dc.w 0,1,2,3,0,3,2,4,5,3,5,4,6,7,5,7,6,1,0,7,1,6,4,2,1
  prm0x   dc.w 0,50,50,0,70,0,70,0
  prm0y   dc.w -6,-6,6,6,6,6,-6,-6
  prm0z   dc.w -6,-6,-6,-6,6,6,6,6
  npts0   dc.w 8
  theta0  dc.l $10000

  prim1 ; An inverted pyramid
   dc.l colrs1,npoly1,nedg1,edglst1,prm1x,prm1y,prm1z,npts1,theta1
  colrs1  dc.w 2,2,2,2,3
  npoly1  dc.w 5
  nedg1   dc.w 3,3,3,3,4
  edglst1 dc.w 0,1,2,0,0,2,3,0,0,3,4,0,0,4,1,0,1,4,3,2,1
  prm1x   dc.w 0,75,75,75,75
  prm1y   dc.w 0,-32,32,32,-32
  prm1z   dc.w 0,-32,-32,32,32
  npts1   dc.w 5
  theta1  dc.l $10000

  prim2 ; A nugget.
   dc.l colrs2,npoly2,nedg2,edglst2,prm2x,prm2y,prm2z,npts2,theta2
  colrs2  dc.w 1,1,0,1,0,0,1,0,1,1,0,1,0,1
  npoly2  dc.w 14
  nedg2   dc.w 4,4,4,4,4,4,4,4,4,4,4,4,4,4
  edglst2 dc.w 1,6,4,2,1,0,1,2,3,0,3,2,4,5,3,4,6,7,5,4,6,1,0,7,6,8,0,3,11,8,3
          dc.w 5,10,11,3,5,7,9,10,5,7,0,8,9,7,8,11,13,12,8,11,10,14,13,11,10,9
          dc.w 15,14,10,9,8,12,15,9,12,13,14,15,12
  prm2x   dc.w 40,60,60,40,60,40,60,40,20,20,20,20,0,0,0,0
  prm2y   dc.w -30,-10,10,30,10,30,-10,-30,-30,-30,30,30,-10,10,10,-10
  prm2z   dc.w -30,-10,-10,-30,10,30,10,30,-30,30,30,-30,-10,-10,10,10
  npts2   dc.w 16
  theta2  dc.l $70000

  prim3 ; A Tee.
   dc.l colrs3,npoly3,nedg3,edglst3,prm3x,prm3y,prm3z,npts3,theta3
  colrs3  dc.w 2,2,2,2,2,2,2,2,2,2
  npoly3  dc.w 10
  nedg3   dc.w 4,4,4,4,4,4,4,4,4,4
  edglst3 dc.w 0,1,2,3,0,3,2,4,7,3,4,5,6,7,4,5,1,0,6,5
          dc.w 8,11,14,15,8,13,14,11,10,13,12,13,10,9,12,8,15,12,9,8
          dc.w 12,15,14,13,12,10,11,8,9,10
  prm3x   dc.w 0,45,45,0,45,45,0,0,70,45,45,70,45,45,70,70
  prm3y   dc.w -10,-10,10,10,10,-10,-10,10,128,128,128,128,-128,-128,-128,-128
  prm3z   dc.w -10,-10,-10,-10,10,10,10,10,10,10,-10,-10,10,-10,-10,10
  npts3   dc.w 16
  theta3  dc.l $10000

  prim4 ; A roller
   dc.l colrs4,npoly4,nedg4,edglst4,prm4x,prm4y,prm4z,npts4,theta4
  colrs4  dc.w 1,0,1,0,1,0,1,1
  npoly4  dc.w 8
  nedg4   dc.w 4,4,4,4,4,4,6,6
  edglst4 dc.w 1,2,8,7,1,0,1,7,6,0,5,0,6,11,5,4,5,11,10,4,3,4,10,9,3
          dc.w 2,3,9,8,2,4,3,2,1,0,5,4,6,7,8,9,10,11,6
  prm4x   dc.w 0,40,40,0,-40,-40,0,40,40,0,-40,-40
  prm4y   dc.w -8,-8,-8,-8,-8,-8,8,8,8,8,8,8
  prm4z   dc.w -45,-20,20,45,20,-20,-45,-20,20,45,20,-20
  npts4   dc.w 12
  theta4  dc.l $20000

  prim5 ; Another roller
   dc.l colrs5,npoly5,nedg5,edglst5,prm5x,prm5y,prm5z,npts5,theta5
  colrs5  dc.w 3,2,3,2,3,2,3,3
  npoly5  dc.w 8
  nedg5   dc.w 4,4,4,4,4,4,6,6
  edglst5 dc.w 1,2,8,7,1,0,1,7,6,0,5,0,6,11,5,4,5,11,10,4,3,4,10,9,3
          dc.w 2,3,9,8,2,4,3,2,1,0,5,4,6,7,8,9,10,11,6
  prm5x   dc.w 0,40,40,0,-40,-40,0,40,40,0,-40,-40
  prm5y   dc.w -8,-8,-8,-8,-8,-8,8,8,8,8,8,8
  prm5z   dc.w  -45,-20,20,45,20,-20,-45,-20,20,45,20,-20
  npts5   dc.w 12
  theta5  dc.l $40000
  *****************************************************************************************
  *                                       data_07.s                                       *
  *                            Control matrices for rotation                              *
  *****************************************************************************************
  * +ve rotation about the view frame x axis (LEFT) by 5 degrees.
  rot_x_pos:
    dc.w 16384,0,0,0,16322,1428,0,-1428,16322

  *-ve rotation about the xv axis (RIGHT)
  rot_x_neg:
    dc.w 16384,0,0,0,16322,-1428,0,1428,16322

  *+ve rotation about the yv axis (UP)
  rot_y_pos:
    dc.w 16322,0,-1428,0,16384,0,1428,0,16322

  *-ve rotation about the yv axis (DOWN)
  rot_y_neg:
    dc.w 16322,0,1428,0,16384,0,-1428,0,16322

  *+ve rotation about the zv axis (ROLL RIGHT)
  rot_z_pos:
    dc.w 16322,1428,0,-1428,16322,0,0,0,16384

  -+ve rotation about the zv axis (ROLL LEFT)
  rot_z_pos:
    dc.w 16322,-1428,0,1428,16322,0,0,0,16384
  *****************************************************************************************
  *                                       data_08.s                                       *
  * The world map for chapter 10. Each byte gives the attribute of a size 256*256 tile in *
  * a 16*16 tile world. The attributes' composition is thus:-                             *
  * High Nibble : Background colour (1-7)                                                 *
  * Low  Nibble : Primitive type (0-5)                                                    *
  *****************************************************************************************
  map_base
   dc.b $62,$62,$62,$50,$41,$35,$35,$35
   dc.b $35,$35,$35,$43,$45,$54,$54,$64
   dc.b $62,$62,$62,$55,$42,$33,$35,$35
   dc.b $35,$35,$32,$44,$45,$54,$54,$64
   dc.b $52,$52,$52,$52,$44,$35,$34,$35
   dc.b $35,$30,$35,$41,$44,$54,$54,$64
   dc.b $45,$41,$42,$42,$42,$35,$22,$23
   dc.b $23,$20,$25,$25,$44,$44,$40,$65
   dc.b $33,$35,$30,$32,$32,$22,$25,$25
   dc.b $25,$23,$24,$24,$35,$32,$35,$31
   dc.b $35,$32,$35,$35,$32,$22,$11,$11
   dc.b $10,$10,$24,$24,$33,$35,$32,$34
   dc.b $20,$25,$25,$25,$20,$21,$13,$13
   dc.b $13,$13,$20,$25,$25,$25,$20,$25
   dc.b $24,$25,$25,$25,$21,$21,$13,$13
   dc.b $13,$13,$20,$20,$25,$25,$20,$25
   dc.b $20,$25,$25,$25,$22,$22,$13,$13
   dc.b $13,$13,$14,$24,$25,$25,$22,$23
   dc.b $25,$23,$25,$25,$23,$22,$13,$13
   dc.b $13,$13,$14,$23,$25,$25,$25,$25
   dc.b $31,$35,$30,$35,$31,$21,$22,$22
   dc.b $20,$20,$20,$35,$35,$34,$20,$33
   dc.b $45,$40,$40,$40,$41,$41,$22,$22
   dc.b $22,$25,$30,$40,$40,$42,$45,$41
   dc.b $40,$40,$41,$41,$44,$45,$30,$35
   dc.b $35,$35,$32,$45,$40,$50,$55,$55
   dc.b $61,$61,$61,$51,$53,$45,$35,$32
   dc.b $35,$35,$31,$45,$40,$50,$60,$60
   dc.b $61,$61,$61,$52,$55,$44,$33,$35
   dc.b $33,$35,$30,$45,$40,$50,$60,$60
   dc.b $61,$61,$61,$55,$51,$45,$30,$35
   dc.b $32,$35,$35,$41,$45,$50,$60,$60
  * eulr_scn.s
  * A multi-object scene
  *****************************
  * A world scene consisting of various types of graphics primitives
  * in motion. The viewer is free to "fly" to any location with
  * flight simulator type control from the joystick. At any
  * position a patch consisting of 4*4 tiles is visible.

  * SECTION TEXT
    opt   d+
    bra   main
    include systm_01.s
    include core_08.s

  main:
  * Initialize the system
    bsr   init_vars   initialize view transform
    bsr   flg_init    initalize flags

  loop:
  * Read input and make adjustments
    bsr   swp_scn     swap the screens
    bsr   joy_look    see which directions to move
    bsr   angle_update  change the euler angles
    bsr   wtranv_l      construct the view transfers
    bsr   vtran_move    move it to the base vectors
    bsr   in_key        update the speed
    bsr   adj_vel       adjust the velocity

  * Draw the scene
    bsr   scne_drw      everything to complete the picture

  * Draw the next frame
    bra   loop

  * SECTION DATA
    include data_00.s
    include data_06.s
    include data_07.s
    include data_08.s
  * SECTION BSS
    include bss_07.s

  END
  *****************************************************************************************
  *                                 Core_08.s                                             *
  *                  Subroutines for euler_scn Chapter 12                                 *
  *****************************************************************************************
   include core_07.s    previous subroutines

  joy_look:
  * Change the euler angles etheta and ephi (vtheta and vphi from chapter 10 are the same)
  * Read the joystick and update the variables accordingly

    move.w  $dff00c,d0    joystick 2
    btst    #8,d0
    beq     eright,dn
    btst    #9,d0
    beq     eup_jy
    bra     eleft_jy

  eright_dn:
    btst    #0,d0
    beq     eout_jy
    btst    #1,d0
    beq     edown_jy
    bra     eright_jy
  eout_jy rts

  eup_jy:
    bsr     erot_down     rotate view frame down about vy axis
    rts
  edown_jy:
    bsr     erot_up       rotate up about vy axis
    rts
  eleft_jy:
    bsr     erot_left     rotate left aobut vx axis
    rts
  eright_jy:
    bsr     erot_right    rotate right about wx axis
    rts

  erot_down:
  * Rotate down about the yv axis. Decrement ephi (same as vphi)
    move.w  #-5,vph_inc
    rts

  erot_up:
  * Rotate up about the xw axis. Increment ephi (same as vphi)
    move.w  #5,vphi_inc
    rts
  erot_left:
  * Rotate left about the xw axis. Increment etheta
    move.w  #5,vtheta_inc
    rts
  erot_right:
  * Rotate right about the xw axis. Decrement etheta
    move.w  #-5,vtheta_inc
    rts
  vtran_move:
  * move the view transform matrix to the base vectors
  * really just a change of label
   lea iv,a0
   lea jv,a1
   lea kv,a2
   lea w_vmatx,a3
   move.w (a3)+,(a0)+ all
   move.w (a3)+,(a0)+ iv
   move.w (a3)+,(a0)+
   move.w (a3)+,(a1)+ all
   move.w (a3)+,(a1)+ jv
   move.w (a3)+,(a1)+
   move.w (a3)+,(a2)+ all
   move.w (a3)+,(a2)+ kv
   move.w (a3),(a2)
   rts

Appendix A: 68000 Instruction Set

Entire books have been written concerning the 68000 instruction set. There is insufficient space here to do more than outline the essentials. A succinct but thorough discussion is given in the Motorola 16-Bit User’s Manual.

The central feature of assembly language programming is that there are no abstract algebraic variables as in regular mathematics or high level languages such as BASIC. It is not possible to make statements such as

LET x=y+z

though it is possible to effect equivalent manipulations of data.

In assembly language, names such as x, y or z are labels representing addresses in RAM. At these addresses can be found binary numbers which are the current values of the parameters associated with the labels. There is a similarity to algebraic variables but at every stage it is the binary number itself which is manipulated either in memory or in the processor registers. The addressing modes of the 68000 are designed to deal with all the ways data needs to be addressed or directed through the system during the execution of the various instructions.

The 68000 instruction set is extensive and powerful. It has two important aspects: the instructions themselves and their addressing modes, which form the basic framework for data acquisition and manipulation.

A.1. Registers

The 68000 processor has eight 32-bit data registers (D0-D7) dedicated to data, seven 32-bit address registers which can be used for data and addresses (A0-A6), two 32-bit stack pointers (both called A7 but used separately, one for the system and one for the user) set to point to last-in, first-out temporary storage areas of RAM (stacks), one 32-bit program counter to keep count of program progress and one 16-bit status register of flags to record results of operations. The 32-bit registers can be used to handle the five basic data types: bits, bed digits, bytes (8 bits), words (16 bits) and long words (32 bits).

A.2. Addressing Modes

Each instruction is concerned with the manipulation of data of some kind somewhere in the microcomputer system: in the processor, in memory or from external hardware. The addressing modes are designed for the many ways data is accessed. There are six basic types: Register Direct, Register Indirect, Absolute, Immediate, Program Counter Relative and Implied, which encompass the 14 modes listed below. For each instruction, the data (which can be an address) which is about to be manipulated, is located somewhere in the system. The addressing modes give the ways this location is to be found. In its most general form this to-be-determined address is called an effective address (ea).

A.2.1. Addressing Modes

Immediate Data Addressing
  Immediate	the	data is the next word
  Quick Immediate	the	data is included with the	instruction

Implied
  ea = SR, SP or PC

Register Direct
  Address Register Direct	ea = An
  (data contained in named address register)
  Data Register Direct	ea = Dn
  (data contained in named data register)

Absolute Data Addressing.
  Absolute Short	ea = (next word)
  (data is at address given at next word tollowing instruction)
  Absolute Long	ea = (next 2 words)

Register Indirect Addressing
  Register Indirect	ea = (An)
  (data is at address given in named address register) Postincrement Register Indirect	ea = (An)+
  (as (An), then increment register) Predecrement Register Indirect	ea = -(An)
  (as (An) but predecrement register)
  Register Indirect with Offset	ea = d16(An)
  (as (An) plus a word length addition)
  Indexed Register Indirect with Offset	ea = d8(An,Xn)
  (As (An) plus a byte length addition together with the contents of an address or data register acting as an index)

An important version of register indirect addressing is PC relative, where the program counter is used instead of An in d16(An) and d8(An,Xn). This allows reference to memory locations relative to the current program counter and is used to generate position independent code. It is not used in this book since the assembler generates relocatable code which achieves the same end.

A.3. Instruction Set

In general instructions have associated with them a source operand and a destination operand. What these actually mean depends specifically on the instruction, for example in a MOVE instruction they do exactly what they imply - supply the source and destination effective addresses. In an ADD instruction they give the addresses of the two numbers to be added. These operands follow the instruction, on the same line. The instruction itself is like the verb of the sentence.

In addition the instruction has attributes. These are the permitted data sizes, which can be one or more of the types: byte, word or long word depending on the instruction. Also as a consequence of the instruction certain flags will be set or cleared in the condition code (status) register.

The list below gives the assembler mnemonics for the main instruction types.


Mnemonic Action Mnemonic Action

ABCD add decimal with extend

ADD add

AND logical and

ASL arithmetic shift left

ASR arithmetic shift right

Bcc branch conditionally

BCHG bit test and change

BCLR bit test and clear

BRA branch always

BSET bit test and set

BSR branch to subroutine

BTST bit test

CHK check register against bounds

CLR clear operand

CMP compare

DBcc test condition, decrement and branch*

DIVS signed divide

DIVU unsigned divide

EOR exclusive OR

EXG exchange registers

EXT sign extend

JMP jump

JSR jump to subroutine

LEA load effective address

LINK link stack

LSL logical shift left

LSR logical shift right

MOVE move

MOVEM move multiple registers

MOVEP move peripheral data

MULS signed multiply

MULU unsigned multiply

NBCD negate decimal with extend

NEG negate

NOP no operation

NOT ones complement

OR logical or

PEA push effective address

RESET reset external devices

ROL rotate left with extend

ROR rotate right with extend

ROXL rotate left with extend

ROXR rotate right with extend

RTE return from exception

RTR return and restore

RTS return from subroutine

SBCD subtract decimal with extend

Scc See set conditional*

STOP stop

SUB subtract

SWAP swap data reg.

TAS test and set operand

TRAP trap

TRAPV trap on overflow

TST test

UNLK unlink

  • A list of condition codes is shown below:

A.3.1. Condition Codes


cc carry clear

CS carry set

EQ equal

F false (never true)

GE greater or equal

GT greater than

HI high

LE less or equal

LS low or same

LT less than

Ml minus

NE not equal

PL plus

T always true

VC no overflow

VS overflow

The condition codes follow instructions such as DBcc and Bcc, but be careful! The codes test the result of a calculation in the order (destination operand) - (source operand), placing the result (if any) in (destination).

DBcc (which is used for loop processing) will go to the next instruction if the condition is true, whereas Bcc (used for a straight branch) will branch if the condition is true (and go to the next instruction if it is false).

The most obvious loop instruction DBRA (decrement a counter and branch until it is -1) is actually absent from the 68000 set. But instead DBF (decrement and branch, never true) achieves the same result. Most assemblers implement DBRA anyway (but convert it to DBF on assembly), as a service to mankind.

A.4. Variations of Instruction Types

Here are additional variations of the main types. Most important are the endings -Q and -I which refer to faster “Quick” and “Immediate” versions; Quick being the faster of the two.


ADDA

add address

ADDI

add immediate

ADDQ

add quick

ADDX

add with extend

ANDI

immediate

ANDI to CCR

immediate to cond. code

ANDI to SR

immediate to status reg.

CMPA

compare address

CMPI

compare immediate

EORI

exclusive OR immediate

EORI to CCR

exclusive OR immediate to condition codes

EORI to SR

exclusive OR immediate to status register

MOVEA

move address

MOVEA

move quick

MOVE to CCR

move to condition codes

MOVE to SR

move to status register

MOVE from SR

move from status register

MOVE to USP

move to user stack pointer

NEGX

move to user stack pointer

ORI OR

immediate

ORI to CCR OR

immediate to condition codes

ORI to SR OR

immediate to status register

SUBA ORI

subtract address

SUBQ ORI

subtract quick

SUBI ORI

subract immediate

SUBX ORI

subtract with extend

Appendix B: Devpac Assembler

There are many good assemblers available. The Devpac Amiga 2 Assembler/ Debugger by Hisoft has been used to develop the programs in this book. What is included in this appendix is a small subset which has been found to be especially useful.

It provides for editing, assembling, running and debugging a program all within the one environment. This gives the speediest development of programs.

B.1. GenAm2

This is the combined editor, assembler and debugger. You can write programs, run and debug them all within GenAm2

B.1.1. The Editor

This is a friendly screen editor, allowing you to roam freely through the entire program. Tabs can be set to convenient column positions in the instruction line which will consist of the following fields separated by spaces:

label mnemonic operand(s) comment

The label is actually an address in RAM though it appears in the program as a user-friendly word, usually having a meaning which is relevant to the program. For example if it is the point to which the program returns in a repetitive loop, it might be simply “loop”. Instruction mnemonics and operands have been discussed in Appendix 1. The comment field should explain in an informative way what is going on so that the progress of the program can be easily understood. An example might be

  loop move.w d0,(a0) save the flag

B.1.2. Moving About the File

Gross movements about a file are easily done by using the Amiga key (the outline A on the right-hand side). To go to the start (top) or end (bottom) of a file press Amiga+T or Amiga+B, respectively.

The cursor keys can be used to control movement within the screen.

B.1.3. Editing Text

Whole lines can be deleted by pressing Control+Y , and restored by pressing Control+U (useful for repeating lines). Deleting within a line can be done by pressing Backspace (backwards) or Delete (forwards).

B.1.4. Text Movement

Among the most useful facilities are those which handle blocks of text. First move the cursor to the start of the block and press F1. Go to the end of the block and press F2. A marked block can be manipulated in several ways (Help lists these):

F3 saves a block; F4 copies it (to where the cursor is),

Shift+F4 saves it to the block buffer from where it can be pasted into the next file,

Shift+F3 or Shift+F5 deletes it (but also saves it in the block buffer in case you made a mistake!),

F5 pastes in the block (at the cursor).

Amiga+W prints it out.

B.1.5. Assembly

A program can be assembled in several ways. Just to see whether it will assemble choose the Output to None option. This is the best thing to try on the first attempt. To run and debug a program choose the Output to Memory option. To save the assembled program to run independently choose the Output to Disk option and name it with the file extension .PRG. For the programs in this book beyond Chapter 6 it is probably best to assemble them to disc to avoid running out of space. They would then be run as executable programs from the CLI, for example.

B.1.6. Options

There are many options available which affect how the assembly should take place. The option OPT-D (written at the top of the source file but after a BRA to the actual program) is very useful and will retain labels in the debugger, which helps enormously to follow the program.

B.1.7. Directives

Assembler directives, which have a similar appearance to assembler instruction mnemonics but which are unique to the assembler, are fairly standard. The common ones, such as EQU (or =), DC, DS, used to fix the values of labels, set up (tables of) constants and to set up variables space, respectively, are used extensively throughout the example programs. Also used extensively to pull in files at assembly is the INCLUDE directive. This has made it possible to build up the book and the overall program by stages. The programs themselves show best how the directives are used.

B.1.8. Debugging

All assembly language programs have errors. Often, more time is spent debugging programs than writing them and so it helps to have a good debugger.

The debugger is actually called MonAm and is available as a free standing program or within the Editor. Using it within the Editor makes the cycle of editing, assembling, running, debugging complete. Most likely you will want to single step through a program and watch what happens in the 68000 registers and in memory. Three windows display the register contents, a disassembled section of program around the current address of the program counter and the contents of a selected part of memory. A fourth small window passes messages. For the purpose of changing addresses and register contents, any one of the display windows can be made active by toggling Tab.

B.1.9. Executing Programs

There are many ways of monitoring a program. Here are some of them:

Ctrl+Z or [Ctrl + Y] single step; every instruction executed
Ctrl+T single step; skips BSR’s, JSR’s, LineA, Traps
Ctrl+A single step; places a breakpoint after next instruction
(useful for by-passing DBF’s (DBRA’s)
Run produces a prompt for the type of run, eg
G run at full speed to next breakpoint

B.1.10. Breakpoints

These allow you to stop the program at specific addresses. They control the flow of the program in the different running modes. Here are simple controls:

Amiga+B set a breakpoint at an address
Ctrl+K clears all set breakpoints
U asks for an address to run to
Help show Help and breakpoints

B.1.11. Miscellaneous

Control+C terminate MonST
L list labels
P print out (active window)
M modify address
Amiga+A set the starting address (active window)
Amiga+R change contents of named register

B.1.12. Hunting for Bugs

This is a skill learned through experience. The most useful tip is to check programs thoroughly before trying them. Try to construct programs in a structured way, in modules, each of which can be thoroughly tested independently before joining them all together. Do not rely on the Debugger to find the mistakes. By that time you’ll have forgotten what each part of the program was for. Don’t be in a hurry; don’t spend one hour “bugging” and ten hours debugging!

A most common error is a bus error. This is when the program counter finds itself pointing to a wrong part of memory. This is often caused by the Stack getting out of order, particularly when a return address from a subroutine is required. Look to see how you have been using the Stack during the subroutine.

Appendix C: Number Systems

C.1. Binary

Computers are made from electronic switches which are either off (0) or on (1). The number system which can be constructed out of such units is called binary (base 2), meaning out of 2; the system which goes in powers of 10 is called denary (base 10). In the binary system numbers are assembled from powers of 2. For example:

1310 = 1*23 + 1*22 + 0*21 + 1*20

Instead of writing numbers out in this long form it is usual to arrange only the coefficients of the powers of 2 in columns. The column number, labelled from the right, gives the power of 2. Hence the number 11 is written as

1310 = 10112

Each one of the units in the binary number is called a binary digit, or bit for short. The group of four bits is called a “nibble”, especially loved by assembly language programmers who have frequent use of it.

A group of 8 bits also has a special name, a “byte”, whose common use largely dates from the age of 8-bit microcomputers, which transferred data in bytes. In more recent 16-bit microprocessors (this microprocessor labelling scheme refers to the size of the data bus) such as the 68000, groups of 16 and 32 bits are commonly used, these are called “words” and “long words” respectively.

C.2. Hexadecimal (hex for short)

Humans count in powers of 10 (probably because they have 10 fingers), and find it unnatural to count in powers of 2. But some link with the binary system is necessary for assembly language programmers, especially when memory locations are being inspected. To this end the hexadecimal number system is commonly used. In it nibbles are abbreviated into single symbols. For the values up to 9 ordinary denary numbers are used but for the values 10 to 15 (the maximum value of a nibble) new symbols are needed. Here a great opportunity has been lost. Instead of inventing new computer age symbols, the letters of the alphabet A, B, C, D, E, F have been hijacked. Hexadecimal means base 16.

In the three systems binary, denary and hexadecimal respectively, the equivalence is:


Binary Denary Hexadecimal

0000

0

0

0001

1

1

0010

2

2

0011

3

3

0100

4

4

0101

5

5

0110

6

6

0111

7

7

1000

8

8

1001

9

9

1010

10

A

1011

11

B

1100

12

C

1101

13

D

1110

14

E

1111

15

F

C.3. Negative Numbers

Negative numbers in binary are hard to get the hang of. This is because there is no special symbol reserved for the minus sign and it must be encoded within the number itself. It is done in the following way.

For simplicity, suppose we are working only in nibble size numbers (in fact there aren’t any instructions to handle only numbers of this size on the 68000, a nibble must be part of a larger number). To deal in negative numbers the total possible range, 0-15, is split equally. The interval 0-7 inclusive (8 numbers) is reserved for positives and the range 15-8 inclusive (also 8 numbers) is reserved for negatives (the range -1 to -8). It’s not as daft as it sounds. A negative number is obtained by counting backwards from 0. If there is nothing below 0 the next best to do is to go to the top and count down. In a practical sense this is a good method because all the negative numbers have their top bit set. The top bit is like a minus sign turned vertical. There is a fancy name for this convention: 2’s complement There is a simple recipe for getting the negative of a number: write it in binary, switch all the l’s to 0’s and 0’s to l’s and then add 1. Let’s try it. We know that -2 is in fact 14 so here’s the check:

Step 1

  +2 is 0010

Step 2 (2’s complement)

  change bits 1101
  and add 1 to give 1110

which is 14 and therefore correct.

The 2’s complement method of labelling negative numbers works for any size: bytes, words and long words. But be warned, only you know that the number is -2 and not 14, the computer doesn’t! To help you keep track of what is going on the 68000 has instructions, called signed instructions which treat the top bit as a sign bit. There are other, unsigned instructions, which treat numbers as positive only. These help, but there are many occasions where the programmer must watch that numbers do not exceed their allotted range and flip sign, usually with pathological consequences.

In assembly language the different number types are distinguished by their different prefixes:

  denary - none ; binary - % ; hex - $ .

Appendix D: Chip Registers

Below is a brief list of the addresses of the registers used by the special or custom hardware “Agnus”, “Denise” and “Paula”. The addresses are given as offsets from the base address SdffOOO. In general chip registers are either read only ® or write only (W) or in some cases strobe (S) (triggered by writing to) and an attempt to do the wrong one will cause trouble. Only a few of the registers are listed below, just those that appear in the programs in this book. For further details consult the Amiga Hardware Reference Manual. Another useful book is “Amiga System Programmer’s Guide”, by Abacus, a Data Becker Book.


REGISTER ADDRESS R/W FUNCTION

BLTAFWM

$44

W

Blitter first word mask source A

BLTALWM

$46

W

Blitter last word mask for source A

BLTCONO

$40

W

Blitter control register number 0

BLTCON1

$42

W

Blitter control register number 1

BLTSIZE

$58

W

Size of block to blit

BLTCMOD

$60

W

Blitter source C modulo

BLTBMOD

$62

W

Blitter source B modulo

BLTAMOD

$64

W

Blitter source A modulo

BLTDMOD

$66

W

Blitter destination D modulo

BLTCPTH

$48

W

Blitter source A pointer

BLTBPTH

$4C

W

Blitter source B pointer

BLTAPTH

$50

W

Blitter source A pointer

BLTDPTH

$54

W

Blitter destination D pointer

BPL1 MOD

$108

W

Bit plane modulo for odd planes

BPL2MOD

$10A

W

Bit plane modulo for even planes

BPLCONO

$100

w

Bit plane control register 0

BPLCON1

$102

w

Bit plane control register 1

BPLCON2

$104

w

Bit plane control register 2

BPL1PTH

$0E0

w

Start of bit plane pointers

coloroo

$180

w

Start of colour table

cop1lc

$80

w

Copper list 1 address

cop2lc

$84

w

Copper list 2 address

copjmp1

$88

s

List 1 restart strobe

copjmp2

$8a

s

List 2 restart strobe

diwstrt

$8e

w

Display window start

diwstop

$90

w

Display window stop

ddfstrt

$92

w

Display data fetch start

ddfstop

$94

w

Display data fetch stop

DMACON

$96

W

Set DMA status

VPOSR

$4

R

Read vertical beam position

Appendix E: Vectors and Matrices

Vectors and matrices go together. Whatever convention is chosen for vectors determines the convention for matrices.

E.1. Vectors

A vector is a concise way of specifying a position in space. The position is measured from a fixed position called the origin. Since space is 3-dimensional the position is determined by moving specified distances forward, sideways right and up from the origin (negative distances account for backward, left and down respectively). In mathematical language this means measuring all displacements in a Cartesian coordinate system. A position in space is then specified by the distances along the three axes at right angles one has to travel to reach it. The vector notation arises from the way this information is presented. If the displacements along the three axes to the point, P, are x,y and z respectively, then the vector r which stretches from the origin to P, as shown in Figure A6.1, can be expressed in vector notation as

r = xi + yj + zk

It is common to write vectors (which have both size (magnitude) and direction) in boldface to distinguish them from ordinary numbers which have only size. Here i, j and k, called the unit or base vectors, are signposts pointing along the x, y and z axes and the term xi means “go a distance x in the direction of the x axis” and so on. They are vectors in their own right with size (magnitude) equal to unity.

Since i, j and k really serve only to distinguish the three components of the displacement, we could omit them from the scheme providing the order is retained. The three components can be included in order inside brackets ready for multiplication with matrices in the column vector notation:

figure a5 1
Figure A5.1 A vector in Cartesian coordinates
     x
r =  y
     z

This is not the only way to represent vectors. In computer graphics it is common to represent them in the row notation

  r = (x y z)

The convention used determines the way matrices are written. In this book column vectors are used because this is more common in science and engineering and therefore likely to be more familiar to the general reader. Switching between the conventions is tiresome but fairly painless.

E.2. Matrices

As a result of rotational transforms which occur frequently in computer graphics, the coordinates of objects change in a particular way. A point P(x,y,z) will move to a new position P'(x',y',z') as a result of a rotation about some axis as shown in Figure A6.2. Each one of the new components is related to all the old components in a set of linear equations:

  x' = M11.x + M12.y + M13.z

  y' = M21.x + M22.y + M23.z

  z' = M31.x + M32.y + M33.z

where the M’s are numbers giving the proportions of the original components and are the elements of a matrix M. The important thing is that the matrix elements are related uniquely to the rotation, so that any other point rotated in an identical way about the same axis would have its new components determined by the same matrix M. Using the rules of multiplication of matrices and vectors, we can emphasise this by disentangling the elements of M from the components x, y and z of the vector. The product is written as:

X' =	M11	M12	M13	    X

r' =	M21	M22	M23		  y

z' =	M31	M32	M33   	z

The matrix product written this way is just shorthand notation for the set of linear equations which really matter when we actually come to work out the new coordinates. But writing it this way makes it clear that, once calculated, the matrix M can be used to rotate any point in the same way. In an even more concise shorthand we can summarise the transformation by:

  r' = M.r

where the product here is the matrix product and not an ordinary product of numbers.

To convert this shorthand product back into the set of equations observe that the vector has three rows and one column and the matrix has three rows and three columns. To form the top row (x') of the transformed vector r', multiply in turn each of the elements in the top row of M by each of the rows of the vector r and add them. The second row of r' is calculated from the product of each elements in the second row of M with the rows of r and so on (if we were working in the row representation of vectors everything would be the other way round). This meaning of matrix multiplication is something that just has to be learned.

figure a5 2
Figure A5.2 Point P transferred to point P'

E.3. Products of Vectors

E.3.1. The Scalar (Dot) Product

Vectors are really just a shorthand and highly suggestive way of doing geometry. A point P(x,y,z) in a Cartesian system looks much more important when represented by a vector r which stretches from the origin to the point P. Another point P' (x' ,y' ,z') is similarly represented by the vector P'.

Very often we wish to know the angle, θ, between these two vectors (referring back to the previous section it could be the angle of rotation of the vector P). It turns out that what is simplest to find is the cosine of θ which is

  cosθ = (x.x' + y.y' + z.z') / √((x^2 + y^2 + z^2).(x'^2 + y'^2 + z'^2))

The factors in the denominator look complicated but are just the magnitudes of the two vectors calculated using a 3D version of Pythagoras’ theorem. The numerator is the sum of the products of the components of the two vectors taken together. Because such a product occurs frequently in geometry it is given a special symbol and name. It is called the scalar or dot product and is written as

  r.r' = x.x' + y.y' + z.z'

It is called the scalar product because it produces a scalar answer from two vectors. Instead of writing the magnitude of a vector as a square root of a sum of squares all the time, which is tiresome, it is usual to represent it by the same symbol as the vector but without boldface. Hence the cosine is given by

  cosθ = (r.r')/r.r'

where r = |r| = √(x^2 + y^2 + z^2) and likewise for r'.

The operation |r| means ‘the magnitude of r.’

Notice that the scalar product r.r' is proportional to cosθ and, most important, has the same sign as cosθ. The sign of the cosine turns out to be a very useful test of whether two vectors are parallel (pointing in the same direction) or antiparallel (pointing in opposite directions) and plays an important part in testing for the visibility of surfaces.

E.3.2. The Vector (Cross) Product

This is a product of two vectors which produces a new vector. Once again it is based on a useful application. In this case it generates the vector which is normal (at right angles) to both the original vectors. Another way of stating this is to say that the new vector is normal to the plane containing the two product vectors. This is shown in Figure A6.3. The new vector r'' and the vector product are defined by:

  r'' = r x r'

The vector r'' is normal to the plane containing r and r' and its magnitude is equal to r.r' ,sin(θ). The components of r'' are

x'' = y.z' - z.y'

y'' = z.x' - x.z'

z'' = x.y' - y.x'

There is one important aspect of vector products which is also true of matrix products, the order of multiplication matters; the product r x r' is not the same as r' x r. In fact

r' x r = -r x r'

The direction of r'' is obtained by twisting r into r' through the smallest angle. The direction in which this is seen as a clockwise rotation is the direction of r''.

figure a5 3
Figure A5.3 Vector cross product

The vector product is complicated but very useful in computer graphics. It is used to construct vectors which are normal to surfaces. We discuss this next.

E.3.3. Surface Normal Vectors

It is often necessary to construct a vector which is normal to two other vectors. This occurs in the calculation of surface normal vectors and coordinate transforms. In the case of a surface normal vector the objective is to construct a vector which is normal (at right angles) to the surface.

What this amounts to is forming the vector product of two vectors which lie in the surface, as discussed in the previous section. Usually these two vectors are not presented as such but have themselves to be constructed from polygon vertex coordinate lists. Suppose three consecutive vertices of a convex polygon are

Pl(x1,y1,z1), P2(x2,y2,z2) and P3(x3,y3,z3) and that these go clockwise round the perimeter. The two vectors which can be multiplied in a cross product to give a vector pointing out of the surface are

r = (x3-x2)i + (y3-y2)j + (z3-z2)k

r' = (x2-x1)i + (y2-y1)j + (z2-z1)k

so

r'' = r x r'

E.3.4. Base Vectors

Base vectors are unit vectors which point along the axes of the coordinate system. In Cartesian coordinates, i, j and k are the “base” vectors. They each have magnitude 1, so the only thing that distinguishes them is their direction.

E.4. Matrices

Matrices have already been discussed in the previous section. In computer graphics they represent a transformation of some kind. The matrices which are most straightforward to deal with are those associated with rotation and are discussed further in Appendix 6.

The rule for multiplying two matrices in the same as that of multiplying a matrix and a vector (as discussed in the previous section) where the vector is taken as a matrix having one column and three rows. Adding extra columns to the vector makes it a matrix and produces extra columns in the product. For a product to be possible there must be as many columns in the first matrix as there are rows in the second matrix.

The matrices which describe rotation about the three axes x,y and z all have three rows and three columns (unless they are in homogeneous coordinates): they are 3x3 matrices. The act of building up a complex rotation from the separate matrices in some order is accomplished by multiplying the matrices together. This is called matrix concatenation. Just as with the vector cross product, the order of the matrix multiplication matters: the matrix farthest to the right is the first rotation and that closest to the left is the last rotation.

E.5. Homogeneous Coordinates

Unlike rotations, certain types of transform, such as translations and perspectives, cannot be written as 3 x 3 matrices and made to operate on vectors as a product. Since, for the purpose of concatenation, it is desirable to put all transforms on an equal footing, homogeneous coordinates are used to convert all transforms to 4 x 4 matrices which can be multiplied.

This means moving to a 4-D space (not real space, just a mathematical convenience) in which the additional dimension is always 1. The extra degree of freedom this gives is sufficient to convert all transforms to 4 x 4 matrices. Likewise all vectors must have a forth component, 1. Putting this fourth dimension to unity means we are working on a “plane” in the 4-D space which has the intersection 1. The “plane” is normal 3-D space.

Appendix F: Geometric and Coordinate Transforms

There are two types of transform used widely in computer graphics: geometric and coordinate transforms. What is confusing is that they are really two aspects of the same thing and it is possible to achieve the same end result by either method. However in order to stay sane it helps greatly to think of them as different, choosing one or the other depending on the problem. Many clever shortcuts become possible once the distinction and connection between them is understood.

Imagine that you are sitting in a swivel chair positioned at the centre of circular carpet in a room with black featureless walls. Since there is no external reference point (apart from remembering what actually happened) it is not possible to distinguish between rotating the chair to the right on a stationary carpet, or keeping the chair fixed and rotating the carpet to the left. The observer on the chair sees the same relative movement of chair and carpet and his view of the carpet pattern is the same in both cases. But we must be careful to establish a scheme of rotation of either the chair or the carpet which are consistent. Let us decide that left rotations are positive and right rotations are negative. Then we can see that a positive rotation of the chair (the observer) is equivalent to a negative rotation of the carpet (the object): they are said to be the inverse of each other.

Now we come to the formal definitions. Rotating the observer is called a coordinate transform and rotating the object is called a geometric transform. There are many times in computer graphics when we wish to do both of these. When an object is moved in the world frame, it is subject to a geometric transform. When we wish to see the world from a different point of view a coordinate transform must be done. When the observer is controlling his viewpoint orientation by means of a joystick it is useful to exploit the connection between the two transforms.

F.1. Coordinate Systems and Frames of Reference

To some extent these terms are used interchangeably. For the most part the positions and vertices of objects are determined in Cartesian coordinates by a set of three x, y and z axes at right angles. The position of the zero of this set of axes is called the origin of the coordinate system. The whole constitutes a frame of reference to track subsequent motion of the various objects. As we have seen, there are two types of movement: a coordinate transform (when the observer moves) and a geometric transform (when an object moves). When the object moves it is easiest to keep track of what is going on by following the motion of the frame of reference attached to the object itself. We have called this the object frame. In the main text the object-to-world transform was made by selected rotations and a displacement of this object frame. Now we can see exactly how this works.

Figure A6.1 Rotation of an object

udmage::figure-a6-1.jpg[width='65%']

Imagine a set of axes permanently attached to the object so that when it moves they also move. For simplicity, we consider a rotation by an angle θ about the z axis, as shown in Figure A6.1. A transform matrix is now needed to relate the coordinates after the rotation (x1,y1,z1,) to those before (x,y,z). The beauty of this scheme is that we can construct this matrix by observing what happens to the base vectors. Remember, the base vectors arc the unit vectors (of size 1) pointing like sign posts along the x,y and z axes. The base vectors before the rotation are i, j, and k and after the rotation are i1, j1 and kl. Looking at the Figure we can see the relations between these:

i1 = cosθ.i + sinθ.j

j1 = -sinθ.i + cosθ.j

k1 = k

leading to a transform matrix for the base vectors:

  | cosθ  sinθ  0 |
  |               |
  | -sinθ cosθ  0 |
  |               |
  |   0     0   1 |

Now this matrix as it stands cannot be used to transform the coordinates (x,y,z) to (x1,y1,z1), but curiously enough, its inverse can. Fortunately, the inverse of a pure rotation is simply obtained by switching (transposing) the rows and columns. In technical language, the inverse of a rotation is its transpose. Doing this yields the matrix:

  | cosθ  -sinθ 0 |
  |               |
  | sinθ  cosθ  0 |
  |               |
  |   0     0   1 |

so that, for example, in a rotation by 90 degrees, the point (0,1,0) becomes the point (0,0,1) and the point (0,0,1) becomes (0,-1,0). So we have found a way of rotating an object to a new orientation: perform that reorientation on the object base vectors and express the result in terms of the original base vectors: then transpose the matrix to produce the coordinate transform matrix.

Can the original matrix be used for anything? Yes. As it stands, before it is transposed, it is a coordinate transform. If we were to leave the object stationary and just rotate the frame of reference, it gives us the transform to calculate what the object coordinates appear to be in the new rotated frame. This is shown in Figure A6.2. Hence in the rotation of 90 degrees, the vertex (0,1,0) appears to be at (0,0,-l), and the vertex (0,0,1) appears to be at (0,1,0) when seen from the rotated frame. Note that in both of these rotations, of the object and reference frame respectively, the sense of the rotation was positive.

figure a6 2
Figure A6.2 Rotation of a frame of reference

Now we can see the qualitative discussion concerning the observer on the swivel chair and the carpet expressed mathematically. The transform which calculates the coordinates of the object after its positive rotation is:

  | cosθ  -sinθ 0 |
  |               |
  | sinθ  cosθ  0 |
  |               |
  |   0     0   1 |

and the transform which calculates the new apparent coordinates of the stationary object after the reference frame has been moved in a positive direction is:

  | cosθ    sinθ  0 |
  |                 |
  | -sinθ   cosθ  0 |
  |                 |
  |   0     0     1 |

They are different when both involve a positive rotation but become the same if the reference frame (the chair) is rotated negatively. Then the angle θ is negative and because sin(-θ) = -sinθ but cos(-θ) = cosθ the terms involving sinθ change sign but those involving cosθ don’t.

This is only restating the fact that rotating the reference frame one way gives the same relative motion as rotating the object the other way.

Appendix G: Program Structure

This appendix shows the file content of each chapter so that you can see where new files have been introduced. The file at the left is the main control file and the files to the right of it are the files it includes. For each of these, the files in parenthesis underneath are the included files.

G.1. Chapter 3

polydraw.s  core_00.s bss_OO.s systm_00.s equates.s data_00.s

G.2. Chapter 4

clipframe.s core_01.s   bss_01.s    systm_00.s    equates.s   data_01.s
            (core_00.s) (bss_00.s)

G.3. Chapter 5

perspect.s   core_02.s   bss_02.s    systm_00.s    data_01.s   data_02.s
           (core_01.s) (bss_01.s)                (data_00.s) equates.s

G.4. Chapter 6

otranw.s   core_03.s   bss_03.s    systm_00.s    data_03.s   data_01.s
           (core_02.s) (bss_02.s)                (data_02.s) data_01.s

G.5. Chapter 7

illhide.s   core_04.s   bss_04.s    systm_00.s    data_03.s   data_01.s
            (core_03.s) (bss_03.s)                (data_02.s) data_00.s
                                                  data_04.s

G.6. Chapter 8

trnsfrms.s  core_05.s   bss_05.s    systm_00.s    data_03.s   data_00.s
            (core_04.s) (bss_04.s)                (data_02.s) data_05.s

G.7. Chapter 9

wrld_scn.s  core_07.s   bss_06.s    systm_00.s    data_03.s   data_00.s
            (core_05.s) (bss_05.s)                (data_02.s) data_05.s

G.8. Chapter 10

wrld_scn.s  core_07.s   bss_07.s    systm_01.s    data_00.s   data_06.s
            (core_06.s) (bss_06.s)  (systm_00.s)  (data_02.s) (data_03.s)
                                                  data_07.s   (data_05.s)
                                                  data_08.s

eulr_scn.s  core_08.s
            (core_07.s)