License
This work is licensed under the Creative Commons AttributionShareAlike 4.0 International (CC BYSA 4.0) License. To view a copy of this license, visit https://creativecommons.org/licenses/bysa/4.0/ or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
Copyright 1992 Andrew Tyler
Released under CC BYSA 4.0 2018
Preface
“A picture is worth a thousand words”. This statement sums it all up.
A few years ago, when I first opened a book on computer graphics, I was stunned by the beautiful simulations of lifelike objects generated by computers. But these were from stateoftheart machines, far more powerful than the popular personal microcomputers of the time, which were almost exclusively 8bit.
With the advent of 16bit micros things changed markedly. Their extra power and memory had an immediate impact on all graphics applications, from painting programs to fast flight simulators sporting solid 3D primitives (objects). The low price and high power of micros such as the Commodore Amiga meant that anyone could enjoy high quality computer graphics (especially in games) at relatively low cost. But enjoying other people’s programs is only half the fun. Surprisingly, writing them is not really as difficult as it looks. Of course there is a fair amount of technology to be learnt along the way, but a good deal of the dramatic effect comes from the speed of the machines themselves, performing fairly standard algorithms very fast.
When I first became interested in graphics programming and wanted it to be as fast as possible in machine code, it seemed to me that essential information was spread thinly in the literature. There were certainly books on machine code programming and on computer graphics; there were even a few books on machine code graphics programming. But somehow I could never quite find the balance I was looking for. Standard texts on computer graphics seemed amazingly obscure on certain aspects of transforms, in particular how to picture a scene from an arbitrary view point. I felt, quite unreasonably perhaps, that there was a tendency to hide it all behind a smokescreen of professional mystique; certainly it helped considerably to understand the mathematics of vectors and matrices, but surely all this had been worked out years ago and ought to be fairly straightforward? Perhaps it was just me! Anyway I wanted to write 3D solid graphics programs that would run in real time (like a flight simulator), and couldn’t find anyone who would tell me how to do it. For sure the people who write commercial games knew, but they weren’t telling — for obvious reasons! There were a few very useful serialised articles in magazines but, by necessity I’m sure, these were often too brief and not exactly what I wanted.
Things came to a head when I was assigned to give a college course on Advanced Microcomputer Software (which was another way of saying “Assembly Language Programming on the 68000”). Teaching programming, especially in assembly language, can be a very sterile pastime unless the application is interesting. What better application than graphics and what better machine (for the price) than the Amiga.
This book arose from my efforts to penetrate the world of computer graphics and make some of the basics understandable (I hope) to nonspecialists. It is about fast 3D (socalled vector) graphics in assembly language. There is certainly no guarantee that the programs in this book are the most efficient, most elegant and fastest of their kind. But they are reasonably fast. Certainly as fast as some commercial programs! The astute reader will undoubtedly be able to make improvements (and tell me, I hope).
There is no assumption that the reader has any prior knowledge of any of the following subjects, all of which eventually figure heavily in the graphics process: the Amiga Operating System, vectors and matrices. There are further explanations in the Appendices. That is not to say that the book contains exhaustive discussions of these subjects, only sufficient for the purpose in hand. The enthusiast will undoubtedly wish to add to them.
As regards the assembly language, although an Appendix contains a list of the instruction set and (most important) the addressing modes, it is assumed that the reader who wishes to fully understand what is going on will have on hand a 68000 code reference book (they are available in pocket form very cheaply).
For the writing, assembly, debugging and running of the programs in the book the powerful and friendly Devpac Amiga assembler from Hisoft, which is one of several good commercial assembler/debuggers, has been used. This comes as an integrated package within which all functions can be performed. Further information on the assembler is given in Appendix 2.
The book is laid out in serial form. Each chapter deals with a different topic and illustrates its application with example programs. To the experienced reader the early chapters will seem pedestrian. To the newcomer they will not. There is really no easy introduction to the overall process and so each stage (a somewhat artificial division) is dealt with in detail separately. Each stage of the graphics “pipeline" does a specific task and has its own algorithm and strategy. The chapters are laid out to reflect the build up of the overall process. Each chapter has its own example programs and the programs saved from the earlier chapters are used in later ones so that they don’t have to be entered more than once. In this way the example programs at the end of the book end up being the largest and most complex, though the amount of code you have to enter for each new chapter doesn’t really increase very much. The programs are written for the Amiga but can be modified to run on any 680000 based computer since, with the exception of certain specifics concerned with the screen and operating system, the graphics routines are entirely independent and selfcontained.
Computer graphics is a vast subject; a book of this length can only cover a small part. Especially since it is not just descriptive but contains working programs. Techniques such as Ray Tracing and Radiosity methods are perhaps better suited to a future, more powerful generation of personal computers. But that will come; it is likely that many of the software routines discussed here will be replaced in future machines by hardware “geometry engines”.
Until then, 3D graphics will have to be done by “bashing the bytes”.
One last very important word of caution. The experienced programmer knows all about it.
Do make frequent backups of your work — about every two hours. In writing programs of this kind, close to the hardware, there is no safety net! A faulty program can easily crash the system and spew garbage at the disk drive as it goes down. It’s happened to me. You want to lose as little of your recent work as possible. Associated with this is a useful practice — put on the disk writeprotect before you run a new program. It’ll help save the disk if it crashes.
Good luck.
Andrew Tyler,1992
1. An Overview
Computer graphics is not a minority interest of computer freaks. It is a multibillion dollar industry. Even in 1982 when Hollywood spent 3 billion dollars on movie production, the world commercial computer graphics industry spent 2 billion dollars and was growing at the rate of 30% a year. In the same year in the U.S. 10 billion dollars were spent on video games. There has been no halt since that time. Computer graphics is very big business indeed.
The microcomputer owner meets some of the best graphics for his machine in games, many of which use advanced concepts straight out of the professional computer journals. For small machines there are always limitations on what can be achieved, determined by the speed of the processor and the size of RAM. But in recent years the popular microcomputer has been extremely good value for money, having considerable computational power at very low price and providing complex graphics at minimal cost. The Amiga is just such a computer. This explosion in the power/price ratio of computer hardware has put immense computing capability in the hands of the popular micro owner and made advanced graphics techniques, which were the domain of the professional, available to anyone.
The aim of this book is to develop fast 3D solid graphics routines which run in real time and include features such as windowing (clipping), hidden surface removal, illumination from a light source, joystick control, full perspective and rotational transforms and ending up with a flight simulator type program. The programs are written in 68000 machine code to run on an Amiga 500 but the algorithms are valid for any machine. In short, everything needed to get started on a flight simulator.
The programs are written in assembly language for maximum speed and have been tested and run using the Hisoft Devpac Amiga assembler. There are many excellent commercial assemblers available at modest expense, and even some in the public domain. There is nothing more irritating when looking for a persistent and obstinate bug in a program than an unfriendly assembler. The Devpac assembler has been a friendly and helpful companion through the many hours required to develop the programs in this book.
1.1. A New Medium
What is ‘computer graphics’? It is certainly shrouded in mystique to some degree. Because it is still a relatively young subject its evolution is continuing apace, and is intimately linked to the power of current computers and the special graphics hardware incorporated in them. The solutions to many of the problems of yesterday, once based in software, are now provided at great speed in hardware. It is likely that much of the software of the kind developed in this book will be replaced in future machines by dedicated ‘geometry engines’.
1.1.1. Is it Art, or What?
Humans are very good at generating and recognising complex visual patterns but not very good at doing arithmetic. By contrast, digital computers were designed to be perfect at binary arithmetic. What else they can do depends on how well complex mathematical functions can be constructed from basic binary arithmetic. There is a limitation here since numbers in a computer cannot be more accurate than the number of bits assigned to them but, apart from that, it is clear that complex mathematical calculations can be done quickly on even very modest microcomputers.
In computer graphics, the computer adds tremendous speed to any calculation associated with geometry, which is the mathematics of drawing. Because geometry is concerned with the exact mathematical relations between lines and surfaces, it is ideally matched to the way the computer works. This is the good and the bad news of drawing with computers: precise mathematical functions can be expressed graphically at lightning speed but making them look like natural objects requires considerably more work. In fact much of the effort in computer graphics is now concerned with ‘messing up’ the perfect but sterile images of geometry to make them fit for human consumption. Doing this has less to do with computers and more to do with the traditional skills of animation discovered many years ago by Walt Disney.
It is very easy to draw precise mathematical shapes with a computer because such shapes can be generated from a formula. A circle is an example of a simple mathematical function. For a circle centred at the origin of an xy coordinate system the formula is
x^{2} + y^{2} = r^{2}
Such a function is a good starting point for a billiard ball but a poor starting point for an apple, although superficially the difference is not all that great (both have an overall spherical shape with a shiny exterior). Let’s consider how we might use a computer to draw an apple.
First of all there has to be a good starting point. There is no such thing as a mathematical formula for an apple. All apples are different. However, apples do have a typical shape and that is what the human artist knows from experience. But an artist would not draw all the apples in a still life with the same shape, it would be too boring. Programming a computer to avoid repetition and simplicity is difficult.
One way to draw apples would be to use equations of curves having the apple shape. By choosing functions with high powers of x, y and z, as much sharpness or flatness as desired can be included. This is the world of bicubic patches, Bezier functions and betasplines. This would certainly allow variation, but with considerable computational effort. One way to do this would be to hold different apple outlines as (x,y) coordinate pairs in a data base and then use curve and surface fitting techniques to connect then as in a “join the dots” picture. This is how the famous teapot of Martin Newell, which was a prototype in the early development of modelling solid surfaces, was constructed. In technical language it can be constructed from an outline consisting of three Bezier curves. Since the teapot is symmetrical, its surface (with the exception of the spout) is then generated by rotating the outline about the central vertical axis.
Another way is to avoid curves altogether, and instead subdivide the surface of the apple into many flat facets like a gemstone. The little facets, being flat and many sided, are polygons and the surface of the apple is a polygon mesh. This approach is less time consuming than using curved patches but there remains the problem of disguising the sharp boundary edges between polygons.
This leads to the next level of refinement in producing a convincing image. A mathematical function on its own knows nothing of the laws of physics. These are so familiar to us that we take them for granted: glass is transparent but wood is opaque, metals look bright and shiny but human skin is dull and diffuse. Somehow these subtle but essential clues must be included. The most important first step is to make the rear surfaces of opaque objects invisible. This is called hidden surface removal which, despite the apparent simplicity of the task, turns out to be quite difficult. Much time has been spent investigating efficient and thorough ways of doing this. Next there must be visual clues to the surface structure. One obvious step is to illuminate it with a light source so that one side is brighter than the other.
At the next level of refinement the surface must be textured and patterned in a “natural” way to look real. In this the programmer is aided by the mathematics of fractals, developed and promoted by Benoit Mandelbrot. This is the geometry of selfsimilar structures and quite different from the geometry of Euclid where structures are built from perfect lines and surfaces. Natural objects appear to have a lot in common with selfsimilar structures and even if the similarity is not exact, they are convincingly modelled by them. A selfsimilar structure is one which has the same appearance at any level of magnification. Of course natural objects may only satisfy this definition over a limited range of dimensions but it often produces very convincing results. For example, the side branch of a fern when magnified looks like the main branch and small pebbles under magnification look like boulders. Nature is full of such structures. An additional bonus is that algorithms have been discovered which allow selfsimilar structures and landscapes to be generated from a relatively small amount of information. This relieves the programmer of carrying a colossal database from which to generate each separate detail of a complex scene.
All of these steps are essential to give a convincing image. The fact that so much visual richness is required to make an image look real testifies to the very advanced pattern recognition capability of human beings.
When all this is done, what have we got? Just a very roundabout way of painting an apple? The difference is that once created in software the graphic entity has an independent existence. The picture on the screen is just the final stage. Even if not being currently displayed, it can evolve according to rules included in the program. There is not even the constraint to create objects which are modelled on real life. It is possible to invent new “lifeforms” inside the computer. In Computer Aided Design (CAD) this is what happens all the time. Machines are designed, built and tested inside the computer long before they exist as material objects. In simulators and games this aspect is pushed as far as possible. Computer games specialise in generating artificial realities; the more exotic the better.
Future developments in inputoutput devices will undoubtedly have a major impact on what is currently called computer graphics. At the moment the emphasis is on generating realistic images. But images are only computer output designed for human input through the eyes. What will it be called when all of the senses are involved? Already, with the aid of spectacles which give separate input to each eye and tactile stimulation on the hands, it is possible to enter totally into the world inside the computer. This is Virtual Reality or Cyberspace. What will it be like when the computer couples directly into the human nervous system without the need for an intermediate interface? Aside from the minor consideration of feeding the body, it will be possible to live out an entirely artificial existence inside the computer.
Computer graphics is the thin end of a very long wedge which started when computers first produced a visual output in response to human input. Where it will end is unknown, but along the way it is sure to be lots of fun.
1.2. What Can You Do With A 16bit Micro?
The answer to this question is best illustrated by looking at what is achievable and a powerful commercial system, of which a good example is the Reyes system developed at Lucasfilm Ltd and currently in use at Pixar. This has been used to make a number of well known short film sequences including “The Adventures Of Andre and Wally B”, “Luxo Jr.”, “Red’s Dream” and the animated knight sequence from “Young Sherlock Holmes”. The Reyes system was set up to compute a full length feature film in about a year, incorporating graphics as visually rich as real life. Assuming a movie film lasts about 2 hours and the film runs at 24 frames per second, this means each frame must be computed (rendered) in approximately three minutes.
The basic strategy in this system is to represent each object (geometric primitive) in a scene by a mesh of micropolygons which are subpixelsized quadrilaterals with an area of 1/4 of a pixel (the smallest visible unit on the screen). All the shading and visibility calculations are done on these micropolygons. The overall picture is constructed like a movie set with only the visible parts actually being drawn. Micropolygons are deemed to be invisible if they lie outside a certain viewing angle or are too close or too far away. The final system includes subtleties such as motion blurring, the effect whereby objects in motion appear to be blurred at their trailing edges. This is one of the devices used to enhance the impression of motion and is another lesson learned from traditional cartoonists.
A very complex picture in this system typically uses slightly less than 7 million micropolygons to render a scene of resolution 1024x612 pixels. With 4 light sources and 15 channels of texture a picture takes about 8 hours of CPU time to compute on a CCI 6/32 computer which is 46 times faster than a VAX11/780. Frames from “Young Sherlock Holmes” were the same resolution and took an hour per frame to compute. In the final movie all the stored frames are played back as in a conventional film.
But it’s not necessary to go as far as this to produce high quality pictures. There are mow “personal” graphics stations available at prices almost within the reach of mortals. The Personal Iris machines manufactured by Silicon Graphics are good examples. They offer 256 colours (8 planes) from a palette of 4096 and, using a hardware “geometry engine”, are able to perform transforms such as scaling, rotation, hiddenline removal and lighting, amongst others, to produce 3D motion in realtime. The CPU is a 20MHz R3000 RISC processor with a R3010 FPU (floating point unit). Here RISC technology has been used to maximise the speed, but it is interesting to note that before 1986 Silicon Graphics used the 68000 processor. It will not be long before machines such as these drop into the personal computer market.
What about a micro like the Amiga 500 with 512 kbytes of RAM and a CPU working at 7 MHz? The potential for detailed graphics is somewhat less, especially if frames are to run in real time, sufficiently fast to avoid intolerable flicker. But it is surprising how much can be achieved. For speed, building up solid objects using polygon meshes is most attractive since it only requires that the vertices be stored, and a large object can be described by a very small amount of information.
Moreover, since polygons are sets of vertices joined by straight lines, the most complex algebra involved will be that of simple geometry. This is the strategy we will use.
1.3. Assembled for Speed
There are many computer languages but assembly language gives the best opportunity of getting as close to the hardware as possible and tailoring to the application in hand. All the programs in this book are written in 68000 assembly language and except for “housekeeping chores” and specific hardware functions in the first chapter, do not use any of the routines in the Amiga operating system. The programs could therefore easily be rewritten to run on a processor other than he 68000 since the most difficult thing is the overall program structure. Language details are secondary.
Assembly language is very exacting and unforgiving with a masochistic charm all of its own. It really has very little grammatical structure beyond the syntax of the instructions themselves, and the main criteria for efficient programming are speed, economic use of registers and memory, and efficient parameter passing. Sometimes there is conflict between these, especially where there is no shortage of memory. Where speed is all important, programs often sacrifice brevity in order to avoid timeconsuming subroutine calls.
The programs in this book have been assembled and run using the Devpac Amiga assembler from Hisoft but any other assembler will do providing changes are made to comply with its format. The simple but powerful INCLUDE directive allows files to be pulled together at assembly time without the need to define global variables. The INCLUDE directive can be nested to any depth that memory will allow so that each chapter can INCLUDE the programs from earlier ones. In this way there is hardly any duplication, and a program file, once entered, can be used later. The overall program therefore grows steadily in size as the book progresses and practically no programming effort is wasted. The final program INCLUDE’s all earlier parts. This is the only linking which needs to be done and it is painless.
Appendix 2 gives a brief description of assembler usage in general and the Devpac assembler in particular, including those commands which have been found to be most useful.
1.4. Writing for a 16 bit Micro
Writing programs in assembler for a 16 bit micro is quite different from writing for an 8 bit micro. Apart from the more powerful addressing modes available, there is a fundamental difference which centres on the ideas embodied in position dependent and independent code. The picture is somewhat confused by other similar sounding terms such as absolute and relocatable code. We shall discuss what these mean because they have a profound effect on how a program is written in assembler.
In an 8bit micro usually only one program at a time is loaded in RAM and at a fixed location. Of course where an operating system oversees the running of programs, such as CP/M, things are more complicated. But in small micros with built in BASIC and very little else, the operating system reserves fixed space for its variables area and frees everything else for the current program. Knowing where the program resides in memory makes life simple for the programmer since fixed addresses can be assigned for variables and these will never change. A program which directly addresses fixed memory locations is said to be written in position dependent or absolute code.
Though such code can be written for computers with operating systems, there is another way of doing things which gives much greater flexibility, and allows several programs to reside in memory simultaneously. A consequence of this is that the actual position in memory of a particular program will not be known until run time. As a result, no actual numerical address can be referred to in the program since it is not fixed until the program is loaded and run.
There are several ways of overcoming this problem. One way is to use an addressing mode of the processor specifically designed to generate position independent code. This is called PC (program counter) relative addressing. What it does is locate an address not as an absolute value but relative to the value of the program counter where the reference is made. The assembled code will tell the processor to calculate the actual address by adding or subtracting a displacement to the current value of the program counter, which will always have a fixed value relative to the start of the program.
Another way is to calculate all addresses from a base address, or pointer, held in an address register. The program will then constantly refer to offsets from the address register but no actual value for the address need be specified when the program is being written. The register cannot, of course, be used for anything else while it is reserved in this way. The special register will have to be set up at the start of the program with the correct pointer. A good pointer is the address of the end of the program.
Another way is to allow the assembler take care of everything and generate relocatable code. This is code where no reference to specific addresses is made, but instead labels are used. The label name is chosen to be informative and of assistance to the programmer. For example, COLOUR might be the label for the long word address where the byte length value of the current colour of a polygon is held. The assembler will mark such a label as relocatable and its address will finally be fixed by the computer operating system when the program is loaded.
All of the programs in this book use relocatable code generated by the Devpac assembler. It is simple to write.
The instruction set of the 68000 is long and complex. To fully appreciate its power and elegance the reader should refer to the Motorola 16Bit Microprocessor User’s Manual. A brief listing is given in Appendix 1.
1.5. The Programs
The programs in this book have been written using the Devpac assembler and are ready to run. Once a program has been entered all that is necessary is to assemble it from within the editor and it will run as described. The program files all have the extension .s since they are source files. If a program is to run independently it can be assembled to disc with the file extension .prg. In fact for reasons of space, it is likely that all the programs beyond Chapter 6 will have to be assembled to disc.
The programs have all been run extensively to ensure they are as bug free as possible, and the listings have been obtained from within the assembler Editor using the PRINT BLOCK facility to ensure that there are no further stages of transcription during which errors might creep in. However as with all human endeavours, there can be no guarantee that the programs are completely bug free.
The programs are undoubtedly neither the fastest nor most elegant examples of their kind in existence but, in a tutorial of this kind where the emphasis is on teaching, the main point is to understand how things are done. The astute reader will quickly discover clever ways of improving them. In any case the best commercial programs are proprietary and kept secret from us.
1.6. The Amiga Operating System
The Amiga operating system is large and complex and operates at many levels. There are often many ways of doing the same thing depending on the level of entry.
Using the device independent routines ensures that programs are portable, i.e. they are shielded from hardware details and in principle work on any machine with the same operating system. The penalty is one of speed. Generally the closer you get to the hardware, the faster things run.
Apart from this all the programs are “original” (if there is such a thing in programming) and tailored closely to the graphics applications.
2. Modelling a 3D World
One of the most fascinating things aspects of computers is the way they can be used to build lifelike models. The great attraction of realistic computer games and, at the more serious end, simulators stems from the way the computer screen can be made to look like a window onto an invented universe  a Virtual Reality. Some famous scientists, impressed with the similarity to the process of creation, have even gone so far as to consider theories of reality based on a real Universe built up from ‘bits’ of information. Whatever the fundamental significance of it all, the fact remains that computers offer a new dimension for human expression and experience. Simply put, they provide the possibility to create alternative realities where the laws of Nature may or not apply. All sorts of strange and exotic situations can be invented and investigated. For human beings, who relate most easily to objects and situations met in everyday life (and dreams), what appears on the computer screen should look familiar. Great effort has gone into constructing models of this kind. In a simulator which is supposed to accurately depict reality, the emphasis is on models which obey the laws of Nature precisely.
In this chapter we will look at a way of modelling which provides a very fast and reasonably accurate picture of real objects. For the most part, but not completely, this involves polyhedral structures with polygonal faces as the building blocks, the socalled ‘vector’ graphics. Spheres and other objects with a high degree of symmetry can also be drawn quickly. Actually, to set the record straight, vector graphics originally meant something else. It was a name given to a mode of display where points on the monitor were joined directly by an electron beam that could be switch quickly from one part of the screen to another. This did not require much memory devoted to the screen and gave very fast ‘wireframe’ pictures. The displays on monitors today do not use this technique. Instead, the image is built up from horizontal raster scans from one side of the image to the other. It is called raster scan (or scan conversion) graphics. The speed with which an outline can be filled by raster scans makes it a very useful technique. However the name vector graphics has become commonly used to describe the graphics modelling technique itself, not the display technology. The adjective “vector” here really refers to the extensive use made of vector geometry in the programs.
One other important technique is the Block Image Transfer type of graphics, in which SPRITES play an important role. The Amiga has a piece of hardware on board, the BLITTER (BLock Image TransfER), which handles such operations very quickly. In such graphics, blocks of memory are manipulated as a whole, which is very useful since, once laid out in RAM, scan conversion need not be done a second time. The block of bytes is simply moved to the screen area. Some very clever and fast things can be done this way, particularly with sprites, but the relationships between the parts of the image are essentially determined by how the block is initially laid out in RAM. Sprite graphics is not discussed any further in this book.
Having said that, it is likely that the next generation of popular computers will have hardware implementation of all the common graphics functions including the ‘vector’ graphics we are about to discuss. It is very probable that soon all graphics functions will be done by very fast hardware ‘geometry engines’.
2.1. 3D Modelling
“Realtime” 3D modelling has to be very fast. This is because humans can spot the flicker of the picture if it changes more slowly than about once every 50 milliseconds. In order to work in real time, the viewer has to be able to enter new data through the keyboard, joystick or mouse and see its effects immediately. The solid 3D structures which can be transformed and drawn on this time scale most easily are polyhedra.
Polyhedra are very good graphics building blocks or ‘primitives’ for several very good reasons:

they are completely defined by their vertices,

the faces are polygons with straight edges,

in any transformation only the vertices need to be recalculated,

a transformed polygon is also a polygon

polygons can quickly be filled in to look ‘solid’ using raster scans.
What all this means really is that it’s very hard to draw and shade in curved surfaces which don’t have high symmetry (like circles) and the only 3D objects without curved surfaces are polyhedra.
In fact computer graphics does not have a monopoly on the use of polyhedra as basic building blocks. The real world uses them extensively; many houses are made from bricks, which are sixsided polyhedra.
2.2. Transformations and Frames of Reference
All of the above statements concerning polyhedra can be translated into a definite mathematical framework called vector algebra, which is a very elegant and precise formulation of the mathematics of lines and planes. It becomes even move useful when presented in matrix form and it is this approach which usually appears in text books on computer graphics. For someone with little knowledge of advanced mathematics this looks very intimidating. Actually it’s not. Many secondary school syllabuses handle simple rotations using 2x2 matrices, and it really isn’t much more complicated than that. For those of us who do not wish to blaze new trails in the world of mathematics it is simply a case of understanding the general method and taking the results on trust. After all, once you have seen the transforms working you can use them in your programs and forget about them. There’s no need to reinvent the wheel.
For the moment though, in order to see the problem laid out in its entirety, let’s consider all the various stages of transforms, as shown in Figure 2.1. The distinction between the view frame and the world frame, and transformations between them, is discussed in further detail in Appendix 6.
2.2.1. The Object Frame
An object which exists inside the computer has quite a complicated life before it is seen on the screen. Most of this complication arises from the various transforms required to make it ‘lifelike’. But whatever they are (rotations, translation or even something more exotic), the object must preserve its original identity, i.e. its relative dimensions. What this means is that no calculation can be absolutely precise and, with the picture being recalculated faster than 20 times each second, if the original definition were not continually referenced, it would not be long before accumulative errors would make it unrecognisable (this problem crops up in all our calculations which, for speed, are done in only limited accuracy). Therefore it is necessary to constantly refer back to the original data which define the object. We call this place, in which the object is defined, the object frame (there is nothing sacrosanct about this name, other people have invented other names). Of course it doesn’t ‘exist’ in any real sense, it’s just that the numbers which fix the positions of the vertices are coordinates measured from some origin. This origin is where the object frame is said to be located. The object frame can be positioned so as to reflect the symmetry of the object. For example, the natural object frame of a cube could be a cartesian (x,y,z) coordinate system centred at the centre of symmetry (centre of gravity) of the cube, with the sides of the cube parallel to the x, y and z axes of the coordinate system as shown in Figure 2.1.
There may be several object frames combined together, particularly when a complex object is made up of several simpler objects. The process of sticking together simple objects (primitives) to make a complex one involves just the kind of transforms we have been talking about. These transforms are sometimes referred to as instance transforms.
2.2.2. The World frame
Having constructed a complex object — which can be thought of as an ‘actor’ in the scenario we are about to create — it is necessary to place it in the arena with all other ‘actors’. This common space, inhabited by all objects is called the world frame. It is the place where the Laws of Nature play a role. For example, objects which are not subject to any force either remain at rest or move at constant velocity. That’s Newton’s First Law. Since this world is our creation, we do not have to stick to these laws, if we wish. This is the place where collisions are tested for. We will call the transform which moves the object into its final position in the world frame the objecttoworld transform. It will consist of some combination of rotation and translation.
2.2.3. The View Frame
Everyone in the real world has a different view of it, and the same thing applies to the world we are creating inside the computer. The only difference is that there is only one screen and therefore only one viewer. The view of the world depends on where the observer is standing and looking.
The view of the world seen by the observer is most easily represented by the view frame. This is a set of x, y, and zaxes which follow the gaze of the observer. Usually the zaxis points forward and in our convention the xaxis points vertically up. In this picture, an object which is straight ahead at a distance of 100 will have the coordinates (0,0,100) in the view frame and if the observer rotates to the left by 90 degrees it will have view frame coordinates (0,100,0). In general the view frame’s position in the world frame will be changing continuously. In a flight simulator, for example, the view frame is the view from the cockpit.
It might appear at first sight that there is an unnecessary duplication of points of view in all these frames of reference. However they define a natural hierarchy within which the overall picture can be constructed to make it easy to take account of the relative motions of the observer and graphics primitives (objects).
One thing in particular is worth noting. Rotating the view frame to the left or moving the scene to the right results in the same relative motion and gives the same picture on the screen. This suggests that there is a simple connection between two motions. In the language of mathematics, one is said to be the inverse of the other. We will return to this again when we look at the rotations in detail. This point is examined in detail in Appendix 6.
2.2.4. The Screen
This is the logical screen, the block of RAM on which pictures are drawn before being displayed. It is mapped out following the way RAM is allocated to the screen, which in turn depends on the screen resolution, as described in Chapter 3. This results in the origin (the point with screen coordinates (0,0)) being right at the top left hand corner of the screen. To get from the view frame to the screen we must make a ‘projection’ onto a plane, called the view plane, of the objects which we wish to display. This is called a perspective transform and must preserve the ordering in space, so that objects which are farther away look smaller. It is done by tracing “rays” from objects to the view point, which is the location of the observer’s eye. The intersection of these rays with the view plane defines the outlines as they will appear on the screen.
The transform to the screen coordinate system is almost the last stage, but not quite; the screen has limits. It may turn out that parts of the picture lie outside the screen RAM; that part of memory allocated to the screen. If no attempt is made to restrict points to appear on the visible screen then the program will attempt to plot them outside screen RAM, which could lead to a system crash. For this reason, unless it is absolutely certain that no point to be displayed will ever lie outside the screen RAM, only part of what is visible on the view plane will reach the screen. This is “windowing”. What is not visible must be “clipped” away. The outline which defines the window on the display is called a view port. To express clearly the effort that has gone into producing the final image, this is sometimes also called the clip frame.
There is even a need to clip in three dimensions in the view frame itself. Objects which are a long way away from the observer should not be displayed, and no time should be wasted worrying about them. It is a consequence of having a finite drawing resolution on the screen that small objects become badly distorted. Ultimately all very distant objects will end up as single pixels and the horizon could have a cluster of dots all over it. Sets of parallel lines will ultimately converge to a single line which will then never diminish in intensity. To stop all of this it makes sense to clip out altogether objects which are more than a certain distance from the origin of the view frame.
2.3. Coordinate Systems
When we try to put all of these transforms on a mathematical basis we immediately run into a sticky and irritating problem — how to define the coordinate systems. It is standard in engineering, science and most of mathematics to work in righthanded Cartesian coordinates. A righthanded and a lefthanded Cartesian coordinate system are both shown in Figure 2.2. In keeping with this convention we will also always use a righthanded Cartesian coordinate system. However, be warned, this is not standard in the world of computer graphics. Lefthanded systems abound and sometimes both conventions are used at the same time!
There is another frequently used convention within computer graphics which, if we are to stick with it, forces the orientation of the axes in the view frame. It is that the positive z axis points forward into the picture, along the direction in which the observer is looking.
Putting all this together, we have chosen to end up with the various coordinate systems shown in Figure 2.1. Positive x is up and in the world frame the yz plane defines ground level.
Coordinate systems and frames of reference are also discussed in Appendix 6.
2.4. Vectors and Matrices
For someone who loves computing but not mathematics, the introduction of matrices and vectors is not very welcome. Although it is possible to do all of the required mathematics by straightforward algebra, vectors and matrices establish an elegant and consistent framework within which to work. In addition there are properties of matrices which make them especially useful. An example is when a series of transforms take place in succession, such as when a rotation of an object about the xaxis is followed by a rotation about the yaxis. Instead of calculating the coordinates of the object twice, after each rotation, it is possible to concatenate (multiply together) the two transformation matrices and then perform the combined transform once only. This can save a lot of time when there are many points to transform.
We will discuss the various types of transforms in detail as they come up. Appendix 5 also explains matrices and vectors.
2.4.1. Vectors
Vectors are a mathematical shorthand notation which tell you how far to go in a given direction. Vectors go together with matrices. Here again there are two conventions concerning vectors. Vectors can be row vectors or column vectors. This doesn’t mean very much, except that it changes how a vector looks when it is written down and the arrangement of elements inside the transformation matrices. In the teaching of engineering and science it is more usual to write vectors in column form and we will adhere to this convention exclusively throughout the book.
2.5. Data Structures
2.5.1. Variables and Labels
One of the most difficult things to get used to when first using assembly language is that there are no algebraic variables, just data stored in registers and at memory locations. You can’t add x to y but you can add the contents of register d0 to the contents of register d1. In a 16 bit system such as the Amiga even memory locations become hard to locate because they are not always known when the program is written (except for addresses of registers used by the Operating System, which are fixed). This is in contrast to simpler 8 bit micros where PEEK ing and POKE ing allows access to anywhere in RAM at addresses which will be fixed and always available to the program. The problem with a micro with an advanced operating system, like the Amiga, is that until a program is actually loaded in the machine and ready to run, its exact location will not be known. There is a way of forcing the Operating System to load the program at a particular memory location by the use of absolute code (set by the assembler directive ORG) but that builds inflexibility into the program and may lead to clashes with other software. That may not be a problem with a game which will tie up the computer all to itself, though it may fall victim to later modifications in the operating system.
The general philosophy is to produce programs which are insulated from all of this and come as complete selfcontained packages which can be located and run anywhere in RAM. At first sight there appear to be insurmountable problems with this approach: how can you set up a table of data and later find it and how can you set up a table of addresses (jump vectors) of subroutines to execute depending on the outcome of a test? There are various solutions to these problems, some of which utilise particular addressing modes of the processor and others of which rely on the assembler, as we have already mentioned in the discussion of position independent and relocatable code in Chapter 1. The problem is solved by the extensive use of labels which are temporary substitutes for addresses which will be calculated later.
Labels play a very prominent part in any assembler program. The way they appear in the code makes them look like algebraic variables but they are not. A label is a pointer to a memory location where the current value of a variable is held, or it is a pointer to another part of the program. This is where much of the difficulty arises.
2.5.2. Lists
Finding ways of efficiently storing and accessing data has been the subject of intense study in computing. In computer graphics it is very important, particularly where speed matters. The important thing is to store data in a form such that is easy to get at for the problem in hand. It may not always be in the best form for all applications all the time, and some manipulation may be required along the way.
In vector graphics where primitives are modelled by polyhedral structures with polygonal faces, what is most important are lists of vertices (corners) and the straight line edges joining them. Figure 2.3 illustrates a house modelled in this way. There is more than one way of setting up a data list to describe this structure, but the one we will most commonly use has at its centre the list of connections which describe the surfaces uniquely: the edge list. One thing to avoid is having to repeat the actual coordinates of the vertices more than once. It is better to give each vertex a number and instead refer to this. When the x, y and zcoordinates of a vertex are required they can be drawn from the list of coordinates by the powerful indexed addressing modes of the ST, providing the position in the list is simply related to the vertex number. To make this point clear, here are the lists which are needed to draw the house. There will be other lists as well, containing other attributes such as the colour of each surface and so on, but they are not shown here. The house is not very complicated, but sufficiently so to show how long the lists might become for a really complex object.
First the number of polygons in the house as a whole must be specified. Each plane face qualifies: four walls, two sloping roofs, one floor, one door, so we have:
surface number: 8
There is only one entry here but if there were other buildings it would be a list. Then the number of edges in each surface is given, where the entry has the same position as the number (circled) of the surface as shown in the figure:
edge numbers: 5, 4, 5, 4, 4, 4, 4, 4
After this the ordered list of vertex numbers going clockwise round the exterior face makes up the edge list. To make the data most useful to the program, the first vertex for each surface is again repeated at the end of its group to make a closed loop.
edge list: 7,8,9,2,1,7,1,2,3,4,1,4,3,10,5,6,4,6,5,8,7,6,5,10,9,8,5,2,9,10,3,2 1,4,6,7,1,11,12,13,14,11
Finally the actual coordinates, in whatever scale is being used, are given for x, y and z in the order of vertex numbers:
x coordinates: 0,100,100,0,100,0,0,100,150,150,0,50,50,0
y coordinates: 50,50,50,50,50,50,50,50,0,0,50,50,50,50
z coordinates: 100,100,100,100,100,100,100,100,100,100,10,10,10,10
These data would be used to define the house in the object frame. Following the transformation to the world frame some of the lists, the edge list, the edge numbers and the surface number would all be unchanged but the coordinates in the world frame would be different.
2.6. Summary
What should be one’s attitude towards these very mathematical aspects of 3D graphics? If you are mathematically inclined, then it makes sense to try to understand what’s going on in detail. This gives you the power to write your own transforms and explore some of the very interesting effects that can be produced. If you are not mathematically inclined then just regard the mathematical transforms as software “black boxes” to be “plugged in” as required. The transforms in this book are structured to allow you to do this. You only have to understand how to present data to them.
3. Drawing on the Screen
In this chapter we look at how the Amiga screen is addressed. This is detail which is highly specific to the Amiga but of great importance for fast graphics since our intention is to draw 3D solid objects in real time. A very important aspect of this will be filling in polygonal shapes quickly.
No matter how complex graphics programs are, ultimately their output must appear on the screen. For the new programmer there is a number of confusing terms associated with producing visible output: playfields, bit planes, screens. These concepts arise because of the power and flexibility built into the Amiga.
Simply put, the Amiga has been designed with a powerful set of tools to implement 2dimensional sprite graphics (the graphics of Pacman, icons and many popular games of the scrolling variety), and this is reflected in the graphics terms. That is not to say that 3dimensional graphics is difficult; on the contrary it is very well catered for but, as we will see, for 3D only a small part of the graphics arsenal included in the Amiga need be used.
First of all remember that the picture that ends up on the monitor screen is a direct “map” of the contents of RAM. The word map, as it is used here, is a bit of mathematician’s jargon. It means what is in RAM entirely defines what is on the monitor screen though it will not look at all like it. RAM is simply a series of 1D bytes of contiguous (all in a line) memory which the hardware converts to the 2D picture. To make life easy, we will refer to that part of memory dedicated to displaying pictures by a special name Video RAM a term commonly used in computer systems. In the Amiga, Video RAM must lie in that section of memory which can be accessed by all the custom hardware (called Agnus, Paula and Denise) and which is called Chip Ram. The prime requirements of Video RAM are that it must be laid out so as to allow easy drawing and to hold colour information. Understanding how colour is included is the key to understanding the layout of Video Ram.
There is an additional and important complication in the graphics we will be doing. It is called double buffering or screen buffering and is essential for flickerfree pictures. What it amounts to is having two chunks of Video RAM available: one to draw on and the other to display. In the Atari ST these are given the helpful names logical and physical screen, respectively. They are switched back and forth so that whilst a picture is being drawn on one screen (logical) the other (physical), which holds the last complete picture, is displayed on the monitor. Then when the new picture is completed it is put on display and becomes the physical screen. The old physical screen then becomes the new logical screen and is erased ready for drawing the next picture. It helps to think of each new picture as a frame, in the movie sense, so that the realtime graphics evolves like an interactive movie.
The programmer arranges that the switch from one screen to the other is naturally synchronised to the program; the program doesn’t ask for the switch until the new frame is complete and the hardware doesn’t change the display until the raster on the screen has reached the bottom righthand corner and is ready to fly back to the top. The short time for this to occur — called the vertical blank — is more than sufficient for the hardware to switch the screens.
3.1. The Screen
To understand the problem, think of the differences between the actual monitor screen and the block of Video RAM holding the image. The monitor screen is a rectangular end of a cathode ray tube on which an electron beam writes. To make this look like a picture the beam moves very quickly from left to right and top to bottom in a series of ‘raster’ scans; the picture is made up of closely spaced horizontal lines each made up of units called pixels. There isn’t really a solid picture at all, it just looks that way from a distance. To see this for yourself, inspect the monitor screen closely with a magnifying glass.
Memory, on the other hand, is laid out as a contiguous line of bytes, which are the smallest elements the microprocessor can directly address. Of these, the smallest resolvable unit is the bit (8 bits = 1 byte). Somehow each bit in memory must directly relate to the smallest ‘spot’ or pixel on the screen. The great flexibility of the Amiga allows many variations depending on the number of lines and size of a pixel. Taken together these two constitute the screen resolution: the more lines and the smaller the pixel, the greater the resolution. We will examine only the case illustrated in the programs in this book: the socalled low resolution with 32 colours and pixel matrix of 320x200 (320 across and 200 up). This mode allows us to produce fast realtime pictures with a good colour range. Other modes may be higher resolution and more colourful but are too slow. The interested reader is referred to the Amiga Hardware Reference Manual for details of other modes.
3.1.1. Playfields and Sprites
At the heart of the Amiga graphics system is the playfield. The playfield is really nothing more than our logical screen. Playfields go together with sprites to make up the 2D orientated system which is so powerful for Pacmanlike images and icons. Sprites are predrawn images, 16 bits (one word) wide and any number of lines high, which can be copied onto the playfield very fast. By first preparing in memory several sprites showing different orientations of an object, and then copying them onto the playfield in succession, it is possible to simulate 3D type motion. As far as we are concerned this is a cheat. The graphic objects, or primitives, we will create will have an independent existence of their own and only at the instant of drawing will they be converted to a pattern of bytes. Since we will never use sprites they are of no further interest to us.
So the place in chip RAM where the picture is to be drawn is called the playfield. As a consequence of the flexibility built into the Amiga system, the playfield can be too big (or too small) for the monitor screen. This does not lead to a catastrophe, it simply means, if it is too big, that only part of the playfield will be visible at any one time. This is an excellent arrangement for games and displays where the background can be made to scroll across the screen. As a further complication you don’t even have to fill the whole screen. The picture can be “windowed” down to the desired size at the time of display. Once again, we will not utilise these variations. In our case the playfield will exactly fill the monitor and the window will include the whole playfield. This will not in any way limit our graphics.
Having decided on the screen resolution, size and number of colours, we can now calculate the playfield size in RAM. This leads naturally to the idea of bitplanes.
3.1.2. BitPlanes and Colour
A playfield holds the picture which will be displayed on the monitor and, in our case of low resolution, must hold sufficient information to display 200 lines vertically, 320 pixels horizontally and show 32 colours. Colour makes the big complication and is the key to understanding why the playfield looks the way it does.
To get a feel for what is going on let’s first consider the simplest possibility, that of only 2 colours. In this each pixel can be either on or off. The simplest playfield we could construct to do this is where the first 40 bytes (40 x 8 pixels) represent the first raster scan line , the second 40 bytes the second scan line and so on. In this way each bit in memory represents a pixel; if it is set to 1 the pixel is on and if it’s set to 0 the pixel is off. To fill the entire screen the playfield would have to be 200 x 40 = 8000 bytes in size. That is in fact how it is done in this case. The array of 8000 bytes is called a bitplane, for that is exactly what it is. In this case the playfield contains a single bit plane. Moving up to 4 colours doesn’t present a problem in this scheme but now 2 bit planes are needed. There is still a onetoone connection between each pixel on the screen and a particular bit in each bit plane, but whether the bit is set 1 or 0 fixes the colour. For example, the second pixel in the top row on the screen has coordinates x = 1, y = 0 (it’s a peculiar feature of computer displays that x = 0, y = 0 (the origin) starts at the top lefthand corner of the screen). This pixel corresponds to the second bit in both the first and second bit planes, which are just two 8000 bytesized blocks of RAM. Now the way in which colour can be encoded becomes clear. If the bit in both planes is 0 then the pixel has colour 0, if the bit in plane 2 is 0 but the bit in plane 1 is 1 the pixel has colour 1, and so on as show below:
bit plane number  1  2  colour 

bit value 
0 
0 
0 
1 
0 
1 

0 
1 
2 

1 
1 
3 
It is therefore possible to have 4 different colours for the second pixel depending on how the second bit in each bit plane is set. Clearly there is a pattern to this: 1 bit plane gives 2 colours, 2 bit planes give 4 colours, 3 bit planes give 8 colours, 4 bit planes give 16 colours and 5 bit planes give 32 colours. Expressed mathematically, the formula is:
number of colours = (2)**number of bit planes.
In the Amiga the bit planes are separate 8000 byte blocks of RAM. In the Atari ST there are 4 bit planes but they are interwoven so that it is not so easy to see what a given pixel is doing.
For our purposes, we have 5 bit planes and therefore 32 colours at our disposal. Figure 3.1 shows the 5 bit planes set to fill pixel (x=2,y=3) with colour 13.
One last question remains. The Amiga 500 has 4096 colours at its disposal; how is this vast range related to the 32 we can display with the 5 bit planes? The answer is that at any one time we can display only 32 out of the total range of 4096. This list of the colours we have selected to currently display (colours 0 to 31) is contained in a special series of registers which, taken together, are called the Colour Table or Colour Palette. It is the programmer’s responsibility to write the values of the colours selected into the table at the start and, if they are changed at any time, to rewrite the table. The codes for the standard palette, which is how the Amiga is set up when it is turned on, are listed in the file data_00.s. The custom hardware takes care of the practical details of converting the 32 numbers in the colour table to colours on the screen. Of course the programmer has to know what number makes which colour in order to write the table; we will discuss this point later in more detail when examining the example program and again in Chapter 7.
3.1.3. Copper Lists
The Copper is a friendly name for the graphics coprocessor which handles nearly all of the graphics system. It works independently of the 68000 processor leaving it free to get on with the program execution. Again it is not our purpose here to savour all the wonderful and exotic functions of the Copper, such as setting up a screen with horizontal slices of different resolution and colour depth. We will discuss only those aspects of the Copper’s function which are used in the example programs.
There is one function for which the Copper is essential for our purposes; mapping the playfield onto the monitor screen. In our case with doublebuffering (one screen to draw on and another to display) there is the additional chore of swapping screens during the vertical blank. The Copper must be given the right information at the right time to take care of all this. This information is gathered together in the Copper lists.
All the Copper needs to do its job are the addresses in RAM of the two sets of 5 bit planes that make up the two screens. These addresses are entered, one at a time, together with corresponding special System addresses (called unsurprisingly Bit Plane Pointers) into two lists which the Copper will use to make the display. Of course the Copper will also be told how many bit planes to use, so our Copper list is really nothing more than a series of linkages connecting each bit plane address with its destination System register. In order to construct the display the Copper will, at the vertical blank, move the addresses of the bit planes to be displayed into the bit plane pointers. The hardware will take care of the rest. More details are given with the examples.
3.2. The Blitter and the Screen
The blitter, which is another hardware coprocessor residing in the Agnus chip, is indeed a mighty device. Its presence overshadows the entire operation of the Amiga making possible very high speed graphics. The word BLIT comes from BLock Image Transfer and the sound of the word itself is just right to suggest the powerful operations that can be done.
For our purposes the blitter plays a key role at several basic stages of assembling a picture on a playfield. It can be used to draw outlines, fill them in, copy them to the bit planes and perform fast erasure (though for our purposes we will bypass the line drawing for reasons explained later). In every respect it gives a great speed advantage, performing elementary graphics functions independent of the 68000, leaving it free to get on with the program. In a system without a blitter all the graphics functions would have to be done by the main processor, which slows things down.
The blitter’s main function is to copy bytes from one place to another very quickly. This is especially important in 2D style graphics where sprites “move” by being copied from one playfield position to another. Remember a sprite is a rectangular wordwide block (one word = 2 bytes); the blitter only handles rectangles of words. In our case it will be used in the filling in of outlines and transferring images. Let’s have a look at how a picture will be constructed. Once again, this is not an exhaustive nor general description of the operations of the blitter. It is quite specifically directed towards explaining how the blitter has been used in our 3D graphics “pipeline”.
3.2.1. The Bit Planes Layout — An Overview
It will be easier to understand what is to follow if the general bit plane strategy employed in the example programs is explained at this stage. It is not claimed by the author that this strategy is the best there is. Other programmers, no doubt, will have better ideas. However the method used relies on correct, documented, use of the custom hardware and provides an excellent insight into its operation at the deepest level.
The drawing of a new picture on the logical screen follows several stages. Each new graphics object (primitive) to be added to the composite picture is drawn from filled polygons on a special bit plane called the mask plane, which is not one of the playfield bit planes. The name “mask” has a special significance since it is projected onto the playfield bit planes in a way which is determined by what is already there and the colour of the object. This all sounds rather complicated, and so it is.
To understand the problem clearly, remember we are using the blitter throughout and must stick to the rules. The blitter likes to work in wordwide rectangles of RAM and so we copy a rectangle which completely encloses the image. Now we have to be careful since if we simply copy the image on the mask plane to the playfield it will blot out a whole rectangle including everything to the side of the object as well as what is underneath. What we want to do is only cover what is obscured by the object itself, not everything in its blit rectangle. Somehow the “background” itself will have to figure in the copying process.
To make things easy, consider the case where there is only one bit plane, though of course in fact we will always have five. We’ll discuss the added complication of several bit planes afterwards. With one bit plane in the playfield, let us suppose that we are building up a composite picture from a triangle and a square. The square has been copied to the playfield bit plane and the triangle has been drawn on the mask plane and is ready to be copied to the bit plane. Figure 3.2 shows what is going on.
If we simply blitted the rectangular mask containing the triangle to the bit plane it would block out the entire square as shown in Figure 3.3. What we want instead is the result in Figure 3.4.
The blitter can handle this: it just needs to be shown the destination background rectangle and with a clever bit of logic called a “cookiecut” function (like cutting out cookies with a shaped cutter) will combine both the triangle and square without obscuring anything.
There is the additional complication of the 5 bit planes required to encode 32 colours to discuss. Now we can consider the triangle to have a colour — colour 5 for example. We don’t know yet what colour this is; it could be anything, depending on what number we choose to set in colour register 5. What colour 5 does mean is that inside the triangle, bits must be set 1 in bit planes 1 and 3 and cleared 0 in all the other planes. It isn’t sufficient to only set the bits in planes 1 and 3 and ignore the others. Hence the true meaning of the object as a mask is revealed; the mask, like a stencil, must set bits 1 in some bit planes and clear bits 0 in others, depending on the colour.
Taken altogether the above discussion sounds like a lot of lengthy programming and hard work. It isn’t. These complex manoeuvres can be simply accomplished by writing the appropriate codes to blitter registers. Understanding what codes to write is another thing. But we will delay the details until the example program is presented so that you have something to look at.
Finally the blitter can be used to erase the logical screen just before the next frame is drawn. It turns out that erasing five bit planes at a time (40,000 bytes) is a time consuming operation helped enormously by the blitter.
3.3. Drawing
At the heart of our fast graphics program are the routines which draw and fill in polygons. Using polyhedra as models for solid 3D objects will produce many polygonal surfaces to fill in. The job is best done in two stages: first an outline is drawn by joining up the vertices at the corners, then the region within the outline is filled in. It turns out that this isn’t quite as simple as it sounds. To understand why, let’s look at how fast line drawing and region filling can be done.
The procedure for fast drawing lines on a computer screen has been around for a long time. It is called the Bresenham algorithm. It is fast because it doesn’t use arithmetic — in particular it doesn’t use division or multiplication, which are among the slowest instructions in the 68000 set. The blitter itself has the capability to draw lines and even special lines for outlines to be blitter filled, but our routines do not use the blitter for line drawing. Why is this so? The answer lies in the way the blitter draws outlines and what it expects them to look like for filling in.
The blitter can only fill in a closed polygon by a series of raster scans starting at the top of the screen and working down a line at a time. It expects to find only two pixels set on each line and will the start filling in at the first pixel and stop at the second. Now you see the problem. If it ever finds more than two pixels set per scan line it’s in trouble. Suppose, by accident there are three pixels set. It will start filling at the first, stop filling at the second and start again at the third. But without a fourth pixel to stop at it will go on filling to the edge of the screen, which is not what is desired.
For most carefully chosen situations a special line drawing mode of the blitter can be selected so that there is no problem but, with the dynamic action of 3D graphics, situations occasionally arise where the blitter ends up drawing an outline which has more than two pixels set per scanline. This is not really a fault of the blitter but rather a complication of passing information to it concerning the positions of polygon vertices to be connected in the outline. To solve the problem in the general case becomes laborious, so for the routines in this book the problem has been avoided by using custom routines that always work, instead of the blitter. There is no loss of speed here since drawing outlines is very quick compared to all the other chores.
3.3.1. The Bresenham Line Drawing Algorithm
How can you draw a line between two points without using the equation
y = mx + c
where a multiplication is required to get the yvalue for a given x value?
Fortunately the solution to this problem was solved many years ago in 1962 by J.E. Bresenham. The problem at that time was to control a digital plotter which could neither multiply nor divide. Such operations are available on the 68000 but they are time consuming and we want to avoid them where possible. The great advantage of the Bresenham algorithm is that it can find all the screen coordinates of a line using only additions and subtractions. When described in algebraic terms the Bresenham algorithm looks intimidating but, like all great ideas, is really very simple. Of course some (though not all) commercial programs use algorithms which draw lines and fill polygons faster than the Bresenham method will allow, but having understood it you can try to do better. In any case it is very elegant and very fast.
The problem facing us is to find the (x,y) coordinate pairs along the sides of a polygon so that we can use them as the start and end points for horizontal lines to do a fill. The fill of a very small area, chosen so to exaggerate the irregularity caused by the pixels, is shown in Figure 3.5. Regarding the boundary as a line, we see that it looks different in different screen resolutions. At the normal resolution, the position of a pixel on the screen is specified by an integer value between 0 and 319 horizontally and between 0 and 199 vertically. With this limitation any line (unless it is either horizontal or vertical) will, under a magnifying glass, look like a staircase. This is shown in Figure 3.6. There is clearly no need for us to try to calculate the coordinates of a point to better accuracy than the screen resolution will allow, which means that integer arithmetic is quite adequate. There is no point in calculating the position of a point on the screen to 4 places of decimals because it can only be plotted to no places of decimals. The Bresenham strategy owes its success to the way it fits in with the pixel layout of the screen. Here is the way it works.
Let us suppose that we are plotting a line on the screen which starts at the point S(x1,y1) and ends at point T(x2,y2) as shown in Figure 3.6. These points will, of course, lie precisely on the line. Now we could take a pencil and ruler and draw an ideal mathematical line between the two end points and then shade in those pixels which lie closest to the line. This is how our line will look on the screen. The result is shown in the figure where the pixels are represented by squares. We want an algorithm to do what the human brain does automatically in deciding which points to shade.
Here is the Bresenham algorithm which does this. To make the picture simpler we replace each pixel by a dot at its centre which makes very clear the degree to which each pixel misses the ideal line. Suppose we have just reached the point A, which didn’t lie precisely on the line, and we have to choose which point to do next. The next point could be B(x+1,y) or C(x+1,y+1). It seems an obvious choice; point C because it is closer. Closer in this sense means a shorter vertical distance to the line at the point E from the centre of the pixel. We can call this the error. On the diagram, error t is less than error s. Notice that somehow we didn’t consider point H in this decision. That’s because the angle of the line is less than 45°. If the angle had been greater than 45°, we would have considered the points H and C. Already it is clear that lines of slope less than 1 (angle less than 45°) are a different case from lines of slope greater than 1 (angle greater than 45°). We will come back to this later.
Well it looks like the problem is solved! Just inspect the next two points ahead, like B and C, calculate the vertical distance of each to the line and choose the shorter. In principle that’s it. If the vertical distance up to the ideal line is taken as a positive error (like s) and a vertical distance down to the line is taken as a negative error (like t) then the overall quantity on which the choice is based is (st):
if (st) = D is positive, the next point is C if (st) = D is negative, the next point is B.
The quantity (st) is called the decision variable D for obvious reasons.
Bresenham’s great innovation was to spot two tricks to make this a simple operation. The first is that since only the sign of (st) matters, any quantity which is proportional to (st) will do. The second is that there is no need to redo this calculation each time. The value of D used for the present choice can be quickly corrected to find the value of D for the next choice.
So it goes like this. The updated decision variable, D, is tested to see if it is positive or negative. If it is negative the next point to set is B. Then D is updated accordingly. If it is positive, the next point to set is C. Then D is updated accordingly. We just have to find out what these updates are and what the value of D at the very start of the line should be.
The key to answering these questions is to look at how to get from A to B or from A to C. To get from A to B do a horizontal move; to get from A to C do a horizontal followed by a vertical move. To calculate the errors associated with the individual horizontal and vertical moves it is simpler to look at point S. From this a horizontal move produces an error of AF, but a simple vertical move to G produces an error of SG (points below the ideal line have a positive error and points above have a negative error). But SG is equal to SA, so we really only have to consider the relative lengths of the vertical and horizontal sides of the triangle SAF. But, very important, triangle SAF is similar to the overall triangle SUT and the sides are in proportion:
AF/SA = TU/SU = (y2y1)/(x2x1) = dy/dx
where dy is the overall distance in y and dx is the overall distance in x from the start to the end of the line.
As we have said, anything in proportion will do, so the errors could be taken as dy and dx. A further factor of 2, which still keeps everything in proportion, will bring us into line with Bresenham’s original scheme:
simple horizontal move: error = 2dy
simple vertical move: error = 2dx
For the actual moves from A to B or from A to C:
horizontal move (AB): error1 = 2dy
horizontal plus vertical move (AC): error2 = 2dy2dx
These are the updates which must be made to the decision variable D, for the next choice.
Finally, what value of D should we start with? Everything works fine if we take the starting value D1 as the average error of error1 and error2
D1 = (error1 + error2)/2 = 2dydx
To summarise, here’s the algorithm

initialize the first point to x1,y1 and the initial value of D to D1,

if D1 is ve, increment x but don’t increment y and make D = D + error1, if D is +ve, increment both x and y and make D = D + error2

repeat step 2 until x = x2.
Now what about lines which have a slope greater than 1? The solution is very simple. To see it clearly, just draw a line with slope greater than 1 on a piece of tracing paper and clearly label the x and y axes. Now turn the tracing paper over. With the y axis horizontal and the x axis vertical, it now looks like our original line of slope less than 1 except that the x and y axes have been interchanged. Everything therefore works exactly as before if x and y are interchanged in the formulae.
3.3.2. Tailoring Bresenham to the Polygon Fill
The blitter is ideally suited to filling in outlines but, as we have said, it needs those outlines to be drawn in a special way. In particular it wants to find only two pixels set on each scan line: one to start filling at and the other to stop.
The procedure we have described will certainly generate points along a line, but for our purpose we do not need them all. When considering lines of slope less then 1, points which lie on the horizontal part of the “staircase”, such as S and A, all have the same y coordinate but different x coordinates. Only the xcoordinate of the first one, S, is required since the others, like A, will confuse the blitter. The first one in the line follows immediately the change in sign of D. Our version of the Bresenham algorithm is modified to generate only the start and end coordinates of horizontal lines for raster scans to fill a convex polygon. It is not exactly a Bresenham algorithm in the usual sense since the coordinates it generates would, if plotted alone, produce a line full of holes along horizontals. But that’s how the blitter likes it.
3.4. Example Program
There is one example program included in this chapter but it is quite long and it illustrates all of the points included in the above discussion and plenty more besides. It doesn’t do a great deal; it draws two solid triangles one after another with double buffering, but contains all the routines we will need in later stages of the book. The coordinates of the triangles are given at the end of the file polydraw.s.
It is not claimed here that the routines are the best or fastest possible. Other versions may be more elegant and faster. But these programs are fast and do the job adequately. Besides, they do have an educational value, illustrating various aspects of assembly code programming and functions of the Amiga hardware. When you have studied how they work you may wish to make your own improvements.
In order to prevent programs evolving in a disorganised mess, several files have been set up containing subroutines of a similar kind. There will always be a main control program which will have a different name in each chapter. In addition there will probably be a core file which will contain all the important subroutines, a systm file with “housekeeping” (like erasing screens) subroutines, a bss file containing labels of variables and a data file containing numbers. Files generally end with the .s extension to show that they are source files. In general these files are added in at assembly with the powerful include directive so that once written they are there for the future.
Here is a discussion of each self contained program file, with an explanation of its salient features. The files themselves are listed on the succeeding pages. They are ready for assembly by Devpac Amiga, but any 68000 assembler can be used providing changes are made to fit in with its rules of syntax as specified in its manual.
Since this is not a textbook of all the hardware details of the Amiga, it is inevitable that frequently only brief mentions are given to many features which deserve a complete section to themselves. This is regrettable but essential to preserve the flow of the narrative and not to lose sight of the overall objective. The interested reader will find a complete hardware description in the Amiga Hardware Reference Manual.
3.4.1. Polydraw.s
This is the main control program. It calls lots of subroutines which are contained in the other files included at the beginning. This directive makes the assembler insert the whole of the source file referred to at this point in the program when the assembling is done. All of the subroutines called in this file are described in detail in the core_00.s and systm_00.s sections.
First memory is allocated for screens and copper lists. Then the Copper lists are constructed followed by the blitter allocation and the writing of the colour table. There is also a lookup table written to speed access to the mask plane. Then the program enters an infinite loop, which you can only interrupt by resetting the machine. It first draws a triangle on screen 1, the first logical screen, and displays screen 2 which, at the start is empty. Then it displays the triangle on screen 1 (now the physical screen) and draws an inverted triangle on screen 2 which has now become the next logical screen. Now the bottom of the loop has been reached and with a
_bra blit_loop_
the cycle is repeated. All the important work is done in the long subroutine poly_fill which does just what it suggests — it fills polygons.
That’s the main program. You could try changing the shapes which are drawn. They are filled polygons and the coordinates of their vertices are given at the end of the file. More about that later.
The program is simple but it illustrates the things we have been talking about and contains all the routines we will use later.
3.4.2. systm_00.s
This is a “housekeeping” file; it contains utility routines which are frequently used. At the start is the routine to allocate memory space for the screens and the Copper lists; this must be done by calls to routines incorporated in the Amiga Operating System in a special Libraries. The Amiga uses dynamic memory management in which free space is allocated as required. Unlike the early 8bit micros, you can’t assume that a particular range of memory addresses are reserved for programs and another range for the screen. You have to ask the Operating System for space and wait to see what you get. If the system cannot find memory to allocate it will return a 0 in the register d0 after the call to allocmem. The program doesn’t check for this condition but it probably happens for large programs and is something to watch for.
For our purposes the screens must be in chip memory so that they can be accessed by the custom hardware. Remember this is the lower 512k range (if your computer only has this much then it’s all chip memory) and it is possible to specify that the space we want is in this memory. There is another request we can make; we can ask for the memory to be cleared when it is allocated — it’s useful to start off with a clean slate. Both of these requests are made by placing the long word #$10002 in d1 before calling the allocmem routine. Before we look at this in any more detail, let’s see briefly how libraries are used.
Libraries
There are several libraries in the Amiga. They are sets of routines of a similar kind provided “free of charge” by the Operating System. The way they are laid out and the fact that they are collected into libraries reflects the C language orientation of the system. Of course you could replace them with your own routines, but why bother (no doubt if the Amiga programmers had included a set of vector graphics routines in the libraries this book would never have been written).
The way in which a particular library function is accessed in assembler is quite straightforward since each, subroutine starts at a fixed distance (offset) in memory from the start, or base, address of the library (each library will have a different base address). So to run a library routine you first get the base address, then set up any entry parameters that are required and finally jump to the offset from the base address. When the function has done its work you are returned to the main program.
In allocmem a particularly important library called Exec library is used to allocate memory for the screens and Copper lists. Exec is a kind of Master Library with functions that are immediately available after reset. Its base address is always the same but in any case it is stored as a long word at memory address 4 and can be placed in a register, such as a6, by the instruction
move.l 4,a6
This is what happens in allocmem except that the constant execbase (defined equal to 4 in equates.s) is used instead as it is more readable.
Looking at the allocmem we can see that first the base address of Exec library is placed in register a6, then space equal to 12 bit planes to build the two screen planes, the mask plane and the store plane (where the background is stored preceding the cookiecut) is written into register d0. Then the long word $10002, being a combination of $2 to specify chip RAM and $10000 for clearing memory during allocation is written into d1. Finally a call to the function allocmem (offset by 198 bytes from the base address) allocates the memory and this is then partitioned into the various parts. Finally, using further calls to the function allocmem, space is allocated for the two Copper lists — one for each screen. Notice that the offsets are given the names of the function to call, as defined in the equates.s file, so as to make the program readable. One other interesting feature is that function offsets are all negative. Positive offsets lead into the ExecBase structure which is where many important Operating System variables are stored.
Following this the two Copper lists are set up. The Copper can perform several useful functions, but all we plan to use it for is to switch screens during the vertical blank. The two Copper lists contain the addresses of the system plane pointers, bpl1pt, bpl2pt,..etc., side by side with the actual screen plane addresses, in succession. That’s all that is required to make up the Copper lists. The hardware will do the job of displaying the screen pointed to by the active Copper list. There’s one small trick at the end of the lists. The copper is a processor in its own right and it wants to execute instructions. The last instruction in each of the lists (the long word $fffffffe) tells the copper to wait until the screen raster gets to the impossible position y = $ff and x = $fe. This is impossible since the largest horizontal position it can record is $e2. So what it does is to wait for an event that will never happen. It is put out of its misery when the vertical blank interrupt occurs, at which point we will switch the copper lists to implement double buffering.
In blitalloc the blitter is reserved for our use alone and task switching is turned off. Let’s discuss what’s going on here.
The blitter can perform many functions; it is even involved in reading the disk drive. Since we intend using it intensively it makes sense to disable all other applications. To do this we have to use the OwnBlitter function in the graphics library. There is a problem here in that, unlike the ExecBase, the base address of other libraries is not known to the system; it must be found. There is however an Exec library function called OpenLibrary which finds other library addresses. OpenLibrary requires a pointer to the library name in register a1 (an ASCII string will do) and the library version in d0. It returns the library address in d0 (or a zero if it fails) which can then be used as a base address for offsets for its own functions. In the program the base address is placed in a6. Having got the graphics library base address in a6 it is straightforward to jump to the OwnBlitter function as an offset defined in the equates.s file and reserve the blitter for our program alone.
Multitasking
The Amiga is a multitasking machine. Despite having only one CPU it can appear to run several programs simultaneously. This means that each task is switched on for a short time and then switched off again until its next go. Each task is slowed down but the overall effect is of multitasking. As far as we are concerned, multitasking should not occur; the last thing we want is another application altering data structures. Multitasking is switched off with the Exec library function Forbid.
DMA
Next in the subroutine init_scrns the playfield structure is set up together with the initial logical/physical screen assignment for completeness (actually the screen assignment needn’t be done here since it will be switched back and forth in the main program loop anyway). Before messing with these important data structures, one final source of outside meddling is eliminated: DMA.
DMA means direct memory access, which is a way of letting various parts of the system read and write to memory independently without having to overburden the 68000 processor. Of course, not everything can be allowed have access to memory simultaneously and the overall control is managed by a separate processor called the DMA controller, which can be manipulated through its main register called DMACON. DMACON has two separate parts: one for readonly and the other for writeonly. We wish to send instructions to DMACON so it is the writeonly part that concerns us. The highest bit (15) of DMACON has a special function when a word is written to the register: if it’s 1 it sets the written bits, if it’s 0 it clears the written bits. So if you write $8004 to it bit 3 will be set, but if you write $0004 it will be cleared. Only bit 3 will be affected by this, other bits will remain undisturbed from whatever state they are in. In the writeonly DMACON the bit assignments are:
BIT NAME FUNCTION 15 SET/CLR set/clear bits 14  no function 13  no function 12 & 11  unassigned 10 BLTPRI if 1, blitter has priority over 68000 9 DMAEN master enable for all bits 0  8 8 BPLEN enable bit plane DMA 7 COPEN enable Copper DMA 6 BLTEN enable blitter DMA 5 SPREN enable sprite DMA 4 DSKEN enable disk DMA 3  0 AUDxEN enable channels for audio DMA
To get to DMACON the base address, $dff000, of the chip register structure is put into a5 and the register itself accessed through the offset DMACON defined in equates.s. While the playfield and screen are being set up all DMA is shut off by writing the word $03ff to DMACON.
The Playfield
The playfield initialization which follows next writes to a number of chip registers associated with the playfield hardware. Since we are working with five bit planes in low resolution and a standard 200 x 300 size window in the noninterlaced mode, setting these registers is about as simple as it can be, though in general the setting up is a fairly complicated operation.
First of all we have to decide how much of the playfield is to be displayed since there is an option here to have a playfield which is larger than the onscreen display. The onscreen display is called the window. Such an option would be useful in sprite graphics with scrolling scenery. We just want the playfield and window to be the same size. There are two registers called DIWSTRT (display window start) and DIWSTOP (display window stop) which have to be written to accordingly. The point of it all is that although the electron beam scans the whole monitor screen, it doesn’t draw on all of it and we have to tell the system where the display part of the screen starts and ends. This is to avoid the edges where the picture is distorted and also to leave space for the blanking gaps. Without further ado, we will use the standard values for low resolution: DIWSTRT = $2c81, DIWSTOP = $f4c1.
In addition to the window position it is necessary to tell the system how to fetch and display data from the bit planes in the window. This information is contained in the DDFSTRT (datafetch start) and DDFSTOP (data fetch stop) registers. Once again, without further ado, we will use the normal low resolution values: DDFSTRT = $0038, DDFSTOP = S00d0.
The job isn’t done yet. Now we have to set up the bit plane control registers BPLCON0, BPLCON1 and BPLCON2 and the modulo registers BPL1MOD and BPL2MOD. Of these really only BPLCON0 matters as it establishes the number of bit planes in colour. It is set to the value %0101001000000000, which means low resolution (bit 15 = 0), 5 bit planes (bits 12 to 14), no hold and modify (bit 11 = 0), no dual playfield (bit 10 = 0), in colour (bit 9 = 1), no genlock audio (bit 8 = 0), no light pen, interlace mode of external synchronization (bits 1,2 and 3); bits 0 and 4 to 7 are unused. BPLCON1 sets up scrolling which we aren’t interested in and BPLCON2 is concerned with sprites which once again we don’t want.
The BPL1MOD and BPL2MOD registers are concerned with displaying a rectangular fraction of a playfield and would contain what are called modulo values for even and odd numbered bit planes. Modulo values make it possible for the system to know where the rectangular part is relative to the whole playfield. We’ll meet this idea again when we get to the blitter. Right now there is no modulo value to worry about since we’re using the entire playfield.
Finally, with the Copper list directed to screen 1, the Copper is turned on by writing to the COPJMP1 register so that bit plane, Copper and blitter DMA are turned on.
Colours
The next routine in systm_00.s sets up the colour palette. Remember this is the list of 32 colours (32 out of a possible 4096) which reside in the colour table. It transfers the contents of the colour list at col_tble in the file data_00.s into the 32 chip colour registers starting at COLOR00. The colours are a full set, following the spectrum, but starting with black since the first colour is the background.
What remains in the file systm_00.s is concerned with drawing and is a set of routines to erase the screens and other planes, which we will put to one side for a moment, and the two complementary screen buffering routines drw1_shw2 and drw2_shw1. What these do is to point the system register COP1LC at the tobedisplayed copper list and save the address of the tobedrawn screen in the log_screen pointer ready for the blitter.
At the very end in v_blnk is a routine to wait for the vertical blank — when the electron beam in the monitor flies back up to the top of the screen. In fact to avoid drawing at the very top of the screen, where there will be distortion of the picture, there are a number of lines which aren’t drawn. Of course that does not mean that we loose the top of the picture, it simply means that the picture is not started until it is someway down the monitor screen. A special chip register VPOSR keeps a record of the vertical position of the raster scan and by reading it we can find out where it is. By switching the screens when the scan is in the “hidden” band at the top we can hide the change. A sufficiently hidden line is number $10. What v_blnk does is to wait until this position is reached. VPOSR is really a long word register with the vertical position in the bit range 8 to 16, i.e. it spans two words, so those bits have to be singled out.
3.4.3. core_00.s
Here’s where all the action is. This file contains the important subroutine, poly_fil, which does the drawing.
Notice how the routine blt_chk is used before writing to the blitter registers throughout this large section of code. That is because the blitter runs independently of the 68000 CPU and is likely to be completing the previous task when it is next needed. As a consequence we must wait for it to finish. Any attempt to alter its register contents while it’s still going will lead to strange effects.
Let’s briefly recap what’s going on here. The routine is passed the coordinates of the vertices of a polygon which is to be filled with a particular colour. To do this it first has to draw an outline in a form acceptable for filling and then fill it using the blitter for speed. In addition the blitter will be involved in erasing the screen ready for the next frame. Though the routines for that are in the file systm_00.s, they look very similar to that which fills a polygon. In fact in erasure the screen is simply filled with nothing! Let’s look in detail at what poly_fil contains.
Drawing the Outline
Before starting anything the mask plane is erased.
We have already met the modified Bresenham algorithm that draws outlines ready for the blitter to fill. That is what is done in the Part 1 of polyfil. First the coordinates of the outline are calculated and temporarily stored in a reserved part (xbuffer) of RAM consisting of 200 long words which starts at xbuf. Each long word refers to a particular yposition on the mask plane and holds in its high and low words the start and end xcoordinates of the outline. This turns out to be a simple way of avoiding setting more than two set pixels per scan (y) line. The xbuffer can only hold one start and one end xvalue, and that’s an end to it!
Once the outline has been assembled in the xbuffer it is drawn onto the mask plane in Part 2.
Filling In With the Blitter
In Part 3 of polyfil the outline drawn in the mask plane is filled in and then copied onto each of the five bit planes which make up the logical screen. What appears on each bit plane is determined by the colour and what logic has been included.
During the fill of the outline on the mask plane we can save time by confining the blitter’s attention to a rectangle containing the polygon. This is especially important when only small objects are being drawn. We don’t want the blitter wasting time looking at parts of the screen which are empty. The vertices of the polygon can be used to construct the rectangle.
The settings for the blitter registers now start to make some sense when interpreted in terms of the basic blitting process, i.e. copying data from here to there. To see the blitter in all its glory we will have to wait until the mapping of the mask plane to the screen bit planes is discussed (next).
One curious feature of blitter filling is that it works backwards; the fill occurs in order of descending addresses from right to left in screen coordinates and therefore starts at the highest address on the outline which is the “bottom righthand” vertex. This address must be given to the two blit pointers A and D. Normally in a simple block image transfer these pointers point respectively at the source and destination addresses. In this case they are one and the same thing.
The BLITMOD registers are set with the difference between the plane width and the blit rectangle. These allow the blitter to start at the right place on each successive line.
The contents of the registers BLITCON0 and BLITCON1 now have specific meanings. In BLITCON0 the top bits 1215 are concerned with shifting which we never use so they are set to $0. Bits 811 turn on DMA channels A, B, C and D. In the limited application of filling only A and D are used so the nibble is $9. Bits 07 set the logic function which combines the channels in a particular way. In this case to simply copy A to D they are set to $F0 (the setting of these logic functions is discussed in more detail below). In BLITCON1 the top bits 1215 are again concerned with shifting and not used. Bit 3 is set for an inclusive fill, which means include the boundary lines (bit 4 is set for exclusive). Bit 1 has to be set for descending mode. There are two other registers, BLTAFWM and BLTALWM, which contain masks to filter out unwanted bits within a word but to the side of the object being copied. In filling they are set to all 1‘s.
Copying With the Blitter
Here’s where the blitter excels. What we want to do is copy the filledin polygon shape from the mask plane to the five bit planes which make up the logical screen. How we go about this is linked to both the colour of the polygon and how the composite picture on the screen is put together. Remember the 32 available colours have their codes entered in the 32 colour registers of the colour table. That’s where the 5 bit planes come from, since 2^5 = 32. The number of the colour register to be used, when converted to binary, tells you what bit planes have to be set and what bit planes have to be cleared to use its colour. So if we want to use the colour in register 7, the first, second and third bit planes have to be set 1 and the fourth and fifth bit planes have to be cleared over regions where the image is set in the mask plane. As far as the blitter is concerned, this means five block image transfers, which it what it was made for.
The additional complication comes from the way in which the composite picture on the logical screen is put together. Since it is to be a scene of some kind, where distant objects are obscured by near ones, we have to put the distant objects in first and the near ones in last. Hence in copying an image from the mask plane to the bit planes some care must be taken not to mess up what is already there, particularly because what is copied is everything within the rectangle containing the polygon, including the space around the polygon (all this is shown in figures 3.2, 3.3 and 3.4). The key to all this is minterms, which is the way of describing the logical operations which control how the mask and its screen destination interact. What we want to do is the “cookie cut” function. Here’s how it works.
First let’s assume a filled polygon has been drawn in the mask plane. Now we are going to copy the rectangle which just surrounds the polygon to the first bit plane of the logical screen. If the first colour bit is set, what is set 1 in the mask must be set 1 in the first bit plane. But what is set 0 in the mask (i.e. the space around the polygon but within the entire blitted rectangle), must not overwrite what is already in the bit plane. On the other hand, if the first colour bit is not set, we want what is set 1 in the mask to be set 0 in the bit plane. But the border around the polygon to the edge of the blitted rectangle must not alter what is already on the bit plane.
Now it is quite clear that what we are doing is combining the image already on the bit plane with the mask image with bitwise logic. We can think of the process as a logical combination of two source data channels, mask and bit plane, to produce a destination channel, the final bit plane. This is how the blitter treats the data. It has four DMA channels — three sources called A, B and C and a destination channel called D. For our purposes, the mask plane is in channel A, the original bit plane data is in channel B and the final combination is in channel D. Channel C is not used. To avoid a conflict, since the bit plane is both a source and destination, the active rectangle in it is first saved as the storeplane. So in the actual logical combination it is the storeplane which is channel B.
Having labelled the planes A, B and D it is now easy to state what we want to happen in the logical combination:
If the colour bit is set, a bit in D (final bit plane) must be set if it is set in A (mask) OR B (store); any other combination leaves it cleared: D = A OR B
If the colour bit is not set, a bit in D must be set if it is NOT set in A AND it is set in B; any other combination leaves it cleared: D = (NOT A) AND B.
That’s really all there is to it. It’s called a “cookiecut” function because the mask is cut out and laid on top of the bit plane, I suppose. The logical instruction is given by setting appropriately the LF bits (numbers 0  7) of register BLITCON0. This is done by expanding the logical expression in terms of products involving A, B and C, each of which is called a minterm. Each minterm has one of the LF bits dedicated to it so if that minterm shows up in the expansion, the LF bit is set. The details need not concern us. For the logic expressions above the values to be entered into the LF byte are:
expression LF byte A OR B $FC (NOT A) AND B $0C
Now looking back at core_00.s we can see this happening. First the active rectangle (the one containing the polygon in the mask) in the bit plane is saved in to the storeplane. This is done as a straight copy for which the minterms give the LF byte $F0. Bits 811 in BLITCON0 specify in order which of the four DMA channels are being used; in this straight copy, only A and D.
The entries to the other blitter registers deserve some comment. BLITxMOD (x means each channel) requires the difference between the bit plane width (40 bytes) and the width, in bytes, of the current blit rectangle, which of course will vary with the program. BLITSIZE wants the number of lines in its upper ten bits and the rectangle width in words in its lower six bits.
Then the logical combination of the mask and the source bit plane occurs depending on the value of the current colour bit. The procedure is repeated for each of the five colour planes.
As a final point, the blitter is also used to clear screens in systm_00.s.
3.4.4. equates.s
This contains the offsets for hardware registers, library functions and other constants.
3.4.5. bss_00.s
This contains the variables which are calculated during in the program.
3.4.6. data_00.s
This mainly contains the colours for the standard palette.
*****************************************************************************************
* Polydraw.s *
*****************************************************************************************
* SECTION TEXT
* assembler directive
opt d+ put in labeles for debugging
bra main dont' try to execute the includes
* all these files are to be include here
include equates.s all the constants
include bss_00.s variables locations
include data_00.s mainly standar palette colours
include systm_00.s a lot of housekeeping routins
include core_00.s the meat
*****************************************************************************************
* heres the main control program
main
bsr alloc_mem allocate memory for screens etc
bsr copr_list set up the copper lists
bsr blit_alloc take over the blitter
bsr colr_set set up the standard palette
bsr wrt_phys_tbl lookup table for fast screen access
* The program cycles here; screen buffering is used
blit_loop:
* first draw a triangle
bsr drw1_shw2 draw on screen 1, display screen 2
lea my_coords,a0 the vertices defined in the triangle
move.l a0,coords_lst here's where to find them
move.w #2,colour coloured red
move.w #3,no_in 3 sides to a triangle
bsr poly_fill draw the outline and fill it red
* then an inverted triangle
bsr drw2_shw1 draw on screen 2, display screen 1
lea my_inv_coords,a0 the vertices
move.l a0,coords_list here they are
move.w #12,colour coloured green
bsr poly_fill draw the outline and fill it green
bra blit_loop repeat the cycle
* The coordinates of the two triangles
my_coords dc.w 100,160,150,70,190,140,100,160
my_inv_coords dc.w 100,80,160,160,170,90,100,80
END
******************************************************************************************
* Core_00.s *
* *
******************************************************************************************
* This fills a polygon.
* It consists of 4 parts:
* 1. the x coords of the boundary are stored in xbuf
* 2. the outline is drawn in the mask plane
* 3. the the outline is filled by the blitter
* 4. the blitter copies the mask to the bitplanes
******************************************************************************************
* Part 1. Fill the buffer with the outline.
* a3 pointer to crds_in coords list (x1,y1...xn,yn,x1,y1)
* a2 pointer to xbuf
* d0x1: d1y1: d2x2: d3y2: d4vertex number/decision vertex
* d5lowest y: d6highest y/the increment: d7edge counter
* Polygon vertices are ordered anticlockwise
******************************************************************************************
poly_fill
bsr blit_mask clear mask plane
*INITIALISE ALL VARIABLES
filxbuf
move.w no_in,d7
beq fil_end quit if no more edges
move.l coords_lst,a3
subq.w #1,d7 counter of num edges
move.w #MINIMUM_Y,d5
clr.w d6 maximum y to zero
filbuf1
lea xbuf,a2
addq.w #2,a2 point to ascending side
move.w (a3)+,d0 next x1
move.w (a3)+,d1 next y1
move.w (a3)+,d2 next x2
move.w (a3)+,d3 next y2
subq.w #4,a3 point back to x2
*FIND THE HIGHEST AND LOWEST Y VALUES: THE FILLED RANGE OF XBUF
cmp.w d5,d1 test(y1miny)
bge filbuf3 miny unchanged
move.w d1,d5 miny is y1
filbuf3
cmp.w d1,d6 test(maxyy1)
bge filbuf5 unchanged
move.w d1,d6 maxy is y1
filbuf5
exg d5,a5 save miny
exg d6,a6 save maxy
clr.w d4 init. decision var
moveq #1,d6 init. increment
* All lines fall into 2 categories: [slope<1], [slope>1].
* The difference is whether x and y are increasing or decreasing.
* See if line is ascending [slope>0] or descending [slope<0].
cmp.w d1,d3 (y2y1)=dy
beq y_limits ignore horizontal altogether
bgt ascend slope > 0
* It must be descending. Direct output to LHS of buffer. a2 must
* be reduced and we have to reverse the order of the vertices.
exg d0,d2 x1 and x2
exg d1,d3 y1 and y2
subq.w #2,a2 point to left hand buffer
ascend
sub.w d1,d3 now dy is positive
* Set up y1 as index to buffer
lsl.w #2,d1
add.w d1,a2
* Check the sign of the slope
sub.w d0,d2 (x2x1)=dx
beq vertical special case to deal with
bgt pos_slope
* It must have a negative slope but we deal with this by making the
* increment negative.
neg.w d6 increment is negative
neg.w d2 dx is positive
* Now decide if the slope is High (>1) or Low (<1).
pos_slope
cmp.w d2,d3 test (dydx)
bgt hislope slope is > 1
* Slope is < 1 so we want to increment x every time and then
* check whether to increment y. If so this value of x must be saved
* dx is the counter. Initial error D1=2dydx.
* If last D ve, then x=x=inc, dont record x, D=D+err1
* If last D +ve, then x=x+inc, y=y+inc, record this x, D=D=err2
* err1=2dy; err2=2dy2dx
* d0=x: d2=dx: d3=dy: d6=incx.
move.w d2,d5
subq.w #1,d5 dx1 is the counter
add.w d3,d3 2dy=err1
move.w d3,d4 2dy
neg.w d2 dx
add.w d2,d4 2dydx= D1
add.w d4,d2 2dy2dx=err2
move.w d0,(a2) save first x
inc_x
add.w d6,d0 x=x+incx
tst.w d4 what is the decision?
bmi no_stk dont inc y dont record x
add.w #4,a2 inc y, record x. next buffer place
move.w d0,(a2) save this x
add.w d2,d4 update decision D=D=err2
bra.s next_x
no_stk
add.w d3,d4 D=D+err1
next_x
dbra d5,inc_x increment x again
bra y_limits
* The slope is > 1 so change the roles of dx and dy.
* This time increment y each time and record the value of x after having done so.
* Init error D1 = 2dxdy
* If last D ve, then y=y+inc, D=D+err1, record x
* If last D +ve, then x=x+inc, y=y+inc, D=D+err2, record x
* err1=2dx, err2=2(dxdy)
* d2=dx: d3=dy: d6=inc: d0=x
hislope
move.w d3,d5
subq.w #1,d5 dy1 is counter
add.w d2,d2 2dx=err1
move.w d2,d4 2dx
neg.w d3 dy
add.w d3,d4 D1=2dxdy
add.w d4,d3 2dx2dy=err2
move.w d0,(a2) save 1st x
inc_y
addq.w #4,a2 next place in buffer
tst.w d4 what is the decision
bmi same_x dont inc x
add.w d6,d0 inc x
add.w d3,d4 D=D+err2
bra.s next_y
same_x
add.w d2,d4 D=D+err1
next_y
move.w d0,(a2) save x value
dbra d5,inc_y
bra y_limits
* The vertical line x is constant. dy is the counter
vertical
move.w d0,(a2) save next x
addq.w #4,a2 next place in buffer
dbra d3,vertical for all y
* Restore the y limits
y_limits
exg d5,a5
exg d6,a6
next_line
dbra d7,filbuf1 do rest of lines (if any left)
* This part ends with min y in d5 and max y d6
move.w d6,ymax
move.w d5,ymin
*****************************************************************************************
* PART 2. Copy the xbuf to the mask plane.
* Set up the pointer
lea xbuf,a0 base address of buffer
move.l maskplane,a1 base address of maak plane
lea msk_y_tbl,a2 mask plane y look up table
sub.w d5,d6 num pairs to set 1
move.w d6,d7 is the counter
beq fil_end quit if all sides horizontal
move.w d5,d2 miny is the start
lsl.w #2,d5 4*min y = offset into xbuf
add.w d5,a0 for the address to start
subq.w #1,d2 reduce initial y
poly2
addq #1,d2 next y
move.w (a0)+,d0 next x1
move.w (a0)+,d1 next x2
cmp.w d0,d1 test(x1x2)
beq poly4 cant draw a line with one point
move.w d2,d5 pass y
bsr set_pix set the 2 pixels
poly4
dbra d7,poly2 repeat for all y values
*****************************************************************************************
* PART 3. Fill in the outline
* Confine the blit to the rectangle (xmaxxmin)*(ymaxymin).
* First xmax and xmin are recorded to define the rectangle.
bsr blt_chk
frme
move.w no_in,d7
subq.w #1,d7
movea.l coords_lst,a3 here they are
move.w #MINIMUM_X,xmin initialise xmin
clr.w xmax and xmax
x_test
move.w (a3),d0 next x
cmp.w xmin,d0 test(x1xmin)
bgt lnblit4 xmin unchanged
move.w d0,xmin this is x min
lnblit4
cmp.w xmax,d0 test(x1xmax)
blt lnblit5 xmax unchanged
move.w d0,xmax this x is xmax
lnblit5
addq.l #4,a3 increment x pointer
dbra d7,x_test for all x
* Here's the fill blit. Several things must be found.
* Calculate the address of the bottom rh corner of the rectangle
* bltstrt contains its offset in the plane
move.w xmax,d0
lsr.w #SIXTEEN,d0 xmax/16
move.w d0,d2 save it
add.w d2,d2 *2 = byte position in row
move.w ymax,d1
mulu #WIDTH,d1 row address
add.w d2,d1
ext.l d1
move.l d1,bltstrt save offset in the plane
* address to start blit
movea.l maskplane,a0 plane base address
add.l d1,a0 plus offset is where blit starts
move.l #$dff000,a5
move.l a0,bltapt(a5) SOURCE
move.l a0,bltdpt(a5) DESTINATION
* bltmod says how much of plane to blit
move.w xmin,d1
lsr.w #SIXTEEN,d1 xmin/16
sub.w d1,d0 xmax/16  xmin/16
addq.w #1,d0 word width of window
move.w d0,bltwidth save it
move.w #WIDTH,d2
add.w d0,d0 width in bytes
sub.w d0,d2 blitmod
move.w d2,blitmod
move.w d2,bltamod(a5) SOURCE MODULO
move.w d2,bltdmod(a5) DESTINATON MODULO
* set the control registers for a simple descending fill.
move.w #$09f0,bltcon0(a5) USE A&D D=A (no shift)
move.w #$000a,bltcon1(a5) INCLUSIVE FILL, DESCENDING
move.w #$ffff,bltafwm(a5)
move.w #$ffff,bltalwm(a5)
* set the size and do the blit
move.w ymax,d0
sub.w ymin,d0
addq.w #1,d0
lsl.w #6,d0 set height
add.w bltwidth,d0 and width
move.w d0,blitsize sizeof blit
move.w d0,bltsize(a5) do the fill
* PART 4.
* Copy the mask to the screen bitplanes which must be set or cleared
* depending on it's colour bit.
* Only the smallest rectangle is blitted.
* The mask is used in the cookie cut function:
* If the colour bit is set, the masked region is set
* If the colour bit is clear, the masked region is cleared.
pln_cpy
bsr blt_chk
move.w #DEPTH1,d7 number of planes to blit
move.w colour,d6
move.w #0002,bltcon1(a5) COPY DESCENDING
move.w blitmod,d0
move.w d0,bltamod(a5)
move.w d0,bltdmod(a5)
move.w d0,bltbmod(a5)
IFD DOUBLE_BUFFERING
move.l workplanes,a2 get address of planepointers list
ELSEIF
move.l showplanes,a2 get address of planepointers list
ENDC
; sub.l #WIDTH*HEIGHT,a0 (ready to increment in next part)
; add.l bltstrt,a0 offset to draw at
nxtplane ;LOOP POINT
bsr blt_chk
; add.l #WIDTH*HEIGHT,a0 get next bitplane base address
move.l (a2)+,a0 get next address into a0
add.l bltstrt,a0 and add offset to start drawing at...
* store the destination plane first, (copy to storeplane)
move.l a0,bltapt(a5) SOURCE
move.l storeplane,a1 destination
add.l bltstrt,a1 start position of rectangle
move.l a1,bltdpt(a5) in plane 6
move.w #$09f0,bltcon0(a5) straight copy
move.w blitsize,bltsize(a5) store destination plane
bsr blt_chk
* now mask region and set/clear as colour bit dictates
movea.l maskplane,a1 the mask
add.l bltstrt,a1 start here
move.l a1,bltapt(a5) A IS MASK
move.l storeplane,a1
add.l bltstrt,a1 offset
move.l a1,bltbpt(a5) B IS STOREPLANE
move.l a0,bltdpt(a5) DESTINATION
* do we set or clear the masked region?
lsr.w #1,d6 get colour bit into carry flag
bcc bltclr bit is zero so clear masked region
* we have to set the masked region
move.w #$0dfc,bltcon0(a5) NO SHIFT: USE A,B,D: D=A OR B
bra bltcopy
bltclr
* clear region
move.w #$0d0c,bltcon0(a5) NO SHIFT: USE A,B,D: D=NOT A AND B
bltcopy
move.w blitsize,bltsize(a5) perform the required blit function
dbf d7,nxtplane do all the planes
* done
fil_end
rts
*****************************************************************************************
* Get pixel address and mask to set pixels in the mask plane which
* mark start and end of a scan line .
* d0=x1: d1=x2: d2=y1: a0=xbuf: a1=maskplane base: a2=msk y line tbl
set_pix
lsl.w #2,d5 4*y is offset in table
movea.l 0(a2,d5.w),a3 row address in mask plane
move.l a3,a4 save it
* set pixel x1
move.w d0,d3 save x1
lsr.w #EIGHT,d0 byte num in row (/8)
adda.w d0,a3 the byte containing the pixel
andi.w #$0007,d3 pixel num in word
subi #7,d3
neg.w d3 bit to set
clr.w d0
bset d3,d0 this is a mask
or.b d0,(a3) set the pixel
* set pixel x2
move.l a4,a3 restore row address
move.w d1,d3 save x2
lsr.w #EIGHT,d1 byte num in row (/8)
adda.w d1,a3 the byte containing the pixel
andi.w #$0007,d3 pixel num in word
subi #7,d3
neg.w d3 bit to set
clr.w d0
bset d3,d0 this is a mask
or.b d0,(a3) set the pixel
rts
*****************************************************************************************
* Get the screen address of a word
* at a0=base: d0=x: d1=y:
scrn_wrd
move.w #WIDTH,d2 plane width
mulu d1,d2 y*width
add.l a0,d2 + base
lsr.w #SIXTEEN,d0 x/16
add.w d0,d0 word pos in row
ext.l d0
add.l d0,d2 address
rts
*****************************************************************************************
* See if last blit is finished
blt_chk
move.l #$dff000,a5
move.l d7,(sp)
blt_chk1
move.w dmaconr(a5),d7
btst.l #14,d7
btst.l #14,d7
bne blt_chk1
move.l (sp)+,d7
rts
************************************************************************************
* BSS_00.s *
* Put all the variables you want to use in here *
************************************************************************************
* POLYGON VARIABLES
colour ds.w 1 current colour
no_in ds.w 1 number of polygon vertices
xmin ds.w 1 limits
xmax ds.w 1 for
ymin ds.w 1 xbuf
ymax ds.w 1
coords_lst ds.l 1
* BITPLANE VARIABLES
maskplane ds.l 1 base addresses
storeplane ds.l 1
log_screen ds.l 1
msk_y_tbl ds.l 200 scan line addresses
xbuf ds.l 200 x buffer
bltstrt ds.l 1
bltwidth ds.w 1
blitsize ds.w 1
blitmod ds.w 1
showplanes_list ds.l DEPTH a list of pointers to each of the bitplanes per playfield
workplanes_list ds.l DEPTH
IFD A500
scrn1_base ds.l 1
scrn2_base ds.l 1
cl1adr ds.l 1
cl2adr ds.l 1
cladr ds.l 1
oldcop ds.l 1
ENDC
gfxversion ds.w 1 lib version
vbi_flag ds.w 1
show_bitmap ds.l 1 BitMap structure pointers
work_bitmap ds.l 1
showlist ds.l 1 Copperlist pointes
worklist ds.l 1
showplanes ds.l 1 VIDEO Ram pointers
workplanes ds.l 1
draw_buffer ds.l 1 Bitplane sized memory to construct objects in
frame_done ds.l 1 flag to indicate frame finished
File_Handle ds.l 1
File_Buffer ds.l 1
DosBase ds.l 1
GrafBase ds.l 1
IntuiBase ds.l 1
LowLevelBase ds.l 1
OldActiView ds.l 1
intHandle ds.l 1 V40 interrupt handle
colormap ds.l 1
ReturnMsg ds.l 1 For system to use when we quit
* SYSTEM STRUCTURES I WANT TO USE.
EVEN
vblank ds.b IS_SIZE ;store interrupt structure here.
EVEN
my_view ds.b v_SIZEOF
EVEN
my_viewport ds.b vp_SIZEOF
EVEN
my_rasinfo ds.b ri_SIZEOF
4. Windowing
If a picture is larger than the limits of the screen then there is a problem with what happens to the excess. Unless some provision is made for this possibility, the program will attempt to write to addresses outside of the section of RAM reserved for the screen which in our case is the five bit planes called the logical screen. Unless we are sure that everything will always lie within the screen size, some provision must be made to clip off those sections of the picture which lie outside. Confining a picture in this way is called windowing because of the obvious analogy to someone looking out of a window. The screen is a window onto the internal world of the computer. This window could be the maximum allowed on a given resolution or something smaller (one obvious way to make graphics fast is to keep the picture small so that not much has to be drawn). The freedom to vary the size of the visible image can even give rise to special effects — an aperture opening, for example. Because of the ‘clipping off’ of the unwanted parts of the picture that takes place, we shall call outline of this window the clip frame.
The algorithm we need is one which will handle filled polygons. It is not sufficient to just chop off vertices where they exceed the clip frame. The line left by the chop must become an additional edge to close the polygon. Once again an elegant solution to this problem was found many years ago by Sutherland and Hodgman.
4.1. SutherlandHodgman Clipping Algorithm.
The SutherlandHodgman algorithm is actually more powerful than we require; it can handle polygons of any shape. In this book, for speed, only convex (roundshaped, all external angles greater than zero) polygons are filled. The requirement to be convex is a consequence of a later constraint; the need to keep the hiddensurfaceremoval algorithm simple. This is something we will meet at a later stage.
Strictly speaking, SutherlandHodgman does not require polygons to be convex nor does it require the clipping frame to be a rectangle. But, for simplicity, the version given here does use a rectangular clipping frame parallel with the monitor screen. The boundaries of the clipping frame are defined by xmin, xmax, ymin and ymax and are shown for a general polygon in Figure 4.1. The SutherlandHodgman strategy is to find the intersections in turn of all of the edges of the polygon with each boundary. Since our boundary has four sides this means that four cycles of the polygon will be made. On each cycle some of the original edges may be lost and new ones added.
As each new vertex is examined, various actions are taken which depend on the position of it and the previous vertex. These cases are illustrated in Figure 4.1 and examined below:

If the next vertex is outside the frame, (A), check the position of the previous vertex, ©. If that was in, find the point of intersection, (S), of the edge joining them with the clip frame and save it. Don’t save the next vertex (A).

If the next vertex is inside the frame, (B), check the position of the previous vertex, (A). If that was out, find the point of intersection of the edge joining them with the clip frame, ® and save it. Also save the next vertex, (B).
This is the algorithm applied to all the vertices going round the polygon.
Once again it might appear that calculating points of intersection of sloping lines with the clip frame requires a lot of mathematical computation involving divisions and multiplications. Surprisingly this is not so. As usual in assembly language programming, where variables are not abstract algebraic symbols, but contents of memory locations or registers, it is possible to find answers using only addition and subtraction and, where it occurs, to use division and multiplication by powers of two which can quickly done by right and left shifts.
To illustrate this consider the case where the previous point was outside but the next point is inside the frame limit xmin. This is shown in more detail in Figure 4.2 where the two possible cases, depending on which point is closest to the limit, are examined. As part of the process to determine that B(x2,y2) lies inside and A(x1,y1) lies outside the limit, it is necessary to compare both x1 and x2 with xmin. But instead of just using the COMPARE instruction, the actual differences (xminx1) and (xminx2) are calculated and the sign of the result used as the basis for decision. Note that (xminx1) is positive and (xminx2) is negative. Having then decided that there is a point of intersection to determine and save, these differences are used as the starting point for calculating the point of intersection in the following way.
One of the coordinates of the point of intersection is already known; it is xmin, the limit itself; it remains to find the y value at the intercept. This is done iteratively in the following way. The average of A and B is calculated by adding coordinates and dividing by 2. The result T1 is closer to the intercept than either A or B and we can see what side of the boundary it lies by following the sign of the average of (xminx1) and (xminx2). More important, the average of y1 and y2 will be the intercept value itself if the average of (xminx1) and (xminx2) is zero, because when this happens the two points are either evenly spaced on either side of the boundary, or coincident with it. This is the basis of the iterative algorithm used in the example program.
What happens the first time is that the average of y1 and y2 and the average of (xminx1) and (xminx2) are calculated by means of an addition and a shift right (a quick divide by two). This yields the y coordinates of the point T1. If the average of (xminx1) and (xminx2) is zero then the intercept has been found. If the xaverage is negative, as at point T1 in case 1, then it lies inside the boundary and the next average must be taken between (xminx1) and(xminxT1). Likewise, the next yaverage must be taken between y1 and yT1. If, on the other hand, the initial average of (xminx1) and (xminx2) is positive, as case 2, the next average must be taken between (xminxT1) and (xminx2) and the next yaverage between yT1 and y2. This iterative process continues until the xaverage is zero, at which point the current yaverage is the y coordinate of the point of intersection, which is then saved.
4.2. Example Program
The example program clips a polygon using a version of the SutherlandHodgman algorithm and then fills it. The polygon is shown in Figure 4.3.
4.2.1. clipfrme.s
This is the control program plus the data for the polygon vertices. The coordinates in my_data are, as usual in the order x0,y0,x1,y1……x0,y0, with the first coordinate repeated at the end. The clip frame limits are also given in the data and you can change them to suit yourself.
4.2.2. core_01.s
Here is where the actual clipping routine resides (together with all the other routines used so far by means of the include core_00.s directive at the end). Most of the work is done by the subroutine clip. It looks rather long but that is to try to make it more readable. Because many of its parts are very similar, it would be possible to make it shorter with inner subroutine calls, but then it would be harder to follow. It is a complicated routine but that is a consequence of the rather difficult task it does, which has been described above.
It is laid out in the order that it clips against boundaries: xmin first followed by the others. In all four complete traversals of the data are made with new vertices being added each time. The data for the vertices is input on the first traversal from crds_in and output to crds_out. The next traversal reverses the order. Because there are four traversals, the data ends up back where it started in crds_in, ready for the next part of the program, to follow in later chapters.
4.2.3. bss_01.s
As the number of variables gets larger, so new bss files appear. The earlier ones have to be included.
* clipfrme.s
*
* Program for chapter 4
* A program to clip and fill apolygon to a window (clip fram)
* defined by the limits clp_xmin, clp_xmax, clp_ymin, clp_ymax
*
*SECTION TEXT
opt d+ incldue levels for debugging
bra main don't execute the includes
inlude equates. constants
inlude systm_00.s housekeeping
include core_01.s important subroutines
main bsr set_up screens, copper, blitter, etc
blit_loop:
bsr drw_shw2
move.w #121,d7 six pari of coords for vertices
lea crds_in,a0 destination
move.l a0,a3 ready for drawing
lea my_data,a1 from here
clp_loop
move.w (a1)+,(a0)+ transfer
dbf d7,clp_loop them all
move.w #5,no_in 5 sides to the polygon
move.w my_colour,colour set the colour
move.w my_xmin,clp_xmin set the
move.w my_max,clp_xmax clip
move.w my_ymin,clp_ymin fram
move.w my_ymax,clp_ymax limits
bsr drw2_shw1 draw on screen 2, display 1
bsr clip window it
bsr poly_fill fill it
bsr drw1_shw2 show the drawing
loop_again:
bra loop_again forever
*SECTION DATA
* A pentagon
my_data dc.w 20,100,200,20,300,80,260,180,140,180,20,100
* which is pink
my_colour dc.w 24
* The window limits
my_xmin dc.w 50
my_xmax dc.w 270
my_ymin dc.w 50
my_ymax dc.w 150
*SECTION BSS
include bss_01.s
* SECTION DATA
include data_00.s
END
*****************************************************************************************
* Core_01.s *
* *
* A version of the SutherlandHodgman clipping algorithm.
* It goes around the the polygon clipping it against one boundary at a time. It goes
* around four times in all.
* a0=crds_in: a1=crds_out: a2=no_out: a3=saved(crds_out):
* d0=current limit: d1=x1: d2=y1: d3=x2: d4=y2: d5=(saved)x2: d6=(saved)y2:
*****************************************************************************************
include core_00.s
* First clip against xmin.
clip
bsr clip_ld1 set up pointers
tst.w d7 any sides to clip?
beq clip_end not this time...
* do first point as a special case.
move.w (a0)+,d5 1st x
move.w (a0)+,d6 1st y
move.w clp_xmin,d0 limit
cmp.w d0,d5 test(x1xmin)
bge xmin_save inside limit
bra xmin_update outside limit
* do succesive vertices in turn
xmin_next
move.w (a0)+,d3 x2
move.w (a0)+,d4 y2
move.w d3,d5 save x2
move.w d4,d6 save y2
* now test for position
sub.w d0,d3 x2xmin
bge xmin_x2in x2 is in
* x2 is inside, find x1
sub.w d0,d1 x1xmin
blt xmin_update both x2 and x1 are outside
* x2 is out but x1 is in so find intersection, needs d1=dx1(+ve):d3=dx2(ve)
* d2=y1: d4=y2:
* find the y intercept and save it.
bsr y_intercept
* but because it's out, don't save x2.
bra xmin_update
xmin_x2in
* x2 is in but where is x1? GOD KNOWS!!
sub.w d0,d1 x1xmin
bge xmin_save both x1 and x2 are in
* x2 is in but x1 is out so find intercept, but need ve one in d3, so swap
exg d1,d3
exg d2,d4
bsr y_intercept
xmin_save
move.w d5,(a1)+ save x
move.w d6,(a1)+ save y
addq.w #1,(a2) inc count
xmin_update
move.w d5,d1 x1=x2
move.w d6,d2 y1=y2
dbra d7,xmin_next
* The last point must be the same as the first
movea.l a3,a4 pointer to first x
subq #4,a1 point to last x
cmpm.l (a4)+,(a1)+ check first and last x and y
beq xmin_dec already the same
move.l (a3),(a1) move first to last
bra clip_xmax
xmin_dec
tst.w (a2) if count
beq clip_xmax is not already zero
subq.w #1,(a2) reduce it
* Now clip against xmax. Essentially the same as above except that the order
* of subtraction is reversed so that the same subroutine can be used to find
* the intercept.
clip_xmax
bsr clip_ld2 set up pointers
tst.w d7 any to do?
beq clip_ymin no...
* do first point as a special case.
move.w (a0)+,d5 1st x
move.w (a0)+,d6 1st y
move.w clp_xmax,d0
cmp.w d5,d0 test (xmaxx1)
bge xmax_save inside limit
bra xmax_update outside limit
* do succesive vertices in turn
xmax_next
move.w (a0)+,d3 x2
move.w (a0)+,d4 y2
move.l d3,d5 save x2
move d4,d6 save y2
* now test for position
sub.w d0,d3
neg.w d3 xmaxx2
bge xmax_x2in x2 is in
* x2 is outside. where is x1?
sub.w d0,d1
neg.w d1 xmaxx1
blt xmax_update both x2 and x1 are out
* x2 is out but x1 is in so find intersection
* needs dx1(+ve) in d1, and dx2(ve) in d3, y1 in d2 and y2 in d4
* find the intercept and save it.
bsr y_intercept
* but because its out dont save x2
bra xmax_update
* x2 is in but where is x1
xmax_x2in
sub.w d0,d1
neg.w d1 xmaxx1
bge xmax_save both x1 and x2 are in
* x2 is in but x1 is out so find intercept
* but must have the ve one in d3,so switch
exg d1,d3
exg d2,d4
bsr y_intercept
xmax_save
move.w d5,(a1)+ save x
move.w d6,(a1)+ save y
addq.w #1,(a2) inc count
xmax_update
move d5,d1 x1=x2
move d6,d2 y1=y2
dbra d7,xmax_next
* the last point must be the same as the first
movea.l a3,a4 pointer to first x
subq #4,a1 point to last x
cmpm.l (a4)+,(a1)+ check 1st and last x and y
beq xmax_dec already the same
move.l (a3),(a1) move first to last
bra clip_ymin
xmax_dec
tst.w (a2) if count
beq clip_ymin is not already zero
subq.w #1,(a2) reduce it
clip_ymin
bsr clip_ld1 set up pointers
tst.w d7 any to do?
beq clip_ymax no...
* do first point as a special case
move.w (a0)+,d5 ist x
move.w (a0)+,d6 1st y
move.w clp_ymin,d0 this limit
cmp.w d0,d6 test (y1ymin)
bge ymin_save inside limit
bra ymin_update outside limit
* do successive vertices in turn
ymin_next
move.w (a0)+,d3 x2
move.w (a0)+,d4 y2
move d3,d5 save x2
move d4,d6 save x1
* now test for position
sub.w d0,d4 y2xmin
bge ymin_y2in y2 is in
* y2 is outside where is y1?
sub.w d0,d2 y1xmin
blt ymin_update both y2 and y1 are out
* y2 is out but y1 is in so find intersection
* needs x1 in d1, x2 in d3, dy1 in d2 and dy2 in d4
* find the intercept and save it
bsr x_intercept
* but because its out, dont save y2
bra ymin_update
ymin_y2in
* y2 is in but where is y1
sub.w d0,d2 y1ymin
bge ymin_save both y1 and y2 are in
* y2 is in but y1 is out so find intercept
* but must have the ve one in d4 so switch
exg d1,d3
exg d2,d4
bsr x_intercept
ymin_save
move.w d5,(a1)+ save x
move.w d6,(a1)+ save y
addq.w #1,(a2) increment no
ymin_update
move d5,d1 x1=x2
move d6,d2 y1=y2
dbra d7,ymin_next
* the last point must be the same as the first
movea.l a3,a4 pointer to first x
subq.w #4,a1 point to last x
cmpm.l (a4)+,(a1)+ check first and last x and y
beq ymin_dec already the same
move.l (a3),(a1) move first to last
bra clip_ymax
ymin_dec
tst.w (a2) if count
beq clip_ymax is not already zero
subq.w #1,(a2) reduce it
clip_ymax
bsr clip_ld2
tst.w d7 any to do?
beq clip_end no...
* do first point as a special case
move.w (a0)+,d5 1st x
move.w (a0)+,d6 1st y
move.w clp_ymax,d0
cmp.w d6,d0 test(ymaxy1)
bge ymax_save
bra ymax_update
* do vertices in turn
ymax_next
move.w (a0)+,d3 x2
move.w (a0)+,d4 y2
move d3,d5 save x2
move d4,d6 save y2
* test for position
sub.w d0,d4
neg.w d4 ymaxy2
bge ymax_y2in
* y2 is outside where is y1?
sub.w d0,d2
neg.w d2 ymaxy1
blt ymax_update both x2 and x1 are out
* y2 is out but y1 is in so find intersection
bsr x_intercept
bra ymax_update
ymax_y2in
*y2 is in but where is y1?
sub.w d0,d2
neg.w d2 ymaxy1
bge ymax_save both y1 and y2 are in
* y2 is in but y1 is out so find intercept
exg d1,d3
exg d2,d4
bsr x_intercept
ymax_save
move.w d5,(a1)+ save x
move.w d6,(a1)+ save y
addq.w #1,(a2) increment num
ymax_update
move.w d5,d1 x1=x2
move.w d6,d2 y1=y2
dbra d7,ymax_next
* the last point must be the same as the first
movea.l a3,a4 pointer to first x
subq.w #4,a1 point to last x
cmpm.l (a4)+,(a1)+ check first and last x and y
beq ymax_dec already the same
move.l (a3),(a1) move first to last
bra clip_end
ymax_dec
tst.w (a2) if count
beq clip_end is not already zero
subq.w #1,(a2) reduce it
clip_end
lea crds_in,a0
move.l a0,coords_lst
rts
clip_ld1
lea crds_in,a0 pointer to vertex coords before
lea crds_out,a1 and after this clip
move.l a1,a3 saved
move.w no_in,d7 this many sides before
lea no_out,a2 where the number after is stored
clr.w no_out
rts
clip_ld2
lea crds_out,a0 pointer to vertex coords before
lea crds_in,a1 and after this clip
move.l a1,a3 saved
move.w no_out,d7 this many sides before
lea no_in,a2 where the number after is stored
clr.w no_in
rts
y_intercept
tst.w d1
beq yint_out
tst.w d3
beq yint_out
movem d5/d6,(sp)
yint_in
move.w d2,d6
add.w d4,d6
asr.w #1,d6
move.w d1,d5
add.w d3,d5
asr.w #1,d5
beq yint_end
bgt yint_loop
move d5,d3
move d6,d4
bra yint_in
yint_loop
move d5,d1
move d6,d2
bra yint_in
yint_end
move.w d0,(a1)+
move.w d6,(a1)+
addq.w #1,(a2)
movem (sp)+,d5/d6
yint_out
rts
x_intercept
tst.w d2
beq xint_out
tst.w d4
beq xint_out
movem d5/d6,(sp)
xint_in
move d1,d5 x1
add.w d3,d5 x1+x2
asr.w #1,d5 ()/2=,x> a possible intercept
move d2,d6 dy1
add.w d4,d6 dy1+dy2
asr.w #1,d6 (dy1+dy2)/2 =<dy>
beq xint_end if <dy>=0. boundry reached
bgt xint_loop if not loop again
move d6,d4 unless <dy> is ve and becomes dy2
move d5,d3 and <x> becomes x2
bra xint_in and try again
xint_loop
move d5,d1 <x> is new dx1
move d6,d2 and <dy> is new dy1
bra xint_in
xint_end
move.w d5,(a1)+ store intercept <x>
move.w d0,(a1)+ and the y as new vertex coords
addq.w #1,(a2) and increment the vertex count
movem (sp)+,d5/d6
xint_out
rts next vertex
* Leaves with a list of vertex coords at coords_in
* the number of polygon sides at no_in
set_up:
* set up memory, screens, blitter etc
bsr alloc_mem
bsr copr_lst
bsr blit_alloc
bsr colr_set
bsr wrt_phys_tbl
include core_00.s add on the previous core
*****************************************************************************************
*****************************************************************************************
* BSS_01 *
*****************************************************************************************
include bss_00.s
* Polygon attributes
crds_in ds.w 100 input coords
crds_out ds.w 100 output as above
no_out ds.w 1 output number
colr_lst ds.w 20 list of polygon colours
clp_xmax ds.w 1 clip frame limits
clp_xmin ds.w 1
clp_ymin ds.w 1
clp_ymax ds.w 1
5. Getting Things Into Perspective
It is a curious thing that distant objects look smaller than ones which are close. They aren’t smaller, but they do subtend a smaller angle at the eye. For any scene to look real therefore, the size of primitives must diminish as they recede into the distance. All of this is done by the eye and the brain. Simulating the same effect on the computer screen is what the perspective transform is all about.
You don’t really need to understand much maths to use the transforms in this book. The maths and the transforms have all been worked out; you only have to understand how to feed data to them. The perspective transform is just such an example. However, to understand and use transforms fully requires some understanding of maths and matrices. We will introduce these as the need arises. The Appendices also contain information on these topics
5.1. The Perspective Transform
The perspective transform is a set of mathematical operations which project an image of an object from the world reference frame onto the screen. This has a similarity to the way in which a shadow is formed, except that in that case the shadow falls behind the object and is larger, whereas in the perspective projection it is between the viewpoint and screen and smaller. This is shown in Figure 5.1.
One aspect that crops up repeatedly in transforms and matrices is the use of homogeneous coordinates. Yet it is possible to avoid using them altogether and in many cases it is an inconvenience to use them at all. What do they mean? Do they matter? In this chapter we find out about homogeneous coordinates and how to use them in the perspective transform which is done using matrix multiplication just to illustrate the method. At the same time it will be clear how to do the transform without using matrix multiplication at all. It just turns out that the perspective transform is a good opportunity to try it out.
Figure 5.1 shows an object, in this case a cube, defined inside the computer in the world frame and projected onto the screen. The screen lies in the xvyv plane of the view frame and the projected image is defined by the points where the ‘rays’ from the view point (also called the centre of projection, at d along the zv axis) pierce the view plane. The window is the area of the view plane which is visible on the screen. That’s really all there is to it. The view point plays a very important role in this scheme and could be placed anywhere. Placing it along the z axis makes the algebra simple and centres the projection about the view frame origin. This is a very simple type of projection; draughtsmen use many other kinds. But it works fine and the algebra associated with it is minimal.
To make life simple, take the case where the window entirely fills the monitor screen. Then the distinction between the two disappears. Let’s look at how a very simple object projects onto the screen. This is shown in Figure 5.2. As part of the transform it is also necessary to adjust to the screen coordinate system, where the origin is at the top lefthand corner. There are three coordinate systems shown in the diagram: the view frame (xv,yv,zv), the screen frame (xs,ys), and the projected coordinates (Xv,Yv). This projected coordinate system is an intermediate one, introduced for convenience and centred at the view frame origin.
From the similar triangles ABC and ADE and the similar triangles ABF and ADG we get the results:
Xv/xv = d/(zv+d) and Yv/yv = d/(zv+d)
or
Xv = xv.d/(zv+d) and Yv = yv.d/(zv+d).
It only remains to choose where to centre the projection on the visible screen. If it is to be centred halfway across at the bottom then in screen coordinates, then
xs = Yv+Wx/2 and ys = WyXv
where Wx and Wy are the width and height of the screen in the current resolution.
In low resolution Wx=320 and Wy=200. In what follows we shall only consider low resolution, though a conversion from one resolution to another is straightforward.
In low resolution the perspective transform becomes, for display in screen coordinates:
xs = 160+yv.d/(zv+d) ys = 200xv.d/(zv+d)
These transforms can be worked out using straightforward algebra. The only thing to look out for is that the denominator doesn’t ever become zero because this will cause a ‘divide by zero’ exception. The program can be set up to watch out for this.
5.2. Homogeneous Coordinates
The perspective transform, above, is quite simple but has a serious disadvantage if it is to be concatenated with several other types of transform. Remember, in the jargon of matrix transforms, concatenation simply means multiplying matrices together. That is the advantage of writing transforms as matrices. Where several transforms (rotations etc.) take place in succession, the overall transform can be constructed by multiplying the individual transforms and then applied to the coordinates in one go. The problem with this perspective transform is that as it stands it cannot be written as a matrix at all.
Basically, a matrix can represent any transform which is linear, which means there is a proportional relation between the initial and the transformed coordinates. What we would like to see for the transforms between Xv,Yv and xv,yv,zv are equations like
Xv = a.xv + b.yv + c.zv
Yv = d.xv + e.yv + f.zv
where the coefficients a,b,c,d,e and f are simple numbers.
Then it could be written as a matrix product (see Appendix 6 for more information on matrices)
Xv a b c xv = * yv Yv d e f zv
Unfortunately the perspective transform we have derived does not have this form. What messes it up is the (zv+d) in the denominators; the coordinates themselves have to be in the numerators. Therefore as it stands our transform cannot be put into 3x3 matrix form. The perspective transform isn’t the only one to suffer from this problem. Simple translations do as well. The way out of the problem is to go to homogeneous coordinates.
As far as we are concerned the use of homogeneous coordinates is just a trick to get round this problem. The trick is to introduce another dimension, temporarily, to give more “space”. That’s all this extra dimension does because in this extra dimension all vertices have the same value, 1. In homogeneous coordinates the point (xv,yv,zv) becomes (xv,yv,zv,1).
How does this help? Now the transform can be written as a product but there are penalties to pay: the matrix product will generate an extra term which must be divided into the others. Also all matrices are now bigger (4x4). Here’s how it works.
First do the perspective transform in homogeneous coordinates to give an intermediate result:
d.xv d 0 0 0 xv d.yv 0 d 0 0 yv 0 = 0 0 0 0 * zv zv+d 0 0 1 d 1
Then divide by the fourth element (zv+d) to give
Xv = xv.d/(zv+d)
Yv = yv.d/(zv+d).
Finally translate to the screen centre (this translation can also be done as a matrix multiplication in homogeneous coordinates but that would be making work for the sake of it):
xs = 160 + yv.d/(zv+d)
ys = 200  xv.d/(zv+d).
The perspective matrix has zeros for most of its elements and so many of the multiplications are a waste of time. In the program at the end of this section which illustrates the transform, we have used the homogeneous form. It serves as a useful introduction to matrix multiplication in assembly language and allows us to try a few littleused assembler instructions.
5.3. Example program
The example program shows a view of a plane with the letter “A” (an A monolith) sloping forwards in the world frame. When the perspective transform is done (together with windowing and everything else) it appears on the screen like the opening logo in a movie, where the words diminish into the distance. Figure 5.3 shows how the plane is set up in the view frame. Figure 5.4 shows how it looks on the screen.
You can look at the coordinates in the data file and change them if you wish to see how it looks in different orientations. If you want, you can change the data altogether to draw something different, but first read carefully how the data is laid out. This is explained more fully below in the data file. Be careful to join up the characters and label the vertices properly.
5.3.1. perspect.s
This is the control program. Its function is to load up the data, draw the picture and terminate with a key press. The data are stored in the file data_01.s, described below.
5.3.2. data_01.s
This is discussed next because it contains lists of the data. Understanding how these are used is essential to understanding how the program works. Since we start off with an object drawn in 3D in the view frame, each of its vertices must be fixed by three coordinates (xv,yv,zv). The lists of these are held at my_datax, my_datay and my_dataz. There is a scheme to identify each vertex in these lists. Each vertex has a number as shown in Figure 5.5. To find its coordinates simply read in from the start counting the first coordinate as number zero. The number of vertices in each polygon is given at vectors.
More data than this is required to actually draw the picture. The connections between the vertices are specified in my_edglst. For each polygon there is a list of connections in this table. The overall object is split into 6 polygons, all of which lie in the same plane. The vertex connections for these, going clockwise and closing the polygon, are
polygon 0: 0,1,2,3,0 polygon 1: 4,5,6,4 polygon 2: 7,8,9,10,7 polygon 3: 11,12,13,11 polygon 4: 14,15,4,18,14 polygon 5: 16,17,19,6,16
Arranged in this way all the information required to draw the object is readily available. To colour in the polygons a list of individual colours is held at my_colour. Notice that in this picture it was decided to construct the “A” by drawing an outline (polygon 1) and masking out the open parts (poly’s 2 and 3) with the background colour, rather than by drawing each segment separately. This is also evident from the actual colour list, my_colour where it can be seen that the background is gold and the letters are magenta. Doing it this way saves a bit of time but may lead to problems when the boundaries don’t quite match up. To supplement these lists the total number of polygons is given at my_npoly. These are the data blocks must be loaded up at initialisation. Other variables are calculated by the various parts of the program as it goes.
You can change these lists to draw anything you wish. Just remember it is a 3D object in the view frame and coordinates are easiest to determine from views along the different axes. It must also be placed in front of the view plane as shown in Figure 5.3.
5.3.3. data_02.s
The 4x4 matrix for the perspective transform is stored here, a row at a time, with a viewpoint at 100 on the view frame z axis. It isn’t included with data_01.s since that file will only be used once
If you can’t follow the matrix multiplication used in the transform, don’t worry. Just think of the transform as a piece of ‘machinery’ to perform a function. If you want to alter the angle of view, change each of the numbers 100 to the new position of the view point. Remember 100 here is the distance of the view point along the negative zv axis.
5.3.4. bss_02.s
This contains a list of the variables used by the programs. Data is loaded into the variables blocks from the data file data_01.s by the control program. What goes where is clear from the control program. It consists of the lists of the x, y, and z coordinates of the vertices in the view frame, and other attributes as described in the previous sections.
5.3.5. core_02.s
This has two parts: the perspective transform, and polydraw which takes care of clipping and the actual drawing.
The perspective transform is done by matrix multiplication in homogeneous coordinates. It could be done by direct algebra but it is done this way to illustrate the use of homogeneous coordinates and matrix multiplication in a very compact way. Also it utilises a useful but littleused assembler instruction, LINK. When invoked, this causes the processor to open a space on the stack, called a frame, where data can be stored without interfering with the main stack. The pointer to the frame, one of the address registers, is declared in the LINK instruction together with the space required. The processor takes care of adjusting the regular stack pointer clear of the frame. In the present case it’s where the intermediate perspective calculations are stored. When finished with, the frame is closed by means of the UNLK instruction and the tidying up of the stack pointer is taken care of by the processor.
The perspective transform calculates the projections of the vertices on the view plane and stores them in two lists: scoordsx and scoordsy.
Polydraw is the final part. It contains all the previous subroutines necessary to complete the drawing. It also contains at the start a test for the visibility of each polygon. This is in anticipation of things to come. The test is to look for a colour number greater than $1f. Such a value would have been set earlier if the polygon was found to be facing away from the view point.
*
* perspect.s
*
*SECTION TEXT
opt d+ labels for debugging
bra main dont execute the includes
include core_02.s core subroutines
include systm_00.s
main bsr set_up allocate memory etc
* Transfer data from the data file to variables locations:
* first the edge numbers and colours
move.w my_npoly,d7 no of polygons?
beq main if none, quit
move.w d7,npoly or becomes
subq.w #1,d7 the counter
move.w d7,d0 save it
lea my_nedges,a0 source
lea snedges,a1 destination
lea my_colour,a2 source
lea col_lst,a3 destination
loop0 move.w (a0)+,(a1)+ transfer edge nos
move.w (a2)+,(a3)0 transfer colours
dbra d0,loop0
* second the edge list and coordinates
move.w d7,d0 restore count
lea my_nedges,a6
clr d1
clr d2
loop1 add.w (a6),d1
add.w (a6)+,d2
addq #1,d2 last one repeated each time
dbra d0,loop1 = total no of vertices
subq #1,d2 the counter
lea my_edglst,a0 source
lea sedglst,a1 destination
loop2 move.w (a0)+,(a1)+ pass it
dbra d2,loop2
move.w d1,vncoords
subq #1,d1
lea vcoordsx,a1
lea my_datax,a0
lea vcoordsy,a3
lea my_datay,a2
lea vcoordsz,a5
lea my_dataz,a4
loop3 move.w (a0)+,(a1)+
move.w (a2)+,(a3)+
move.w (a4)+,(a5)+
dbra d1,loop3
* the clip form boundaries
move.w my_xmin,clp_xmin ready
move.w my_xmax,clp_xmax for
move.w my_ymin,clp_ymin clipping
move.w my_ymax,clp_ymax clipping
* Calculate the perspective view and draw it
bit_loop:
bsr drw_shw2
bsr perspective
bsr polydraw
bsr drw2_shw1
pers_loop
bra pers_loop forever
*SECTION DATA
include data_01.s
include data_02.s
*SECTION BSS
include bss_02.s
END
******************************************************************************************
* Core_02.s
* Perspective stuff
******************************************************************************************
include core_01.s
perspective
move.w vncoords,d7 any points to do?
beq prs_end
subq.w #1,d7 counter
lea vcoordsx,a0
lea vcoordsy,a1
lea vcoordsz,a2
lea scoordsx,a4
lea scoordsy,a5
link a6,#32 open 16 word frame
prs_crd
moveq #3,d6
lea persmatx,a3
prs_elmnt
move.w (a0),d0
move.w (a1),d1
move.w (a2),d2
muls (a3)+,d0
muls (a3)+,d1
muls (a3)+,d2
add.l d1,d0
add.l d2,d0
move.w #1,d1
muls (a3)+,d1
add.l d1,d0
move.l d0,(a6)
dbra d6,prs_elmnt
move.l (a6)+,d3
bne prs_ok
addq #1,d3
prs_ok
addq.l #4,a6
move.l (a6)+,d4
divs d3,d4
add.w #160,d4
move.w d4,(a4)+
move.l (a6)+,d4
divs d3,d4
sub.w #199,d4
neg.w d4
move.w d4,(a5)+
addq.l #2,a0
addq.l #2,a1
addq.l #2,a2
dbra d7,prs_crd
unlk a6
prs_end
rts
polydraw
move.w npoly,d7
beq polydraw5
subq #1,d7
lea scoordsx,a0
lea scoordsy,a1
lea sedglst,a2
lea snedges,a3
lea col_lst,a4
polydraw2
move.w (a4)+,d0
cmp.w #$1f,d0
ble polydraw3
move.w (a3)+,d0
addq.w #1,d0
add d0,d0
adda.w d0,a2
bra polydraw4
polydraw3
move.w d0,colour
move.w (a3)+,d0
beq polydraw3
move.w d0,no_in
lea crds_in,a5
polydraw1
move.w (a2)+,d1
lsl #1,d1
move.w 0(a0,d1.w),(a5)+
move.w 0(a1,d1.w),(a5)+
dbra d0,polydraw1
movem.l d7/a0a4,(sp)
bsr clip
bsr poly_fill
movem.l (sp)+,d7/a0a4
polydraw4
dbra d7,polydraw2
polydraw5
rts
******************************************************************************************
* BSS_02.s
******************************************************************************************
include bss_01.s
scoordsx ds.w 100 xcoords
scoordsy ds.w 100 ycoords
sedglst ds.w 100 edge connections
snedges ds.w 20 number of edges in each polygon
npoly ds.w 1 number of polygons in this object
col_lst ds.w 20 colours
vcoordsx ds.w 100 viewframe xcoords
vcoordsy ds.w 100
vcoordsz ds.w 100
vncoords ds.w 1
*****************************************************************************************
* Data_01.s
*****************************************************************************************
include data.s
IFND TRANSFORM
my_datax dc.w 115,115,25,25,43,107,43,40,65,65,40,75
dc.w 88,75,34,34,34,34,43,43
my_datay dc.w 100,100,100,100,70,20,73,55,20,30
dc.w 53,8,10,22,40,91,90,20,50,48
my_dataz dc.w 120,120,0,0,24,108,24,20,53,53
dc.w 20,66,84,66,12,12,12,12,24,24
ENDC
my_edglst dc.w 0,1,2,3,0,4,5,6,4,7,8,9,10,7
dc.w 11,12,13,11,14,15,4,18,14,16,17,19,6,16
my_nedges dc.w 4,3,4,3,4,4
my_npoly dc.w 6
my_colour dc.w 5,23,5,5,23,23
my_xmin dc.w 0
my_xmax dc.w 319
my_ymin dc.w 0
my_ymax dc.w 199
persmatx dc.w 100,0,0,0,0,100,0,0,0,0,0,0,0,0,1,100
* data_02.s
persmatx:
dc.w 100,0,0,0,0,100,0,0,0,0,0,0,0,0,,1,100
include equates.s
6. Simple Rotations
What we want to do here is rotate an object in the world frame. In our world model this is part of what happens when an object is moved from its object frame to the world frame. In addition, in general, there will be an associated translation as it is moved to its current location. As an example of simple rotations in action, the objecttoworld transform is a good thing to do next. In a complex world with several different objects, each one would have different translations and rotations to bring them all together to make the world picture.
Let’s take a simple world with just one object to start with. We already have a good example to work on  the monolith with the “A” written on it, which was used to illustrate the perspective transform. The data is already entered and ready to go. What we would like to see is the monolith rotating in the centre of the screen. That’s what we’ll do next.
6.1. Geometric Transforms
Geometric transforms are those which change the coordinates of objects. Arc there any other kinds? Yes, those which change frames of reference, called coordinate transforms. In mathematical language a geometric transform is the inverse of a coordinate transform (this topic is also discussed in Appendix 6). An example of the latter kind is the transform from world frame to view frame. Remember, the view frame is the set of axes attached to the observer (you) moving through the world frame. Seen from the view frame of an observer on the move, the coordinates of all objects are continuously changing. Although coordinate and geometric transforms are two sides of the same coin, the viewing transform is a bit more difficult to follow and is done later in Chapters 9 and 10 . In this section simple rotations about the x, y and z axes are presented without mathematical derivation. Turn to Appendix 6 for an additional mathematical description.
6.2. Rotations About the Principal Axes
A spinning top is a good example of an object undergoing geometric rotation about the vertical axis. As far as we are concerned here, the mathematics used to do this is just ‘heavy machinery’. There is no real need to know how it is derived in order to use it. The transforms we are about to discuss are illustrated in Figure 6.1.
6.2.1. Rotation about the xaxis
This is illustrated in Figure 6.1(1) by a point P with coordinates (x,y,z) being rotated about the xaxis by an angle 9 to arrive at the point P' with coordinates (x' ,y' ,z'). Representing the points by vectors clearly shows the rotation. Notice how the sense of the rotation is defined. It is clockwise when looking along the positive xaxis from behind the yz plane. In terms of the column vectors, the transform can be written as a matrix product
x' 1 0 0 x y' = 0 cos0 sin0 * y z' 0 sin0 cos0 z
In simple algebra, with the matrix product multiplied out:
x' = x
y' = y.cos0  z.sin0
z' = y.sinG + z.cosG
For conciseness, the matrix is abbreviated to R'(0) and the transform is then abbreviated to
P' = R'(0).P
6.2.2. Rotation about the yaxis
In this case the point P is rotated about yaxis by an angle 0 as shown in Figure 6.1(2). As before, the rotation R'(0) is clockwise looking along the positive yaxis from behind the xz plane. Expressed as a matrix product, the transform is
x' cos0 0 sin0 x y' = 0 1 0 y z' sin0 0 cos0 z
6.2.3. Rotation about the z axis
x' cosy siny 0 x y' = siny cosy 0 i iy z' 0 0 1 z
In Figure 6.1(3) the point P is rotated about the zaxis by an angle y. The rotation R' (y) is clockwise looking along the z axis from behind the xy plane.
x' cosy siny 0 x y' = siny cosy 0 y x' 0 0 1 z
6.2.4. Composite Rotations
When all three types of rotation are done simultaneously things become a good deal more complicated. This is because the order of rotation matters; rotating first by 9, second by <)> and third by y does not end up with P in the same place as with any other order. This may seem to be a surprising result. In mathematical jargon, three dimensional rotations are said to be noncommutative. To illustrate the point look at Figure 6.2.
This has two parts to it. In part 1 a vector which lies along the z axis to start with is first rotated about the x axis by 90° and then about the z axis by 90°. It ends up pointing along the x axis. In part 2 the order of rotations is reversed. Consequently the first rotation does nothing and the second leaves it pointing along the y axis. Clearly, changing the order of rotation alters the end result.
A consequence of this is that keeping count of the individual rotations 0, <J> and y separately provides insufficient information to get to the final position. The order of rotation must also be given. Where the individual rotations are small and frequent, such as in an object following a complex path, a different strategy must be found to keep track of the orientation.. This is discussed in Chapter 10.
For the moment this is not such a problem. Performing a simple sequence of rotations in the world frame, or as part of the objecttoworld transform, may only require three rotations about the individual axes in a simple order. To have a consistent scheme, we rotate first by y, second by <J> and third by 0. In shorthand the overall transform when all these rotations take place in this order is:
P' = R'(0).R'(0).R'(y).P
Notice how the first rotation appears next to the original point P, and later rotations appear farther to the left. This is the order of matrix multiplication with column vectors.
There is no need to perform the matrix products on the vector separately. Their product can be found beforehand to produce one resultant matrix, which can the be multiplied by the vector in one single operation. This combined (concatenated) rotation is denoted by R' (0,0,Y).
6.3. The ObjecttoWorld Transform
This is a good transform to illustrate what we have been talking about.
The point of this transform is to move an object from its object reference frame to the world frame where it appears in the cluster of all the other objects which make up the world picture. The objecttoworld transform is illustrated in Figure 6.3 for the general case of all three rotations and a translation. In this case the angles are specific to the transform and are called 00, o<J) and oy to distinguish them from other angles which will appear later in other transforms and the displacement is (Oox,Ooy,Ooz) or, written in vector notation:
x' x Oox y' = R' y * Ooy x' z Ooz
Notice that the translation has not been implemented as a matrix multiplication, but has been left as a vector addition. Like the perspective transform, the translation can be converted to a matrix product in homogeneous coordinates to put it on the same footing as everything else and allow it to be included in concatenation. This is not done here because it can be incorporated simply as an addition following the rotation transform. Further information on homogeneous coordinates is given in Appendix 6.
One way to think of the object frame is as a set of axes centred on the world frame origin. This is certainly a valid picture since without any rotation or translation, the object would appear at the world frame origin. The translation is essential to avoid superimposing all objects at the world frame origin. If the angles are continuously changed between frames then the object will rotate in the world frame. Since we already have the perspective transform in place from the previous chapter we can watch this happen.
6.4. Example Program
This is a program to set up the objecttoworld transform and use it to show the A monolith rotating about the zaxis of the world frame. Also the sines and cosines of angles must be calculated for the rotation matrices. How these are done is discussed below in the example programs.
6.4.1. otranw.s
This is the main control program. This time the initialisation is more extensive because a lot of data transfer takes place. The data to draw the A monolith is in the file data_01.s as before, but now it has to be transferred to the object variables list. The rotation takes place as it is transferred from the object frame to the world frame.
At the moment we can only show rotation by an angle oQ about the xw axis. This is because rotations o<) and oy about the other axes would try to display the rear side of the monolith. This cannot be done because of the way the polygon filling routine is set up to expect polygons in the screen frame to have an anticlockwise connected edge list . The rear side has this order reversed and in trying to cope with this the routine draws garbage. Normally the rear side of an object is not visible and would be dealt with in that way. As yet we do not have the capability to test for visibility. This is done in Chapter 7. If it were desired to show the back of the monolith it would have to be entered in the data as a separate object in a backtoback arrangement.
The program shows the rotation of the A monolith about the zw axis in the world frame through the range of angles 0° to 360° in 10° steps. You can alter the angular increment between each frame and the displacement (Oox,Ooy,Ooz) to see what effect these have. For very large objects it is a good idea to have a small window so that only a small fraction of a large object will actually get drawn so that speed is maintained without losing the impression of size. This explains why many games have a very small window, which is the only part that needs to be redrawn each frame, surrounded by a large static control panel which is drawn only once at the beginning.
6.4.2. data_03.s
The rotation transform uses the sines and cosines of the angles 00, cx)> and <ry. For a program operating in Basic these would be calculated to many significant digits using a series approximation. There is no time for that here. We have to resort to the method used before hand calculators were invented  tables. The table in this file contains the sines of all the angles between 0° and 90° in 1° increments each multiplied by the factor 16384, which is 214. The reason for this is straightforward. It moves the decimal point 14 places to the left in binary and allows us to work in units of 1/16384 so that products can be determined to high accuracy. However it must be remembered that at the end of the calculation of a new coordinate the result must be divided by 16384 to restore it to its correct size. There is no point in knowing the final coordinate to greater accuracy than plus or minus 1 since this is the smallest increment which can be displayed on the screen. Also if all the trigonometric functions were not multiplied by 16384, all products would fall in the range 0 to 1 and in the approximation of binary would be approximated to one or other of these values which would then give either zero or the same result for all products. The point of choosing 214 as a factor is that it can introduced or removed very quickly by 14 left or right shifts. Greater accuracy could be obtained using a larger factor, but 16384 is quite adequate for our purposes providing steps are taken to correct for errors where they occur.
For greatest speed it makes most sense to have separate tables for both sines and cosines. This is not done here mainly to illustrate how the symmetry of sine and cosines allows any value in the entire range 0° to 360° degrees to be calculated from the range 0° to 90° degrees. The time to do this is very small compared, for example, to the time taken to actually fill the polygon, but for greater speed separate tables should be used.
6.4.3. core_03.s
The first part of the subroutine here uses the lookup table in data_03.s to find the sines and cosines of the angles used in the rotation, ready for use in the transform matrix. This uses the result that the sine or cosine of any angle in the range 0° to 360° can be found from that of an equivalent angle in the range 0° to 90°. Finding this equivalent angle is what the start of the first part is all about.
In the second part, the matrix is constructed and then used to transform the object coordinates by matrix multiplication as was done in the earlier perspective transform. Although only rotations about the x axis are done in this example, the matrix can handle rotations about all three axes as described above. At the end of the rotational transform, the displacements Oox, Ooy and Ooz are added to place the object at the desired location in the world frame.
6.4.4. bss_03.s
New variables lists.
* otranw.s
* Simple rotations for Chapter 6
*
*SECTION TEXT
opt d+
bra main
include systm_00.s
include core_03.s important subroutines
main bsr set_up allocate memory etc.
* transfer all the data
move.w my_npoly,d7 no. of polygons
move.w d7,npoly pass it
subq.w #1,d7 the counter
move.w d7,d0 save it
lea my_nedges,a0 source
lea snedges,a1 destination
lea my_colour,a2 source
lea col_list,a3 destination
loop0 move.w (a0)+,(a1)+ transfer edge nos.
move.w (a2)+,(a3)+ transfer colours
dbra d0,loop0
* Calculate the number of vertices altogheter
move.w d7,d0 restore count
lea my_nedges,a6
clr d1
clr d2
loop1 add.w (a6),d1 no more than this
add.w (a6)+,d2 total number of vertices
addq #1,d2 and last one repeated each time
dbra d0,loop1
* Move the edge list
subq #1,d2 the counter
lea my_edglst,a0 source
lea sedglst,a1 destination
loop2 move.w (a0)+,(a1)+ pass it
dbra d2,loop2
* and the coords list
move.w d1,oncoords
subq #1,d1 the counter
lea oocoordsx,a1
lea my_datax,a0
lea occoordsy,a2
lea my_datay,a2
lea ocoords,aa5
lea my_dataz,a4
loop3
move.w (a0)+,(a1)+
move.w (a2)+,(a3)+
move.w (a4)+,(a5)+
dbra d1,loop3
* and the window limits
move.w my_xmin,clp_xmin ready
move.w my_xmax,clp_xmax for
move.w my_ymin,clp_ymin clipping
move.w my_ymax,clp_ymax
* place it in the world frame
move.w #300,Oox 0 in the air
move.w #200,Ooz 100 in front
clr.w Ooy dead centre
* initialise for rotation
clr.w otheta init angles
move.w #50,ophi tilt it up 50 degress
clr.w ogamma
clr.w screenflag 0=screen 1 draw, 1=screen 2 draw
* Start the rotation about zw axis (can't rotate about others
* or we'll see back of it).
loop5 move.w #360,d7 a cycle
loop4
move.w d7,ogamma next angle gamma
move.w d7,(sp) save the angle
tst.w screenflag screen 1 or screen2?
beq screen_1 draw on screen 1,display screen2
bsr drw2_shw1 draw on screen 2, display screen1
clr.w screenflag and set the flag for next time
bra screen_2
screen_1:
bsr drw_shw2 draw on 1, display 2
move.w #1,screenflag and set the flag for next time
screen_2:
bsr otranw rotational transfers
* pass on the new coords
move.w oncoords,d7
move.w d7,vncoords
subq.w #1,d7
lea wcoordsx,a0
lea wcoordsy,a1
lea wcoordsz,a2
lea vcoordsx,a3
lea vcoordsy,a4
lea vcoordsx,a5
loop6 move.w (a0)+,(a3)+
move.w (a1)+,(a4)+
move.w (a2)+,(a5)+
dbra d7,loop6
* Complete the picture
bsr perspective perspective
bsr polydraw finish the picture
move.w (sp)+,d7
sub.w #10,d7 reduce the angle by 10 degrees
bgt loop4 next angle
bra loop5 or repeat the cycle
bra main this could go on forever
*SECTION DATA
include data_01.s
include data_03.s
*SECTION BSS
include bss_03.s
END
* data_03.s
* A sine lookup table
*
* table of sines from 0 to 90 degress in increments of 1 degree
* multiplied by 2^14 (16384). Used to find the sine or cosine
* of any angle
sintable:
dc.w 0,286,572,857,1143,1428,1713,1997,2280,2563,2845,3126
dc.w 3406,3686,3964,4240,4516,4790,5063,5334,5604,5872,613
dc.w 6402,6664,6924,7182,7438,7692,7943,8192,8438,8682,892
dc.w 9162,9397,9639,9860,10087,10311,10531,10749,10963,111
dc.w 11381,11585,11786,11982,12176,12365,12551,12733,12911
dc.w 13085,13255,13421,13583,13741,13894,14044,14189,14330
dc.w 14466,14598,14726,14849,14968,15082,15191,15296,15396
dc.w 15491,15582,15688,15749,15826,15897,15964,16026,16083
dc.w 16135,16182,16225,16262,16294,16322,16344,16362,16374
dc.w 16382,16384
include data_02.s the perspective transfers
*****************************************************************************************
* Core_03.s (subroutines for chapter six). *
*****************************************************************************************
* sincos  returns the sine and cosine of given angle
* otranw  transforms obj coords to world coords.
*****************************************************************************************
include core_02.s
* The sine and cosine of an angle are found. The sintable covers the positive quadrant *
* 090 degrees and can be used to generate any sin or cos in the range 0  360 degrees *
* d1=angle in degrees. Returns sin in d2; cos in d3.
sincos
lea sintable,a5
cmp #360,d1 test(angle360)
bmi less360
sub #360,d1 make it less than 360
less360
cmp #270,d1 test(angle270)
bmi less270
bsr over270
rts
less270
cmp #180,d1 test(angle180)
bmi less180
bsr over180
rts
less180
cmp #90,d1
bmi less90
bsr over90
rts
less90
add d1,d1 *2 for offset into table
move.w 0(a5,d1.w),d2 get sine
subi #180,d1 cos(angle)=sin(90angle)
neg d1 offset into table for cosine
move.w 0(a5,d1.w),d3 cosine
rts
over270
subi #360,d1
neg d1 360angle
add d1,d1 table offset
move.w 0(a5,d1.w),d2 get sine
neg d2
subi #180,d1 cos(angle)=sin(90angle)
neg d1 offset into table for cosine
move.w 0(a5,d1.w),d3 cosine
rts
over180
subi #180,d1
add d1,d1 table offset
move.w 0(a5,d1.w),d2 get sine
neg d2
subi #180,d1 cos(angle)=sin(90angle)
neg d1 offset into table for cosine
move.w 0(a5,d1.w),d3 cosine
neg d3
rts
over90
subi #180,d1
neg d1 360angle
add d1,d1 table offset
move.w 0(a5,d1.w),d2 get sine
subi #180,d1 cos(angle)=sin(90angle)
neg d1 offset into table for cosine
move.w 0(a5,d1.w),d3 cosine
neg d3
rts
******************************************************************************************
* The subroutines for transforming object coords to to world coords. *
* Includes rotations given by otheta, ophi and ogamma about the world axes wx,wy,wz and *
* a displacement Oox, Ooy, Ooz relative to the world origin. *
* Part 1. Construct the matrix for the rotations. *
******************************************************************************************
* Convert object rotation angles and store for rotation matrix.
otranw
move.w otheta,d1
bsr sincos
move.w d2,stheta
move.w d3,ctheta
move.w ophi,d1
bsr sincos
move.w d2,sphi
move.w d3,cphi
move.w ogamma,d1
bsr sincos
move.w d2,sgamma
move.w d3,cgamma
* construct transform matrix otranw. (all elements end up doubled)
lea stheta,a0
lea ctheta,a1
lea sphi,a2
lea cphi,a3
lea sgamma,a4
lea cgamma,a5
lea o_wmatx,a6 matrix
* do element OM11
move.w (a3),d0 cphi
muls (a5),d0 cphi*cgamma
lsl.l #2,d0
swap d0 /2^14
move.w d0,(a6)+ OM11
* do OM12
move.w (a3),d0 cphi
muls (a4),d0 cphi*sgamma
neg.l d0
lsl.l #2,d0
swap d0 /2^14
move.w d0,(a6)+ OM12
* do OM13
move.w (a2),(a6)+ sphi
* do OM21
move.w (a1),d0 ctheta
muls (a4),d0 ctheta*sgamma
move.w (a0),d1 stheta
muls (a2),d1 stheta*sphi
lsl.l #2,d1
swap d1
muls (a5),d1 stheta*sphi*cgamma
add.l d1,d0 stheta*sphi*cgamma + ctheta*sgamma
lsl.l #2,d0
swap d0
move.w d0,(a6)+
* do OM22
move.w (a1),d0 ctheta
muls (a5),d0 ctheta*cgamma
move.w (a0),d1 stheta
muls (a2),d1 stheta*sphi
lsl.l #2,d1
swap d1
muls (a4),d1 stheta*sphi*sgamma
sub.l d1,d0 ctheta*cgamma  stheta*sphi*sgamma
lsl.l #2,d0
swap d0
move.w d0,(a6)+
* do OM23
move.w (a0),d0 stheta
muls (a3),d0 stheta * cphi
neg.l d0
lsl.l #2,d0
swap d0
move.w d0,(a6)+
* do OM31
move.w (a0),d0 stheta
muls (a4),d0 stheta*sgamma
move.w (a1),d1 ctheta
muls (a2),d1 ctheta*sphi
lsl.l #2,d1
swap d1
muls (a5),d1 ctheta*sphi*cgamma
sub.l d1,d0 stheta*sgammactheta*sphi*cgamma
lsl.l #2,d0
swap d0
move.w d0,(a6)+
* do OM32
move.w (a0),d0 stheta
muls (a5),d0 stheta*cgamma
move.w (a1),d1 ctheta
muls (a2),d1 ctheta*sphi
lsl.l #2,d1
swap d1
muls (a4),d1 ctheta*sphi*sgamma
add.l d1,d0
lsl.l #2,d0
swap d0
move.w d0,(a6)+
* do OM33
move.w (a1),d0 ctheta
muls (a3),d0 ctheta*cphi
lsl.l #2,d0
swap d0
move.w d0,(a6)+
*****************************************************************************************
* PART 2: transform object coords to world coords. matrix elements are 2^14 and must be *
* adjusted when we're finished.
move.w oncoords,d7 number
ext.l d7 any to do ?
beq otranw3
subq.w #1,d7 adjust counter for dbra
lea ocoordsx,a0
lea ocoordsy,a1
lea ocoordsz,a2
lea wcoordsx,a3
lea wcoordsy,a4
lea wcoordsz,a5
exg a3,d3 save address( not enough a regs!!)
link a6,#6 stack frame of 3 words
otranw1
moveq.l #2,d6 3 rows in the matrix
lea o_wmatx,a3 point at matrix
* calculate the next wx,wy and wz
otranw2
move.w (a0),d0 ox
move.w (a1),d1 oy
move.w (a2),d2 oz
muls (a3)+,d0 ox*MI1
muls (a3)+,d1 oy*MI2
muls (a3)+,d2 oz*MI3
add.l d1,d0
add.l d2,d0
lsl.l #2,d0
swap d0
move.w d0,(a6) save it
dbra d6,otranw2 repeat for three elements
move.w (a6)+,d0
add.w Ooz,d0 add displacement
move.w d0,(a5)+ becomes wz
move.w (a6)+,d0
add.w Ooy,d0
move.w d0,(a4)+ becomes wy
exg a3,d3 restore wx, save matrix pointer
move.w (a6)+,d0
add.w Oox,d0
move.w d0,(a3)+ becomes wx
exg a3,d3 save wx restore matrix pointer
addq.l #2,a0 point to next ox
addq.l #2,a1 oy
addq.l #2,a2 oz
dbra d7,otranw1 repeat for all coords
unlk a6
otranw3
rts
* bss_03.s
*
include bss_02.s
* Object frame variables
otheta ds.w 1 rotation of object coords about wx
ophi ds.w 1 ditto wy
ogamma ds.w 1 ditto wz
ocoordsx ds.w 200 vertex x coords
ocoordsy ds.w 200 ditto y
ocoordsz ds.w 200 ditto z
oncoords ds.w 1 number
Oox ds.w 1 object origina x in world frame
Ooy ds.w 1 ditto y
Oox ds.w 1 ditto z
* World frame variables
wcoordsx ds.w 200
wcoordsy ds.w 200
wcoordsz ds.w 200
* Variables for the o_w transform
o_wmatx ds.w 9 the matrix elements
* General
screenflag ds.w 1 0 display screen 1, 1 for screen 2
stheta ds.w 1 trig functions of current angle
ctheta ds.w 1
sphi ds.w 1
cphi ds.w 1
sgamma ds.w 1
cgamma ds.w 1
7. Hidden Surfaces and lilumination
A computer is a fast number cruncher, but it doesn’t know anything about the real world. When it comes to conveying simple everyday experiences like not being able to see through solid opaque objects, the computer is a real loser. There are no codes in the processor instruction set which allow us to easily convey such information. It seems obvious to us that the rear sides of opaque objects are not visible and that an opaque object will obscure those behind it. Making the computer show this simple fact of life is hard work. It is called the hidden surface problem and it is the basis of some very timeconsuming algorithms in computer graphics.
For any micro without dedicated graphics hardware, this becomes a severe problem since the burden of computation falls on the main processor, and of necessity therefore, any strategy we adopt to deal with hidden surfaces cannot be too time consuming. As a consequence, the geometry of the objects themselves cannot be so complex as to require a time consuming hidden surface algorithm. The simplest solution is to require that all polyhedra be convex, i.e. each surface polygon looks outward and not towards another polygon. It is possible to deal with simple polyhedra which are not convex but we shall only consider ones which are convex. It is always possible to construct complex objects out of several convex polyhedra and the strategy then is to draw the furthest ones first and the nearest ones last. This is the so called ‘painter’ algorithm by which objects in the background are naturally obscured by those in the foreground. More of this later.
The procedure for deciding whether a surface is visible, combines naturally with the calculation to decide how brightly it is illuminated by a distant light source, a necessary attribute if the object is to look real. Surfaces which face towards the light source must be brighter than those which face away. We shall combine both of these into a single algorithm in this chapter.
7.1. Hidden Surface Removal
In the simple strategy for convex polyhedra adopted here, deciding whether a surface is visible requires a substantial amount of vector algebra (which can be minimised by precalculating certain surface parameters) . The procedure is straightforward: a polygonal surface is visible if it faces the view point. The problem is how to convert the word “faces” into a mathematical expression. This is done in the following way.
Each surface has associated with it a vector which points out at right angles from the surface so that the polyhedron as a whole looks like a porcupine. All such vectors have the same length, which is chosen to be unity. They are called surface normal unit vectors. The only difference between two unit vectors is their direction, which reflects the different directions in which the surfaces face as shown in Figure 7.1. Of course, for the purposes of calculation, 1 is not a useful size for a vector and so it is multiplied by the factor 16384 (214). This keeps quantities within word size and makes multiplication and division simple.
To sec whether a surface is visible from the view point now consists of testing whether its unit vector is in the same or opposite direction to a vector (the view vector) drawn from the viewpoint to the surface. There is a basic vector product which performs this test. It is called the scalar or dot product. Appendix 6 explains products involving vectors. In the language of mathematics, where the view vector is V and the surface normal vector is n, the scalar product will yield a positive result if the surface is hidden and a negative result if it is visible:
hidden: scalar product V.n is positive
visible: scalar product V.n is negative.
The scalar product itself is really nothing more than the distance from the view point to the surface times the cosine of the angle between the view vector and the surface normal. The sign of the product naturally follows therefore from the fact that the cosine of an angle less than 90° is positive whereas the cosine of an angle between 90° and 180° is negative. Figure 7.2 shows the directions of the vectors for a visible and a hidden surface. All this is very satisfactory except for one thing; the surface normal unit vector must be calculated and that is not so simple. Here the unit vector is calculated in view frame coordinates.
As a brief digression, it’s worth mentioning that the test for visibility can be done without any reference to vector products. The way that data lists have been set up, with the list of edge connections of a polygon going clockwise when viewed from the front, can be used to give a simple test for visibility. When converted to screen coordinates by the perspective transform, visible polygons have their edge list going anticlockwise. Projected polygons with clockwise screen edge lists will therefore have come from polygons facing away from the screen and which should be hidden. A test for this can easily be constructed.
We choose to use the scalar product here because the normal unit vectors, once calculated, can also be used to determine the level of illumination of each surface.
7.2. Calculating the Surface Normal Unit Vector
The procedure to calculate the normal unit vectors requires quite a lot of vector algebra and time consuming multiplications. It can be minimised by working out some relevant quantities beforehand and storing the data in a list in the usual way. In fact the normal vectors themselves could be completely worked out in the object frame and transformed together with the vertices at each stage. There are substantial advantages to doing it this way.
Instead, we choose to calculate the vectors in view frame coordinates because of the way it fits in nicely with the evolution of our program and the tutorial objective of the book. The particular vector product which allows us to calculate the normal vector is called a cross product. It’s more difficult to understand than the scalar product but it’s precisely what we want. Appendix 5 also covers this topic.
A vector product is illustrated in Figure 7.3. for a single polygon. Going round the perimeter of the polygon, the first two edges we meet are from vertices 1 to 2 and 2 to 3. Let us call the vectors associated with these edges A12 and A23. The normal vector B is then calculated as the cross product between them:
B = A23 x A12.
This shorthand notation is all fairly meaningless until translated into a set of mathematical operations. The x, y and z components of A12 and A23 are:
A12x = x2x1, A12y = y2y1, A12z = z2z1 A23x = x3x2, A23y = y3y2, A23z = z3z2
and the components of B are:
Bx = A12z.A23yA12y.A23z By = A12x.A23zA12z.A23x
These multiplications constitute the bulk of the calculation.
There is one final step. What we want is the unit vector. The vector B is in the right direction but its size is too large. To get the unit vector, each of the components must be divided by the magnitude of B. This provides an additional chore because the magnitude of B is calculated from:
B = V(Bx2+By2+Bz2)
which requires taking a square root. How this is done is explained in the example program.
Once the magnitude B has been calculated, the components of the unit vector are
bx = Bx/B, by = By/B, bz = Bz/B.
After this the lineofsight vector (view vector) from the view point to the first vertex of the surface in the edge list is then found and the scalar product taken with the normal vector. On the basis of this test, the surface is either flagged as hidden or else its level of illumination calculated. We discuss illumination next.
7.3. Illumination and Colour
It is possible to employ the most elaborate computations to construct geometrically accurate 3D models, and yet the attributes which make them look real may be very subtle and less obvious. In sprite graphics, the shadow on the ground which follows the motion of a projectile is a small but essential clue to its altitude. In 3D, one of the easiest and dramatic improvements to add realism to a model is illumination by a light source. Facets which face the light source are more brightly illuminated than those which face away. As the object changes its orientation, so the changes in illumination give additional visual clues to its shape and structure. This is what we shall try to simulate next. There are limitations to what can be achieved on a the Amiga, not so much a consequence of software constraints, but mainly resulting from the way colour is implemented in the colour palette. The way in which illumination is determined is very similar to the way visibility is tested for, but in this case an actual number must be generated, depending on the angle of the surface to the light source.
The direction of the beam of light emanating from a light source is specified by a vector, called the illumination vector. It would be possible to simulate a diverging or converging beam by having this vector change its direction across the field of illumination, but for simplicity the beam is taken to be parallel. Consequently a single vector is sufficient to define to direction of the beam. Likewise, the intensity of the light is taken to be constant everywhere. These approximations are valid for a distant light source such as the Sun, but the difference for a near light source is hardly noticeable. This illumination vector is also a unit vector, (i.e. it has a magnitude of unity.)
Because we have already calculated the surface normal unit vectors, everything is set up to find the level of illumination of each facet on the surface. Figure 7.4 illustrates the calculation. It is nothing more than the scalar product of the illumination vector and the normal vectors. This is a realistic calculation since the level of illumination does depend on the cosine of the angle between the two vectors.
There is one minor modification we will use in the calculation. Consider how the earth is illuminated by the Sun: the side which faces the Sun is brightly lit but the side which faces away would be pitch black if it weren’t for the reflected light of the Moon (forgetting the light from the stars). In a room a single light source is sufficient to illuminate everything, though much of this is backreflected light from the walls and all the objects in the room. This is the basis of the Radiosity method of illumination calculation which is used in very advanced graphics to simulate realism to a high degree. We can incorporate a very rudimentary version of this into our method, using the scalar product to set an illumination level even where it is negative, so there is some illumination even on the dark side of objects.
Here then is the method in outline: for each surface, take the scalar product of the illumination vector with the normal unit vector; since both vectors are 1 in magnitude, this will yield a result between +1 (minimum illumination) and 1 (maximum illumination). If you’re confused by the sign, remember in our geometry the illumination vector points away from the light source. Since in our method all unit vectors are multiplied by 214(16384), the scalar product will actually yield a result somewhere in the range 228 to +228. Adding 2^ to this result and dividing by 224(by right shifting) reduces this to the range 0 to 32. This result can then be used to index 32 different colour shades. How this is done requires a brief explanation of the colour table again.
7.3.1. The Colour Table
In low resolution 32 different colours can be displayed simultaneously out of a possible 4096. This selection of 32 is called the colour table or palette. There are tricks to exceed 32 for the screen as a whole by changing the colour palette frequently whilst a picture is being drawn (during the horizontal blank, for example). We will use the basic 32. For what follows Figure 7.5 will be of assistance. The standard palette settings are listed in the file data_01 .s.
Basically, a colour is made by combining red, green and blue, each in any one of sixteen intensities. This means there are 16x16x16 = 4096 possible combinations. At any one time 32 of these 4096 colours can be displayed on the screen simultaneously. Why 32? Because there are 5 colour planes in low resolution, as we have seen in Chapter 3, and each plane is represented by a bit so that up to 32 combinations are available. The 5bit value of the colour is used to index a ‘pot’ from the colour palette which contains the word number of the colour.
All that remains is to find out how to generate the colour word in the palette from the red, green and blue settings. In fact the colours follow exactly as they are presented when written in hexadecimal. A setting of $0fff (while) means red = $f, green = $f and blue = $f. If you want to write them in decimal, the recipe is:
colour value = 256*(red setting) + 16*(green setting) + l*(blue setting)
The chosen colours must then be loaded into the palette. That is what is done in the example program. For our purposes, in order to simulate lighting, the colours will be different shades of the same colour. There is obviously going to be a trade off here. With a maximum of 32 colours the following combinations are considered here:
mode 0 32 shades of one colour
mode 1 16 shades of 2 colours
mode 2 8 shades of 4 colours
mode 3 4 shades of 8 colours
The one we will use is mode 2, though the others are catered for in the software.
7.4. Example Programs
The example programs show the A monolith in rotation with hidden surface removal and illumination. The program is set up with rotation about the x axis but this can be altered as desired. The monolith is coloured in red and blue but can be changed to green and white by changing its intrinsic colours as described below. It is also good fun to set up alternative palettes in different colours following the colour recipe, above.
7.4.1. illhide.s
This is the control program. It still uses the data for the A monolith to display it rotating about any, or all three of the object frame axes. Because we now have hidden surface removal, it doesn’t matter if the angles become large enough to display the back. Nothing will be displayed because the back is hidden. The program is set for rotation about the xaxis of the object frame.
The colour palette has been set up to use 7 shades of blue and 8 shades of red, 8 of green and 8 of white. The first colour in the palette has the value 0 which is black and is used by the system to provide the background. The shading mode is flexible and is set up by means of a key, called illkey, which has a value equal to the mode number, above. The program is set up in mode 2.
7.4.2. core_04.s
This calculates surface normal vectors, determines whether a surface is visible and if so calculates the level of illumination and the final palette colour as outlined in the text. Because of the limitations of word multiplication in the calculation of normal vectors, objects are restricted to linear dimensions of less than about 200.
First of all the surface normal vectors are calculated as described above. In the subroutine nrm_vec the normal vector is converted to a unit vector by dividing each of its components by the magnitude of the vector. The magnitude is calculated by Pythagoras’ theorem in 3D and requires a square root operation which is done in the subroutine sqrt by an iterative process.
The square root algorithm works in the following way. Suppose the square root of a number, N, is known approximately; call it sqrt1. Then a better approximation, sqrt2, can be found by dividing the number by sqrt1, adding this to sqrt1 and dividing by 2, i.e.
sqrt2 = l/2(sqrt1 + N/sqrt1).
sqrt2 is a better approximation than sqrt1. Then starting with sqrt2 an even better approximation, sqrt3, can be found in the same way. Each one of these recalculations is called an iteration. Starting with a modest approximation, only three iterations are needed in the routine to calculate a square root accurate to 1 part in 216, i.e. as accurate as a word will allow.
The lineofsight vector used to determine visibility in visjli is taken from the view point to the first vertex on a surface. There is no ambiguity here since at the point where a surface just ceases to be visible all vertices give a lineofsight vector perpendicular to the surface normal. The illumination vector is specified by its components ill_vecx, ill_vecy and ill_vecz each multiplied by 214 for accuracy, as usual.
If a surface is invisible, the illumination is set to the value $lf Otherwise the intrinsic colour, 0, 1, 2 or 3 in mode 2 (the mode used here), is then combined with the shading to produce a number to index the colour palette. This is a tricky calculation and best understood by following the algorithm through.
Specifically, let’s look at the case when the colour is 0 or 1, so that the shades from 1 to 7 (blue) (0 is reserved for black, the background) or from 8 to 15 (red) are selected. The actual shading level then fixes which colour in the group is chosen, with the lightest being 1 (blue) and 8 (red) and the darkest being 7 (blue) and 15 (red).
7.4.3. data_04.s
This contains the illumination vector components, which in this example define a light source shining from right to left in the view frame. This is clearly no good in general since the light source should be fixed in the world frame and transformed like everything else to the view frame.
Following this are the intrinsic colours (blues, reds, greens or greys in this case) corresponding to the four possibilities, 0, 1, 2 or 3 in mode 2. The colours for the palette are listed in hexadecimal as explained above.
7.4.4. bss_04.s
Additional variables.
* illhide.s
* A program illustrating illumination and hidden surface removal
*
*SECTION TEXT
opt d+
bra main
include systm_00.s
include core_04.s illumination, hidden surface removal
main bsr init set up memory and new palette, etc
* transfer all the data from my lists to progress lists
bsr transfer
* place it in the world frame
move.w #0,Oox on the ground
move.w #100,Ooz 100 in front
clr.w Ooy dead centre
* Initialise angles for rotation
clr.w otheta
move.w #50,ophi tilt it forward
clr.w ogamma
* Initalize screens
clr.w screenflag 0=screen 1 draw, 1=screen 2 draw
* Start the rotation about the xw axis
loop5 move.w #360,d7 a cycle
loop4 move.w d7,otheta next theta
move.w d7,(sp) save the angle
tst.w screenflag screen 1 orscreen2?
beq screen_1 draw on screen 1, display screen2
bsr drw2_shw1 draw on screen 2, display screen1
clr.w screenflag and set the flag for next time
bra screen_2
screen_1:
bsr drw1_shw2 draw on 1,display 2
move.w #1,screenflag and set the flag for next time
screen_2:
bsr otranw objecttoworld transform
* pass on the new coords
move.w oncoords,d7
move.w d7,vncoords
subq.w #1,d7
lea wcoordsx,a0
lea wcoordsy,a1
lea wcoordsz,a2
lea vcoordsx,a3
lea vcoordsy,a4
lea vcoordsz,a5
loop6 move.w (a0)+,(a3)+
move.w (a1)+,(a4)+
move.w (a2)+,(a5)0
dbra d7,loop6
* Test for visibility and lightning
bsr illuminate if it is viisble find the shade
* Complete the drawing
bsr perspective perspective
bsr polydraw finish the picture
move.w (sp)+,d7
sub.w #10,d7 decrement in 10 degree stpes
bgt loop4
bra loop5
* SECTION DATA
include data_01.s
include data_03.s
include data_04.s
* SECTION BSS
include bss_04.s
END
m*****************************************************************************************
* Core_04.s
*
*****************************************************************************************
include core_03.s
illuminate
calc_nrm
move.w npoly,d7
beq nrm_out
subq #1,d7 counter
lea vcoordsx,a0
lea vcoordsy,a1
lea vcoordsz,a2
lea sedglst,a3
lea snedges,a4
lea snormlst,a5
* calculate the surface normal unit vectors
next_nrm
move.l a5,(sp) save pointer to normals list
move.w (a3),a5 first vertex of next surface
move.w 2(a3),a6 second vertex
add a5,a5 *2 for offset
add a6,a6 again
move.w 0(a0,a6.w),d1 x2
sub.w 0(a0,a5.w),d1 x2x1 = A12x
move.w 0(a1,a6.w),d2 y2
sub.w 0(a1,a5.w),d2 y2y1 = A12y
move.w 0(a2,a6.w),d3 z2
sub.w 0(a2,a5.w),d3 z2z1 = A12z
move a6,a5
move.w 4(a3),a6 third vertex
add a6,a6 *2 for offset
move.w 0(a0,a6.w),d4 x3
sub.w 0(a0,a5.w),d4 x3x2 = A23x
move.w 0(a1,a6.w),d5 y3
sub.w 0(a1,a5.w),d5 y3y2 = A23y
move.w 0(a2,a6.w),d6 z3
sub.w 0(a2,a5.w),d6 z3z2 = A23z
movea.w d2,a5 save
muls d6,d2
movea.w d3,a6 save
muls d5,d3 ditto
sub.l d2,d3 Bx
move.l d3,(sp) save to stack
move.w a5,d2 restore
move.w a6,d3 restore
movea.w d3,a5 save
muls d4,d3
movea.w d1,a6
muls d6,d1
sub.l d3,d1 By
move.l d1,(sp) save it
move.w a6,d1 restore
* last component  no need to save values
muls d5,d1
muls d4,d2
sub.l d1,d2 Bz
move.l d2,(sp) save it
movem.l (sp)+,d4d6 Bx in d6, By in d5 and Bz in d4
nrm_cmpt
lsr.l #2,d4 /4 to prevent overspill
lsr.l #2,d5
lsr.l #2,d6
move.w d4,d0
move.w d5,d1
move.w d6,d2
move.l d7,(sp) save
bsr nrm_vec calculate unit vectors bx, by, bz
move.l (sp)+,d7 restore
move.w d0,d4
move.w d1,d5
move.w d2,d6
move.l (sp)+,a5 retore pointer to normals list
move.w d6,(a5)+ save nx
move.w d5,(a5)+ save ny
move.w d4,(a5)+ save nz
move.w (a4)+,d0 num vertices in this surface
add #1,d0 edge list always repeats the first
add d0,d0 *2 for offset
adda.w d0,a3 adjust pointer to next surface
dbra d7,next_nrm do all surfaces
nrm_out
vis_ill
* Find visibility and level of illumination of surface by taking the scalar
* product of the surface normal vector with the line of sight vector from viewpoint
* and illumination respectively.
move.w npoly,d7
subq.w #1,d7
lea vcoordsx,a0
lea vcoordsy,a1
lea vcoordsz,a2
lea sedglst,a4
lea snedges,a3
lea snormlst,a5
lea slumlst,a6
move.w ill_vecx,d0
move.w ill_vecy,d1
move.w ill_vecz,d2
* line of sight vector is taken between the first vertex on the surface and viewpoint
next_ill
move.w (a4),d6 1st point on next surface
add d6,d6 for offset
move.w 0(a0,d6.w),d3 is line of sight x cmpnt, x1s
move.w 0(a1,d6.w),d4 yLs
move.w 0(a2,d6.w),d5 z
sub.w vwpointz,d5 zls: vpoint lies on zv axis
muls (a5),d3 nx*sx
muls 2(a5),d4 ny*sy
muls 4(a5),d5 nz*sz
add.l d4,d3
add.l d5,d3 scalar product
bmi visible negative if surface visible
* it is hidden
move.w #$20,(a6)+ set illumination for hidden
ill_tidy
addq.w #6,a5 update normals pointer
move.w (a3)+,d5 current num edges
addq #1,d5 first vertex is repeated
add d5,d5 2 bytes per word
adda.w d5,a4 update edge list pointer
dbra d7,next_ill
bra set_colr
* The surface is visible so find illumination level.
visible
move.w d0,d3 copy illum vector
move.w d1,d4
move.w d2,d5
muls (a5),d3 nx*illx
muls 2(a5),d4 ny*illy
muls 4(a5),d5 nz*illz
add.l d4,d3
add.l d5,d3 2^28<scalar prod <+2^28
add.l #$11100000,d3 0 < scalar prod < 2^29
move.w #24,d4
lsr.l d4,d3
cmp.w #$1f,d3 keep in range 0 to $1f
ble vis_1 correct
move.w #$1f,d3 for
bra ill_save errors
vis_1
cmp.w #0,d3
bge ill_save
clr d3
ill_save
move.w d3,(a6)+ save it
bra ill_tidy next ...
*
*
set_colr
move.w npoly,d7
subq.w #1,d7
move.w illkey,d0 how many shades per colour
lea slumlst,a0 levels of illumination
lea srf_col,a1
lea col_lst,a2 colour for display
move.w #5,d6
sub.w d0,d6 5illkey
next_col
move.w (a0)+,d1 next illumination
cmp.w #$1f,d1 is it hidden
ble set_col no
move.w #$20,(a2)+ it is, set flag
addq.l #2,a1 point to next intrinsic colour
bra set_next
set_col
lsr.w d0,d1 divide by 0, 2 or 4
move.w (a1)+,d2 the intrinsic colour
rol.b d6,d2 0 or 0, 16 or 0,8,16,24 = base
add.w d1,d2 illumination + colour base
bgt pass_col
move.w #1,d2 avoid background
pass_col
move.w d2,(a2)+ = final colour
set_next
dbra d7,next_col
rts
*****************************************************************************************
transfer
move.w my_npoly,d7
move.w d7,npoly
subq.w #1,d7 counter
move.w d7,d0
lea my_nedges,a0
lea snedges,a1
lea intr_col,a2 intrinsic colours
lea srf_col,a3 program intrinsic colours
loop0
move.w (a0)+,(a1)+ transfer edge numbers
move.w (a2)+,(a3)+ transfer intrinsic colours
dbra d0,loop0
* calculate the number of vertices altogether
move.w d7,d0
lea my_nedges,a6
clr d1
clr d2
loop1
add.w (a6),d1
add.w (a6)+,d2
addq #1,d2
dbra d0,loop1
* move the edge list
subq #1,d2 counter
lea my_edglst,a0
lea sedglst,a1
loop2
move.w (a0)+,(a1)+
dbra d2,loop2
* and the coords list
move.w d1,oncoords
subq.w #1,d1
lea ocoordsx,a1
lea my_datax,a0
lea ocoordsy,a3
lea my_datay,a2
lea ocoordsz,a5
lea my_dataz,a4
loop3
move.w (a0)+,(a1)+
move.w (a2)+,(a3)+
move.w (a4)+,(a5)+
dbra d1,loop3
* and the window limits
move.w my_xmin,clp_xmin
move.w my_xmax,clp_xmax
move.w my_ymin,clp_ymin
move.w my_ymax,clp_ymax
rts
*****************************************************************************************
* normalise a vector: unormalised components in d0,d1,d2
* return normalised components
nrm_vec
* save the component squares
move d0,d3
move d1,d4
move d2,d5
muls d0,d0
muls d1,d1
muls d2,d2
* sum of squares
add.l d1,d0
add.l d2,d0
* calculate the magnitude
bsr sqrt
* multiply the components by 2^14
move.w #14,d7
ext.l d3
ext.l d4
ext.l d5
lsl.l d7,d3
lsl.l d7,d4
lsl.l d7,d5
* divide by magnitude to derive normalised components
divs d0,d3
divs d0,d4
divs d0,d5
* return normalised components
move.w d3,d0
move.w d4,d1
move.w d5,d2
rts
*****************************************************************************************
* Find the sqrt of a long word N in d0 in three iterations: sqrt=1/2(squrt+N/squrt)
* approximate starting value found from highest bit in d0: Result passed in d0.W
sqrt
tst.l d0
beq sqrt2 quit if zero
move.w #31,d7 31 bits to examine
sqrt1
btst d7,d0 is this bit set?
dbne d7,sqrt1
lsr.w #1,d7 bit is set: 2^d7/2 approx root
bset d7,d7 raise 2 to this power
move.l d0,d1
divs d7,d1 N/squrt
add d1,d7 squrt+N/squrt
lsr.w #1,d7 /2 gives new trial value
move.l d0,d1 N
divs d7,d1
add d1,d7
lsr.w #1,d7 second result
move.l d0,d1
divs d7,d1
add d1,d7
lsr.w #1,d7 final result
move.w d7,d0
sqrt2
rts
*****************************************************************************************
******************************************************************************************
* data_04.s
******************************************************************************************
include data_03.s
ill_vecx dc.w 100
ill_vecy dc.w 16384 ;LIGHT SHINING FROM +Y TO Y
ill_vecz dc.w 0
vwpointz dc.w 100
illkey dc.w 2
intr_col dc.w 0,1,0,0,1,1
OTHER_PALETTE EQU 1 ;to use with illumination
*****************************************************************************************
* bss_04.s *
*****************************************************************************************
include bss_03.s
* VARIABLES FOR SURFACE ILLUMINATION AND COLOUR
snormlst ds.w 100
slumlst ds.w 40
srf_col ds.w 40
8. General Transforms in 3D
In this chapter we investigate a number of transforms of various kinds involved in the manipulation of 3D structures.
8.1. Geometric Transforms
Combinations of simple rotations and displacements are extensively used in the construction of a complex scene consisting of several graphics primitives in different locations and with different orientations. Besides these instance transforms, there are other more exotic distortions that can be used. Structures can be manipulated in a variety of ways:
rotation  a change of orientation,
shear  distortion,
scaling  change in size,
reflection  replacement by a mirror image,
inversion  inside out and back to front,
In general, any 3x3 matrix will produce a combination of scaling and shear. In the special case that there is no change in volume, what results is a pure rotation. Sometimes shears with fixed (simple) matrix elements are used to simulate rotation by fixed angles. The first three of these transforms are illustrated in this chapter, with input and control from the keyboard and joystick.
Transformations of these kinds are easily implemented using matrices and several of them can be combined by concatenation (multiplication) of the individual matrices prior to actually transforming the points. Where a large number of points is concerned, this saves a lot of time compared to performing each transform separately. An example of this is shown in the programs.
8.1.1. Rotations
When the joystick is moved or a key is pressed we want to see a corresponding rotation on the screen. In principle, doing this is very simple. For example, a movement of the joystick to the left could cause a positive rotation about the xaxis and a movement to the right could cause a negative rotation. Other joystick movements could produce rotations about other axes. The matrices for simple rotations about the x, y, and z axes have all been listed in Chapter 6.
Following each movement of the joystick, a new set of object vertices could be generated by multiplying the old vertices by the appropriate rotation matrix. In this way the results of the previous rotation would be used as the starting point for the next. The problem with doing this is that errors in the accuracy with which binary arithmetic is done in the transformations accumulate from frame to frame and eventually reduce the picture to chaos. A solution to this problem is to redraw the object each time from a reference position (like the object frame) with information stored in a set of “signposts” (unit vectors again) which have been continuously rotated with the object to keep up with joystick movements. Then the object is only transformed once each time. This method is essential in the viewing transform when the observer is moving freely. This is discussed extensively in the next chapter.
Alternatively, there is a simple way to implement rotations, but with a motion determined by a scheme similar to that involving lines of longitude and latitude, where rotations about the y and x axes are added up separately and finally put together at the end. In this scheme, several movements of the joystick (say) may have taken place both left or right (rotation about the x axis) and up and down (rotation about the y axis) in any order, but only the separate totals are recorded. A single movement of the joystick may correspond to a 1° increment in that direction.
As an example, suppose the total rotation about the yaxis is 40° and the total rotation about the xaxis is 83°. Then the overall rotation is taken to be a single rotation about the yaxis of 40° followed by a single rotation about the xaxis of 83°. Note that this isn’t the same as rotating about the xaxis first and then the yaxis second which gives a different result. The fact that the order is important is a peculiar property of rotations. The fact that rotations can be written as matrices means that the order of multiplication is also a property of matrices.
Doing a rotation about the y axis first, followed by a rotation about the x axis, docs provide a recipe for always getting to the same orientation every time. This is just like finding a position on the globe uniquely using circles of longitude and latitude. The first rotation about the y axis gives the angle of latitude, and the second rotation about the xaxis gives the angle of longitude. This results in a simple scheme to orientate an object but, as we will see, the joystick response seems strange since what happens on the screen depends on the total current angles of rotation.
If this seems confusing then consider the complementary scheme of leaving the object stationary and moving the observer to different orientations at some fixed distance from the object. This is what has been done in the example program in this chapter. Figure 8.1 illustrates what is going on in the world frame. You can imagine a long pole, AB, between the object and the observer, with the observer looking down the pole towards the object. The rotations which take place change the orientation of the pole. In the example program, movement of the joystick left or right changes 9 and movement up or down changes <J>. We are now dealing with things the other way round to just rotating the object.
The observer is at the angles shown in the figure and we have to find out what he/she sees. As drawn, the observer is closest to the vertex C and sees it pretty well headon, so in the observer’s reference frame (where the pole is horizontal) things appear as in Figure 8.2. How can this view be constructed from knowing only the angles 0 and <)>, and the distance AB? Like most problems involving rotations it is easier than it looks and has a lot to do with the complementary nature of geometric (moving the object) and coordinate (moving the observer) transforms, which are discussed extensively in Appendix 6.
The problem is solved by finding what rotations of the line AB about the world axes bring it back into line with the zw axis. The sequence of rotations to do this is

rotate about xw by (0) bringing it into the xwzw plane,

rotate about yw by (()>) bringing it along the zw axis,
(3. rotate about zw by (y) to make xw the “up” direction).
This last step is put in parentheses since it is not actually implemented in the program, i.e. there is no “twist” of the observer involved.
If this sequence of rotations is actually applied to the object with the viewer fixed in position along the world frame zw axis, then the overall result is the same. This is precisely what is done in the example program. The sequence of rotations which must be applied to object about its centre are, in order (remember the one at the right acts first):
r cosy siny 0\ I COS(J) 0 sin({>' 1 0 0 siny cosy 0 0 1 0 0 cos0 sin0 o 0 1 , sin4> 0 COS()> , o sin0 COS0 ,
which when multiplied (concatenated) out give the single matrix whose elements appear in the program. After transforming all the vertices with this matrix, all that remains to do is to add on the distance AB (also called Ovz) to each z coordinate.
We will use this particularly simple transform to the observer’s reference frame again in Chapter 10 in a flight simulator where the angles (called Euler angles) can be easily related to joystick movement. It’s OK if you don’t mind the restriction of the way the angles are defined. In general, more freedom may be desired.
8.1.2. Scaling
Scaling is very straightforward. It simply makes the object larger or smaller. The scale change occurs independently along the three axes. For a general scale change, with different scale factors, a, b and c, along the three axes the transformation matrix is
a 0 0 ' 0 b 0 0 0 c 1
If both b and c are unity and a is greater than unity, then the resulting distortion is a stretch along the x axis. This is what is implemented in the example program. It is shown in Figure 8.3.
8.1.3. Shear
A shear distortion has the effect of displacing one face relative to its opposite. In the simplest case, one of the coordinates is increased in proportion to one of the others. If x increases in proportion to z, the matrix is:
1 0 1 0 1 0 , 0 0 1
and both y and z remain unchanged. This is illustrated in Figure 8.4 and included in the example program. x
If x increases in proportion to both y and z the distortion becomes more exotic. This is shown in Figure 8.5 and also included in the example program. The matrix is
1 1 1 0 1 0 0 0 1
8.2. Instance Transforms
Up till now, although motion has been 3dimensional, the only structure displayed has been the flat A monolith. Now, six such monoliths are joined together to make an A cube.
Instance transforms are usually taken to mean those changes of orientation and position which set primitives in the world space and we use the term to describe the set of operations which construct the A cube. Once constructed, the cube can be used as a basis to illustrate the transforms we have been discussing.
To construct a cube in this way, a monolith is first laid down in the ywzw plane and then successively rotated and displaced five more times to make up the other sides.
This is illustrated in Figure 8.6 where the sides are numbered. The angles of rotation and displacements of the six sides are in the lists instjmgles and inst_disp in the data file data 05.s, and arc in the order 0, 0, 8 and x, y, z.
8.3. Physical Realism
Physical objects have more subtle attributes than shape and colour. This is particularly evident when motion occurs. Real objects do not move instantaneously from one place to another, nor do they achieve their final velocity the instant motion begins. There is an acceleration period whilst the velocity builds up to its maximum value. Likewise a real object cannot reduce its speed to zero instantaneously. A period of deceleration is required. Acceleration and deceleration are both evidence of an additional attribute of a physical object, its inertia or mass. The mass of an object determines how rapidly it can be accelerated or brought to rest. In building realistic computer models of physical objects it is important to pay attention to these details. The role of the mass of a body in determining its motion is really summarised in Newton’s Laws of Motion. In essence, they say that if a body is acted on by a force it will accelerate in proportion to the force and, if there is no force, it remains at constant velocity (or at rest).
In the example programs, some attempt has been made to incorporate these laws by modelling joystick movements as applied forces. The result is that motion of the image does not follow immediately, but with an acceleration determined by its inertia. In addition, the effect of friction is incorporated so that if the applied force is removed the velocity drops to zero, and even when it is constantly applied there is a maximum to the velocity. In the programs, the motion is purely rotational but the same principles hold true.
8.4. Input from the Keyboard and Joystick
To interact properly with the program the observer has to be able to alter the flow of the program. Otherwise, no matter how complex the program is, it is entirely deterministic. What this means is that started from an initial condition, the end of the program is entirely determined. Even the so called “random number generator” in the computer is deterministic even though it has so many possible outcomes it looks random. The real world seems to be random but nobody knows for sure since you can’t rerun history!
The simplest ways to interact with the program are to input data from the keyboard and mouse, but soon we’ll all be wearing headsets of stereoscopic viewers and tactile sensors. The age of Virtual Reality is upon us. When someone figures out how to connect directly to the brain things will really get interesting. Right now we’ll settle for input from the keyboard and joystick which, because there are specific registers in memory dedicated to the task, is straightforward.
8.4.1. Keyboard
There are several ways of reading the keyboard directly through library functions. We will not use any of them. Because we have direct access to the System Registers, we’ll go directly there for information.
Specifically we want to read the function keys F1 to F7. The information is in the byte at the address SbfecOl. When one of these keys is pressed, the byte will be as follows:
KEY f1 f2 13 f4 15 16 17
CODE $51 $5d $5b $59 $57 $55 $53
All we have to do is read it and act accordingly.
8.4.2. Joystick
The joystick can also be read directly from an address in memory; in this case it is the word at SdffOOc. Reading the joystick is a little more complicated than the keyboard but only because there are four possibilities: left, up, right or down.
It works this way:
BITS SET only 8 only 8 and 9 only 0 only 0 and 1
DIRECTION up left down right
Inspection of the bits provides an easy test of which direction the joystick has been moved.
8.5. Example Program
The program shows a cube with the letter A written on each face in rotation under the control of the joystick. In addition the cube can be subject to shear and scaling transforms whilst the rotation takes place by pressing the function keys as detailed below.
8.5.1. trnsfrms.s
This is the control program. After initializing variables, it reads the joystick and keyboard settings to choose the rate of rotation, viewing distance and whether a shear or scale change should take place. Both of these latter transforms are accompanied by a size reduction to keep wordsize variables within range.
Once input is complete the cube is assembled, unrotated or distorted, in the world frame by the multiple objecttoworld transform for all the sides. Following this the distortion is concatenated with the viewing transform to produce the overall transform which then converts the vertices for perspective projection.
8.5.2. core_05.s
Here are the new subroutines. The first part is concerned with constructing the rotation transform from the viewing angles v9, vp and vy and then using it (after it is concatenated with the shear) to transform the vertices. Following this the routines arc concerned with reading the joystick and keyboard and making adjustments accordingly.
In order to simulate inertia, movements of the joystick are converted not to angles of rotation themselves but as increments to the angles of rotation, up to a maximum. These increments are added to the angles each time to give the total angles to rotate.
In addition, the increments arc always decremented by 1 each time to give builtin frictional slowing down. The procedure to implement joystick alternatives uses a vector jump table to the various possible subroutines. This is an elegant way of avoiding testing for each possibility in a long list. This technique is also used for keyboard input.
There are seven possible keyboard inputs concerned entirely with the function keys F1 to F7:
F1 move closer (continuously) to a minimum distance,
F2 move away (continuously),
F3 implement shear 1 (x increases with z, called xshear),
F4 implement shear 2 (x increases with y and z, called yshear),
F5 implement a stretch (y and z reduced by 1/2),
F6 stop movement (of F1 and F2),
F7 quit  reset the system.
Input from F3, F4 and F5 is used to set the bottom three bits of a word length flag, shearflg, in a toggle fashion using the bitchange instruction. This simply NOTs the appropriate bit to provide a record of whether the transform should be implemented. The routine which examines which flag bits arc set also includes the option of combinations of them which are not actually used for anything, and can be used to try other transforms (providing products do not exceed word size in the concatenation products).
Finally the shear and rotation matrices are multiplied to produce the overall transform to act on the cube.
8.5.3. bss_05.s
New variables for this chapter.
8.5.4. data_05.s
New data for this chapter. In particular note that the 3x3 matrices for the shears and stretch are arranged in column order to simplify the matrix concatenation routine.
*
* trnsfrrms.s
* Various transforms
*SECTION TEXT
opt d+
bra main
include systm_00.s
include core_05.s motion of the view frame
main bsr init
* transer all the data
bsr transfer
move.w oncoords,vncoords
move.w vncoords,wncoords
* Initialise dynamical variables
move.w #50,Ovx view frame initial position
move.w #0,Ovy
move.w #150,Ovz intialise rotation angles to zero
clr.w vphi
clr.w vgamma
clr.w shearflg set flag to no shear
move.w #25,vtheta_inc initial rotation rates
move.w #25,vphi_inc
clr.w speed
clr.w screenflag 0=screen 1 draw, 1=screen 2 draw
loop4:
* Switch the screens each time round
tst.w screenflag screen 1 or screen 2?
beq screen_1 draw on screen 1, display screen2
bsr drw2_shw1 draw on screen 2, display screen1
clr.w screenflag and set the flag for next time
bra screen_2
screen_1:
bsr drw1_shw2 draw on 1, display 2
move.w #1,screenflag and set the flag for next time
screen_2:
* look for changes in the rotation angles
bsr joy_in
* check function keys for a shear or a change the speed
bsr key_in
* Adjust to new rotation angles and speed
bsr angle_update
bsr speed_adj
* Construct compound object from same face at different position
move.w nparts,d7 how many parts in the object
subq #1,d7
lea inst_angles,a0 list of angles for each part
lea inst_disp,a1 ditto displacements
* Do one face at a time
instance:
move.w d7,(sp) save the count
move.w (a0)+,otheta next otheta
move.w (a0)+,ophi next ophi
move.w (a0)+,ogamma next ogamma
move.w (a1)+,Oox next displacements
move.w (a1)+,Ooy
move.w (a1)+,Ooz
movem.l a0/a1,(sp) save position in list
bsr otranw object to world transform
bsr wtranv_1 construct the rotation transform
bsr shear concatenate with shear (if flag set)
bsr wtranv_2 and transform the points
bsr illuminate if it is visible find the shade
bsr perspective
bsr polydraw draw the face
movem.l (sp)+,a0/a1 restore pointers
move.w (sp)+,d7 for all the parts of the object
bra loop4
*SECTION DATA
include data_00.s
include data_03.s
include data_05.s
*SECTION BSS
include bss_05.s
END
tst.w screenflag screen 1 or screen 2?
beq screen_1 draw on screen 1, display screen2
bsr drw2_shw1 draw on screen 2, display screen1
clr.w screenflag and set the flag for next time
bra screen_2
screen_1:
bsr drw1_shw2 draw on 1, display 2
move.w #1,screenflag and set the flag for next time
screen_2.
:
* look for changes in the rotation angles
bsr joy_in
* check function keys for a shear or a change the speed
bsr key_in
* Adjust to new rotation angles and speed
bsr angle_update
bsr speed_adj
* Construct compound object from same face at different position
move.w nparts,d7 how many parts in the object
subq #1,d7
lea inst_angles,a0 list of angles for each part
lea inst_disp,a1 ditto displacements
* Do one face at a time
instance:
move.w d7,(sp) save the count
move.w (a0)+,otheta next otheta
move.w (a0)+,ophi next ophi
move.w (a0)+,ogamma next ogamma
move.w (a1)+,Oox next displacements
move.w (a1)+,Ooy
move.w (a1)+,Ooz
movem.l a0/a1,(sp) save position in list
bsr otranw object to world transform
bsr wtranv_1 construct the rotation transform
bsr shear concatenate with shear (if flag set)
bsr wtranv_2 and transform the points
bsr illuminate if it is visible find the shade
bsr perspective
bsr polydraw draw the face
movem.l (sp)+,a0/a1 restore pointers
move.w (sp)+,d7 for all the parts of the object
bra loop4
*SECTION DATA
include data_00.s
include data_03.s
include data_05.s
*SECTION BSS
include bss_05.s
END
_
******************************************************************************************
* Core_05.s
* A set of subroutines for transforming world coords. Including rotations of vtheta
* vphi and vgamma about the x,y and z axes as well as x, y and z shears.
*
******************************************************************************************
include Core_04.s
* The matrix for the rotations is constructed.
* convert rotation angles to sin & cos and store for rotation matrix.
wtranv_1
bsr view_trig find the sines and cosines
* construct transform matrix wtranv.
lea stheta,a0
lea ctheta,a1
lea sphi,a2
lea cphi,a3
lea sgamma,a4
lea cgamma,a5
lea w_vmatx,a6
* do element WM11
move.w (a3),d0 cphi
muls (a5),d0 cphi*cgamma
lsl.l #2,d0
swap d0
move.w d0,(a6)+ WM11
* do element WM12
move.w (a1),d0 ctheta
muls (a4),d0 ctheta*sgamma
move.w (a0),d1 stheta
muls (a2),d1 stheta*sphi
lsl.l #2,d1
swap d1
muls (a5),d1 stheta*sphi*cgamma
add.l d0,d1 stheta*sphi*cgamma + ctheta*sgamma
lsl.l #2,d1
swap d1
move.w d1,(a6)+
* do WM13
move.w (a0),d0 stheta
muls (a4),d0 stheta * sgamma
move.w (a1),d1 ctheta
muls (a2),d1 ctheta*sphi
lsl.l #2,d1
swap d1
muls (a5),d1 ctheta*sphi*cgamma
sub.l d1,d0 stheta*sgamma  ctheta*sphi*cgamma
lsl.l #2,d0
swap d0
move.w d0,(a6)+
* do WM21
move.w (a3),d0 cphi
muls (a4),d0 ctheta*sgamma
lsl.l #2,d0
swap d0
neg d0
move.w d0,(a6)+
* do WM22
move.w (a1),d0 ctheta
muls (a5),d0 ctheta*cgamma
move.w (a0),d1 stheta
muls (a2),d1 stheta*sphi
lsl.l #2,d1
swap d1
muls (a4),d1 stheta**sphi*sgamma
sub.l d1,d0 ctheta*cgammastheta*sgamma
lsl.l #2,d0
swap d0
move.w d0,(a6)+
* do WM23
move.w (a0),d0 stheta
muls (a5),d0 stheta*cgamma
move.w (a1),d1 ctheta
muls (a2),d1 ctheta*sphi
lsl.l #2,d1
swap d1
muls (a4),d1 ctheta*sphi*sgamma
add.l d0,d1
lsl.l #2,d1
swap d1
move.w d1,(a6)+
* do WM31
move.w (a2),(a6)+
* do WM32
move.w (a3),d0 cphi
muls (a0),d0 cphi*stheta
lsl.l #2,d0
swap d0
neg d0
move.w d0,(a6)+
* do WM33
move.w (a1),d0 ctheta
muls (a3),d0 ctheta*cphi
lsl.l #2,d0
swap d0
move.w d0,(a6)+
rts
*****************************************************************************************
* PART 2: Transform the World coords to view coords.
wtranv_2
move.w wncoords,d7
ext.l d7 any to do?
beq wtranv3
subq.w #1,d7
lea wcoordsx,a0
lea wcoordsy,a1
lea wcoordsz,a2
lea vcoordsx,a3
lea vcoordsy,a4
lea vcoordsz,a5
exg a3,d3 save cos we're short of registers
link a6,#6 save 3 words
wtranv1
moveq.l #2,d6 3 rows in matrix
lea w_vmatx,a3 init max pointer
* calculate the next wx, wy and wz
wtranv2
move.w (a0),d0 wx
move.w (a1),d1 wy
move.w (a2),d2 wz
sub.w #50,d0 wx50
sub.w #50,d1 wy50
sub.w #50,d2 wz50
muls (a3)+,d0 wx*Mi1
muls (a3)+,d1 wy*Mi2
muls (a3)+,d2 wz*Mi3
add.l d1,d0
add.l d2,d0 wx*Mi+wy*Mi2+wz*Mi3
lsl.l #2,d0
swap d0
move.w d0,(a6)
dbra d6,wtranv2 repeat for 3 elements
move.w (a6)+,d0
add.w Ovz,d0
move.w d0,(a5)+ becomes vz
move.w (a6)+,(a4)+
exg a3,d3 restore vx, save matx pointer
move.w (a6)+,d0
add.w #100,d0
move.w d0,(a3)+ becomes vx
exg a3,d3 save vx, restore matx pointer
addq.l #2,a0 point to next wx
addq.l #2,a1 wy
addq.l #2,a2 wz
dbra d7,wtranv1 repeat for all ocoords
unlk a6 close frame
wtranv3
rts
*
* Calculate the sines and cosines of the view angles
view_trig
move.w vtheta,d1 theta
bsr sincos
move.w d2,stheta sine
move.w d3,ctheta cosine
move.w vphi,d1
bsr sincos
move.w d2,sphi
move.w d3,cphi
move.w vgamma,d1 gamma
bsr sincos
move.w d2,sgamma
move.w d3,cgamma
rts
*
* Read jstick and update vars accordingly.
joy_in
move.w $dff00c,d0 read jstick register
* convert value to angle totals
angle_speed
btst #8,d0 up or left?
beq dwn_rt nope
btst #9,d0 left?
beq up
bra left
dwn_rt
btst #0,d0 down or right?
beq joy_out
btst #1,d0 right?
beq down
bra right
joy_out
rts
IFD JOY1
* set up the increments to angles +/10 is the limit
up
subq.w #2,vphi_inc
rts
down
addq.w #2,vphi_inc
rts
left
addq.w #2,vtheta_inc
rts
right
subq.w #2,vtheta_inc
rts
ENDC
IFD JOY2
up
move.w #350,vyangle
bsr rot_vy
rts
down
move.w #10,vyangle
bsr rot_vy
rts
left
move.w #10,vxangle
bsr rot_vx
rts
right
move.w #350,vxangle
bsr rot_vx
rts
ENDC
IFD JOY3
up
bsr rot_down
rts
down
bsr rot_up
rts
left
bsr rot_left
rts
right
bsr rot_right
rts
ENDC
IFD JOY4
up
move.w #5,vphi_inc
rts
down
move.w #5,vphi_inc
rts
left
move.w #5,vtheta_inc
rts
right
move.w #5,vtheta_inc
rts
ENDC
**************************************************************
angle_update
move.w vtheta_inc,d0
bmi vth_neg
beq chk_phi
subq.w #1,vtheta_inc
cmp.w #25,vtheta_inc
ble chk_phi
move.w #25,vtheta_inc
bra chk_phi
vth_neg
addq.w #1,vtheta_inc
cmp.w #25,vtheta_inc
bge chk_phi
move.w #25,vtheta_inc
chk_phi
move.w vphi_inc,d0
bmi vph_neg
beq chk_out
subq.w #1,vphi_inc
cmp.w #25,vphi_inc
ble chk_out
move.w #25,vphi_inc
bra chk_out
vph_neg
addq.w #1,vphi_inc
cmp.w #25,vphi_inc
bge chk_out
move.w #25,vphi_inc
chk_out
* update vtheta
move.w vtheta,d0 the previous angle
add.w vtheta_inc,d0 increase by increment
bgt thta_1 check it lies between 0 and 360
add #360,d0
bra thta_2
thta_1
cmp.w #360,d0
blt thta_2
sub #360,d0
thta_2
move.w d0,vtheta becomes the current angle
* update vphi
move.w vphi,d0
add.w vphi_inc,d0
bgt phi_1
add #360,d0
bra phi_2
phi_1
cmp.w #360,d0
blt phi_2
sub #360,d0
phi_2
move.w d0,vphi
rts
*****************************************************************************************
key_in
in_key
clr.w d0
move.b $bfec01,d0
cmp.b #$5f,d0
beq f1
cmp.b #$5d,d0
beq f2
cmp.b #$5b,d0
beq f3
cmp.b #$59,d0
beq f4
cmp.b #$57,d0
beq f5
cmp.b #$55,d0
beq f6
cmp.b #$53,d0
beq f7
rts
IFD JOY3
f1 bsr roll_left
rts
f2 bsr roll_right
rts
f3 move.w #2,speed
rts
f4 move.w #2,speed
rts
f5 move.w #3,speed
rts
f6 move.w #0,speed stop
rts
f7 move.w #QUIT,quitflag
rts
ELSEIF
f1 move.w #1,speed reverse
rts
f2 move.w #1,speed forward
rts
f3 bchg.b #2,shearflag toggle x shearflag
rts
f4 bchg.b #1,shearflag toggle yshearflag
rts
f5 bchg.b #0,shearflag toggle z shearflag
rts
f6 move.w #0,speed stop
rts
f7 move.w #QUIT,quitflag
rts
ENDC
******************************************************************************************
* concatenate the shear with the rotation
shear
clr d0
move.b shearflag,d0 flag is lower 3 bits
and #$f,d0
* there are 8 possibilities 111  000, xyz respectively
lea shear_jump,a0
lsl.w #2,d0 get offset
move.l 0(a0,d0.w),a0
jmp (a0)
shear_jump
dc.l null,z,y,user1,x,user2,user3,user4
null
rts
z
lea zshear,a0
lea w_vmatx,a1
bsr concat
rts
y
lea yshear,a0
lea w_vmatx,a1
bsr concat
rts
user1
rts
x
lea xshear,a0
lea w_vmatx,a1
bsr concat
rts
user2 rts
user3 rts
user4 rts
*
* Multiply two 3x3 matrices pointed to by a0 and a1
* order is (a1)x(a0) with result sent to temp store at (a2)
* (a0) is in column order whilw (a1) and (a2) are in row order, of word length elements.
* Finally (a2) is copied to (A1).
concat
lea tempmatx,a2
move.w #2,d7 3 rows
conc1
move.w #2,d6
movea.l a0,a3 reset shear pointer
conc2
move.w (a1),d1
ext.l d1
lsr.l #1,d1
move.w 2(a1),d2
ext.l d2
lsr.l #1,d2
move.w 4(a1),d3
ext.l d3
lsr.l #1,d3
muls (a3)+,d1
muls (a3)+,d2
muls (a3)+,d3
add.w d2,d1
add.w d3,d1
move.w d1,(a2)+ next product element
dbra d6,conc2 do all elements in row
addq.w #6,a1 point to next row
dbra d7,conc1 for al rowa
* transfer result back to rotation matrix
lea tempmatx,a0
lea w_vmatx,a1
move.w #8,d7 num elements 1
conloop
move.w (a0)+,(a1)+
dbra d7,conloop
rts
* set the velocity components
speed_adj
move.w speed,d0
lsl.w #3,d0 scale it
move.w Ovz,d1
cmp.w #10,Ovz
bgt adj_out
move.w #10,Ovz
adj_out
add.w d0,Ovz
rts
******************************************************************************************
* bss_05.s
******************************************************************************************
include bss_04.s
* World Frame Variables
wncoords ds.w 1 num vertices in world frame
* View frame vars
vtheta ds.w 1 rotation of view frame abouut wx
vphi ds.w 1 wy
vgamma ds.w 1 wz
Ovx ds.w 1 view frame x origin in world frame
Ovy ds.w 1
Ovz ds.w 1
* General transform matrices
w_vmatx ds.w 9
tempmatx ds.w 9
* joystick
joy_data ds.w 1
* Dynamic vars
speed ds.w 1
vtheta_inc ds.w 1
vphi_inc ds.w 1
vgamma_inc ds.w 1
shearflag ds.w 1
quitflag ds.w 1
*****************************************************************************************
* Data_05.s
*****************************************************************************************
TRANSFORM EQU 1
include data_04.s
my_datax dc.w 100,100,0,0,20,90,20,15,45,45,15,55
dc.w 70,55,10,10,10,10,20,20
my_datay dc.w 0,100,100,0,15,60,87,25,40,65,74,46
dc.w 55,61,30,5,95,60,25,74
my_dataz dc.w 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
xshear dc.w 1,0,0,0,1,0,1,0,1
yshear dc.w 1,0,0,1,1,0,1,0,1
zshear dc.w 2,0,0,0,1,0,0,0,1
nparts dc.w 6
inst_angles dc.w 0,0,0,90,0,0,180,0,0,270,0,0,0,270,0,0,90,0
inst_disp dc.w 0,0,0,0,100,0,0,100,100,0,0,100,100,0,0,0,0,100
9. Flying Around The World
9.1. Introduction
A flight simulator? Well not exactly, but getting there.
In order to fully implement the simulation of independent motion of the observer, we require a little more vector algebra. The task is to construct a view of the world model from the point of view of an observer free to move in any direction. This is different from the simple procedure we used in the previous chapter. We now wish to operate a joystick and navigate our way through the assembly of objects constructed in the world frame. We want the view on the screen to move up or down when the joystick is pushed forward or pulled back and to move to the left or right when the joystick is moved to the right or the left. In other words, all of the motion on the screen must be relative to the observer’s current position. Even if the pilot of a plane is flying upside down, his perception of “up” is directed towards the roof of the cockpit, which as far as someone on the ground is concerned is “down”. What matters is that all of the movements corresponding to “up”, “down”, “left” and “right” apply to the observer’s reference frame, which we have called the view frame. Unlike rotation by Euler angles, which we used in the previous chapter, here we want the rotations to be about the view frame axes.
To be specific, let’s ask what we expect to happen when the joystick is pulled back. We expect to see the picture move vertically upwards, and this must always happen no matter what the orientation of the observer. Suppose we have got into the position where the aircraft, or whatever it is being controlled, is flying horizontally but with its wings vertically.
Figure 9.1 shows this orientation.
If the joystick is pulled back, object A will come into view at the top of the screen and the object B will go out of view at the bottom of the screen. The view seen by the pilot of the plane is shown in Figure 9.2. Herein lies the problem. The pilot has a very definite perception of what is “up” and what is “down” at any given moment and, while this does not change in the cockpit, it is changing continuously with respect to the world outside. In the previous chapter it was easy to relate “up” to an increase of v(J) and left to an increase in v9, but when referenced from the view frame all these motions depend on the orientation of the observer at any given instant.
There is more than one way of solving this problem. One method is to use control matrices to perform rotations of coordinates after they have been transformed to the view frame. The control matrices perform simple rotations about the view x, y and z axes. This method is employed in the next chapter. Another way is to keep a constant record of the position and orientation of the view frame in the world frame and to generate movements of the view frame resulting from movements of the joystick. This second method relies heavily on the notion of a set of view frame axes undergoing rotations and translations following the path of the observer. It also embodies the notion of rotation about an arbitrary axis that we would also like to introduce in this chapter which is very useful for performing rotations about any axis in the world frame.
We could, of course, decide to accept the limitations of Euler angles to fix the view frame orientation in the simpler orbitallike fashion. In Chapter 10, we show how a flight simulator works well by using each of these approaches.
9.2. Coordinate Transforms and Direction Cosines
Here’s a bit of maths. It’s not as hard as it looks.
If you know the coordinates of the vertices of an object in one reference frame and want to know what they are in another, it is necessary to do a coordinate transform. (Remember the other type of transform is called a geometric transform, which is what happens when the object itself is moved inside a single reference frame). If a point has coordinates (xw,yw,zw) in the world frame, it will have coordinates (xv,yv,zv) in the view frame. Thus the point A in Figure 9.3 has coordinates (0,0,50) in the world frame, and coordinates (0,50,0) in the view frame (what is seen on the screen has later to be worked out by means of the perspective transform). As far as rotations are concerned there is always a linear relation between these two sets of coordinates, and for this case we can write in general terms:
xv = nll.xw + nl2.yw + nl3.zw yv = n21.xw + n22.yw + n23.zw zv = n31.xw + n32.yw + n33.zw
where the n’s are numbers that remain to be worked out. This relation can also be written as a matrix product:
xv = n11 n12 n13 xw yv = n21 n22 n23 * yw zv = n31 n32 n33 zw
The n matrix is the transformation matrix. The elements nil, nl2, etc., are specific to the relative orientation of the two reference frames and are called the direction cosines.
To see how the direction cosines are related to the geometry, look at Figure 9.4. The direction cosines are simply the cosines of the angles between the axes of the reference frames. It is quite hard to draw a comprehensive diagram which is not confusingly messy but, for example, nil is the cosine of the angle between vx and xw, nl2 is the cosine of the angle between xv and yw, nl3 is the cosine of the angle between xv and zw and so on:
nil = cos(a), nl2 = cos(b), n!3 = cos(c).
If these direction cosines can be found, the problem of converting world frame coordinates into view frame coordinates is solved. We are however still left with the problem of converting movements of the joystick into changes in the direction cosines. It is clear that we should solve the problem with a strategy that centres on the direction cosines. Here is one way it can be done.
9.3. Base Vectors and Direction Cosines
Just for a moment let’s forget all about the maths. Let’s try to visualise what’s going from the point of view of a second, stationary, independent observer at rest in the world frame and able to see both the world frame and the moving observer simultaneously. This is the point of view of a man on the ground watching a plane fly past. Think of the plane as the view frame but with the fuselage replaced by the zv axis, the wings replaced by the yv axis and the vertical tail wing in the direction of the xv axis. Although he is not in the plane, the stationary observer can calculate the view according to the pilot if he knows the position and orientation of the plane at any instant.
To see how that view would change when the pilot pulls back the joystick, for example, he has only to rotate the plane about the axis of the wings (the angle depends on how long the joystick is pulled back), which is a rotation about the yv axis. Since the plane is moving forward during the rotation this has the added complication of making it fly upwards. Like the stationary observer, we need to keep a continuous record of the position and orientation of the view frame as it flics around the world.
To do this, imagine three unit vectors in the directions of the view frame axes. In vector geometry these unit vectors are given a special name. They are called base vectors. At the very start of the program let us suppose that the view frame is positioned coincident with the world frame. This is equivalent to having a second set of world frame base vectors at the airfield from where the plane has taken off. (Actually it isn’t really necessary to have them start off coincident and in general they don’t, but it makes the argument easier to visualise).
Now at each stage of the subsequent motion it is necessary to record the position and orientation of the view frame unit vectors. It is not possible simply to keep a running total of how many degrees the plane rotated to the left (about vx) or up (about vy) since we have no way of knowing how to translate this information into the final orientation of the plane after many movements. In the method of Euler angles used in the previous chapter it was possible to keep a running total since the first angle referred to rotation about an axis of the static world frame. But now we are using angles referred to the view frame which is moving all the time.
Here comes the big question. Suppose we can keep a record of the positions of the view frame base vectors, what do they have to do with the original transform? The answer is very simple: the components in the world frame of the view frame base vectors are just the direction cosines that are the elements, nil to n33, of the worldtoview transform matrix. In other words, where iv, jv and kv are the view frame base vectors and iw, jw and kw are the world frame base vectors, the relation between them is:
iv = n11.iw + n12.jw + n23.kw jv = n21.iw + n22.jw + n23.kw kv = n31.iw + n32.jw + n33.kw.
Or, writing the view frame base vectors in terms of their world frame components
n11 n21 n31 iv = n12 , jv = n22 , kv = n32 n13 n23 n33
At the start of the motion, when the view frame and world frame axes were aligned, the view frame base vectors had components
1 0 0 iv = 0 , jv = 1 , kv = 0 0 0 1
If we can keep a record of the view frame base vectors we therefore have the direction cosines immediately available to construct the view from the cockpit. The strategy is straightforward but there are some tricky problems to solve on the way.
9.4. Rotating the Base Vectors: Rotation About an Arbitrary Axis
The base vectors which fix the current orientation of the view frame depend on what movements have already taken place. Suppose at a given instant the view frame is oriented with its base vectors in the positions shown in Figure 9.5. The base vector of the vx axis, iv, has three components in the world frame nil, nl2 and nl3 (the other unit vectors jv and kv also have components but for clarity these are not shown in the diagram). Now suppose a movement of the joystick occurs corresponding to a rotation about the vy axis. To find the new components of iv and kv (jv remains unchanged in this rotation) we must rotate them about vy. The vy axis is the axis of rotation and is specified in the world frame by its direction cosines. But we are in luck! This problem has already been solved. It is known as rotation about an arbitrary axis. Since at this point yv can be pointing anywhere in the world frame, the axis is very arbitrary. In fact the solution to the problem is given in just the format most useful to us. It is in the form of a matrix for rotation by an angle about an axis specified by its direction cosines. Just the form we want. The transform can also be used for rotation about any other axis in the world frame. All that is required are the three direction cosines.
Once constructed, the rotation matrix can be multiplied by iv and kv to yield the new components of iv and kv, which then replace the old ones and are also used directly to construct the worldtoview transform (there is a catch, which we’ll discuss shortly).
For rotation by an angle 𝛿 about an axis with direction cosines n1, n2 and n3 (just the last index in the cosine to show it can refer to any axis), the matrix is
n1.nl+(1n1.n1)cos(𝛿) n1.n2(1cos(𝛿))n3sin(𝛿) n1.n3(1cos(𝛿))+n2sini(𝛿) n1.n2(1cos(𝛿))+n3sin(𝛿) n2.n2+(1n2.n2)cos(𝛿) n2.n3(1cos(𝛿))n1sin(𝛿) n1.n3(1cos(𝛿))n2sin(𝛿) n2.n3(1cos(𝛿))+n1sin(𝛿) n3.n3+(1n3.n3)cos(𝛿)
9.5. Accumulating Errors
Broadly speaking, all the ingredients required to steer the view frame through the world frame controlled by joystick movements are in place. Let us lay out the algorithm as it stands at the moment:

movement of the joystick specifies a rotation of the view unit vectors about one of the view frame axes,

construct the rotation matrix to rotate the other two unit vectors about this axis and replace them with their new components,

use the components of the unit vectors, now called direction cosines, to construct the worldtoview transform,

perform the transform and display the picture

and repeat the cycle.
This is all OK and it works. For a while.
Eventually it will lead to a degenerating picture, or worse a chaotic mess, because of accumulating errors. As it stands the program has a builtin pathological selfdistruct. Because calculations are done in integer arithmetic, and sines and cosines are calculated to an accuracy no better than 1 in 16384, given enough transforms, large errors will accumulate in the unit vectors and, as a consequence, the worldtoview transform. In life nothing is perfect and this is a good example of that adage. In addition, the algorithm has feedback in that joystick movements are made on the basis of the picture on the screen that is generated, in turn, from the transform constructed from the joystick movements. This has all the ingredients necessary to create chaos, and so it does.
In order to beat the accumulation of errors, the cycle of error accumulation must be broken. This is achieved by regenerating the base vectors afresh each time. This requires more work but it solves the problem. Figure 9.6 shows the stages in the regeneration of the view frame unit vectors.
The vectors that matter most are kv, the one that points in the direction of motion and iv, the pointer to the “up” direction. Without these two it is not possible to define either the direction of motion or which way is up as far as the pilot is concerned. Let’s suppose that, because of errors in the last transform, we have three unit vectors iv’, jv’ and kv’ which are slightly wrong. The errors will result in the base vectors not being at rightangles to each other and not having size equal to unity. As a first step, the vector kv’ is normalised, i.e. its magnitude is made to be unity. It becomes kv. This at least ensures that if its direction is slightly wrong, its size isn’t. The only effect a slightly wrong direction will have is that the view will be slightly in error, but that hardly matters since the view is being constantly adjusted by the joystick anyway. Second, the vector cross product of kv and iv’ is taken in order to generate a new vector at 90 degrees to them both. A vector cross product has just this property (see Appendix 5). This new vector is in the direction that jv would have if it weren’t in error. The new vector is then normalised i.e. its magnitude is made to be 1, and it becomes the new jv. Third, the vector cross product of the new kv and the new jv is taken, and normalised, in order to generate a new iv. In this way all three unit vectors are regenerated each frame and errors do not accumulate (it is interesting to remove the regeneration stage in the example program and watch the disintegration take place).
The components of the new unit vectors then become the components of the viewing transform matrix and the cycle is repeated.
The technical details are discussed as they appear in the example programs.
9.6. Clipping in 3D
No part of an object which lies behind the view plane (zv < 0) must be drawn. If this is attempted, the program will not crash but what appears on the screen will be garbage. This is because the polygon drawing routines expect to see the edge list of vertices go clockwise round the perimeter of a polygon and this will be wrong for polygons projected backwards onto the view plane. In addition, objects that lie too far from the view plane should not be drawn either. This is because nothing can be drawn smaller than a pixel, and very distant objects reduce to an incoherent cluster of pixels.
Besides these obvious cases, there is no point in wasting time on objects that lie too far outside the field of view. This field of view is defined by the frustum (truncated pyramid) defined by the line of sight from the view point to the viewport boundaries. This is illustrated in Figure 9.7.
In a more leisurely application it would be possible to clip polyhedra to the boundary of the frustum in a 3D generalisation of the way polygons have been clipped to the screen window. In this application that would be too time consuming. Here, the centre of symmetry (Oox,Ooy,Ooz) is used to locate objects in the field of view and the angle of the frustum is increased to lie beyond the screen limit. This means that some time is wasted drawing distant objects which cannot be seen, but objects that are close up are not abandoned the instant their centres pass beyond the field of view. They are marked as visible but only part will appear on the screen as a result of screen clipping.
The top and base of the frustum are called the hither and yon planes. In the example program they are defined by the equations zv=100 (hither) and zv=2000 (yon). The sides of the frustum of the field of view are defined (where the viewport centre coincides with the view frame origin) by the planes
zv + 100 = xv side A zv + 100 = xv side B (1.2).(zv + 100) = yv side C (1.2).(zv + 100) = yv side D
but the actual sides used in the program extend beyond this limit, for reasons explained above, and are described by
8.(zv + 100) = ±xv sides A and B 8.(zv + 100) = ±yv sides C and D
9.7. Velocity of the Observer
The observer (you) does not only use the joystick to do rotations. The observer also has a velocity that may be changing as time passes. To include velocity, all that has to be done is to increment the observer’s position in the world frame in proportion to the velocity. The velocity is a vector, so it has direction as well as size  speed is the magnitude of the velocity. The procedure is to change each component of the observer’s position, each frame, by an amount proportional to the speed times the relevant component of the base vector kv.
In other words, if the view frame is pointing only in the direction of the zw axis, only Ovz should be incremented each time. On the other hand if the view frame is pointing along the xw axis, only Ovx should be incremented each time. For anything between, Ovx, Ovy and Ovz should be incremented in proportion the components of kv in those directions. This ensures that the direction in which the observer is looking is the direction of motion. The details are explained in the example program.
9.8. Example Programs
In this program it is possible to fly round the A cube. The program starts with the cube at midscreen and with the observer stationary. Pressing F2 causes the view frame to move towards the cube at constant speed (pressing F1 causes it to retreat). Thereafter motion is controlled by the joystick. It is possible to fly past the cube and then do an about turn to return to it. Because of 3D clipping, the cube is not displayed if it comes closer than 100 or is farther away than 2000 or is outside the field of view (see above). Motion can be stopped by pressing F6 and the program aborted by pressing F7.
9.8.1. wrld_vw.s
This is the control program. Much of it is similar to that of the previous chapter. It draws an A cube that can be flown around under the control of the joystick. This time the joystick performs rotations about the axes of the view frame, i.e. the pilot. When the joystick is pulled back the viewer looks upwards into the world and if there is forward motion he/she follows a rising trajectory. Other motions of the joystick produce corresponding motion as if the viewer were flying through the world frame. In this way it is possible to fly past an object and then sweep through an arc to return to it.
The program follows the sequence described above. First the view frame base vectors are initialized. Following this the joystick is read and immediately the view frame unit vectors in the world frame are rotated. Then the keyboard is read to see if the speed has changed. Following this the new position of the view frame in the world frame is calculated from the speed and the view frame zaxis base vector kv which is now pointing along the new direction of motion. In motion that is not in a straight line, the velocity is changing all the time (the velocity is a vector and so it can change if its direction changes even if its size, the speed, doesn’t). Finally the unit vectors are themselves regenerated to avoid accumulating errors and passed on directly as the elements of the worldtoview transform before drawing the picture of the A cube.
The function keys F1 and F2 are reverse and forward respectively. F6 is stop and F7 resets the system. Be careful to press the keys lightly and not hold them down since the keyboard buffer is not cleared between frames.
There are no subdeties such as inertia in the mouon but these could be incorporated along the lines described in the previous chapter.
9.8.2. core_06.s
Here is where all the work is done. The subroutine dircosines regenerates the base vectors and passes the new values to the viewing transform matrix. To do the regeneration requires vector cross products and normalisation (i.e. scaling the size of the vector to unity). To normalise a vector requires dividing each of its components by the magnitude of the vector, which must be calculated as the square root of the sum of the squares of the components. This is dealt with using the nrm vec routine used previously for the illumination calculation.
In the subroutine in Joy, the joystick is read and action taken immediately to rotate the view frame base vectors about an axis in the world frame, which here is one of the base vectors, but could be any axis defined by its direction cosines. The matrix for rotation is constructed in v_rot_matx. The elements of this are quite large but the overall work is minimised by calculating pairs of elements at a time due to the similarity of elements with their row and column indices interchanged.
In vel_adj the new direction of motion, which is the direction pointed to by the kv vector, is combined with the speed to produce a displacement of the view frame. What this amounts to is simply multiplying the components of kv by the speed and adding them to Ovx, Ovy and Ovz, the current value of the view frame origin in the world.
The test for visibility of objects follows the criteria explained above, where the object frame origin (Oox,Ooy,Ooz) is examined to see if it lies in the frustum defined as the field of view. To do this, the origin itself is first transformed into the view frame where it becomes (Vox,Voy,Voz).
One final routine, sern adj, is included to reset the centre of the screen at the origin of the world frame. This is not the same as simply moving the view frame in the world frame since it affects the appearance of perspective. Having the view frame centred on the screen is more natural to “flying around in space” experiences.
9.8.3. bss_06.s
This contains the few new variables introduced in this section: the base vectors and the rotations resulting from movement of the joystick.
* wrld_vw.s
* Joystick control of the view frame for chapter 9
*
*SECTION TEXT
opt d+
bra main
include systm_00.s screens and tables
include core_06.s new subroutines
main bsr init allocate memory etc.
* transfer all the data
bsr transfer
move.w oncoords,vncoords
move.w vncoords,wncoords
* Initalise dynamical variables
move.w #0,Ovx view frame
move.w #0,Ovx starts off
move.w #200,Ovz 200 behind world frame
* Set up view frame base vectors
* 1. iv
lea iv,a0 align
move.w #$4000,(a0) view
clr.w (a0)+ frame
clr.w (a0) axes
* 2 jv
lea iv,a0 with
clr.w (a0)+ the
move.w #$4000,(a0)+ world
clr.w (a0) frame
* 3. kv
lea jv,a0 axes
clr.w (a0)+
clr.w (a0)0
move.w #$4000,(a0)
clr.w speed start at rest
clr.w screenflag 0=screen 1 draw, 1=screen 2 draw
clr.w viewflag
loop4:
* Switch the screens each time round
tst.w screenflag screen 1 or screen2?
beq screen_1 draw on screen 1, display screen2
bsr drw2_shw1 draw on screen 2, display screen1
clr.w screenflag and set the flag for next time
bra screen_2
screen_1:
bsr drw1_shw2 draw on 1, display 2
move.w #1,screenflag
screen_2:
* Look for changes in the view frame angles
bsr in_joy read joystick rotate view frame
* See if the function keys have been pressed to change the speed
bsr key_in
* Adjust to new velocity
bsr vel_adj
* Recalculate view frame base vectors and set up the worldview
* transform matrix
bsr dircosines
* See if the object is within the visible angle of view
bsr viewtest
tst.b viewflag is it visible
beq loop4 no,try again
* Construct compound objects from same face att different positions
move.w nparts,d7 how many parts in the object
subq #1,d7
lea inst_angles,a0 list of angles for each part
lea ins_disp,a1 ditto displacements
* Do one face at a time
instance:
move.w d7,(sp) save the count
move.w (a0)+,otheta next otheta
move.w (a0)+,ophi next ophi
move.w (a0)+,ogamma next ogamma
move.w (a1)+,Oox next displacements
move.w (a1)+,Ooy
move.w (a1),Ooz
movem.l a0/a1,(sp) save position in the list
bsr otranvw object to world transform
bsr w_tran_v world to view transform
bsr illuminate if not hidden find the shade
bsr perspective perspective
bsr scrn_adj centre window
bsr polydraw draw this face
movem.l (sp)+,a0/a1 restore pointers
move.w (sp)+,d7 restore the parts count
dbra d7,instance for all the parts of object
bra loop4 draw the next frame
*SECTION DATA
include data_00.s
include data_03.s
include data_05.s
*SECTION WSS
include bss_06.s
END
*****************************************************************************************
* Core_06.s *
* subroutines for Chapter 9 *
*****************************************************************************************
CORE6 EQU 1
include Core_05.s
* Find the direction cosines for the transform from the world frame to view frame.
* These are components of the view frame base vectors in the world frame.
* To avoid accumulating errors they are regenerated and normalised to a magnitude of:
* 2^14.
dircosines
lea iv,a0
lea jv,a1
lea kv,a2
* Kv is normalised
move.w (a2),d0
move.w 2(a2),d1
move.w 4(a2),d2
bsr nrm_vec
move.w d0,(a2) new components
move.w d1,2(a2)
move.w d2,4(a2)
* calc vj from cross product of vk & vi using subroutine AxB.
* A pointer in a2: B pointer in a0:
bsr AxB
move.w d0,(a1)
move.w d1,2(a1)
move.w d2,4(a1)
* finally the cross product of kv & jv is used for iv.
lea jv,a2
lea kv,a0
bsr AxB
lea iv,a1
move.w d0,(a1) regenerated iv
move.w d1,2(a1)
move.w d2,4(a1)
* The components of the view frame base vectors in the world frame are the elements
* of the transform matrix required for the world to view transform.
lea w_vmatx,a0
lea iv,a1
lea jv,a2
lea kv,a3
move.w (a1)+,(a0)+ matrix elements of the view transform
move.w (a1)+,(a0)+
move.w (a1)+,(a0)+
move.w (a2)+,(a0)+
move.w (a2)+,(a0)+
move.w (a2)+,(a0)+
move.w (a3)+,(a0)+
move.w (a3)+,(a0)+
move.w (a3)+,(a0)+
rts
*****************************************************************************************
AxB
move.w 2(a2),d0 Ay
muls 4(a0),d0 bz*Ay
move.w 4(a2),d1 Az
muls 2(a0),d1 By*Az
sub.l d1,d0 Bz*AyBy*Ax
* 2nd component
move.w 4(a2),d1 Az
muls (a0),d1 Bx*Az
move.w (a2),d2 Ax
muls 4(a0),d2 Bz*Ax
sub.l d2,d1 Bx*AzBz*Ax
* 3rd component
move.w (a2),d2 Ax
muls 2(a0),d2 By*Ax
move.w 2(a2),d3 Ay
muls (a0),d3 Bx*Ay
sub.l d3,d2 By*AxBx*Ay
* Reduce them to < word size by dividing by 2^14
move #14,d7
lsr.l d7,d0
lsr.l d7,d1
lsr.l d7,d2
* normalise them
bsr nrm_vec
rts
*****************************************************************************************
* Do a rotation of the view frame about one of the view frame axes in the world frame.
* The direction cosines for the axis are the base vector components.
* First a rotation about the view frame xaxis, vx.
rot_vx
lea iv,a0 the axis of rotation
move.w vxangle,d1 the angle to rotate
bsr v_rot_matx construct the rotation matrix
* only jv and kv are affected
lea jv,a0 1st transform
bsr rot_view
lea kv,a0 2nd transform
bsr rot_view
rts
*
rot_vy
lea jv,a0
move.w vyangle,d1
bsr v_rot_matx
* only iv and kv are affected
lea iv,a0 1st transform
bsr rot_view
lea kv,a0 2nd transform
bsr rot_view
rts
*
rot_vz
lea kv,a0
move.w vzangle,d1
bsr v_rot_matx
* only iv and kv are affected
lea iv,a0 1st transform
bsr rot_view
lea jv,a0 2nd transform
bsr rot_view
rts
*
* Rotate a view frame base vector. The vector is pointed to by a0. Since it is
* a unit vector it is specified by three components which are the direction cosines.
* (nx, ny, nz).
rot_view
moveq #2,d6 rows in matrix
lea vrot_matx,a3
link a6,#6
rot_vw1
move.w (a0),d0 nx components
move.w 2(a0),d1 ny
move.w 4(a0),d2 nz
muls (a3)+,d0 nx*Mi1
muls (a3)+,d1 ny*Mi2
muls (a3)+,d2 nz*Mi3
add.l d1,d0
add.l d2,d0
lsl.l #2,d0
swap d0
move.w d0,(a6)
dbra d6,rot_vw1
move.w (a6)+,4(a0) z
move.w (a6)+,2(a0) y
move.w (a6)+,(a0) x
unlk a6
rts
***************************************************************************************
* Construct the rotation matrix for rotations about an arbitrary axis specified by a
* unit vector with components (direction cosines) n1, n2, n3.
* ENTRY: Pointer to direction cosines in a0: Angle in d0.
v_rot_matx
lea vrot_matx,a6
bsr sincos
move.w d2,d6 sine delta
move.w d3,d7 cos delta
* elements M12 and M21
move #16384,d5
move d5,d0
move.w (a0),d1 n1
muls 2(a0),d1 n1*n2
lsl.l #2,d1
swap d1
sub.w d7,d0 1cosdelta
move d0,d4
muls d1,d0
lsl.l #2,d0
swap d0 n1*n2(1cosdelta)
move d0,d2
move.w 4(a0),d1 n3
muls d6,d1 n3*sindelta
lsl.l #2,d1
swap d1
sub.w d1,d0 n1*n2(1cosdelta)n3*sindelta
move.w d0,2(a6) M12
add.w d1,d2 n1*n2(1cosdelta)+n3*sindelta
move.w d2,6(a6) M21
* elements M13 and M31
move d4,d0 1cosdelta
muls (a0),d0 n1*(1cosdelta)
lsl.l d0
swap d0
muls 4(a0),d0 n1*n3(1cosdelta)
lsl.l #2,d0
swap d0
move d0,d2
move.w 2(a0),d1 n2
muls d6,d1 n2*sindelta
lsl.l #2,d1
swap d1
add.w d1,d0 n1*n3(1cosdelta)+n2*sindelta
move.w d0,4(a6) M13
sub.w d1,d2 n1*n3(1cosdelta)n2*sindelta
move.w d2,12(a6) M31
* elements M23 and M32
move d4,d0 1cosdelta
muls 2(a0),d0 n2*(1cosdelta)
lsl.l #2,d0
swap d0
muls 4(a0),d0 n2*n3(1cosdelta)
lsl.l #2,d0
swap d0
move d0,d2
move.w (a0),d1 n1
muls d6,d1 n1*sindelta
lsl.l #2,d1
swap d1
sub.w d1,d0 n2*n3(1cosdelta)n1*sindelta
move.w d0,10(a6) M23
add.w d1,d2 n2*n3(1cosdelta)+n1*sindelta
move.w d2,14(a6) M32
* elemnt M11
move.w (a0),d1 n1
muls d1,d1 n1*n1
lsl.l #2,d1
swap d1
move d5,d2 1
sub.w d1,d2 1n1*n1
muls d7,d2 (1n1*n1)cosdelta
lsl.l #2,d2
swap d2
add.w d2,d1 n1*n1+(1n1*n1)cosdelta
move.w d1,(a6) M11
* element M22
move.w 2(a0),d1 n2
muls d1,d1 n2*n2
lsl.l #2,d1
swap d1
move d5,d2 1
sub.w d1,d2 1n2*n2
muls d7,d2 (1n2*n2)cosdelta
lsl.l #2,d2
swap d2
add.w d2,d1 n2*n2+(1n2*n2)cosdelta
move.w d1,8(a6) M22
* element M33
move.w 4(a0),d1 n3
muls d1,d1 n3*n3
lsl.l #2,d1
swap d1
move d5,d2
sub.w d1,d2 1n3*n3
muls d7,d2 (1n3*n3)cosdelta
lsl.l #2,d2
swap d2
add.w d2,d1 n3*n3+(1n3*n3)cosdelta
move.w d1,16(a6) M33
rts
************************************************************************
w_tran_v
move.w wncoords,d7
ext.l d7 any to do?
beq w_tranv3
subq.w #1,d7
lea wcoordsx,a0
lea wcoordsy,a1
lea wcoordsz,a2
lea vcoordsx,a3
lea vcoordsy,a4
lea vcoordsz,a5
exg a3,d3 save cos we're short of registers
link a6,#6 save 3 words
w_tranv1
moveq.l #2,d6 3 rows in matrix
lea w_vmatx,a3 init max pointer
* calculate the next vx, vy and vz
w_tranv2
move.w (a0),d0 wx
move.w (a1),d1 wy
move.w (a2),d2 wz
sub.w Ovx,d0
sub.w Ovy,d1
sub.w Ovz,d2
muls (a3)+,d0 wx*Mi1
muls (a3)+,d1 wy*Mi2
muls (a3)+,d2 wz*Mi3
add.l d1,d0
add.l d2,d0 wx*Mi+wy*Mi2+wz*Mi3
lsl.l #2,d0
swap d0
move.w d0,(a6)
dbra d6,w_tranv2 repeat for 3 elements
move.w (a6)+,(a5)+
move.w (a6)+,(a4)+
exg a3,d3 restore vx, save matx pointer
move.w (a6)+,(a3)+
exg a3,d3 save vx, restore matx pointer
addq.l #2,a0 point to next wx
addq.l #2,a1 wy
addq.l #2,a2 wz
dbra d7,w_tranv1 repeat for all ocoords
unlk a6 close frame
w_tranv3
rts
****************************************************************************************
* Set the velocity components
vel_adj
lea kv,a0
moveq.l #14,d7 ready to divide by 2^14
move.w speed,d0
lsl.w #3,d0 scale it
move d0,d1
move d0,d2
muls (a0),d0 v*VZx
lsr.l d7,d0 /2^14
add.w d0,Ovx xw speed component
muls 2(a0),d1 v*VZy
lsr.l d7,d1
add.w d1,Ovy zw speed component
muls 4(a0),d2 v*VZz
lsr.l d7,d2
add.w d2,Ovz
rts
****************************************************************************************
* test whether the primitive is vsible. see whether its centre (oox,Ooy,Ooz) lies within
* the angle of visibilty. Oox, Ooy and Ooz are transformed to view coords and then tested.
viewtest
moveq.l #2,d6 rows in matrix
lea w_vmatx,a3
link a6,#6
move.w Oox,d3
addi.w #50,d3
move.w Ooy,d4
addi.w #50,d4
move.w Ooz,d5
addi.w #50,d5
sub.w Ovx,d3 OoxOvx relative to the view frame
sub.w Ovy,d4
sub.w Ovz,d5
tran0v
move d3,d0
move d4,d1
move d5,d2
muls (a3)+,d0 *Mi1
muls (a3)+,d1 *Mi2
muls (a3)+,d2 *Mi3
add.l d1,d0
add.l d2,d0 *Mi1+*Mi2+*Mi3
lsl.l #2,d0
swap d0
move.w d0,(a6)
dbra d6,tran0v repeat for three elements
move.w (a6)+,d3 Voz
move.w (a6)+,d2 Voy
move.w (a6)+,d1 Vox
move.w d3,Voz
move.w d2,Voy
move.w d1,Vox
unlk a6
* Clip Ovz. For visibility must have 100<Voz<2000
cmpi.w #100,d3 test(Voz100)
bmi invis
cmpi.w #2000,d3 test(Voz2000)
bpl invis
* is it within the view angle?
addi.w #100,d3 Voz+100
add.w d3,d3 *2
add.w d3,d3 *4
add.w d3,d3 *8
* First test horizontal position
tst.w d2 is Voy +ve or ve
bpl pos_y
neg.w d2
pos_y
cmp.w d2,d3 Voy is +, (test(8*(Voz+100)_Voy))
bmi invis
* Test vertical position
tst.w d1 Vox
bpl pos_x
neg.w d1
pos_x
cmp.w d1,d3 test(8(Voz+100)Vox)
bmi invis
* It IS visible
st viewflag
rts
* It is INVISIBLE
invis
sf viewflag
rts
**************************************************************************************
*Adjust screen coords so that view frame (0,0) is at centre
scrn_adj
move.w vncoords,d7
beq adj_end
subq.w #1,d7
lea scoordsy,a0
adj_loop
subi.w #100,(a0)+
dbra d7,adj_loop
adj_end
rts
*****************************************************************************************
* bss_06.s *
*****************************************************************************************
include bss_05.s
* VARIABLES FOR ROTATING THE VIEW FRAME
iv ds.w 3 view frame base vector components in world
jv ds.w 3
kv ds.w 3
vxangle ds.w 1 rotation angles about these axes
vyangle ds.w 1
vzangle ds.w 1
vrot_matx ds.w 9 rotation matrix about an arbitrary axis
* VISIBILTY
viewflag ds.w 1
Vox ds.w 1 object centre in view frame
Voy ds.w 1
Voz ds.w 1
10. A World Scene
In this chapter a world containing many objects is constructed.
The transition from a single graphics primitive to a scene containing several brings a host of new problems. For example, in the complex scene of many objects, spatial relationships must be preserved; objects in the foreground must not be obscured by those in the distance. Some form of depth sorting is required that orders objects for drawing on the basis of their distance from the observer.
Just as important is a sound strategy for ignoring all objects outside the immediate environment of the observer. In a world consisting of hundreds of objects spread out over a landscape, it would be pointlessly time consuming to attempt to draw them all. As in real life, the observer need only be concerned with those that are close by and affect current decisions. We examine these aspects of die multiobject world in turn.
10.1. A Database
Associated with each object in the complex world will be a list of its attributes (type, position, colour, rotation angles, etc.), and the set of lists of all the objects is a database. It contains all information needed to draw the view seen by the observer. Exacdy how this database is laid out in memory is very important in determining the speed with which it can be accessed for graphics.
To explain this point further, consider the choices available in ordering the objects in the database. Objects could be entered in the database in order of increasing x (world) coordinate or increasing y coordinate or increasing z coordinate, or indeed at random with no spatial order whatsoever. Objects could be listed according to their type, colour or any one of their attributes. Of all the possibiliues there will be those that provide fast access to those objects which are going to be drawn, i.e. those in the immediate vicinity of the observer. It is clear that some kind of ordering in position is needed to achieve this.
10.1.1. A Map
The position of an object in the world is specified by its three coordinates in the form (xw,yw,zw). It is clear that ordering the database in any one single coordinate (xw or yw or zw) alone will not provide an immediate picture of where each object is in relation to its neighbours.
What is needed is a database where the objects are arranged in 3D order. This is difficult to visualise until it is realised that what is being described is nothing more than a map. The similarity to an ordinary route map is fairly exact for the world we will construct which consists of objects sitting on a surface, just like the surface of the Earth. The advantage of a map of this kind, (which is a 2D array) is that all the objects that lie in a particular region are immediately obvious in their spatial relations.
What is actually done is shown in Figure 10.1. The world space is divided into a 16*16 array of “tiles” (just like on the bathroom wall) each one of which has the dimensions 256*256. Each tile is a unit of space to be considered for display. It can contain a collection of objects; in the example program it contains just one, for simplicity. Of course this is not a very extensive world, but there is nothing in the method which limits it to these dimensions; it could be a big as you like and the individual tiles as small as you like. But, “wrap” occurs so that when the observer strays off any edge he reappears on the opposite side; in this way the world is effectively “infinite”, like a sphere. For our purposes a 16*16 tile world is sufficient to illustrate the method. Each tile defines a region of space which, for the purposes of display, is a single entity. To construct the view seen by the observer, all that has to be done is to find her/his position on the tile grid, select the nearestneighbour tiles, find which ones are in front of the observer and draw the objects placed on them.
How can this 2D array be laid out in the ID contiguous RAM? There is nothing new here. The screen itself is a 2D world which is represented in memory as a ID database. The pixel is analogous to a tile and the four bits which specify its colour are analogous to the data list specifying the attributes of the object on the tile. An arrangement of information in this way, where each element is linked to its adjacent ones is called a linked list. In this case, the links are permanent and implied by the physical position in the array. The world database is thus a list of 256 bytes, each one holding the attributes of one tile in the 16*16 tile world. In the example program it is held in the file data_08.s. The list starts at map_base and every 16th byte starts a new tile in the z direction. The tile position in the list, mod 16, represents the 16 y values. In this model the world is flat and x does not vary.
There is very little information needed for the attributes, since the position in space is automatically included by the tile’s position in the list. The first nibble gives the colour of the background (115) and the second gives the type of object which is to sit on the tile. At present only six are possible (listed in data_06.s), but in principle there is no limit.
10.2. Sorting
As mentioned above, once the visible objects in the near vicinity to the observer have been identified there is the problem of ordering them for drawing so that the more distant ones are drawn first. This is commonly known as the painter’s algorithm, since in painting a picture the last brush stroke overlays earlier ones.
There are many well known algorithms for sorting data in order. Most of the more exotic varieties have been developed to handle large databases with a large number of entries (records). In our case it is necessary to sort a small number (<16) of records in depth order. Sorting at this level is efficiently done by one of the simplest sorting methods, called a bubble sort. Note that at this stage we are referring to the attributes and other accumulated data about the objects to be drawn as records. A record is a set of data of different types where each data type is confined to specific parts or “fields”. This is how data for visible objects is carried around in the example programs. A record is constructed containing all the relevant data to draw in the tile and during depth sorting the records are actually sorted like a deck of cards. That way, although the depth field is the basis for sorting, it carries with it other information for drawing, reducing the retrieval of additional data at a later stage to a minimum. Of course, to avoid slowing things down too much it’s important to keep the record short. In the example program a record consists of 2 long words divided into 7 fields.
10.2.1. A Bubble Sort
Let’s illustrate the bubble sort by direct example from the program. In this we have a short list of records for the visible objects to be displayed. The field on which the sort is based is the second word in the record. It is the distance of the object from the origin of the view frame in the positive z direction, i.e. the direction in which the viewer is looking. The other fields are unimportant for the sorting. Figure 10.2 shows a possible arrangement of simple objects in front of the view frame. The number on each object is its type, which is the content of the second Field on its record. A suitable order in which they should be drawn so that objects in the rear lie behind those in the forefront is: 2,1,4,5,6,3. But this is unlikely to be the order in which the tiles have been retrieved from the database. Let us suppose that they have been withdrawn in the order 6,1,3,4,2,5. The sorting now begins.
The procedure in a bubble sort is to go through the list comparing each entry with its successor and making a switch if necessary. In the present case we will order the list with the objects to be drawn first at the top of the list, i.e. the list will be in the order: distant objects  near objects. In the first sweep, first the first pair 6 and 1 are examined, found to be in the wrong order and exchanged. At the same time, to record that the list was found to be out of order, a flag is set. This leaves 1 as the first entry and 6 as the second. Then the next pair 6 and 3 are examined. The order here is O.K so no switch is made. This is continued through the entire list. Each time a switch is made the flag is set (of course it can only be set once so the following swaps do nothing to the flag). The following lines show the progression of the first sort:
6.1.3.4.2.5 start 1.6.3.4.2.5 1st pair tested 1.6.3.4.2.5 2nd 1.6.4.3.2.5 3rd 1.6.4.2.3.5 4th 1,6,4,2,5,3 5th
Notice how, like bubbles, the distant objects “float” to the top.
At the end of the list the flag is tested to see if a switch was made. If so the entire list is tested again. This is repeated until a pass is made in which the flag was not set, in which case the list in order and the sort is deemed to be complete.
10.2.2. The Viewing Transform
In this chapter we include two different ways of constructing the view seen by the observer. The first uses control matrices and is a simpler version of the view transform used in the previous chapter. The second is altogether different and much simpler; it uses the Euler angles met in Chapter 8 and is widely used in elementary flight simulators. It is slightly limited as a consequence of the way the angles are defined. We discuss the application of control matrices first.
10.2.3. Control Matrices
Let us suppose that we have reached the stage where all the transforms have been done to present a scene from the viewpoint of an observer. The vertices of all visible objects will then be given in the frame of reference of the observer, i.e. the view frame. If, as a consequence, for example, of a movement of the joystick the observer moves his head to the left, all that is required to show the new view is to rotate the vertices to the right. Rotation of the observer about any axis in his reference frame can be implemented by rotating view frame vertices coordinates in the opposite direction. Such a transform is called a coordinate transform since it calculates the view seen from a different coordinate system, i.e. the rotated coordinate system of the observer.
So it seems that all that is required to show the view of the observer, as he flies through the world, is to multiply the view frame coordinates by the sequence of rotation matrices representing his accumulated motion to date. It won’t work! First a record of the total sequence of rotations would have to be kept and then, for each frame, they would have to be multiplied out in order. Not exactly an efficient algorithm for fast graphics. After a while the picture would stop altogether as hundreds of matrix multiplications were done for each frame. What is the solution?
The solution to this problem is very similar to the method used in the previous chapter where the view frame base vectors were rotated and then used to construct the view transform. In this case the procedure is done backwards. At any instant, as a result of calculations done to display the previous frame, we know the view transform matrix. This is the starting point for the next frame. The sequence of events at the end of the calculations will be to: 1) do the view transform to convert vertices to the view frame, 2) do the rotations about view frame axes we have been talking about, 3) finally, do the perspective transform and everything else that follows. Here is now the solution to the problem. Instead of regarding the view transform, (V), and the view frame rotations, ©, as separate transforms, to be done to the vertices, (PW), in the world frame in sequence to produce first the view frame vertices (PV) and then the rotated vertices (PV’).
(C)(V)(PW) = (V’)(PW) = (PV’).
we concatenate (multiply out) © and (V) separately beforehand to produce a rotated view transform, (V’)
(C)(V)(PW) = (C)(PV) = (PV’),
In this scheme each rotation of the observer is brought about by premultiplying the view transform by a “control” matrix appropriate to the rotation. The control matrices for the separate rotations about the view frame xv, yv and zv axes are:
 1 0 0    (Cx) =  0 cosθ sinθ     0 sinθ cosθ     cosθ 0 sinθ    (Cy) =  0 1 0     sinθ 0 cosθ     cosy siny 0    (Cz) =  siny cosy 0     0 0 1   
Notice that these are exactly the same as the geometric transforms of Chapter 6 except that the sine terms have the opposite sign. This is because
sin(θ) = sin(θ)
and shows that the coordinate transforms are the same as geometric transforms with negative angles, i.e. they correspond to backward rotations. This is saying mathematically what we know to be true: rotating the observer’s head to the left achieves the same end result as rotating the scene to the right. (See Appendix 6).
The physical motions corresponding to the rotations are shown in Figure 10.3. They are: yaw (rotation about the x axis), pitch (rotation about the y axis) and roll (rotation about the z axis).
To speed things up the control matrices can be precalculated. If it is accepted that rotations always occur in 1 degree increments then the elements of the matrices will be sine(l) and cos(l) (multiplied by 16384 as usual). This is indeed what is done in the example program file dat_07.s where angle increments are taken to be 5, although here rotations only occur about the xv and yv axes.
There still remains the need to ensure that errors do not accumulate. So, remembering that the rows of the view transform can be visualised as the view frame base vectors, we regenerate the view matrix rows by vector products as was done in Chapter 9.
The details of all these stages are shown in the example program, wrld_scn.
10.2.4. Euler Angles
We have already discussed these in section 8.1.1. Euler angles are a way of specifying the orientation of one reference frame with respect to another using only three angles but with some restriction as to how the angles are defined. Most important is that they specify rotations about different axes in a fixed order. There are many combinations possible. The sequence defined below is the one beloved of aeronautical engineers and is called the 321 sequence because it describes rotations about the x, y, and z axes in order. These correlate with motions of the joystick and so describe yaw (bearing), pitch and roll but note that yaw here, being an initial rotation about the world frame axis, wx, is different from that described in section 10.3.1. The physical rotations of the observer are shown in Figure 10.3.
Here is the sequence of rotations (displacements have already been subtracted off) which carry the world reference frame into the observer’s view frame. It is illustrated in Figure 10.4. Both frames are coincident to begin with and rotations arc about view frame axes, wherever they are at the time:

rotate by 0 about the x axis  the same for both frames (yaw)

rotate by 4> about the y axis (pitch)

rotate by y about the z axis (roll)
The end product is the orientation of the view frame.
Looking back to section 8.1.1. it will be seen that this is precisely the sequence of rotations done there and so the results, in particular the final matrix product, can be used directly. The results are illustrated in the example program eulr_sen.
10.3. Running Times
The example program in this chapter allows you to roam around a world containing 256 different graphic entities under the control of the joystick as in a rudimentary flight simulator. There is no limitation here; a larger world database could be constructed with no additional time penalty. A world of this limited size has been used because it is sufficient to illustrate the procedures involved without involving excessively long listings.
Because of the serial way the book has introduced the different stages of getting a moving picture on the screen, and the manner in which programs have been included together to make an overall program of increasing power, there has been an inevitable compromise in speed. The final program in this last chapter could be rationalised and simplified to become substantially faster.
10.4. Example Program
10.4.1. wrld_scn.s and eulr_scn.s
There are two main control programs here. They both allow free flight through a landscape of moving objects but differ in the type of viewing transform used. In one of them, wrld_scn.s, motion is controlled through the joystick and keyboard by means of rotations about the instantaneous axes of the observer’s coordinate frame. In the other, eulr_scn.s the joystick increments or decrements the Euler angles and to vary the orientation of the observer’s reference frame. The detailed controls are
wrld_scn: up, down, left, right = joystick roll left = fl, roll right = F2
eulr scn: up, down, left, right = joystick.
In both cases the other function keys are:
reverse=F3, slow forward=F4, fast forward=F5, stop=F6, abort=F7.
10.4.2. data_06.s
This is the data file of the graphics primitives, which are simple 3D structures. They appear littered about the landscape according to the database in data_08.s where the primitive associated with each tile is specified in the low nibble of the attribute byte. There are 6 types (05) vectored from a jump table at the address primitive. There is no limit to the variety or number; to include a new one simply add one more label to the jump vectors and fill in the details at the end of the list. The primary jump vectors at primitive point to a list of secondary vectors, which are the tables of data for each particular type For a particular type data is given in a scries of lists:

the secondary pointers,

the intrinsic colours (0, 1, 2 or 3 for 8 shades of 4 colours),

the number of faces on each polyhedral object,

the list of edge numbers on each face,

the list of vertex connections on all faces in order,

the three sets of x, y and z coordinates of the vertices,

the total number of vertices and

the type of rotation which the object is undergoing.
The type of rotational motion which each type displays is specified in the lowest nibble of the high word of the variable 0n (where n is the type number) and the low word is used by the program to hold the current angle but appears as 0 in the list. The type of rotation is given by the bit which is set in the nibble:
bit 0  rotation about x axis of object frame
bit 1  ditto y
bit 2  ditto z
so that any combination of simultaneous rotations can be included.
10.4.3. data_07.s
Here are the four control matrices for positive and negative rotations about the view frame x and y axes laid out in row order.
10.4.4. data_08.s
Here are the 256 bytes which make up the 16*16 tile world unit. In the program, wraparound occurs so that motion beyond the extreme left boundary returns the viewer to the right boundary. In this sense, like a sphere, the world is “infinite”. In each byte the high nibble gives the actual colour of the background \((07), no illumination) and the high nibble gives the object type (015) sitting on the tile. Only 6 types are used in the program. The reader can easily invent new ones.
10.4.5. core_07.s
The first subroutine in the core, patch_ext first takes the observer’s current position and normalises it to lie within the world map. This is where the wraparound occurs. Following this the location in tile coordinates (Ty,Tz) is calculated by dividing by the y and z positions by 256. Remember there are 16*16 tiles spread out over the yz plane. This is the vertical projection of the observer’s position onto the plane. Then the attributes of the 16 tiles centred about this position are retrieved from the database and, for each tile, stored as the first byte of the first word in the 4word record which accompanies each one. The offset of each tile from the observer’s position is saved in the second byte of the first record word. This collection of potentially visible tiles is called a patch.
Following this a visibility test is done on every tile in the patch. The test here does not consider a frustum of visibility, but only whether the centre of the tile lies in front of the observer. The central parameter calculated for each tile during this test is its distance (zv) in front of the observer. This is also saved as the second word in the record for depth sorting later. Less than half the tiles pass the visibility test. The visibility sort, next, simply uses a bubble sort to place the records in order of depth, that is in order of decreasing distance from the observer. The tiles with records at the top of the list will be drawn first since they are farthest away.
The subroutine which follows, drwjt, sets up the data to draw each tile and its resident object in the ordered list of visible tiles, and calls all the earlier subroutines to draw the complete picture. There is a lot going on at this stage. The background on each tile is just a cross of a particular colour so that all the tiles together define a grid on which the objects sit. Since the background is the same for every tile, it is entered directly from the program rather than being stored in a data file. Also since it has a fixed colour without varying illumination, there is no need to call the timeconsuming illumination calculations.
The data lists for each object are pulled in from the data file and before it is drawn its new angle in the world frame is determined for whatever mode of rotation is active.
10.4.6. bss_07.s
New variables
10.4.7. systm_Ol.s
Just a few routines to set up the system. In particular the view point is moved back a bit to 300 on the zv axis to reduce the perspective distortion and eliminate the possibility of parts of objects falling behind the observer, which would not cause the system to crash but would produce a display of spectacular garbage as the basic drawing routines attempt to cope with drawing backwards.
Also a bit of a cheat. The Amiga is being stretched with this program and it helps to speed things up by reducing the size of the window (clip frame) on the screen so that the picture is smaller (ever wondered why games show a tiny screen surrounded by a lot of static ornamentation looking like a console?).
10.4.8. core O8.s
This is the core file for the Euler angle transform.
10.5. Epilogue
How far have we got? What’s next? For a start the overall program can be speeded up considerably by rationalising the anomalies caused by the serial way in which programs have been introduced in this book.
There also remains the inclusion of the third party (you, the world scene and the alien). So far the graphic entities have been static in the sense that their evolution has been determined by their attributes. To give entities life requires that their actions evolve independent of the deterministic structure of the program. But there is really only one truly random element in this scenario  you, the observer. Hence to create life within the computer it is necessary to make the entities respond to your actions. This is of course what happens in all games. Aliens head for the target. To invent a third party is no more complicated than has already been done in reading the movements of the joystick to follow the motion of the observer. In the case of the third party there are no joystick movements, but rather, the response to world conditions.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* wrld_scn.s
* A multiobject scene
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* A world scene consisting of various types of graphics primitives
* in notion. The viewer is free to *fly* to any location. At any
* position a patch consisting of 4*4 "tiles" is visible.
* Joystick controls Yaw and pitch. F1 and F2 controll roll
* Don't held down keys as keyboard buffer is not cleared.
* SECTION TEXT
opt d+
bra main
include systm_01.s
include core_07.s
main:
* Initalize the system.
bsr init_vars initialize view transform
bsr flg_init initialize flags
loop:
* Read input and make adjustments.
bsr swp_scn swap the screens
bsr dircosines regenerate view matrix
bsr joy_read see which direction to move
bsr in_key update the speed
bsr adj_vel adjust the velocity
* Draw the scene
bsr scne_drw everything to complete the picture
* Draw the next frame
bra loop
*SECTIOM DATA
include data_00.s
include data_06.s
include data_07.s
include data_08.s
*SECTION BSS
include bss_07.s
END
*****************************************************************************************
* Core_07.s *
* subroutines for chapter 10 *
*****************************************************************************************
include Core_06.s
scne_drw ; draw a scene of several primitives
bsr patch_ext select the local scene
bsr sight_tst select only the visible ones
bsr vis_srt sort in depth order
bsr drw_it draw them in depth order
rts
*****************************************************************************************
* Extract the tile patch. Put the 16 tiles in a list at patch_lst
patch_ext
move.w oposx,d0 observers x pos
move.w oposy,d1
move.w oposz,d2
* Find position in world. Keep to range 4096
andi.w #$fff,d0 range x
andi.w #$fff,d1 range y
andi.w #$fff,d2 range z
move.w d0,oposx restore x etc..
move.w d1,oposy
move.w d2,oposz
move.w d1,d3
move.w d2,d4
* Find coords of patch centre=local world origin
lsr.w #8,d1
move.w d1,Ty y coord. in 16*16 layout
lsr.w #8,d2
move.w d2,Tz z coord
* Coords of view frame, referenced to this origin
lsl.w #8,d1 Ty*256
lsl.w #8,d2 Tz*256
sub.w d1,d3 oposyTy*256 = Ovy
move.w d3,Ovy
sub.w d2,d4 opozTz*256 = Ovz
move.w d4,Ovz
move.w oposx,Ovx (the height is universal)
* Fetch the attributes of the 16 surrounding tiles from the map and calculate their world
* coords. Store the data in a record/structure with the format:
* WORD 1 : HIBYTE  graphics attribute
* LOBYTE  clear
* WORD 2 : Voz tile centre z in view frame coords
* WORD 3 : tile y in local world coords
* WORD 4 : ditto z
* Ty & Tz are the patch centre coords = local world origins.
move.w Ty,d0
move.w Tz,d1
* A 4*$ patch of tiles centred on the Ty,Tz are retrieved
move.w #2,d5 z offset of start tile
lea map_base,a0
lea patch_lst,a1 the local list of 4*4
move.w #3,d7 4 z values
tile_lp1
move.w #2,d4 reset start yoffset
move.w #3,d6 4 y values
move.w d1,d3 origin Tz
add.w d5,d3 +offset = next z
andi.w #$f,d3 stay in range 015
lsl.w #4,d3 *16
tile_lp2
move d0,d2 origin Ty
add.w d4,d2 +offset = next y
andi.w #$f,d2 stay in range 015
add.w d3,d2 16*z+y = tile address in map
move.b 0(a0,d2.w),d2 fetch attribute in low byte
swap d2 of high word
clr.w d2 0 for low word
lsl.l #8,d2 everything into high word
move.l d2,(a1)+ store the first half of the record
* Calculate the tile local coords: Ooy & Ooz. Coords are offset*256.
movem.l d4/d5,(sp) stack offsets
lsl #8,d4 yoffset*256
swap d4 in high word
lsl #8,d5 zoffset*256
move.w d5,d4 in low word
move.l d4,(a1)+
movem.l (sp)+,d4/d5 restore offsets
addq #1,d4 next y offset
dbra d6,tile_lp2 for all tiles in this row
addi.w #1,d5 next z offset
dbra d7,tile_lp1 for all rows
rts
******************************************************************************************
sight_tst
lea patch_lst,a0 pointer to source list
lea vis_lst,a1 list of visible tiles
lea vis_cnt,a2 count of previous
clr.w (a2) zero count
move.w #15,d7 16 tiles in a patch
clr.w Oox all tiles are on the ground
sight_tst1
move.w 4(a0),d0
addi.w #128,d0
move.w d0,Ooy tile
move.w 6(a0),d0
addi.w #128,d0
move.w d0,Ooz centres
movem.l d7/a0a2,(sp)
bsr testview is tile within filed of vision
movem.l (sp)+,d7/a0a2
tst.b viewflag visible?
beq nxt_tile
addq.w #1,(a2) yes, increment visible count
move.w Voz,2(a0) save the depth for sorting
move.l (a0),(a1)+ transfer 1st half to visible list
move.l 4(a0),(a1)+ 2nd half
nxt_tile
addq #8,a0 point to next record
dbra d7,sight_tst1 for all tiles
rts
******************************************************************************************
*Test whether the primitive is visible.
* Tile centre (Oox, Ooy, Ooz) transformed to view coords then tested. Correct for 2^14.
testview
moveq.l #2,d6 3 rows in matrix
lea w_vmatx,a3 init max pointer
link a6,#6 3 words to store temporarily
move.w Oox,d3
move.w Ooy,d4
move.w Ooz,d5
sub.w Ovx,d3 OoxOvx rel to the view frame
sub.w Ovy,d4 OoyOvy
sub.w Ovz,d5 OozOvz
tranv0
move d3,d0 restore
move d4,d1
move d5,d2
muls (a3)+,d0 *Mi1
muls (a3)+,d1 *Mi2
muls (a3)+,d2 *Mi3
add.l d1,d0
add.l d2,d0 *Mi1+*Mi2+*Mi3
lsl.l #2,d0
swap d0 /2^14
move.w d0,(a6) save it
dbra d6,tranv0 repeat for 3 elements
move.w (a6)+,d3 off my stack becomes Voz
move.w (a6)+,d2 off my stack becomes Voy (centre in view frame)
move.w (a6)+,d1 off my stack becomes Vox
move.w d3,Voz
move.w d2,Voy
move.w d1,Vox
unlk a6
* Clip Ovz. To be visible must have 50<Voz<2000
* This test only looks at depth.
cmp.w #50,d3 test(Voz50)
bmi notvis fail
cmp.w #2000,d3 test(Voz2000)
bpl notvis fail
st viewflag we can see it
rts
notvis
sf viewflag can't see it
rts
*****************************************************************************************
* Order the visible tiles in orderof decreasing Voz (the distance of the tile centre from
* the view frame origin). Largest Voz's (furthest) should be drawn first.
vis_srt
move.w vis_cnt,d7 number to do
beq srt_quit
subq #1,d7
beq srt_quit
subq #1,d7
* Bubble sort
vis_srt1
lea vis_lst+2,a0 pointer to 1st record Voz
movea.l a0,a1
addq.l #8,a1 pointer to 2nd Voz
move d7,d6 reset count
clr.w srt_flg
vis_srt2
cmpm.w (a0)+,(a1)+ test(Voz2Voz1)
ble no_swap 1st is farther
move.l 4(a0),d0 fetch 1st record
move.l (a0),d1
move.l 4(a1),4(a0) make
move.l (a1),(a0) 2nd the 1st
move.l d0,4(a1) & 1st
move.l d1,(a1) 2nd
st srt_flg
no_swap
addq.l #6,a0 point to next record Voz
addq.l #6,a1 and the one follwing
dbra d6,vis_srt2
tst.w srt_flg
beq srt_quit
bra vis_srt1
srt_quit
rts
*****************************************************************************************
drw_it
* draw the visible tiles
move.w vis_cnt,d7
beq drw_it_out
subq.w #1,d7
lea vis_lst,a0 ptr to list
drw_it1
movem.l d7/a0,(sp)
bsr set_prim drw next prim
movem.l (sp)+,d7/a0
addq.l #8,a0 next record
dbra d7,drw_it1
drw_it_out
rts
*****************************************************************************************
* Set up next primitive for drawing; pointer to record in a0.
* 1. DO BACKGROUND
set_prim
move.l a0,(sp) save ptr
bsr ldup_bkg
bsr otranw obj>world
bsr w_tran_v world>view
* Background always visible at a constant illumination level
movea.l (sp)+,a0 restore ptr
move.w (a0),d0 1st word of record
move.l a0,(sp) save pointer
lsr.w #8,d0 top byte
lsr.w #4,d0 top nibble is colour
move.w d0,col_lst the final colours
move.w d0,col_lst+2
bsr perspective
bsr scrn_adj centre it
bsr polydraw
*2. Draw the object
movea.l (sp)+,a6 restore pointer
bsr ldup_obj
bsr otranw
bsr w_tran_v
bsr illuminate
bsr perspective
bsr scrn_adj
bsr polydraw
rts
*****************************************************************************************
* Load background data as program data. Background is a grid.
ldup_bkg
move.w #2,npoly 2 rectangles
move.l #$40004,snedges 4 edges in each
lea sedglst,a2 edgelist 0,1,2,3,0,4,5,6,7,4
move.l #1,(a2)+ edges 0,1
move.l #$20003,(a2)+ 2,3
move.l #4,(a2)+ 0,4
move.l #$50006,(a2)+ 5,6
move.l #$70004,(a2)+ 7,4
* The background vertices define a cross. All x coords are zero.
lea ocoordsx,a2 vertex coords x =
move.l #0,(a2)+ 0,0
move.l #0,(a2)+ 0,0
move.l #0,(a2)+ 0,0
move.l #0,(a2) 0,0
lea ocoordsy,a2 y =
move.l #$ff800080,(a2)+ 128,128
move.l #$80ff80,(a2)+ 128,128
move.l #$fffcfffc,(a2)+ 4,4
move.l #$40004,(a2) 4,4
lea ocoordsz,a2
move.l #$40004,(a2)+ 4,4
move.l #$fffcfffc,(a2)+ 4,4
move.l #$ff800080,(a2)+ 128,128
move.l #$80ff80,(a2)+ 128,128
move.w #8,oncoords
move.w #8,vncoords
move.w #8,wncoords
* The tile centre in the world frame is Oox=0 and the contents of the 3rd & 4th
* words of the records.
move.w #0,Oox
move.w 4(a0),Ooy 3rd word
addi.w #128,Ooy
move.w 6(a0),Ooz 4th word
addi.w #128,Ooz
clr.w otheta no orientation
clr.w ophi
clr.w ogamma
rts
****************************************************************************************
* This has no label in the book and therefore it seems unlikely that it will ever be used.
move.w #1,npoly 1 rectangles
move.l #$4,snedges 4 edges
lea sedglst,a2 edgelist 0,1,2,3,0
move.l #1,(a2)+ edges 0,1
move.l #$20003,(a2)+ 2,3
move.l #0,(a2)+ 0,4
* The background vertices are the corners of the tile.
lea ocoordsx,a2 vertex coords x =
move.l #0,(a2)+ 0,0
move.l #0,(a2)+ 0,0
move.l #0,(a2)+ 0,0
move.l #0,(a2)+ 0,0
lea ocoordsy,a2 y =
clr.l (a2)+
move.l #$ff00ff,(a2)
lea ocoordsz,a2
move.l #$ff,(a2)+ 0,255
move.l #$ff0000,(a2)+ 255,0
move.w #4,oncoords
move.w #4,vncoords
move.w #4,wncoords
* The tile centre in the world frame is Oox=0 and the contents of the 3rd & 4th
* words of the records.
move.w #0,Oox
move.w 4(a0),Ooy 3rd word
move.w 6(a0),Ooz 4th word
clr.w otheta no orientation
clr.w ophi
clr.w ogamma
rts
****************************************************************************************
ldup_obj
* Find out what type of object it is.
move.w (a6),d0 top word
lsr.w #8,d0 top byte
andi.w #$f,d0 low nibble is type (call it n)
lsl.w #2,d0 *4 for offset
lea primitive,a5 ptr to vector table
movea.l 0(a5,d0.w),a5 ptr to type n lists
movea.l 4(a5),a2 pointer to npolyn
move.w (a2),d7 got it
move.w d7,npoly
subq.w #1,d7
move d7,d0
movea.l 8(a5),a0 ptr to nedge list
movea.l a0,a4 saved
lea snedges,a1 destination
move.l (a5),a2 ptr to intrinsic colours
lea srf_col,a3 dest
obj_lp1
move.w (a0)+,(a1)+ transfer edge numbers
move.w (a2)+,(a3)+ transfer intrinsic colours
dbra d0,obj_lp1
* Calculate total number of edges
move.w d7,d0 retore count
clr d1
clr d2
obj_lp2
add.w (a4)+,d2 number of edges
addq #1,d2 and with last repeated
dbra d0,obj_lp2
* Move the edge list
subq #1,d2 counter
movea.l 12(a5),a0 edglstn, the source
lea sedglst,a1 dest
obj_lp3
move.w (a0)+,(a1)+ pass it
dbra d2,obj_lp3
* and the coords list
movea.l 28(a5),a0 ptr to num vertices
move.w (a0),d1 num vertices
move.w d1,oncoords
move.w d1,vncoords
move.w d1,wncoords
subq #1,d1 counter
movea.l 16(a5),a0 ptr to object x
lea ocoordsx,a1
movea.l 20(a5),a2 object y
lea ocoordsy,a3
movea.l 24(a5),a4 object z
movea.l a5,a6
lea ocoordsz,a5
obj_lp4
move.w (a0)+,(a1)+
move.w (a2)+,(a3)+
move.w (a4)+,(a5)+
dbra d1,obj_lp4
* Increment the rotation angle
bsr next_rot
addi.w #128,Ooy
addi.w #128,Ooz
rts
*****************************************************************************************
* Increment the rotation of the object.
next_rot
movea.l 32(a6),a0 ptr to angle and flag
move.l (a0),d0 top word is flag, bottom is angle
move.l d0,d1
andi.l #$ffff,d0 the angle
addi.w #2,d0 increment it
cmp #360,d0
blt obj_lp5
subi #360,d0
obj_lp5
move.w d0,2(a0) next angle
* see what angles to rotate
swap d1
andi.w #$f,d1 flag in lo nib
* flags are set:bit 0= xrot 1=yrot 2=zrot
lsl.w #2,d1 offset
lea rot_vec,a0 ptr to jump table
move.l 0(a0,d1.w),a0
jmp (a0)
rot_vec
dc.l no_rot,rotx,roty,rotxy,rotz,rotxz,rotyz,rotxyz
no_rot rts
rotx
move.w d0,otheta
rts
roty
move.w d0,ophi
rts
rotxy
move.w d0,otheta
move.w d0,ophi
rts
rotz
move.w d0,ogamma
rts
rotxz
move.w d0,otheta
move.w d0,ogamma
rts
rotyz
move.w d0,ophi
move.w d0,ogamma
rts
rotxyz
move.w d0,otheta
move.w d0,ophi
move.w d0,ogamma
rts
****************************************************************
* These are the rotations the joystick reader sends us here.
rot_down
lea rot_y_neg,a0 ptr to ctrl matrix
bsr ctrl_view
rts
rot_up
lea rot_y_pos,a0 ptr to ctrl matrix
bsr ctrl_view
rts
rot_left
lea rot_x_pos,a0 ptr to ctrl matrix
bsr ctrl_view
rts
rot_right
lea rot_x_neg,a0 ptr to ctrl matrix
bsr ctrl_view
rts
roll_left
lea rot_z_neg,a0 ptr to ctrl matrix
bsr ctrl_view
rts
roll_right
lea rot_z_pos,a0 ptr to ctrl matrix
bsr ctrl_view
rts
ctrl_view
* multiply the control matrix poited to by a0 by the view matrix
* to calculate the new elements of the view base vectors.
* 1.base vector iv
lea w_vmatx,a1 ptr to view matrix
lea iv,a2 ptr to view frame base vector
move.w #2,d6 3 elements to iv
movea.l a1,a3 set view ptr
iv_loop
move.w (a3),d1 next view elements
move.w 6(a3),d2
move.w 12(a3),d3
muls (a0),d1
muls 2(a0),d2
muls 4(a0),d3
add.l d2,d1
add.l d3,d1
lsl.l #2,d1
swap d1
move.w d1,(a2)+ next element in base vector
addq.l #2,a3 next column in base vector
dbra d6,iv_loop
*2. No need to do jv; it's calculated from the other two.
*3. base vector kv
lea kv,a2
move.w #2,d6
movea.l a1,a3
kv_loop
move.w (a3),d1
move.w 6(a3),d2
move.w 12(a3),d3
muls 12(a0),d1
muls 14(a0),d2
muls 16(a0),d3
add.l d2,d1
add.l d3,d1
lsl.l #2,d1
swap d1
move.w d1,(a2)+ next element in base vector
addq.l #2,a3 next column in base vector
dbra d6,kv_loop
rts
*****************************************************************************************
* Set the velocity components
adj_vel
lea kv,a0
move.w #14,d7
move.w speed,d0
lsl.w #4,d0
move d0,d1
move d0,d2
muls (a0),d0 v*VZx
lsr.l d7,d0
add.w d0,oposx xw speed component
bpl adj1
clr.w oposx oposx must be > 0
adj1
muls 2(a0),d1 v*VZy
lsr.l d7,d1
add.w d1,oposy yw speed component
muls 4(a0),d2 v*VZz
lsr.l d7,d2
add.w d2,oposz zw speed component
rts
*****************************************************************************************
* bss_07.s *
* variables for chapter 10 *
*****************************************************************************************
include bss_06.s
* Observer's position in world.(mod 4096)
oposx ds.w 1
oposy ds.w 1
oposz ds.w 1
* Tile offset in 16*16 patch
Ty ds.w 1
Tz ds.w 1
* Tile lists
patch_lst ds.l 32 records of 16 tiles in patch
vis_lst ds.l 32 records of visible tiles
* List variables
vis_cnt ds.w 1 number of visible tiles
srt_flg ds.w 1 set during depth sorting
*****************************************************************************************
* System_01.s *
*****************************************************************************************
init_vars:
* set up the screens
bsr init
* Set up the view point
move.w #100,oposx
clr.w oposy
clr.w oposz
* and the clip frame
move.w #50,clp_xmin
move.w #270,clp_xmax
move.w #30,clp_ymin
move.w #170,clp_ymax
* Set up view frame base vectors
*1. iv
lea iv,a0 align view frame axes
move.w #$4000,(a0)+
move.w #0,(a0)+
move.w #0,(a0)
*2. jv
lea jv,a0 with the world frame
clr.w (a0)+
move.w #$4000,(a0)+
clr.w (a0)
*3.kv
lea kv,a0
move.w #0,(a0)+
clr.w (a0)+
move.w #$4000,(a0)
flg_init:
* Initialize flags and other variables
clr.w speed start at rest
clr.w screenflag 0=screen 1 draw, 1=screen 2 draw
clr.w viewflag
* Move the view point to 300 on the view frame z axis
lea persmatx,a0
move.w #300,d0
move.w d0,(a0)
move.w d0,10(a0)
move.w d0,30(a0)
rts
swap_scn:
tst.w screenflag screen 1 or screen2?
beq screen_1 draw on screen 1, display screen2
bsr drw2_shw1 draw on screen 2, display screen1
clr.w screenflag and set the flag for next time
bra screen_2
screen_1:
bsr drw1_shw2 drar on 1, display 2
move.w #1,screenflag and set the flag for next time
screen_2:
rts
*****************************************************************************************
* data_06.s *
* Data for chapter 10 *
*****************************************************************************************
include data_05.s (ensure we include data_03.s as well).
* The vector table of graphics primitives in 8 shades of 4 colours.
primitive:
dc.l prim0,prim1,prim2,prim3,prim4,prim5
* Now follow the vector tables for each primitive.
prim0 ; A simple block
dc.l colrs0,npoly0,nedg0,edglst0,prm0x,prm0y,prm0z,npts0,theta0
colrs0 dc.w 1,1,1,1,1
npoly0 dc.w 5
nedg0 dc.w 4,4,4,4,4
edglst0 dc.w 0,1,2,3,0,3,2,4,5,3,5,4,6,7,5,7,6,1,0,7,1,6,4,2,1
prm0x dc.w 0,50,50,0,70,0,70,0
prm0y dc.w 6,6,6,6,6,6,6,6
prm0z dc.w 6,6,6,6,6,6,6,6
npts0 dc.w 8
theta0 dc.l $10000
prim1 ; An inverted pyramid
dc.l colrs1,npoly1,nedg1,edglst1,prm1x,prm1y,prm1z,npts1,theta1
colrs1 dc.w 2,2,2,2,3
npoly1 dc.w 5
nedg1 dc.w 3,3,3,3,4
edglst1 dc.w 0,1,2,0,0,2,3,0,0,3,4,0,0,4,1,0,1,4,3,2,1
prm1x dc.w 0,75,75,75,75
prm1y dc.w 0,32,32,32,32
prm1z dc.w 0,32,32,32,32
npts1 dc.w 5
theta1 dc.l $10000
prim2 ; A nugget.
dc.l colrs2,npoly2,nedg2,edglst2,prm2x,prm2y,prm2z,npts2,theta2
colrs2 dc.w 1,1,0,1,0,0,1,0,1,1,0,1,0,1
npoly2 dc.w 14
nedg2 dc.w 4,4,4,4,4,4,4,4,4,4,4,4,4,4
edglst2 dc.w 1,6,4,2,1,0,1,2,3,0,3,2,4,5,3,4,6,7,5,4,6,1,0,7,6,8,0,3,11,8,3
dc.w 5,10,11,3,5,7,9,10,5,7,0,8,9,7,8,11,13,12,8,11,10,14,13,11,10,9
dc.w 15,14,10,9,8,12,15,9,12,13,14,15,12
prm2x dc.w 40,60,60,40,60,40,60,40,20,20,20,20,0,0,0,0
prm2y dc.w 30,10,10,30,10,30,10,30,30,30,30,30,10,10,10,10
prm2z dc.w 30,10,10,30,10,30,10,30,30,30,30,30,10,10,10,10
npts2 dc.w 16
theta2 dc.l $70000
prim3 ; A Tee.
dc.l colrs3,npoly3,nedg3,edglst3,prm3x,prm3y,prm3z,npts3,theta3
colrs3 dc.w 2,2,2,2,2,2,2,2,2,2
npoly3 dc.w 10
nedg3 dc.w 4,4,4,4,4,4,4,4,4,4
edglst3 dc.w 0,1,2,3,0,3,2,4,7,3,4,5,6,7,4,5,1,0,6,5
dc.w 8,11,14,15,8,13,14,11,10,13,12,13,10,9,12,8,15,12,9,8
dc.w 12,15,14,13,12,10,11,8,9,10
prm3x dc.w 0,45,45,0,45,45,0,0,70,45,45,70,45,45,70,70
prm3y dc.w 10,10,10,10,10,10,10,10,128,128,128,128,128,128,128,128
prm3z dc.w 10,10,10,10,10,10,10,10,10,10,10,10,10,10,10,10
npts3 dc.w 16
theta3 dc.l $10000
prim4 ; A roller
dc.l colrs4,npoly4,nedg4,edglst4,prm4x,prm4y,prm4z,npts4,theta4
colrs4 dc.w 1,0,1,0,1,0,1,1
npoly4 dc.w 8
nedg4 dc.w 4,4,4,4,4,4,6,6
edglst4 dc.w 1,2,8,7,1,0,1,7,6,0,5,0,6,11,5,4,5,11,10,4,3,4,10,9,3
dc.w 2,3,9,8,2,4,3,2,1,0,5,4,6,7,8,9,10,11,6
prm4x dc.w 0,40,40,0,40,40,0,40,40,0,40,40
prm4y dc.w 8,8,8,8,8,8,8,8,8,8,8,8
prm4z dc.w 45,20,20,45,20,20,45,20,20,45,20,20
npts4 dc.w 12
theta4 dc.l $20000
prim5 ; Another roller
dc.l colrs5,npoly5,nedg5,edglst5,prm5x,prm5y,prm5z,npts5,theta5
colrs5 dc.w 3,2,3,2,3,2,3,3
npoly5 dc.w 8
nedg5 dc.w 4,4,4,4,4,4,6,6
edglst5 dc.w 1,2,8,7,1,0,1,7,6,0,5,0,6,11,5,4,5,11,10,4,3,4,10,9,3
dc.w 2,3,9,8,2,4,3,2,1,0,5,4,6,7,8,9,10,11,6
prm5x dc.w 0,40,40,0,40,40,0,40,40,0,40,40
prm5y dc.w 8,8,8,8,8,8,8,8,8,8,8,8
prm5z dc.w 45,20,20,45,20,20,45,20,20,45,20,20
npts5 dc.w 12
theta5 dc.l $40000
*****************************************************************************************
* data_07.s *
* Control matrices for rotation *
*****************************************************************************************
* +ve rotation about the view frame x axis (LEFT) by 5 degrees.
rot_x_pos:
dc.w 16384,0,0,0,16322,1428,0,1428,16322
*ve rotation about the xv axis (RIGHT)
rot_x_neg:
dc.w 16384,0,0,0,16322,1428,0,1428,16322
*+ve rotation about the yv axis (UP)
rot_y_pos:
dc.w 16322,0,1428,0,16384,0,1428,0,16322
*ve rotation about the yv axis (DOWN)
rot_y_neg:
dc.w 16322,0,1428,0,16384,0,1428,0,16322
*+ve rotation about the zv axis (ROLL RIGHT)
rot_z_pos:
dc.w 16322,1428,0,1428,16322,0,0,0,16384
+ve rotation about the zv axis (ROLL LEFT)
rot_z_pos:
dc.w 16322,1428,0,1428,16322,0,0,0,16384
*****************************************************************************************
* data_08.s *
* The world map for chapter 10. Each byte gives the attribute of a size 256*256 tile in *
* a 16*16 tile world. The attributes' composition is thus: *
* High Nibble : Background colour (17) *
* Low Nibble : Primitive type (05) *
*****************************************************************************************
map_base
dc.b $62,$62,$62,$50,$41,$35,$35,$35
dc.b $35,$35,$35,$43,$45,$54,$54,$64
dc.b $62,$62,$62,$55,$42,$33,$35,$35
dc.b $35,$35,$32,$44,$45,$54,$54,$64
dc.b $52,$52,$52,$52,$44,$35,$34,$35
dc.b $35,$30,$35,$41,$44,$54,$54,$64
dc.b $45,$41,$42,$42,$42,$35,$22,$23
dc.b $23,$20,$25,$25,$44,$44,$40,$65
dc.b $33,$35,$30,$32,$32,$22,$25,$25
dc.b $25,$23,$24,$24,$35,$32,$35,$31
dc.b $35,$32,$35,$35,$32,$22,$11,$11
dc.b $10,$10,$24,$24,$33,$35,$32,$34
dc.b $20,$25,$25,$25,$20,$21,$13,$13
dc.b $13,$13,$20,$25,$25,$25,$20,$25
dc.b $24,$25,$25,$25,$21,$21,$13,$13
dc.b $13,$13,$20,$20,$25,$25,$20,$25
dc.b $20,$25,$25,$25,$22,$22,$13,$13
dc.b $13,$13,$14,$24,$25,$25,$22,$23
dc.b $25,$23,$25,$25,$23,$22,$13,$13
dc.b $13,$13,$14,$23,$25,$25,$25,$25
dc.b $31,$35,$30,$35,$31,$21,$22,$22
dc.b $20,$20,$20,$35,$35,$34,$20,$33
dc.b $45,$40,$40,$40,$41,$41,$22,$22
dc.b $22,$25,$30,$40,$40,$42,$45,$41
dc.b $40,$40,$41,$41,$44,$45,$30,$35
dc.b $35,$35,$32,$45,$40,$50,$55,$55
dc.b $61,$61,$61,$51,$53,$45,$35,$32
dc.b $35,$35,$31,$45,$40,$50,$60,$60
dc.b $61,$61,$61,$52,$55,$44,$33,$35
dc.b $33,$35,$30,$45,$40,$50,$60,$60
dc.b $61,$61,$61,$55,$51,$45,$30,$35
dc.b $32,$35,$35,$41,$45,$50,$60,$60
* eulr_scn.s
* A multiobject scene
*****************************
* A world scene consisting of various types of graphics primitives
* in motion. The viewer is free to "fly" to any location with
* flight simulator type control from the joystick. At any
* position a patch consisting of 4*4 tiles is visible.
* SECTION TEXT
opt d+
bra main
include systm_01.s
include core_08.s
main:
* Initialize the system
bsr init_vars initialize view transform
bsr flg_init initalize flags
loop:
* Read input and make adjustments
bsr swp_scn swap the screens
bsr joy_look see which directions to move
bsr angle_update change the euler angles
bsr wtranv_l construct the view transfers
bsr vtran_move move it to the base vectors
bsr in_key update the speed
bsr adj_vel adjust the velocity
* Draw the scene
bsr scne_drw everything to complete the picture
* Draw the next frame
bra loop
* SECTION DATA
include data_00.s
include data_06.s
include data_07.s
include data_08.s
* SECTION BSS
include bss_07.s
END
*****************************************************************************************
* Core_08.s *
* Subroutines for euler_scn Chapter 12 *
*****************************************************************************************
include core_07.s previous subroutines
joy_look:
* Change the euler angles etheta and ephi (vtheta and vphi from chapter 10 are the same)
* Read the joystick and update the variables accordingly
move.w $dff00c,d0 joystick 2
btst #8,d0
beq eright,dn
btst #9,d0
beq eup_jy
bra eleft_jy
eright_dn:
btst #0,d0
beq eout_jy
btst #1,d0
beq edown_jy
bra eright_jy
eout_jy rts
eup_jy:
bsr erot_down rotate view frame down about vy axis
rts
edown_jy:
bsr erot_up rotate up about vy axis
rts
eleft_jy:
bsr erot_left rotate left aobut vx axis
rts
eright_jy:
bsr erot_right rotate right about wx axis
rts
erot_down:
* Rotate down about the yv axis. Decrement ephi (same as vphi)
move.w #5,vph_inc
rts
erot_up:
* Rotate up about the xw axis. Increment ephi (same as vphi)
move.w #5,vphi_inc
rts
erot_left:
* Rotate left about the xw axis. Increment etheta
move.w #5,vtheta_inc
rts
erot_right:
* Rotate right about the xw axis. Decrement etheta
move.w #5,vtheta_inc
rts
vtran_move:
* move the view transform matrix to the base vectors
* really just a change of label
lea iv,a0
lea jv,a1
lea kv,a2
lea w_vmatx,a3
move.w (a3)+,(a0)+ all
move.w (a3)+,(a0)+ iv
move.w (a3)+,(a0)+
move.w (a3)+,(a1)+ all
move.w (a3)+,(a1)+ jv
move.w (a3)+,(a1)+
move.w (a3)+,(a2)+ all
move.w (a3)+,(a2)+ kv
move.w (a3),(a2)
rts
Appendix A: 68000 Instruction Set
Entire books have been written concerning the 68000 instruction set. There is insufficient space here to do more than outline the essentials. A succinct but thorough discussion is given in the Motorola 16Bit User’s Manual.
The central feature of assembly language programming is that there are no abstract algebraic variables as in regular mathematics or high level languages such as BASIC. It is not possible to make statements such as
LET x=y+z
though it is possible to effect equivalent manipulations of data.
In assembly language, names such as x, y or z are labels representing addresses in RAM. At these addresses can be found binary numbers which are the current values of the parameters associated with the labels. There is a similarity to algebraic variables but at every stage it is the binary number itself which is manipulated either in memory or in the processor registers. The addressing modes of the 68000 are designed to deal with all the ways data needs to be addressed or directed through the system during the execution of the various instructions.
The 68000 instruction set is extensive and powerful. It has two important aspects: the instructions themselves and their addressing modes, which form the basic framework for data acquisition and manipulation.
A.1. Registers
The 68000 processor has eight 32bit data registers (D0D7) dedicated to data, seven 32bit address registers which can be used for data and addresses (A0A6), two 32bit stack pointers (both called A7 but used separately, one for the system and one for the user) set to point to lastin, firstout temporary storage areas of RAM (stacks), one 32bit program counter to keep count of program progress and one 16bit status register of flags to record results of operations. The 32bit registers can be used to handle the five basic data types: bits, bed digits, bytes (8 bits), words (16 bits) and long words (32 bits).
A.2. Addressing Modes
Each instruction is concerned with the manipulation of data of some kind somewhere in the microcomputer system: in the processor, in memory or from external hardware. The addressing modes are designed for the many ways data is accessed. There are six basic types: Register Direct, Register Indirect, Absolute, Immediate, Program Counter Relative and Implied, which encompass the 14 modes listed below. For each instruction, the data (which can be an address) which is about to be manipulated, is located somewhere in the system. The addressing modes give the ways this location is to be found. In its most general form this tobedetermined address is called an effective address (ea).
A.2.1. Addressing Modes
Immediate Data Addressing Immediate the data is the next word Quick Immediate the data is included with the instruction Implied ea = SR, SP or PC Register Direct Address Register Direct ea = An (data contained in named address register) Data Register Direct ea = Dn (data contained in named data register) Absolute Data Addressing. Absolute Short ea = (next word) (data is at address given at next word tollowing instruction) Absolute Long ea = (next 2 words) Register Indirect Addressing Register Indirect ea = (An) (data is at address given in named address register) Postincrement Register Indirect ea = (An)+ (as (An), then increment register) Predecrement Register Indirect ea = (An) (as (An) but predecrement register) Register Indirect with Offset ea = d16(An) (as (An) plus a word length addition) Indexed Register Indirect with Offset ea = d8(An,Xn) (As (An) plus a byte length addition together with the contents of an address or data register acting as an index)
An important version of register indirect addressing is PC relative, where the program counter is used instead of An in d16(An) and d8(An,Xn). This allows reference to memory locations relative to the current program counter and is used to generate position independent code. It is not used in this book since the assembler generates relocatable code which achieves the same end.
A.3. Instruction Set
In general instructions have associated with them a source operand and a destination operand. What these actually mean depends specifically on the instruction, for example in a MOVE instruction they do exactly what they imply  supply the source and destination effective addresses. In an ADD instruction they give the addresses of the two numbers to be added. These operands follow the instruction, on the same line. The instruction itself is like the verb of the sentence.
In addition the instruction has attributes. These are the permitted data sizes, which can be one or more of the types: byte, word or long word depending on the instruction. Also as a consequence of the instruction certain flags will be set or cleared in the condition code (status) register.
The list below gives the assembler mnemonics for the main instruction types.
Mnemonic Action  Mnemonic Action 

ABCD add decimal with extend 
ADD add 
AND logical and 
ASL arithmetic shift left 
ASR arithmetic shift right 
Bcc branch conditionally 
BCHG bit test and change 
BCLR bit test and clear 
BRA branch always 
BSET bit test and set 
BSR branch to subroutine 
BTST bit test 
CHK check register against bounds 
CLR clear operand 
CMP compare 
DBcc test condition, decrement and branch* 
DIVS signed divide 
DIVU unsigned divide 
EOR exclusive OR 
EXG exchange registers 
EXT sign extend 
JMP jump 
JSR jump to subroutine 
LEA load effective address 
LINK link stack 
LSL logical shift left 
LSR logical shift right 
MOVE move 
MOVEM move multiple registers 
MOVEP move peripheral data 
MULS signed multiply 
MULU unsigned multiply 
NBCD negate decimal with extend 
NEG negate 
NOP no operation 
NOT ones complement 
OR logical or 
PEA push effective address 
RESET reset external devices 
ROL rotate left with extend 
ROR rotate right with extend 
ROXL rotate left with extend 
ROXR rotate right with extend 
RTE return from exception 
RTR return and restore 
RTS return from subroutine 
SBCD subtract decimal with extend 
Scc See set conditional* 
STOP stop 
SUB subtract 
SWAP swap data reg. 
TAS test and set operand 
TRAP trap 
TRAPV trap on overflow 
TST test 
UNLK unlink 

A list of condition codes is shown below:
A.3.1. Condition Codes
cc carry clear 
CS carry set 
EQ equal 
F false (never true) 
GE greater or equal 
GT greater than 
HI high 
LE less or equal 
LS low or same 
LT less than 
Ml minus 
NE not equal 
PL plus 
T always true 
VC no overflow 
VS overflow 
The condition codes follow instructions such as DBcc and Bcc, but be careful! The codes test the result of a calculation in the order (destination operand)  (source operand), placing the result (if any) in (destination).
DBcc (which is used for loop processing) will go to the next instruction if the condition is true, whereas Bcc (used for a straight branch) will branch if the condition is true (and go to the next instruction if it is false).
The most obvious loop instruction DBRA (decrement a counter and branch until it is 1) is actually absent from the 68000 set. But instead DBF (decrement and branch, never true) achieves the same result. Most assemblers implement DBRA anyway (but convert it to DBF on assembly), as a service to mankind.
A.4. Variations of Instruction Types
Here are additional variations of the main types. Most important are the endings Q and I which refer to faster “Quick” and “Immediate” versions; Quick being the faster of the two.
ADDA 
add address 
ADDI 
add immediate 
ADDQ 
add quick 
ADDX 
add with extend 
ANDI 
immediate 
ANDI to CCR 
immediate to cond. code 
ANDI to SR 
immediate to status reg. 
CMPA 
compare address 
CMPI 
compare immediate 
EORI 
exclusive OR immediate 
EORI to CCR 
exclusive OR immediate to condition codes 
EORI to SR 
exclusive OR immediate to status register 
MOVEA 
move address 
MOVEA 
move quick 
MOVE to CCR 
move to condition codes 
MOVE to SR 
move to status register 
MOVE from SR 
move from status register 
MOVE to USP 
move to user stack pointer 
NEGX 
move to user stack pointer 
ORI OR 
immediate 
ORI to CCR OR 
immediate to condition codes 
ORI to SR OR 
immediate to status register 
SUBA ORI 
subtract address 
SUBQ ORI 
subtract quick 
SUBI ORI 
subract immediate 
SUBX ORI 
subtract with extend 
Appendix B: Devpac Assembler
There are many good assemblers available. The Devpac Amiga 2 Assembler/ Debugger by Hisoft has been used to develop the programs in this book. What is included in this appendix is a small subset which has been found to be especially useful.
It provides for editing, assembling, running and debugging a program all within the one environment. This gives the speediest development of programs.
B.1. GenAm2
This is the combined editor, assembler and debugger. You can write programs, run and debug them all within GenAm2
B.1.1. The Editor
This is a friendly screen editor, allowing you to roam freely through the entire program. Tabs can be set to convenient column positions in the instruction line which will consist of the following fields separated by spaces:
label mnemonic operand(s) comment
The label is actually an address in RAM though it appears in the program as a userfriendly word, usually having a meaning which is relevant to the program. For example if it is the point to which the program returns in a repetitive loop, it might be simply “loop”. Instruction mnemonics and operands have been discussed in Appendix 1. The comment field should explain in an informative way what is going on so that the progress of the program can be easily understood. An example might be
loop move.w d0,(a0) save the flag
B.1.2. Moving About the File
Gross movements about a file are easily done by using the Amiga key (the outline A on the righthand side). To go to the start (top) or end (bottom) of a file press Amiga+T or Amiga+B, respectively.
The cursor keys can be used to control movement within the screen.
B.1.3. Editing Text
Whole lines can be deleted by pressing Control+Y , and restored by pressing Control+U (useful for repeating lines). Deleting within a line can be done by pressing Backspace (backwards) or Delete (forwards).
B.1.4. Text Movement
Among the most useful facilities are those which handle blocks of text. First move the cursor to the start of the block and press F1. Go to the end of the block and press F2. A marked block can be manipulated in several ways (Help lists these):
F3 saves a block; F4 copies it (to where the cursor is),
Shift+F4 saves it to the block buffer from where it can be pasted into the next file,
Shift+F3 or Shift+F5 deletes it (but also saves it in the block buffer in case you made a mistake!),
F5 pastes in the block (at the cursor).
Amiga+W prints it out.
B.1.5. Assembly
A program can be assembled in several ways. Just to see whether it will assemble choose the Output to None option. This is the best thing to try on the first attempt. To run and debug a program choose the Output to Memory option. To save the assembled program to run independently choose the Output to Disk option and name it with the file extension .PRG. For the programs in this book beyond Chapter 6 it is probably best to assemble them to disc to avoid running out of space. They would then be run as executable programs from the CLI, for example.
B.1.6. Options
There are many options available which affect how the assembly should take place. The option OPTD (written at the top of the source file but after a BRA to the actual program) is very useful and will retain labels in the debugger, which helps enormously to follow the program.
B.1.7. Directives
Assembler directives, which have a similar appearance to assembler instruction mnemonics but which are unique to the assembler, are fairly standard. The common ones, such as EQU (or =), DC, DS, used to fix the values of labels, set up (tables of) constants and to set up variables space, respectively, are used extensively throughout the example programs. Also used extensively to pull in files at assembly is the INCLUDE directive. This has made it possible to build up the book and the overall program by stages. The programs themselves show best how the directives are used.
B.1.8. Debugging
All assembly language programs have errors. Often, more time is spent debugging programs than writing them and so it helps to have a good debugger.
The debugger is actually called MonAm and is available as a free standing program or within the Editor. Using it within the Editor makes the cycle of editing, assembling, running, debugging complete. Most likely you will want to single step through a program and watch what happens in the 68000 registers and in memory. Three windows display the register contents, a disassembled section of program around the current address of the program counter and the contents of a selected part of memory. A fourth small window passes messages. For the purpose of changing addresses and register contents, any one of the display windows can be made active by toggling Tab.
B.1.9. Executing Programs
There are many ways of monitoring a program. Here are some of them:
Ctrl+Z or [Ctrl + Y] single step; every instruction executed
Ctrl+T single step; skips BSR’s, JSR’s, LineA, Traps
Ctrl+A single step; places a breakpoint after next instruction
(useful for bypassing DBF’s (DBRA’s)
Run produces a prompt for the type of run, eg
G run at full speed to next breakpoint
B.1.10. Breakpoints
These allow you to stop the program at specific addresses. They control the flow of the program in the different running modes. Here are simple controls:
Amiga+B set a breakpoint at an address
Ctrl+K clears all set breakpoints
U asks for an address to run to
Help show Help and breakpoints
B.1.11. Miscellaneous
Control+C terminate MonST
L list labels
P print out (active window)
M modify address
Amiga+A set the starting address (active window)
Amiga+R change contents of named register
B.1.12. Hunting for Bugs
This is a skill learned through experience. The most useful tip is to check programs thoroughly before trying them. Try to construct programs in a structured way, in modules, each of which can be thoroughly tested independently before joining them all together. Do not rely on the Debugger to find the mistakes. By that time you’ll have forgotten what each part of the program was for. Don’t be in a hurry; don’t spend one hour “bugging” and ten hours debugging!
A most common error is a bus error. This is when the program counter finds itself pointing to a wrong part of memory. This is often caused by the Stack getting out of order, particularly when a return address from a subroutine is required. Look to see how you have been using the Stack during the subroutine.
Appendix C: Number Systems
C.1. Binary
Computers are made from electronic switches which are either off (0) or on (1). The number system which can be constructed out of such units is called binary (base 2), meaning out of 2; the system which goes in powers of 10 is called denary (base 10). In the binary system numbers are assembled from powers of 2. For example:
13_{10} = 1*2^{3} + 1*2^{2} + 0*2^{1} + 1*2^{0}
Instead of writing numbers out in this long form it is usual to arrange only the coefficients of the powers of 2 in columns. The column number, labelled from the right, gives the power of 2. Hence the number 11 is written as
13_{10} = 1011_{2}
Each one of the units in the binary number is called a binary digit, or bit for short. The group of four bits is called a “nibble”, especially loved by assembly language programmers who have frequent use of it.
A group of 8 bits also has a special name, a “byte”, whose common use largely dates from the age of 8bit microcomputers, which transferred data in bytes. In more recent 16bit microprocessors (this microprocessor labelling scheme refers to the size of the data bus) such as the 68000, groups of 16 and 32 bits are commonly used, these are called “words” and “long words” respectively.
C.2. Hexadecimal (hex for short)
Humans count in powers of 10 (probably because they have 10 fingers), and find it unnatural to count in powers of 2. But some link with the binary system is necessary for assembly language programmers, especially when memory locations are being inspected. To this end the hexadecimal number system is commonly used. In it nibbles are abbreviated into single symbols. For the values up to 9 ordinary denary numbers are used but for the values 10 to 15 (the maximum value of a nibble) new symbols are needed. Here a great opportunity has been lost. Instead of inventing new computer age symbols, the letters of the alphabet A, B, C, D, E, F have been hijacked. Hexadecimal means base 16.
In the three systems binary, denary and hexadecimal respectively, the equivalence is:
Binary  Denary  Hexadecimal 

0000 
0 
0 
0001 
1 
1 
0010 
2 
2 
0011 
3 
3 
0100 
4 
4 
0101 
5 
5 
0110 
6 
6 
0111 
7 
7 
1000 
8 
8 
1001 
9 
9 
1010 
10 
A 
1011 
11 
B 
1100 
12 
C 
1101 
13 
D 
1110 
14 
E 
1111 
15 
F 
C.3. Negative Numbers
Negative numbers in binary are hard to get the hang of. This is because there is no special symbol reserved for the minus sign and it must be encoded within the number itself. It is done in the following way.
For simplicity, suppose we are working only in nibble size numbers (in fact there aren’t any instructions to handle only numbers of this size on the 68000, a nibble must be part of a larger number). To deal in negative numbers the total possible range, 015, is split equally. The interval 07 inclusive (8 numbers) is reserved for positives and the range 158 inclusive (also 8 numbers) is reserved for negatives (the range 1 to 8). It’s not as daft as it sounds. A negative number is obtained by counting backwards from 0. If there is nothing below 0 the next best to do is to go to the top and count down. In a practical sense this is a good method because all the negative numbers have their top bit set. The top bit is like a minus sign turned vertical. There is a fancy name for this convention: 2’s complement There is a simple recipe for getting the negative of a number: write it in binary, switch all the l’s to 0’s and 0’s to l’s and then add 1. Let’s try it. We know that 2 is in fact 14 so here’s the check:
Step 1 +2 is 0010 Step 2 (2’s complement) change bits 1101 and add 1 to give 1110 which is 14 and therefore correct.
The 2’s complement method of labelling negative numbers works for any size: bytes, words and long words. But be warned, only you know that the number is 2 and not 14, the computer doesn’t! To help you keep track of what is going on the 68000 has instructions, called signed instructions which treat the top bit as a sign bit. There are other, unsigned instructions, which treat numbers as positive only. These help, but there are many occasions where the programmer must watch that numbers do not exceed their allotted range and flip sign, usually with pathological consequences.
In assembly language the different number types are distinguished by their different prefixes:
denary  none ; binary  % ; hex  $ .
Appendix D: Chip Registers
Below is a brief list of the addresses of the registers used by the special or custom hardware “Agnus”, “Denise” and “Paula”. The addresses are given as offsets from the base address SdffOOO. In general chip registers are either read only ® or write only (W) or in some cases strobe (S) (triggered by writing to) and an attempt to do the wrong one will cause trouble. Only a few of the registers are listed below, just those that appear in the programs in this book. For further details consult the Amiga Hardware Reference Manual. Another useful book is “Amiga System Programmer’s Guide”, by Abacus, a Data Becker Book.
REGISTER  ADDRESS  R/W  FUNCTION 

BLTAFWM 
$44 
W 
Blitter first word mask source A 
BLTALWM 
$46 
W 
Blitter last word mask for source A 
BLTCONO 
$40 
W 
Blitter control register number 0 
BLTCON1 
$42 
W 
Blitter control register number 1 
BLTSIZE 
$58 
W 
Size of block to blit 
BLTCMOD 
$60 
W 
Blitter source C modulo 
BLTBMOD 
$62 
W 
Blitter source B modulo 
BLTAMOD 
$64 
W 
Blitter source A modulo 
BLTDMOD 
$66 
W 
Blitter destination D modulo 
BLTCPTH 
$48 
W 
Blitter source A pointer 
BLTBPTH 
$4C 
W 
Blitter source B pointer 
BLTAPTH 
$50 
W 
Blitter source A pointer 
BLTDPTH 
$54 
W 
Blitter destination D pointer 
BPL1 MOD 
$108 
W 
Bit plane modulo for odd planes 
BPL2MOD 
$10A 
W 
Bit plane modulo for even planes 
BPLCONO 
$100 
w 
Bit plane control register 0 
BPLCON1 
$102 
w 
Bit plane control register 1 
BPLCON2 
$104 
w 
Bit plane control register 2 
BPL1PTH 
$0E0 
w 
Start of bit plane pointers 
coloroo 
$180 
w 
Start of colour table 
cop1lc 
$80 
w 
Copper list 1 address 
cop2lc 
$84 
w 
Copper list 2 address 
copjmp1 
$88 
s 
List 1 restart strobe 
copjmp2 
$8a 
s 
List 2 restart strobe 
diwstrt 
$8e 
w 
Display window start 
diwstop 
$90 
w 
Display window stop 
ddfstrt 
$92 
w 
Display data fetch start 
ddfstop 
$94 
w 
Display data fetch stop 
DMACON 
$96 
W 
Set DMA status 
VPOSR 
$4 
R 
Read vertical beam position 
Appendix E: Vectors and Matrices
Vectors and matrices go together. Whatever convention is chosen for vectors determines the convention for matrices.
E.1. Vectors
A vector is a concise way of specifying a position in space. The position is measured from a fixed position called the origin. Since space is 3dimensional the position is determined by moving specified distances forward, sideways right and up from the origin (negative distances account for backward, left and down respectively). In mathematical language this means measuring all displacements in a Cartesian coordinate system. A position in space is then specified by the distances along the three axes at right angles one has to travel to reach it. The vector notation arises from the way this information is presented. If the displacements along the three axes to the point, P, are x,y and z respectively, then the vector r which stretches from the origin to P, as shown in Figure A6.1, can be expressed in vector notation as
r = xi + yj + zk
It is common to write vectors (which have both size (magnitude) and direction) in boldface to distinguish them from ordinary numbers which have only size. Here i, j and k, called the unit or base vectors, are signposts pointing along the x, y and z axes and the term xi means “go a distance x in the direction of the x axis” and so on. They are vectors in their own right with size (magnitude) equal to unity.
Since i, j and k really serve only to distinguish the three components of the displacement, we could omit them from the scheme providing the order is retained. The three components can be included in order inside brackets ready for multiplication with matrices in the column vector notation:
x r = y z
This is not the only way to represent vectors. In computer graphics it is common to represent them in the row notation
r = (x y z)
The convention used determines the way matrices are written. In this book column vectors are used because this is more common in science and engineering and therefore likely to be more familiar to the general reader. Switching between the conventions is tiresome but fairly painless.
E.2. Matrices
As a result of rotational transforms which occur frequently in computer graphics, the coordinates of objects change in a particular way. A point P(x,y,z) will move to a new position P'(x',y',z') as a result of a rotation about some axis as shown in Figure A6.2. Each one of the new components is related to all the old components in a set of linear equations:
x' = M11.x + M12.y + M13.z y' = M21.x + M22.y + M23.z z' = M31.x + M32.y + M33.z
where the M’s are numbers giving the proportions of the original components and are the elements of a matrix M. The important thing is that the matrix elements are related uniquely to the rotation, so that any other point rotated in an identical way about the same axis would have its new components determined by the same matrix M. Using the rules of multiplication of matrices and vectors, we can emphasise this by disentangling the elements of M from the components x, y and z of the vector. The product is written as:
X' = M11 M12 M13 X r' = M21 M22 M23 y z' = M31 M32 M33 z
The matrix product written this way is just shorthand notation for the set of linear equations which really matter when we actually come to work out the new coordinates. But writing it this way makes it clear that, once calculated, the matrix M can be used to rotate any point in the same way. In an even more concise shorthand we can summarise the transformation by:
r' = M.r
where the product here is the matrix product and not an ordinary product of numbers.
To convert this shorthand product back into the set of equations observe that the vector has three rows and one column and the matrix has three rows and three columns. To form the top row (x') of the transformed vector r', multiply in turn each of the elements in the top row of M by each of the rows of the vector r and add them. The second row of r' is calculated from the product of each elements in the second row of M with the rows of r and so on (if we were working in the row representation of vectors everything would be the other way round). This meaning of matrix multiplication is something that just has to be learned.
E.3. Products of Vectors
E.3.1. The Scalar (Dot) Product
Vectors are really just a shorthand and highly suggestive way of doing geometry. A point P(x,y,z) in a Cartesian system looks much more important when represented by a vector r which stretches from the origin to the point P. Another point P' (x' ,y' ,z') is similarly represented by the vector P'.
Very often we wish to know the angle, θ, between these two vectors (referring back to the previous section it could be the angle of rotation of the vector P). It turns out that what is simplest to find is the cosine of θ which is
cosθ = (x.x' + y.y' + z.z') / √((x^2 + y^2 + z^2).(x'^2 + y'^2 + z'^2))
The factors in the denominator look complicated but are just the magnitudes of the two vectors calculated using a 3D version of Pythagoras’ theorem. The numerator is the sum of the products of the components of the two vectors taken together. Because such a product occurs frequently in geometry it is given a special symbol and name. It is called the scalar or dot product and is written as
r.r' = x.x' + y.y' + z.z'
It is called the scalar product because it produces a scalar answer from two vectors. Instead of writing the magnitude of a vector as a square root of a sum of squares all the time, which is tiresome, it is usual to represent it by the same symbol as the vector but without boldface. Hence the cosine is given by
cosθ = (r.r')/r.r'
where r = r = √(x^2 + y^2 + z^2) and likewise for r'.
The operation r means ‘the magnitude of r.’
Notice that the scalar product r.r' is proportional to cosθ and, most important, has the same sign as cosθ. The sign of the cosine turns out to be a very useful test of whether two vectors are parallel (pointing in the same direction) or antiparallel (pointing in opposite directions) and plays an important part in testing for the visibility of surfaces.
E.3.2. The Vector (Cross) Product
This is a product of two vectors which produces a new vector. Once again it is based on a useful application. In this case it generates the vector which is normal (at right angles) to both the original vectors. Another way of stating this is to say that the new vector is normal to the plane containing the two product vectors. This is shown in Figure A6.3. The new vector r'' and the vector product are defined by:
r'' = r x r'
The vector r'' is normal to the plane containing r and r' and its magnitude is equal to r.r' ,sin(θ). The components of r'' are
x'' = y.z'  z.y' y'' = z.x'  x.z' z'' = x.y'  y.x'
There is one important aspect of vector products which is also true of matrix products, the order of multiplication matters; the product r x r' is not the same as r' x r. In fact
r' x r = r x r'
The direction of r'' is obtained by twisting r into r' through the smallest angle. The direction in which this is seen as a clockwise rotation is the direction of r''.
The vector product is complicated but very useful in computer graphics. It is used to construct vectors which are normal to surfaces. We discuss this next.
E.3.3. Surface Normal Vectors
It is often necessary to construct a vector which is normal to two other vectors. This occurs in the calculation of surface normal vectors and coordinate transforms. In the case of a surface normal vector the objective is to construct a vector which is normal (at right angles) to the surface.
What this amounts to is forming the vector product of two vectors which lie in the surface, as discussed in the previous section. Usually these two vectors are not presented as such but have themselves to be constructed from polygon vertex coordinate lists. Suppose three consecutive vertices of a convex polygon are
Pl(x1,y1,z1), P2(x2,y2,z2) and P3(x3,y3,z3) and that these go clockwise round the perimeter. The two vectors which can be multiplied in a cross product to give a vector pointing out of the surface are
r = (x3x2)i + (y3y2)j + (z3z2)k r' = (x2x1)i + (y2y1)j + (z2z1)k
so
r'' = r x r'
E.3.4. Base Vectors
Base vectors are unit vectors which point along the axes of the coordinate system. In Cartesian coordinates, i, j and k are the “base” vectors. They each have magnitude 1, so the only thing that distinguishes them is their direction.
E.4. Matrices
Matrices have already been discussed in the previous section. In computer graphics they represent a transformation of some kind. The matrices which are most straightforward to deal with are those associated with rotation and are discussed further in Appendix 6.
The rule for multiplying two matrices in the same as that of multiplying a matrix and a vector (as discussed in the previous section) where the vector is taken as a matrix having one column and three rows. Adding extra columns to the vector makes it a matrix and produces extra columns in the product. For a product to be possible there must be as many columns in the first matrix as there are rows in the second matrix.
The matrices which describe rotation about the three axes x,y and z all have three rows and three columns (unless they are in homogeneous coordinates): they are 3x3 matrices. The act of building up a complex rotation from the separate matrices in some order is accomplished by multiplying the matrices together. This is called matrix concatenation. Just as with the vector cross product, the order of the matrix multiplication matters: the matrix farthest to the right is the first rotation and that closest to the left is the last rotation.
E.5. Homogeneous Coordinates
Unlike rotations, certain types of transform, such as translations and perspectives, cannot be written as 3 x 3 matrices and made to operate on vectors as a product. Since, for the purpose of concatenation, it is desirable to put all transforms on an equal footing, homogeneous coordinates are used to convert all transforms to 4 x 4 matrices which can be multiplied.
This means moving to a 4D space (not real space, just a mathematical convenience) in which the additional dimension is always 1. The extra degree of freedom this gives is sufficient to convert all transforms to 4 x 4 matrices. Likewise all vectors must have a forth component, 1. Putting this fourth dimension to unity means we are working on a “plane” in the 4D space which has the intersection 1. The “plane” is normal 3D space.
Appendix F: Geometric and Coordinate Transforms
There are two types of transform used widely in computer graphics: geometric and coordinate transforms. What is confusing is that they are really two aspects of the same thing and it is possible to achieve the same end result by either method. However in order to stay sane it helps greatly to think of them as different, choosing one or the other depending on the problem. Many clever shortcuts become possible once the distinction and connection between them is understood.
Imagine that you are sitting in a swivel chair positioned at the centre of circular carpet in a room with black featureless walls. Since there is no external reference point (apart from remembering what actually happened) it is not possible to distinguish between rotating the chair to the right on a stationary carpet, or keeping the chair fixed and rotating the carpet to the left. The observer on the chair sees the same relative movement of chair and carpet and his view of the carpet pattern is the same in both cases. But we must be careful to establish a scheme of rotation of either the chair or the carpet which are consistent. Let us decide that left rotations are positive and right rotations are negative. Then we can see that a positive rotation of the chair (the observer) is equivalent to a negative rotation of the carpet (the object): they are said to be the inverse of each other.
Now we come to the formal definitions. Rotating the observer is called a coordinate transform and rotating the object is called a geometric transform. There are many times in computer graphics when we wish to do both of these. When an object is moved in the world frame, it is subject to a geometric transform. When we wish to see the world from a different point of view a coordinate transform must be done. When the observer is controlling his viewpoint orientation by means of a joystick it is useful to exploit the connection between the two transforms.
F.1. Coordinate Systems and Frames of Reference
To some extent these terms are used interchangeably. For the most part the positions and vertices of objects are determined in Cartesian coordinates by a set of three x, y and z axes at right angles. The position of the zero of this set of axes is called the origin of the coordinate system. The whole constitutes a frame of reference to track subsequent motion of the various objects. As we have seen, there are two types of movement: a coordinate transform (when the observer moves) and a geometric transform (when an object moves). When the object moves it is easiest to keep track of what is going on by following the motion of the frame of reference attached to the object itself. We have called this the object frame. In the main text the objecttoworld transform was made by selected rotations and a displacement of this object frame. Now we can see exactly how this works.
udmage::figurea61.jpg[width='65%']
Imagine a set of axes permanently attached to the object so that when it moves they also move. For simplicity, we consider a rotation by an angle θ about the z axis, as shown in Figure A6.1. A transform matrix is now needed to relate the coordinates after the rotation (x1,y1,z1,) to those before (x,y,z). The beauty of this scheme is that we can construct this matrix by observing what happens to the base vectors. Remember, the base vectors arc the unit vectors (of size 1) pointing like sign posts along the x,y and z axes. The base vectors before the rotation are i, j, and k and after the rotation are i1, j1 and kl. Looking at the Figure we can see the relations between these:
i1 = cosθ.i + sinθ.j j1 = sinθ.i + cosθ.j k1 = k
leading to a transform matrix for the base vectors:
 cosθ sinθ 0     sinθ cosθ 0     0 0 1 
Now this matrix as it stands cannot be used to transform the coordinates (x,y,z) to (x1,y1,z1), but curiously enough, its inverse can. Fortunately, the inverse of a pure rotation is simply obtained by switching (transposing) the rows and columns. In technical language, the inverse of a rotation is its transpose. Doing this yields the matrix:
 cosθ sinθ 0     sinθ cosθ 0     0 0 1 
so that, for example, in a rotation by 90 degrees, the point (0,1,0) becomes the point (0,0,1) and the point (0,0,1) becomes (0,1,0). So we have found a way of rotating an object to a new orientation: perform that reorientation on the object base vectors and express the result in terms of the original base vectors: then transpose the matrix to produce the coordinate transform matrix.
Can the original matrix be used for anything? Yes. As it stands, before it is transposed, it is a coordinate transform. If we were to leave the object stationary and just rotate the frame of reference, it gives us the transform to calculate what the object coordinates appear to be in the new rotated frame. This is shown in Figure A6.2. Hence in the rotation of 90 degrees, the vertex (0,1,0) appears to be at (0,0,l), and the vertex (0,0,1) appears to be at (0,1,0) when seen from the rotated frame. Note that in both of these rotations, of the object and reference frame respectively, the sense of the rotation was positive.
Now we can see the qualitative discussion concerning the observer on the swivel chair and the carpet expressed mathematically. The transform which calculates the coordinates of the object after its positive rotation is:
 cosθ sinθ 0     sinθ cosθ 0     0 0 1 
and the transform which calculates the new apparent coordinates of the stationary object after the reference frame has been moved in a positive direction is:
 cosθ sinθ 0     sinθ cosθ 0     0 0 1 
They are different when both involve a positive rotation but become the same if the reference frame (the chair) is rotated negatively. Then the angle θ is negative and because sin(θ) = sinθ but cos(θ) = cosθ the terms involving sinθ change sign but those involving cosθ don’t.
This is only restating the fact that rotating the reference frame one way gives the same relative motion as rotating the object the other way.
Appendix G: Program Structure
This appendix shows the file content of each chapter so that you can see where new files have been introduced. The file at the left is the main control file and the files to the right of it are the files it includes. For each of these, the files in parenthesis underneath are the included files.
G.1. Chapter 3
polydraw.s core_00.s bss_OO.s systm_00.s equates.s data_00.s
G.2. Chapter 4
clipframe.s core_01.s bss_01.s systm_00.s equates.s data_01.s (core_00.s) (bss_00.s)
G.3. Chapter 5
perspect.s core_02.s bss_02.s systm_00.s data_01.s data_02.s (core_01.s) (bss_01.s) (data_00.s) equates.s
G.4. Chapter 6
otranw.s core_03.s bss_03.s systm_00.s data_03.s data_01.s (core_02.s) (bss_02.s) (data_02.s) data_01.s
G.5. Chapter 7
illhide.s core_04.s bss_04.s systm_00.s data_03.s data_01.s (core_03.s) (bss_03.s) (data_02.s) data_00.s data_04.s
G.6. Chapter 8
trnsfrms.s core_05.s bss_05.s systm_00.s data_03.s data_00.s (core_04.s) (bss_04.s) (data_02.s) data_05.s
G.7. Chapter 9
wrld_scn.s core_07.s bss_06.s systm_00.s data_03.s data_00.s (core_05.s) (bss_05.s) (data_02.s) data_05.s
G.8. Chapter 10
wrld_scn.s core_07.s bss_07.s systm_01.s data_00.s data_06.s (core_06.s) (bss_06.s) (systm_00.s) (data_02.s) (data_03.s) data_07.s (data_05.s) data_08.s eulr_scn.s core_08.s (core_07.s)