The Basic Ideas behind GGI

written by Steffen Seeger
last update was on $Date: 1998/09/25 05:36:29 $

This document explains the basic terminology used with GGI and introduces the concepts of input devices, displays and devices.

Terminology

As with any other hardware, we need some help by the kernel to provide secure and efficient access to graphics hardware. A part of GGI therefore needs to be implemented in kernel space, but this is restricted to security critical stuff, like setting video modes, loading font tables etc. that may affect 'normal' operation or the capability to recover from application faults. The main parts of GGI that run in kernel space - in the current design - are:

some administrative code that does mainly driver registration and (later) conflict resolution
the console special files (/dev/tty??) driver, which provides basic routines needed by
the terminal emulators or more exactly 'terminal parsers'. These can (later) be modules.
input device drivers, e.g. for keyboards, mice, graphics tablets etc. These can be modules.
display drivers that allow access to the display hardware installed. These can be modules.
the graphics special files (/dev/graph??) driver, which provide low level access to the graphics hardware.

So, compared to the Linux-2.0 design, GGI adds the display drivers, the administrative code - called the KGI manager - and the graphics files to the kernel. These together make up the Kernel Graphics Interface or short KGI. Any other parts were implemented in the kernel already as with Linux-2.0 and they are just substituted by new implementations to fit in the GGI point of view. Also, the kernel doesn't get bloated with the drivers, because only the administrative code, the graphics and console device file drivers need to be compiled into the kernel. The input and display device drivers can be loaded as modules during runtime. The terminal drivers are not loadable yet, but they can be made loadable easily. (Though not implemented yet, we will try to make the graphics device file driver loadable too.)

Note, however, that the display drivers do not implement drawing functions that can be implemented in userspace without compromises to security. This unfortunately means, that on badly designed hardware, such as the VGA, we may have to implement drawing functions in kernel space, because we will not make any compromise in terms of security and the design of GGI is, and will be, oriented on modern hardware designs.

The other part of GGI will run in user space and hide the actual hardware from the applications. Anyhow, you may build your own API and/or application that accesses the display devices directly, because in the design planned, the /dev/graph?? special files will provide the lowest level access possible. Whenever possible, we will try to abstract the things you want to do from the things you have to do with a certain hardware.

The abstractions used by GGI to deal with graphics and user feedback are quite straitforward. But to avoid misunderstandings, let's first define some terms used later in the text.

image	is the picture seen in the real world. We call the image seen the 'visible image' which may actually be a part of a larger 'virtual image'. Which part of the virtual image is seen is determined be the origin of the visible image which is given relative to the origin of the virtual image.
pixel	or 'picture element' refers to a small rectangular part of an image. It has some attributes associated, e.g. a certain color, texture etc, which are assumed to be constant over the whole area covered by it. All pixels are assumed to have the same shape and size. A pixel is the smallest element of a picture that can be controlled independently of the other pixels in its attributes.
dot	is the smallest unit that can be addressed for an image. For uniform pixels this may be the pixel itself; for textured pixels, such as character glyphs, a dot is a pixel of the texture which is assumed to be uniform. We will use this to refer to sub-pixel coordinates, e.g. for a graphical pointer in textmode. As with pixels, dots are assumed to have the same shape, size, color and intensity over the whole area covered. They have only a uniform texture.
pixelvalue	an unsigned integer used to represent the attributes of a pixelsuch as color, the texture to use etc.
frame buffer	a 2D array of pixelvalues corresponding to certain picture elements.
frame buffer layout	is the rule how a certain address into the framebuffer maps to a certain part of the image.

Before we have a look at the hardware used to do computer graphics, we should examine what attributes a pixel may have and how GGI cares about them.

colorvalue	is a triple of unsigned integers, uniquely identifying the impression of color and intensity.
CLUT	is the abbreviation used for Color Look Up Table. This term will be used for a table or, which is equivalent, a function that gives the color associated with a pixel taking the pixelvalue as an index, or argument respectively.
fonttable	will be used referring to a table (or function), that gives the texture associated to a pixel taking the pixelvalue as an index (or argument).
z-value	unsigned integer used in 3D graphics to estimate the distance of the pixel from the viewer.
attributes	will generally refer to other properties a pixel may have, like transparency, blinking etc.
attribute table	will be used as a synonym for the funtion that associates other attributes with a pixelvalue.

Thus an image you can see on your computer screen is internally represented as a simple frame buffer that has a certain layout and stores pixelvalues. The pixelvalues themself describe all the attributes that the pixel corresponding to the address of the pixelvalue has. The actual appearance of the pixel in the image depends on the color look up, font, and attribute table. If you can't tell for sure you understand the above sentences, the following picture might help to make things clearer:

[frame buffer to image mapping]

Any computer hardware used to do framebuffer oriented graphics will somehow fit into the above terminology, may it be a very simple monochrome text-only display or a $10000 virtual reality gadget. Anyhow, most operations to be carried out on these devices are common and can be abstracted, no matter how the low level access is actually done.

To allow applications to share physical devices most things need to be virtualized too. We will need to distinguish between physical and virtual graphics devices and therefore use the term 'display' only for physical displays, that is, any hardware/driver combination used to control a real world image. In contrast to this, the term 'device' is used to refer to a combination of a virtual display and (optionally) an input facility which is - if present - at least a keyboard. Each device is 'attached to' a display; meaning that if the device becomes 'real' and visible, its image will be displayed on the display it is attached to. If the virtual image of a device is displayed on the physical displays it is attached to we will say the device is 'mapped on' that particular display.

Now that we have a general overview of what we see, how to describes it, and how the description is transformed into what we see, let's examine how user feedback is looked at.

Input Devices

The abstractions to handle user input with GGI are keyboards, input devices and events. GGI allows up to MAX_NR_KEYBOARDS to handle character input. This is an arbitrary choosen number and currently set to 32 by default. There can be any number of input devices each of which is logically attached to one keyboard. Both, keyboards and input devices, can be registered and unregistered dynamically, thus GGI is well prepared for "hot plug & pray it works" technology to come. There is, however, one constraint that input devices can only be registered to an existing keyboard. Thus one will have to register the keyboard before the input devices associated with that keyboard. Before we give a more detailed explanation again some terminology:

event	will be used for the internal representation of something that happened. We will care about events by collecting information about what happened, when it happened and how it happened.
input	will generally refer to any kind of a hard- and software combination that converts user actions into events, such as a keystrokes, mouse movements, a button press or anything you like.
keyboard	will refer to an input device that is used to input alphanumerical characters.
pointer	refers to an input device used to control a (2D) position and actions to be done with or at that position.

We will have to specify these terms more exactly lateron, but for now just memorize that all user actions are converted into events by inputs which may be a keyboard, a pointer or any other kind you can imagine.

As interaction is the most common way to work with computers we will need to waste some lines about what interaction means. If you type a text, draw a picture or play a game - always the same things happen from a general point of view. You do something that is converted into events, the applications handle these events and somehow alter the image you see. If we allow not only for one image your computer can control and not only for one input that does report events - just as GGI allows - there needs to be a mechanism to ensure a user can get some feedback. Each keyboard and the input devices registered to it are given one - and only one - device they report all events to: the focus. Actually there may be more than one application reading events from a given device which in turn may be listening on other devices too. But because a device is not only an input; you remember: it was defined as a combination of a keyboard, its inputs and a virtual display; the device needs to be mapped on its display and be visible. This rises some constraints on the possible devices a keyboard may focus. One note to make for future development is that actually not only inputs and displays, but also sound should somehow be related to the focus and/or the displays a certain keyboard may focus on. E.g. if you have two sound cards installed, two users might want to play two different games and if they both want to have sound, there should be at least a conceptual way to allow this.

After kernel boot has finished and the displays are initialized, the KGI manager attempts to detect the keyboards connected to a machine. On a i386 type PC, this just tries to detect the keyboard and registers a keyboard to the KGI manager. For future versions, it is planned to have the keyboard code loadable as all the other input drivers.

This should be enough about input handling, if you would like to know more about the relevant parts of the KGI it is recommended you read the Input Driver Writers Guide.

Displays

Exactly one display driver has to be registered for every real world image that can - or should - be controlled independently. Currently there are up to 32 displays possible.

Generally speaking, the display transforms the framebuffer representation of an image into the real world image seen. A display itself may consist of a video card and a monitor, but almost any gadget that transforms the framebuffer representation into a real world image can be supported by GGI. Basically displays can be registered and unregistered dynamically just as keyboards too, but there are much more constraints to obey in order to avoid locking the machine.

After kernel initialization, a minimal boot display driver is registered for every display found and supported. A boot display driver can be replaced afterwards, but the new driver needs to be able to support the same mode(s) as the boot driver it has to replace. Thus the boot driver should have minimal functionality to allow for text output only common to all hardware, because any extended functionality can be loaded during runtime. There may also a full driver be implemented and registered at boot, but this option should only be choosen if one can be sure that exactly the hardware supported by this driver is found on all hardware platforms this kernel may boot on. Additional displays not detected during boot can be registered without any constraints, except the only one that they should work properly afterwards.

Unregistering a display is a bit trickier and needs some special treatment. To ensure that the machine stays usable even when all drivers are to be unregistered, at least one driver has to provide an alias driver. This alias is a minimal boot display driver provided by the kernel and therefore might not be able to handle the modes required by the devices attached to the current display. Thus a display driver can be loaded easily, but needs to be prepared for unloading before it actually is unloaded. Unloading display drivers not implemented fully currently, sorry for this, but this area is under constrution! Thanks for your patience.

If you are curios about the display driver internals or the relevant parts of the KGI it is a good idea to read the Display Driver Writers Guide.

Devices

As stated before, a device is an abstraction from the actual hardware and can be looked at as a virtual display. The features supported by a device depend on the display it is attached to. For example the split screen feature, support of several frames or virtual framebuffers may be missing. Which features are possible with a given device are determined using the display it is attached to. Currently the following features are possible:

A device can be set into a certain graphic mode. Which modes are possible and which features are possible in which modes is inherited from the display the device is attached to.
Virtual framebuffers, meaning the image seen is a part of a virtual image and the position of the origin of the visible part can be set.
Internal buffers for up to 8 independent frames. This means the device can internally store up to 8 frames with exactly one of them being displayed. Changing the frame being displayed alters the whole real-world image in real-time without any visible delay.
Frame 0 can be splitted so that the area above the splitline displays the visible part of the virtual image and the area below the top area of the virtual image. This is because the current hardware doesn't support split-lines for others than frame 0.
A pointer and a cursor mark that can be controlled independently in position and shape. When in text mode, the cursor and pointer are mandatory, for other modes they should be supported by the kernel code only when they are done in hardware.
A font table can be set.
A CLUT (color look up table) can be set.

The GGI console layer uses devices to implement basic terminal routines, such as scrolling up and down etc., in a heavily optimized way for several 'levels' of hardware support. Thus it should be possible to move the console code completely to user space, but we need to have a somehow stable implementation of the graphic devices first. The actual terminal behavior is implemented hardware independent in the terminal parsers, which only interface to the console layer and not to the display hardware directly. If you like to implement another terminal type than the current implementation, which is choosen to be pretty close to xterm, you should read the Terminal Driver Writers Guide before you start coding.

Copyright (c) 1997 Steffen Seeger - All rights of this work, especially publishing, translation and all kinds of reproduction are reserved by the author. You may, however, print a copy for personal use provided that the name(s) of the authors are included with the copy.