Video Drivers Development

Source:
aux/vga: /sys/src/cmd/aux/vga/*
kernel drivers: /sys/src/9/pc/vga*

See also:

Extracted from the 9fans archives: http://lists.cse.psu.edu/archives/9fans/2002-February/015741.html

The following is a sort of theory of operation for aux/vga and the kernel vga drivers.

--- aux/vga and basic kernel drivers

Aux/vga consists of a number of modules each of which conforms to an interface called a Ctlr. The Ctlr provides functions snarf, options, init, load, and dump, which are explained in more detail below. Video cards are internally represented as just a collection of Ctlrs. When we want to run one of the functions (snarf, etc.) on the whole card, we run it on each Ctlr piece in turn.

In the beginning of aux/vga, it was common for video cards to mix and match different VGA controller chips, RAMDACs, clock generators, and sometimes even hardware cursors. The original use for vgadb was to provide a recipe for how to deal with each card. The ordering in the ctlr sections was followed during initialization, so that if you said ctlr 0xC0076="Tseng Laboratories, Inc. 03/04/94 V8.00N" link=vga clock=ics2494a-324 ctlr=et4000-w32p ramdac=stg1602-135 when aux/vga wanted to run, say, snarf on this card it would call the snarf routines for the vga, ics2494a, et4000, and stg1602 Ctlrs, in that order. The special Ctlrs vga and ibm8514 take care of the generic VGA register set and the extensions to that register set introduced by the IBM 8514 chip. Pretty much all graphics cards these days still use the VGA register set with some extensions. The only exceptions currently in vgadb are the Ticket to Ride IV and the Neomagic (both LCD cards). The S3 line of chips tends to have the IBM 8514 extensions.

This "mix and match" diversity has settled down a bit, with one chip now usually handling everything. As a result, vgadb entries have become a bit more formulaic, usually listing only the vga link, a controller, and a hardware cursor. For example: ctlr 0xC0039="CL-GD540" link=vga ctlr=clgd542x hwgc=clgd542xhwgc

On to the controller functions themselves. The functions mentioned earlier are supposed to do the following.

void snarf(Vga *vga, Ctlr *ctlr) Read the ctlr's registers into memory, storing them either in the vga structure (if there is an appropriate place) or into a privately allocated structure, a pointer to which can be stored in vga->private (sic). (The use of vga->private rather than ctlr->private betrays the fact that private data has only been added after we got down to having cards with basically a single controller.)

void options(Vga *vga, Ctlr *ctlr) This step prepares to edit the in-memory copy of the registers to implement the mode given in vga->mode. It's really the first half of init, and is often empty. Basically, something goes here if you need to influence one of the other init routines and can't depend on being called before it. For example, the virge Ctlr rounds the pixel line width up to a multiple of 16 in its options routine. This is necessary because the vga Ctlr uses the pixel line width. If we set it in virge.init, vga.init would already have used the wrong value.

void init(Vga *vga, Ctlr *ctlr) Edit the in-memory copy of the registers to implement the mode given in vga->mode.

void load(Vga *vga, Ctlr *ctlr) Write all the ctlr's registers, using the in-memory values. This is the function actually used to switch modes.

void dump(Vga *vga, Ctlr *ctlr) Print (to the Biobuf stdout) a description of all the in-memory controller state. This includes the in-memory copy of the registers but often includes other calculated state like the intended clock frequencies, etc.

Now we have enough framework to explain what aux/vga does. It's easiest to present it as a commented recipe.

1. We sniff around in the BIOS memory looking for a match to any of the strings given in vgadb. (In the future, we intend also to use the PCI configuration registers to identify cards.)

2. Having identified the card and thus made the list of controller structures, we snarf the registers and, if the -p flag was given, dump them.

3. If the -i or -l flag is given, aux/vga then locates the desired mode in the vgadb and copies it into the vga structure. It then does any automatic frequency calculations if they need doing. (See the discussion of defaultclock in vgadb(6).)

For a good introduction to video modes, read Eric Raymond's XFree86 Video Timings HOWTO, which, although marked as obsolete for XFree86, is still a good introduction to what's going on between the video card and the monitor. http://en.tldp.org/HOWTO/XFree86-Video-Timings-HOWTO/

4. Having copied the vgadb mode parameters into the vga structure, aux/vga calls the options and then the init routines to twiddle the in-memory registers appropriately.

5. Now we are almost ready to switch video modes. We dump the registers to stdout if we're being verbose.

6. We tell the kernel (via the "type" vga ctl message) what sort of video card to look for. Specifically, the kernel locates the named kernel vga driver and runs its enable function.

7. If we're using a frame buffer in direct-mapped linear mode (see the section below), we express this intent with a "linear"; vga ctl message. In response, the kernel calls the vga driver's linear function. This should map the video memory into the kernel's address space. Conventionally, it also creates a named memory segment for use with segattach so that user-level programs can get at the video memory. If there is a separate memory-mapped i/o space, it too is mapped and named. These segments are only used for debugging, specifically for debugging the hardware acceleration routines from user space before putting them into the kernel.

8. We tell the kernel the layout of video memory in a "size" ctl message. The arguments are the screen image resolution and the pixel channel format string.

9. Everything is set; we disable the video card, call the loads to actally set the real registers, and reenable the card.

At this point there should be a reasonable picture on the screen. It will be of random memory contents and thus could be mostly garbage, but there should be a distinct image on the screen rather than, say, funny changing patterns due to having used an incorrect sync frequency.

10. We write "drawinit" into #v/vgactl, which will initialize the screen and make console output from now on appear on the graphics screen instead of being written to the CGA text video memory (as has been happening). This calls the kernel driver's drawinit function, whose only job is to initialize hardware accelerated fills and scrolls and hardware blanking if desired.

11. We write "hwgc <hwgcname>" into #v/vgactl, which calls the enable function on the named kernel hwgc driver. (Plan 9 does not yet support software graphics cursors.)

12. We set the actual screen size with an "actualsize" ctl message. The virtual screen size (which was used in the "size" message in step 8) controls how the video memory is laid out; the actual screen size is how much fits on your monitor at a time. Virtual screen size is sometimes larger than actual screen size, either to implement panning (which is really confusing and not recommended) or to round pixel lines up to some boundary, as is done on the ViRGE and Matrox cards. The only reason the kernel needs to know the actual screen size is to make sure the mouse cursor stays on the actual screen.

13. If we're being verbose, we dump the vga state again.

--- hardware acceleration and blanking

Hardware drawing acceleration is accomplished by calling the kernel-driver-provided fill and scroll routines rather than doing the memory operations ourselves. For >8-bit pixel depths, hardware acceleration is noticeably needed. For typical Plan 9 applications, accelerating fill and scroll has been fast enough that we haven't worried about doing anything else.

The kernel driver's drawinit function should sniff the card and decide whether it can use accelerated fill and scroll functions. If so, it fills in the scr->fill and scr->scroll function pointers with functions that implement the following:

int fill(VGAscr *scr, Rectangle r, ulong val); Set every pixel in the given rectangle to val. Val is a bit pattern already formatted for the screen's pixel format (rather than being an RGBA quadruple). Do not return until the operation has completed (meaning video memory has been updated). Usually this means a busy wait looping for a bit in some status register. Although slighty inefficient, the net effect is still much faster than doing the work ourselves. It's a good idea to break out of the busy loop after a large number of iterations, so that if the driver or the card gets confused we don't lock up the system waiting for the bit. Look at any of the accelerated drivers for the conventional method.

int scroll(VGAscr *scr, Rectangle r, Rectangle sr); Set the pixels in rectangle r with the pixels in sr. r and sr are allowed to overlap, and the correct thing must be done, just like memmove. Like fill, scroll must not return until the operation has completed.

Russ Cox <rsc@plan9.bell-labs.com> has a user-level scaffold for testing fill and scroll routines before putting them into the kernel. You can mail him for them.

Finally, drawinit can set scr->blank to a hardware blanking function. On 8-bit displays we can set the colormap to all black to get a sort of blanking, but for true-color displays we need help from the hardware.

int blank(VGAscr *vga, int isblank); If isblank is set, blank the screen. Otherwise, restore it. Implementing this function on CRT-based cards is known to mess up the registers coming out of the blank. We've had better luck with LCD-based cards although still not great luck. But there it is.

--- linear mode and soft screens

In the bad old days, the entire address space was only 1MB, but video memory (640x480x1) was only 37.5kB, so everything worked out. It got its own 64kB segment and everyone was happy. When screens got deeper and then bigger, the initial solution was to use the 64kB segment as a window onto a particular part of video memory. The offset of the window was controlled by setting a register on the card. This works okay but is a royal pain, especially if you're trying to copy from one area of the screen to another and they don't fit in the same window. When we are forced to cope with cards that require accessing memory through the 64kB window, we allocate our own copy of the screen (a so-called soft screen) in normal RAM, make changes there, and then flush the changed portions of memory to video RAM through the window. To do this, we call the kernel driver-provided page routine:

int pageset(VGAscr *scr, int page); Set the base offset of the video window to point page*64kB into video memory.

With the advent of 32-bit address spaces, we can map all of video memory and avoid the soft screen. We call this running the card in linear mode, because the whole video memory is mapped linearly into our address space. Aux/vga is in charge of deciding whether to do this. (In turn, aux/vga more or less respects vgadb, which controls it by having or not having "linear=1" in the controller entry.) If not, aux/vga doesn't do anything special, and we use a soft screen. If so, aux/vga writes "linear" and an address space size into vgactl in step #7 above. In response the kernel calls the kernel driver's linear function, whose job was described in step #7.

Most drivers only implement one or the other interface: if you've got linear mode, you might as well use it and ignore the paging capabilities of the card. Paging is typically implemented only when necessary.

--- from here

If you want to write a VGA driver, it's fairly essential that you get documentation for the video chipset. In a pinch, you might be able to get by with the XFree86 driver for the chipset instead. (The NVidia driver was written this way.) Another alternative is to use documentation for a similar but earlier chipset and then tweak registers until you figure out what is different. (The SuperSavage parts of the virge driver got written this way, starting with the Savage4 parts, which in turn were written by referring to the Savage4 documentation and the Virge parts.)

Even if you do get documentation, the XFree86 driver is good to have to double check. Sometimes the documentation is incomplete, misleading, or just plain wrong, whereas the XFree86 drivers, complicated beasts though they are, are known to work most of the time.

Another useful method for making sure you understand what is going on is dumping the card's registers under another system like XFree86 or Microsoft Windows. The Plan 9 updates page contains an ANSI/POSIX port of aux/vga that is useful only for dumping registers on various systems. It has been used under Linux, FreeBSD, and Windows 95/98. It's not clear what to do on systems like Windows NT or Windows 2000 that both have reasonable memory protection and are hardware programmer-unfriendly.

If you're going to write a driver, it's much easier with a real Plan 9 network or at least with a do-everything cpu/auth/file server terminal, so that you can have an editor and compiler going on a usable machine while you continually frotz and reboot the machine with the newfangled video card. Booting this latter machine from the network rather than its own disk makes life easier for you (you don't have to explicitly copy aux/vga from the compiling machine to the testing machine) and doesn't wreak havoc on the testing machine's local kfs.

It's nice sometimes to have a command-line utility to poke at the vga registers you care about. We have one that perhaps we can clean up and make available. Otherwise, it's not hard to roll your own.

The first step in writing an aux/vga driver is to write the snarf and dump routines for the controller. Then you can run aux/vga -p and see whether the values you are getting match what you expect from the documentation you have.

A good first resolution to try to get working is 640x480x8, as it can use one of the standard clock modes rather than require excessive clock fiddling.

/sys/src/cmd/aux/vga/template.c is a template for a new vga controller driver. There is no kernel template but any of the current drivers is a decent template. /sys/src/9/pc/vga3dfx.c is the smallest one that supports linear addressing mode.