Benchmarks

Config.sys BUFFERS – What Are They?

The CONFIG.SYS BUFFERS statement appeared with MS-DOS 2, which amongst other big changes added fixed disk support.  It also made two changes to the way that DOS handled IO:

  • The file allocation table (FAT) was no-longer always held in memory, instead being treated like any other sector; and
  • Instead of a single DOS disk sector buffer, the BUFFERS CONFIG.SYS statement enabled the user to select between 1 and 99 sector buffers.

Since previously (in DOS 1.x) the FAT had been held entirely in memory, no disk IO was needed to find the sectors being requested by open file IO.  But with DOS 2, this could need serveral IOs – Tim Paterson, an original architect of MS-DOS, published a superb article describing the details in Byte Magazine in 1983, now available here (and cached here).

But why does this matter?

In testing old disks with my simple DiskTest utility, I couldn’t help but notice a wealth of full-stroke IO going on, especially with the 40MB WD-384R since it’s stepper is so vocal.  The reason is clear: with FAT-16 (DOS 4), the FAT is over 80KB (a list of 40,000 clusters each 16-bits wide), and with each buffer being 512-bytes, there simply isn’t enough buffer space for the FAT.  So, as the random IO test runs then buffers containing FAT data will be victimised by file data on its way through, and hence DOS needs to seek back to the FAT.

This lead to what is approaching full-stoke IO in this case as the drive happened to be nearly full when I ran the test, so the majority (or all) of the test file was near the opposite end of the drive to the FAT.

Tuning

But this leads to a tuning opportunity for the PC/XT disk system, particularly for random IO applications such as databases.

Dividing the drive(s) into smaller partitions doesn’t really help, since DOS selects a cluster size so that the FAT (on a FAT-16 volume) tends to be between 64 and 128KB – so with 99 buffers available (49.5KB), the FAT will never completely fit in the buffer space.  And in any case, with a maximum of only 640KB available on the PC/XT, there simply isn’t much space for a disk cache (each buffer consumes 528 bytes of RAM).

Here’s the difference that buffers makes to the WD-384R under DOS 4 with a single 40MB FAT-16 partition:

For random IO,  the system is performing at over five times the rate with 99 buffers, than with a single buffer, and 40% better than with 16 buffers.

For a  PC/XT, with its ST-406 drive, DOS 2, and a single 10MB FAT-12 partition, the impacts of buffers is rather different – as the FAT is only about 4KB, so it can fit comfortably in just 8 buffers:

This all leads me to a few conclusions:

  • For FAT-16, use 99 buffers, unless the memory is really needed by programs
  • if space permits (up to 16MB), use a FAT-12 partition for database files since less buffers are needed for optimum performance
  • on later (AT) machines with even a little XMS, a disk caching utility such as SmartDrv should offer significant further gains

Western Digital WD-384R, or is it a Tandon Drive?

Cracking open the Amstrad PC2286, I was surprised to find a Western Digital WD-384R RLL 40MB disk as the few references I can find refer to Seagate disks.  Most likely the original disk was replaced, but an unexpected gem to find the WD disk.  There’s a good history of disks at http://redhill.net.au/d/d-a.html, but basically Western Digital, then a controller manufacturer, bought Tandon to get a disk division to develop drives with integrated controllers – what would ultimately become IDE drives.  Tandons disks were simply re-stickered as Western Digital immediately after the takeover, in this case the TM364 becoming the WD-384.

3.5″ Form-Factor

The TM364’s 20MB 2-head sister, the TM262, was one of the first 3.5″ form-factor drives.  It was a pretty reliable classic RLL stepper-motor drive with somewhat relaxed performance – the specifications quoted 85ms average seek, but measured today with Norton Calibrate in reality even that is somewhat optimistic.  But this unit is still running well 22 years on, which says a lot about its quality.

Bad Sectors and Interleave

Bad sectors were a reality of 1980’s hard drives – internal relocation hadn’t been thought of, and the drives had a handwritten list of known bad sectors on the label.  Once low-level and high-level formatted, an amount of bad sectors would almost always be present, the Amstrad handbook stating 1% of drive capacity as an acceptable range.  With about 100K of bad sectors, this drive is still performing within spec when new today.

Another complication was the interleave, which spaces out sectors so that the CPU has a minimal wait between sectors on sequential operations.  This disk, when operating through the Amstrad PC2286 with it’s 80286 CPU and Western Digital 1006 controller, supports an interleave of 1:1 meaning that an entire track can be read in one disk RPM.

Performance

Hooking up the drive to Intel’s IOMeter was possible by using DOS Networking (LanManger) to share the volume to a Windows 2003 Server, but the performance was awful.  The share performance gave about 60KB/s and 2 IOPS only.  LanManager client performs better – achiving over 20 IOPS from a network share.

Because of this, I devised a simple DOS app to do the same types of tests, DiskTest, which simply performs the three core tests (32K sequential read and write, and 8K random) with a 4MB test file and displays the results.

When testing the random workload, the limitations of the DOS FAT file system were quickly evident, with much slower throughput that I was expecting at just 3 IOPS.  The reason was that DOS kept having to seek to the FAT to find the blocks, and since the FAT(s) are at one end of the disk, this created heavy full-stroke IO load on the disk as the test file happened to be near the other end.  Increasing buffers to 99 solved this, as would installing SMARTDRV with even a small read cache.  With 99 buffers, the results are:

  • 32K Sequential Read – 449KB/s
  • 32K Sequential Write – 282KB/s
  • 8K Random 70% read – 7.1 IOPS

Only 7 IOPS still seems low at first sight, but at these transfer rates and with 70% read, this drive takes on average 21 ms just to transfer the 8K, plus the 105ms to find it, and at 3,600 RPM another 8 ms for the data to arrive – on average, 134 ms.  At 7.1 IOPS, the difference (6 ms on average) is probably then because some of the operations could span two tracks, adding another 15ms, and the odd seek to the file allocation table.

The performance with real apps really does feel lethargic with this drive, but adding a small cache with buffers=99 or SMARTDRV makes a huge difference and software from the time, such as Windows 2, then chug along at a usable pace.