Welcome1.gif (9491 bytes)

How to Recognize, Repair and
Prevent Database Corruption

by Alec Mulvey

This two-part article appears in the November and
December 1998 issues of The Quick Answer

Database corruption is the unseen enemy lurking in the shadows when you’re working with your Q&A databases. It’s probably the most common cause of Q&A data loss—and it's getting more prevalent. In this article, you'll learn how to recognize a corrupt database, what you can do about it, and how to prevent it.

What is database corruption?
Database corruption is damage to a Q&A database that upsets its integrity and usually impairs its functionality. Database corruption isn’t confined to Q&A. Databases of all kinds suffer corruption. That’s why you’ll find equivalents to Q&A's Recover Database facilities in other database management programs.

What causes corruption?
There are many ways a database can become corrupt. Some are user-related, others aren’t. There are hardware and/or environmental causes such as these:

•  Hard disk error
•  Memory error
•  Network error
•  Power glitch (power loss, dip, or surge)
•  Lightning

User actions like these can corrupt a database:

•  Rebooting or switching off PC when a database is open
•  PC crashing (freezing) when a database is open

With Q&A 5.0 there are additional causes of database corruption:

•  Upgrading a Q&A 4.0 database to Q&A 5.0 without performing a few preliminaries
•  Copying the database design only, without first resaving all Specs
•  Expanding a Retrieve Spec field during a mailmerge or mailing label operation if default fonts have been saved for the word processor (Write).

Of these, the most common cause of corruption is the PC being switched off when a database is open. This can be a user turning off the PC or a power loss or system crash. The latter cause is more common these days as more people are running Q&A under various flavors of Windows, though it’s becoming less common as later versions of Windows become more stable (where one application crashing is less likely to result in the PC having to be rebooted).

Another frequent cause of database corruption these days is network users running Q&A in Windows and inadvertently starting a second (or more!) session of Q&A. With another Q&A session running in the background, a database file can be left open and exposed to corruption all day. On one occasion, I saw a user with eight Q&A sessions going at once! And the user tried to deny it!

How do you recognize corruption?
A corrupt database will usually (but not always) show symptoms—sometimes extreme symptoms such as one or more of the following:

•  Database can’t be browsed, causing a system error
•  Not all records can be retrieved
•  Unable to perform various tasks such as Recover Database or Copy Database
•  Hugely overblown database file size

Intermediate symptoms like these might appear:

•  Uncertainty over just how many records in the database
•  Error messages when creating new reports, for example
•  Thrown back to Main menu and password cleared when attempting, for example, to access the Program Spec or Restrict Spec

"Mild" symptoms like the following might appear:

•  Report doesn’t print—No records found message
•  Spurious characters in data fields or Retrieve Specs
•  Database file inexplicably omitted from backup

A database can show no symptoms at all and still be corrupted.

Ironically, the most dramatic symptoms can be the result of minor damage and the most innocuous symptoms can be evidence of irreparable corruption. A database that can’t even be browsed might be fully repaired by recovering it. On the other hand, a very severely damaged database might show no ill effects whatever during day to day use, but be damaged beyond recovery, copying, or modification.

What can you do about corruption?
Rule No. 1 when you suspect you have a damaged database is:

Set aside any old backups of the database—DO NOT overwrite them.
Then, make a new backup of the database and put it in a safe place.

This way you’re always working on a copy, or have backup copy available in case any of your actions further damage the database (which is quite likely).

It’s absolutely vital to frequently backup your databases. How frequent? Each day, or several times each day, you should backup all the data you can’t afford to lose or re-enter. What’s more, you should retain older backups (not just overwrite one backup with another) in case a database has gone bad over time without you noticing it. One day, all your attempts at data recovery might fail, in which case all you’ll have is your most recent viable backup. By viable I mean that which can be effectively restored. So you should perform test restores from time to time to ensure your backups are okay.

The data recovery options available to you depend on what tools you have.

Assuming you just have Q&A, you can run Recover ( File / Utilities / Recover database). This is the primary tool for most users. Recovery often remedies database problems.

Recovery will either run its course and display the message, Recovery was successful, or it will fail. If it fails, it will either hang or terminate with the message, Recovery failed - use a backup copy of the database. Attempting Recovery on a badly damaged database might crash Q&A, or it might not even allow Recovery to start.

If Recovery succeeds, immediately make a new backup of the database and keep it in a safe place. Unfortunately, a "successful" recovery is no guarantee of a sound database (though it's a good start). Be especially suspicious if Recovery seems to proceed too fast for the file size (for example, 10 seconds for an indexed 5M database). If Recovery fails, the database is almost certainly wrecked. In this case, delete the file from your hard drive, make another copy of your backup file, then try Recovery again—up to three times. If all attempts fail, try some of the other methods I’ll suggest.

Once Recovery is started, it can’t be stopped. If the file is large and your PC is slow, it’s wise to copy the file to a local disk and/or to a faster PC to perform Recovery. Recovering a large database on a slow network can take hours. You’ll also need to have sufficient free disk space—at least twice the size of the database file plus its index file—to perform a Recovery.

Recovery is done in phases. Phase 1 is fairly fast. Phase 2 is usually very quick, especially if there aren’t many reports. Phase 3 is related to the amount of data and number of indexed fields. For an "untaught" database, it’s usually the slowest. If there are no indexed fields, then Phase 3 is instantaneous. Phases 4 to 8 relate to the Intelligent Assistant and Query Guide. If these are untaught, then Phases 4 to 8 are omitted. If the Query Guide or Intelligent Assistant has been taught, then phase 3 is omitted and the slow phases come later.

Repairing indexes
Some database corruption problems have to do with indexing (Speedy fields). A quick check is to retrieve all records, then move to the last record by pressing (Ctrl-End). The record number on the status line should equal the total number of records. If it doesn’t, then the indexing is faulty. Recovery will often fix indexing problems. If it doesn’t, you can reindex the database or replace the index file.

To reindex the file:

1. Go to the Speed-up Spec ( File / Design / Customize / Speed up Searches).
2. Make a note of which fields are indexed.
3. Clear the Spec with F3, then press F10 to save it.
4. Return to the Spec and index no more than two fields at a time, starting with the most important field(s) first.

The reason for indexing the database two fields at a time is that Q&A handles the exercise more efficiently this way. The first field to be indexed should be the one that defines a unique value for each record—a key field. Every database should have such a field.

To replace a damaged index file (or to use a Q&A database with a missing index file) do the following. (Let’s say the database with the damaged index file is named Badindex.dtf)

1. Outside of Q&A, delete Badindex.idx
2. Outside of Q&A, make a copy of the .idx file belonging to any other database, naming it Badindex.idx. (The database won’t be usable at this stage. Attempting to browse it will produce a system error.)
3. In Q&A, Recover Badindex.dtf

The database should now spring back to life with its former indexing. (The Speedy fields Spec is stored in the .dtf file.)

If you don't have another database, simply create a one-field database for the purpose.

The tasks of deleting the index file and renaming an index file from another database can’t be done in Q&A’s File module because there Q&A performs file management tasks on the .dtf and .idx files in tandem. (You can do it in Q&A only by using Main menu Utilities / DOS File Facilities.) Because of the danger of accidentally deleting the database itself, it’s safer to perform these tasks using your favorite file manager (for example, Windows 3.1 File Manager, or Windows 95/98 Explorer or My Computer).

Symantec technical notes on this subject describe a very much more complicated procedure. You'll find, however, that the above will do the trick.

Copying the database
For Q&A 4.0 databases, copying the database can be useful repair tool. It should be done after a successful Recovery and so, by definition, after you’ve made a backup:

1. Select File / Copy / Copy Design only.
2. Specify a new name for the database copy. (For example, copy Customer to Custom2.)
3. Select File / Copy / Copy Selected Records and copy all the records to the new database.

This rebuilds the database in its entirety.

Do not do this on a Q&A 5.0 database without first opening and saving all the Specs associated with the database. (See "Preventing Corruption" below.)

The copy database’s properties will be identical to the earlier one with a single exception: its sharing mode will default to Automatic. So this might need to be reset to Allow (File / Design File / Secure a File / Declare Sharing Mode) if the database is used on a peer-to-peer network.

Saving Specs
Corruption can reside in numerous parts of the database—in the data or the database structure. Within the structure, corruption can reside in various places, but is most commonly found in reports, Print Specs, and some of the Specs on the Customize menu. If you find or suspect corruption and the above techniques don’t work, try opening and resaving each of these Specs. For reports and Print Specs, this means going through each of the three stages, then declining the offer to print. (A macro can be created to speed this process.) You might find you get an error message at a particular Spec and can’t save it. In this case, you’ve found your corruption and clearing that Spec is the solution. On the other hand, opening the damaged Spec can crash Q&A. In this case (say it’s a corrupted report), you’ll have to delete the report (assuming Q&A will let you) or use some other means to repair the database.

That’s about all you can do if all you have is Q&A. However, there are other tools available.

Data extraction utility
QEXTRACT is a Symantec utility you can download from The Quick Answer’s Web site (Free Downloads page). It lets you extract the data, form design and programming from a Q&A database. (It works with all versions of Q&A from 3.0 onward.)

When you run QEXTRACT, you’re told it’s a beta version. Well, it’s been a "beta" version forever. Don’t worry about this—it is the latest version!

QEXTRACT respects Q&A security, so you'll need to provide a valid user ID and password to extract data from a secured database

QEXTRACT allows you to extract the data from a database into an ASCII file. It also lets you extract the form design and programming into respective ASCII files. If you were to rebuild a database from these, you’d still lose the field types, Navigation Spec, any saved Retrieve Specs, Reports, Print Specs, Speedy fields, Custom Help, Restrict Spec, and so on. What’s more, the programming, though extracted, can’t easily be transferred back to the database—plan on retyping most of it from a printout of the ASCII file. So, relying on QEXTRACT to perform a complete recovery is unwise.

QEXTRACT also has three further unexpected shortcomings:

It usually extracts somewhat fewer than all the records. Of course, you don’t know how many records are missing or which ones!

If you’ve redesigned the database, QEXTRACT will create one output file for each generation, named 86.STR, 89.STR, and so on, indicating the number of fields. These will need to be separately imported into a rebuilt database using Merge Specs.

What’s a Database "Generation"?
When you make a change to a database design, the new version of the database isn’t just the current version of the database as far as Q&A is concerned. Q&A separately stores each set of changes to the database design—the generation information—in the .dtf file. It’s clear that the more generations there are, the more information Q&A has to remember, and the greater the risk that Q&A will get confused over them at some stage.

Worse still, if two generations of the database have the same number of fields (say if two fields swap places), then records saved in each of these generations will go to a single file. As an ASCII file is simply a list of field values (first field, second field, and so on), the data entered into, say, the 9th field will depend on which field was the 9th field at the time the data was entered. You might not realize that some of the fields extracted by QEXTRACT have actually been transposed.

This likelihood of (minor) data loss and risk of data transposition renders QEXTRACT unsuitable as a reliable data extraction tool.

So far we’ve looked at several ways of handling a corrupted Q&A database. First, using the facilities built in to Q&A itself, then using a utility program. Once you’ve exhausted these you don’t have to give up, but you’ll have to get your checkbook out.

Have the doctor look at your database
DTFDOCTR is a superb suite of Q&A diagnostic and repair utilities written by John Dow. Using it you can:

•  Obtain in an instant useful information about a database such as the number of fields and records in it, how many generations there are, whether the Intelligent Assistant has been taught, and how much of the finite programming space remains.
•  Perform analyses of data and Specs (Program, Initial Values, reports, Print, and so forth) to find out if they’re damaged
•  Extract data from a damaged database (or a good one for that matter).
•  Transfer reports between databases
•  Repair damaged databases

As a data extraction tool, DTFDOCTR far surpasses QEXTRACT. It extracts all the records that can be extracted, it doesn’t transpose fields, and you don’t have to create multiple Merge specs if your data is more than one generation. DTFDOCTR is significantly faster, too. With its database repair tools, you might not need to extract your data at all.

This isn’t the place for a detailed evaluation of DTFDOCTR (see "Database Corruption Revisited" in the June 1997 issue and check under DTFDOCTR and DTFSPECS in the Topic Index at www.quickanswer.com). DTFDOCTR offers a host of database repair and extraction tools, as well as a number of other facilities (such as transferring reports and determining generations) which would be invaluable by themselves. Anyone who uses Q&A seriously should have a copy of this program. Certainly, it’s an essential for all Q&A developers and database administrators.

If all that fails
If everything we’ve discussed so far fails, there’s not much else you can do. There are some types of problems (beyond the scope of this article) that can be fixed by editing the file at the hex level, but this is problematic and best left to experienced professionals. Many of these tricks can be performed much more safely by DTFDOCTR. Once you’ve tried all these techniques without success on your unusable Q&A database, it’s time, as the saying goes, to "revert to your most recent backup," or send the database to Symantec’s data recovery service.

How to avoid corruption
Corruption isn’t quite inevitable, but it’s certainly common. There is much, though, that you can do to avoid it.

Hardware & environmental measures

•  Keep your hard disk in good order by running Scandisk and Defrag regularly. (These utilities are supplied with DOS since version 6.2 and Windows 95 and later.) Don't, however, set up Scandisk to run attended, automatically repair errors and not write to a log (or neglect perusing the log). If you do, you won’t be aware if Scandisk has been repairing errors on a regular basis—a sure sign that a hard disk replacement is in the offing.
•  Ensure your network is set up correctly, and protect network cables from physical damage.
•  Attach a UPS (Uninterruptible Power Supply) to the file server, and consider getting them for workstations, too. This particularly applies to Q&A 5.0.
•  Don’t use databases during a thunderstorm. Ideally, switch off and unplug the computers (but don't touch power leads during a thunderstorm for safety reasons). To take this one stage further, unplug the telephone cords connecting the modem(s), too, to avoid telephone-system-borne power problems. This is clearly impractical for a network but is a consideration for home office. (An excuse for an early lunch, perhaps!)

User measures

•  Ensure Q&A is set up correctly.
•  Ensure good practice is observed as regards exiting Q&A properly, not leaving databases open unnecessarily, and so on.
•  Don’t casually alter the design of your databases. (See "More on Redesign" below.)

Q&A 5.0-specific measures

•  Don’t upgrade a Q&A 4.0 database to Q&A 5.0 without first performing the steps shown in the sidebar, "Before You Upgrade a Q&A 4.0 Database to Q&A 5.0."
•  Don’t set a default font for Write. (When you choose Write / Type / Edit and press Ctrl-F9 (Assign fonts), the screen should be blank.)
•  Don’t copy a Q&A 5.0 database without first bringing the data and its structure up to one generation. Backup the database (onto the C: drive is sufficient), Recover it, resave all Specs (including reports and Print Specs), and perform a null Mass Update (a Mass Update of all the records using a blank Update Spec).

Before You Upgrade a Q&A 4.0
Database to Q&A 5.0

1. Back up the database.
2. Delete any reports and Print Specs you no longer use.
3. Change the global date format to (1).
4. Remove any @Date and @Time Initial values.
5. Note and remove any Speedy field codes (S, SU, SE).
6. Run Recover database.
7. Perform a null Mass Update. (Update all records leaving the Update Spec blank).
8. Open the database in Q&A 5.0 to upgrade it.
9. Change the global date format as desired.
10. Reinstate any @Date and @Time Initial Values.
11. Reinstate Speedy fields.
12. Set the File Sharing Mode as appropriate (Allow for a peer-to-peer network).
13. Change the form colors, if desired.
14. Recover the database again.
15. Run a null Mass Update.
16. Make another backup and keep both backups in a safe place and don’t overwrite them.

Some of these steps aren’t absolutely essential but are certainly desirable. Steps 14 and 15, for example, could be omitted, but they’ll do no harm and might do some good.

More on Redesign
Q&A makes it easy to add a field to a database. But doing so represents a significant change to the database’s structure and needs to be done with care.

Before any database redesign, first make a backup of the database (onto the C: drive is sufficient). Run Recover on it. This might correct a minor problem that could well have been exacerbated by the redesign. Then make your design changes.

You’re not done yet, though. When you add or delete field(s), you’re creating a new generation of the database. You should then take steps to reduce the generation overhead of Q&A. At a minimum, you need to bring all the data to a single generation. This is done by performing a null Mass Update, which effectively rewrites the data. However, you’d be well advised to bring all Specs to the current generation as well. Resaving them does this. Save all Specs on the Customize menu, all Specs on the Programming menu, all the reports, and finally all the Print Specs.

Once you've gone to all this trouble, it’s very little extra work to go that one step further and create a new first generation database (File / Copy / Copy Design only). This produces a new database with no data. Give the new database a suitable description (press F6 at the List of Files screen), then make another copy of this blank database design (F5 at the List of Files screen) and store it in a safe place.

Next, copy all the records in the new database, using blank Retrieve and Merge Specs. The resulting database will then be as sound as you can make it and will not be prone to generation-related problems. The new database, once checked, can be renamed with the name of the original database, overwriting it in the process.

If you intend to make a first-generation database, you can omit the null Mass Update following the redesign. You can't omit saving the Specs though. Copying a Q&A 5.0 database with Specs from a previous generation can introduce spurious "garbage" characters in the Retrieve Specs and perhaps the Column/Sort Specs of your reports.

Q&A 5.0 is particularly sensitive to multiple generations. Some commentators (Messrs Meigs and Halpern, for instance) say that there’s only one good generation of a Q&A 5.0 database—the first one.

I’ve found that there’s a direct relationship between the incidence of database corruption and the frequency of clients modifying their databases. The sites where I get no corruption at all for many years are the ones where they simply leave their databases alone. This is another reason why database corruption is becoming more prevalent: databases are getting older and have endured more redesigns without the preventative maintenance they need.

What Defines a "New Generation"?
Imagine the definition of a Q&A database as a list of the field codes seen in form design view, like this:

<AA> <AB> <AC> <AD> <AE>

Anything that changes that list creates a new generation. Thus adding a field, deleting a field, or swapping fields creates a new generation. Changing field labels or field names, or changing the size of fields does not.

Changing a field type
Changing a field type (at the Format Spec) should not be undertaken lightly. If you change a field from text to number, or from number to date, for example, then Q&A will accept the change, but you might be left with report Retrieve Specs having conflicting parameters (for example, .. Oct 96 in what is now a date field). This is an invalid Retrieve Spec, but it’s stored in the database and could facilitate corruption at a later date.

Proactive data protection
If you really want to safeguard your data, you need to actively look after it. The following needs to be done regularly to all databases:

Backup, Recover, Null Mass Update

Do these at frequent intervals, especially after a form redesign as described earlier. A macro-driven procedure can be created to perform these tasks automatically on a regular basis, and scheduled to run during the night when the databases aren’t in use.

Moreover, the following can be done to ensure the integrity of your databases:

•  Backup
•  Recover the database
•  Null mass update
•  Save all Specs, including reports
•  Remove indexes
•  Copy design / Copy selected records to the new database
•  Reinstate indexes in the new database
•  Reset Sharing Mode to Allow if necessary

This is as much as you can possibly do to ensure you have a sound database. There is no way to guarantee a Q&A database is free of corruption.

Keep a copy of each database’s design
Lastly, you should always keep a copy of the database design (not necessarily the data). You should make a new copy every time you redesign the database, and periodically thereafter. This way, if the worse happens, you at least have a structure into which you can import the data. Extracting data from a damaged database is always easier than extracting the database design.

Conclusion
Your data is always at risk. For no apparent reason, your database can become unusable. Having adequate and viable backups is the first and most important safeguard, but isn’t enough. There’s much you can do to help prevent corruption occurring. If it does occur, there’s much you can do to remedy it. But if these fail, it’s comforting to know that you have a recent backup to fall back on. Accordingly, the last words on the subject must be, Backup! Backup!! Backup!!!

Alec Mulvey owns Keyword Software & Consultancy based in Ascot, near London, England. Alec has been building Q&A applications and performing data recovery for nine years, and is the current president of the (Inter)National Q&A User Group. Keyword Software is also the UK distributor for the International English edition of Q&A (the "Definitive Edition" CD-ROM) and also John Dow's Utilities. Fax +44-1344-884-111, email alec@KeywordSoftware.com   http://www.KeywordSoftware.com.

Useful Contacts

John Dow
6560 Rosemoor Street
Pittsburgh, PA 15217-3026
412-973-9473
Email: sales@johntdow.com
Web site: http://www.johntdow.com

Symantec data recovery
175 West Broadway
Eugene, OR 97401-3003
541-984-7910
Email: datarec@symantec.com
Web site: http://www.symantec.com/techsupp/recovery/qa/qa_index.html

Q&A User Group
Web site: http://www.qaug.com

Comments? Looking for something? Write us at inbox@quickanswer.com
Want to submit an article to The Quick Answer? Write to office@quickanswer.com
Phone -  800-780-5474 or 949-722-9127 / Fax - 949-722-9127
Marble Publications, 1927A Harbor Blvd., Costa Mesa, California 92627
Last modified: December 11, 2006