How to Recognize, Repair and
Prevent Database Corruption
by Alec Mulvey
This two-part article appears in the November
December 1998 issues of The Quick Answer
Database corruption is the unseen enemy lurking in the shadows when youre working with your Q&A databases. Its probably the most common cause of Q&A data lossand it's getting more prevalent. In this article, you'll learn how to recognize a corrupt database, what you can do about it, and how to prevent it.
What is database corruption?
Database corruption is damage to a Q&A database that upsets its integrity and usually impairs its functionality. Database corruption isnt confined to Q&A. Databases of all kinds suffer corruption. Thats why youll find equivalents to Q&A's Recover Database facilities in other database management programs.
What causes corruption?
There are many ways a database can become corrupt. Some are user-related, others arent. There are hardware and/or environmental causes such as these:
Hard disk error
Power glitch (power loss, dip, or surge)
User actions like these can corrupt a database:
Rebooting or switching off PC when a database is open
PC crashing (freezing) when a database is open
With Q&A 5.0 there are additional causes of database corruption:
Upgrading a Q&A 4.0 database to Q&A 5.0 without performing a few preliminaries
Copying the database design only, without first resaving all Specs
Expanding a Retrieve Spec field during a mailmerge or mailing label operation if default fonts have been saved for the word processor (Write).
Of these, the most common cause of corruption is the PC being switched off when a database is open. This can be a user turning off the PC or a power loss or system crash. The latter cause is more common these days as more people are running Q&A under various flavors of Windows, though its becoming less common as later versions of Windows become more stable (where one application crashing is less likely to result in the PC having to be rebooted).
Another frequent cause of database corruption these days is network users running Q&A in Windows and inadvertently starting a second (or more!) session of Q&A. With another Q&A session running in the background, a database file can be left open and exposed to corruption all day. On one occasion, I saw a user with eight Q&A sessions going at once! And the user tried to deny it!
How do you recognize corruption?
A corrupt database will usually (but not always) show symptomssometimes extreme symptoms such as one or more of the following:
Database cant be browsed, causing a system error
Not all records can be retrieved
Unable to perform various tasks such as Recover Database or Copy Database
Hugely overblown database file size
Intermediate symptoms like these might appear:
Uncertainty over just how many records in the database
Error messages when creating new reports, for example
Thrown back to Main menu and password cleared when attempting, for example, to access the Program Spec or Restrict Spec
"Mild" symptoms like the following might appear:
Report doesnt printNo records found message
Spurious characters in data fields or Retrieve Specs
Database file inexplicably omitted from backup
A database can show no symptoms at all and still be corrupted.
Ironically, the most dramatic symptoms can be the result of minor damage and the most innocuous symptoms can be evidence of irreparable corruption. A database that cant even be browsed might be fully repaired by recovering it. On the other hand, a very severely damaged database might show no ill effects whatever during day to day use, but be damaged beyond recovery, copying, or modification.
What can you do about corruption?
Rule No. 1 when you suspect you have a damaged database is:
Set aside any old backups of the databaseDO NOT
Then, make a new backup of the database and put it in a safe place.
This way youre always working on a copy, or have backup copy available in case any of your actions further damage the database (which is quite likely).
Its absolutely vital to frequently backup your databases. How frequent? Each day, or several times each day, you should backup all the data you cant afford to lose or re-enter. Whats more, you should retain older backups (not just overwrite one backup with another) in case a database has gone bad over time without you noticing it. One day, all your attempts at data recovery might fail, in which case all youll have is your most recent viable backup. By viable I mean that which can be effectively restored. So you should perform test restores from time to time to ensure your backups are okay.
The data recovery options available to you depend on what tools you have.
Assuming you just have Q&A, you can run Recover ( File / Utilities / Recover database). This is the primary tool for most users. Recovery often remedies database problems.
Recovery will either run its course and display the message, Recovery was successful, or it will fail. If it fails, it will either hang or terminate with the message, Recovery failed - use a backup copy of the database. Attempting Recovery on a badly damaged database might crash Q&A, or it might not even allow Recovery to start.
If Recovery succeeds, immediately make a new backup of the database and keep it in a safe place. Unfortunately, a "successful" recovery is no guarantee of a sound database (though it's a good start). Be especially suspicious if Recovery seems to proceed too fast for the file size (for example, 10 seconds for an indexed 5M database). If Recovery fails, the database is almost certainly wrecked. In this case, delete the file from your hard drive, make another copy of your backup file, then try Recovery againup to three times. If all attempts fail, try some of the other methods Ill suggest.
Once Recovery is started, it cant be stopped. If the file is large and your PC is slow, its wise to copy the file to a local disk and/or to a faster PC to perform Recovery. Recovering a large database on a slow network can take hours. Youll also need to have sufficient free disk spaceat least twice the size of the database file plus its index fileto perform a Recovery.
Recovery is done in phases. Phase 1 is fairly fast. Phase 2 is usually very quick, especially if there arent many reports. Phase 3 is related to the amount of data and number of indexed fields. For an "untaught" database, its usually the slowest. If there are no indexed fields, then Phase 3 is instantaneous. Phases 4 to 8 relate to the Intelligent Assistant and Query Guide. If these are untaught, then Phases 4 to 8 are omitted. If the Query Guide or Intelligent Assistant has been taught, then phase 3 is omitted and the slow phases come later.
Some database corruption problems have to do with indexing (Speedy fields). A quick check is to retrieve all records, then move to the last record by pressing (Ctrl-End). The record number on the status line should equal the total number of records. If it doesnt, then the indexing is faulty. Recovery will often fix indexing problems. If it doesnt, you can reindex the database or replace the index file.
To reindex the file:
1. Go to the Speed-up Spec ( File / Design / Customize / Speed up Searches).
2. Make a note of which fields are indexed.
3. Clear the Spec with F3, then press F10 to save it.
4. Return to the Spec and index no more than two fields at a time, starting with the most important field(s) first.
The reason for indexing the database two fields at a time is that Q&A handles the exercise more efficiently this way. The first field to be indexed should be the one that defines a unique value for each recorda key field. Every database should have such a field.
To replace a damaged index file (or to use a Q&A database with a missing index file) do the following. (Lets say the database with the damaged index file is named Badindex.dtf)
1. Outside of Q&A, delete Badindex.idx
2. Outside of Q&A, make a copy of the .idx file belonging to any other database, naming it Badindex.idx. (The database wont be usable at this stage. Attempting to browse it will produce a system error.)
3. In Q&A, Recover Badindex.dtf
The database should now spring back to life with its former indexing. (The Speedy fields Spec is stored in the .dtf file.)
If you don't have another database, simply create a one-field database for the purpose.
The tasks of deleting the index file and renaming an index file from another database cant be done in Q&As File module because there Q&A performs file management tasks on the .dtf and .idx files in tandem. (You can do it in Q&A only by using Main menu Utilities / DOS File Facilities.) Because of the danger of accidentally deleting the database itself, its safer to perform these tasks using your favorite file manager (for example, Windows 3.1 File Manager, or Windows 95/98 Explorer or My Computer).
Symantec technical notes on this subject describe a very much more complicated procedure. You'll find, however, that the above will do the trick.
Copying the database
For Q&A 4.0 databases, copying the database can be useful repair tool. It should be done after a successful Recovery and so, by definition, after youve made a backup:
1. Select File / Copy / Copy Design only.
2. Specify a new name for the database copy. (For example, copy Customer to Custom2.)
3. Select File / Copy / Copy Selected Records and copy all the records to the new database.
This rebuilds the database in its entirety.
Do not do this on a Q&A 5.0 database without first opening and saving all the Specs associated with the database. (See "Preventing Corruption" below.)
The copy databases properties will be identical to the earlier one with a single exception: its sharing mode will default to Automatic. So this might need to be reset to Allow (File / Design File / Secure a File / Declare Sharing Mode) if the database is used on a peer-to-peer network.
Corruption can reside in numerous parts of the databasein the data or the database structure. Within the structure, corruption can reside in various places, but is most commonly found in reports, Print Specs, and some of the Specs on the Customize menu. If you find or suspect corruption and the above techniques dont work, try opening and resaving each of these Specs. For reports and Print Specs, this means going through each of the three stages, then declining the offer to print. (A macro can be created to speed this process.) You might find you get an error message at a particular Spec and cant save it. In this case, youve found your corruption and clearing that Spec is the solution. On the other hand, opening the damaged Spec can crash Q&A. In this case (say its a corrupted report), youll have to delete the report (assuming Q&A will let you) or use some other means to repair the database.
Thats about all you can do if all you have is Q&A. However, there are other tools available.
Data extraction utility
QEXTRACT is a Symantec utility you can download from The Quick Answers Web site (Free Downloads page). It lets you extract the data, form design and programming from a Q&A database. (It works with all versions of Q&A from 3.0 onward.)
When you run QEXTRACT, youre told its a beta version. Well, its been a "beta" version forever. Dont worry about thisit is the latest version!
QEXTRACT respects Q&A security, so you'll need to provide a valid user ID and password to extract data from a secured database
QEXTRACT allows you to extract the data from a database into an ASCII file. It also lets you extract the form design and programming into respective ASCII files. If you were to rebuild a database from these, youd still lose the field types, Navigation Spec, any saved Retrieve Specs, Reports, Print Specs, Speedy fields, Custom Help, Restrict Spec, and so on. Whats more, the programming, though extracted, cant easily be transferred back to the databaseplan on retyping most of it from a printout of the ASCII file. So, relying on QEXTRACT to perform a complete recovery is unwise.
QEXTRACT also has three further unexpected shortcomings:
It usually extracts somewhat fewer than all the records. Of course, you dont know how many records are missing or which ones!
If youve redesigned the database, QEXTRACT will create one output file for each generation, named 86.STR, 89.STR, and so on, indicating the number of fields. These will need to be separately imported into a rebuilt database using Merge Specs.
|Whats a Database
When you make a change to a database design, the new version of the database isnt just the current version of the database as far as Q&A is concerned. Q&A separately stores each set of changes to the database designthe generation informationin the .dtf file. Its clear that the more generations there are, the more information Q&A has to remember, and the greater the risk that Q&A will get confused over them at some stage.
Worse still, if two generations of the database have the same number of fields (say if two fields swap places), then records saved in each of these generations will go to a single file. As an ASCII file is simply a list of field values (first field, second field, and so on), the data entered into, say, the 9th field will depend on which field was the 9th field at the time the data was entered. You might not realize that some of the fields extracted by QEXTRACT have actually been transposed.
This likelihood of (minor) data loss and risk of data transposition renders QEXTRACT unsuitable as a reliable data extraction tool.
So far weve looked at several ways of handling a corrupted Q&A database. First, using the facilities built in to Q&A itself, then using a utility program. Once youve exhausted these you dont have to give up, but youll have to get your checkbook out.
Have the doctor look at your database
DTFDOCTR is a superb suite of Q&A diagnostic and repair utilities written by John Dow. Using it you can:
Obtain in an instant useful information about a database such as the number of fields and records in it, how many generations there are, whether the Intelligent Assistant has been taught, and how much of the finite programming space remains.
Perform analyses of data and Specs (Program, Initial Values, reports, Print, and so forth) to find out if theyre damaged
Extract data from a damaged database (or a good one for that matter).
Transfer reports between databases
Repair damaged databases
As a data extraction tool, DTFDOCTR far surpasses QEXTRACT. It extracts all the records that can be extracted, it doesnt transpose fields, and you dont have to create multiple Merge specs if your data is more than one generation. DTFDOCTR is significantly faster, too. With its database repair tools, you might not need to extract your data at all.
This isnt the place for a detailed evaluation of DTFDOCTR (see "Database Corruption Revisited" in the June 1997 issue and check under DTFDOCTR and DTFSPECS in the Topic Index at www.quickanswer.com). DTFDOCTR offers a host of database repair and extraction tools, as well as a number of other facilities (such as transferring reports and determining generations) which would be invaluable by themselves. Anyone who uses Q&A seriously should have a copy of this program. Certainly, its an essential for all Q&A developers and database administrators.
If all that fails
If everything weve discussed so far fails, theres not much else you can do. There are some types of problems (beyond the scope of this article) that can be fixed by editing the file at the hex level, but this is problematic and best left to experienced professionals. Many of these tricks can be performed much more safely by DTFDOCTR. Once youve tried all these techniques without success on your unusable Q&A database, its time, as the saying goes, to "revert to your most recent backup," or send the database to Symantecs data recovery service.
How to avoid corruption
Corruption isnt quite inevitable, but its certainly common. There is much, though, that you can do to avoid it.
Hardware & environmental measures
Keep your hard disk in good order by running Scandisk and Defrag regularly. (These utilities are supplied with DOS since version 6.2 and Windows 95 and later.) Don't, however, set up Scandisk to run attended, automatically repair errors and not write to a log (or neglect perusing the log). If you do, you wont be aware if Scandisk has been repairing errors on a regular basisa sure sign that a hard disk replacement is in the offing.
Ensure your network is set up correctly, and protect network cables from physical damage.
Attach a UPS (Uninterruptible Power Supply) to the file server, and consider getting them for workstations, too. This particularly applies to Q&A 5.0.
Dont use databases during a thunderstorm. Ideally, switch off and unplug the computers (but don't touch power leads during a thunderstorm for safety reasons). To take this one stage further, unplug the telephone cords connecting the modem(s), too, to avoid telephone-system-borne power problems. This is clearly impractical for a network but is a consideration for home office. (An excuse for an early lunch, perhaps!)
Ensure Q&A is set up correctly.
Ensure good practice is observed as regards exiting Q&A properly, not leaving databases open unnecessarily, and so on.
Dont casually alter the design of your databases. (See "More on Redesign" below.)
Q&A 5.0-specific measures
Dont upgrade a Q&A 4.0 database to Q&A 5.0 without first performing the steps shown in the sidebar, "Before You Upgrade a Q&A 4.0 Database to Q&A 5.0."
Dont set a default font for Write. (When you choose Write / Type / Edit and press Ctrl-F9 (Assign fonts), the screen should be blank.)
Dont copy a Q&A 5.0 database without first bringing the data and its structure up to one generation. Backup the database (onto the C: drive is sufficient), Recover it, resave all Specs (including reports and Print Specs), and perform a null Mass Update (a Mass Update of all the records using a blank Update Spec).
|Before You Upgrade a Q&A 4.0
Database to Q&A 5.0
1. Back up
Some of these steps arent absolutely essential but are certainly desirable. Steps 14 and 15, for example, could be omitted, but theyll do no harm and might do some good.
More on Redesign
Q&A makes it easy to add a field to a database. But doing so represents a significant change to the databases structure and needs to be done with care.
Before any database redesign, first make a backup of the database (onto the C: drive is sufficient). Run Recover on it. This might correct a minor problem that could well have been exacerbated by the redesign. Then make your design changes.
Youre not done yet, though. When you add or delete field(s), youre creating a new generation of the database. You should then take steps to reduce the generation overhead of Q&A. At a minimum, you need to bring all the data to a single generation. This is done by performing a null Mass Update, which effectively rewrites the data. However, youd be well advised to bring all Specs to the current generation as well. Resaving them does this. Save all Specs on the Customize menu, all Specs on the Programming menu, all the reports, and finally all the Print Specs.
Once you've gone to all this trouble, its very little extra work to go that one step further and create a new first generation database (File / Copy / Copy Design only). This produces a new database with no data. Give the new database a suitable description (press F6 at the List of Files screen), then make another copy of this blank database design (F5 at the List of Files screen) and store it in a safe place.
Next, copy all the records in the new database, using blank Retrieve and Merge Specs. The resulting database will then be as sound as you can make it and will not be prone to generation-related problems. The new database, once checked, can be renamed with the name of the original database, overwriting it in the process.
If you intend to make a first-generation database, you can omit the null Mass Update following the redesign. You can't omit saving the Specs though. Copying a Q&A 5.0 database with Specs from a previous generation can introduce spurious "garbage" characters in the Retrieve Specs and perhaps the Column/Sort Specs of your reports.
Q&A 5.0 is particularly sensitive to multiple generations. Some commentators (Messrs Meigs and Halpern, for instance) say that theres only one good generation of a Q&A 5.0 databasethe first one.
Ive found that theres a direct relationship between the incidence of database corruption and the frequency of clients modifying their databases. The sites where I get no corruption at all for many years are the ones where they simply leave their databases alone. This is another reason why database corruption is becoming more prevalent: databases are getting older and have endured more redesigns without the preventative maintenance they need.
|What Defines a "New Generation"?
Imagine the definition of a Q&A database as a list of the field codes seen in form design view, like this:
<AA> <AB> <AC> <AD> <AE>
Anything that changes that list creates a new generation. Thus adding a field, deleting a field, or swapping fields creates a new generation. Changing field labels or field names, or changing the size of fields does not.
Changing a field type
Changing a field type (at the Format Spec) should not be undertaken lightly. If you change a field from text to number, or from number to date, for example, then Q&A will accept the change, but you might be left with report Retrieve Specs having conflicting parameters (for example, .. Oct 96 in what is now a date field). This is an invalid Retrieve Spec, but its stored in the database and could facilitate corruption at a later date.
Proactive data protection
If you really want to safeguard your data, you need to actively look after it. The following needs to be done regularly to all databases:
Backup, Recover, Null Mass Update
Do these at frequent intervals, especially after a form redesign as described earlier. A macro-driven procedure can be created to perform these tasks automatically on a regular basis, and scheduled to run during the night when the databases arent in use.
Moreover, the following can be done to ensure the integrity of your databases:
Recover the database
Null mass update
Save all Specs, including reports
Copy design / Copy selected records to the new database
Reinstate indexes in the new database
Reset Sharing Mode to Allow if necessary
This is as much as you can possibly do to ensure you have a sound database. There is no way to guarantee a Q&A database is free of corruption.
Keep a copy of each databases design
Lastly, you should always keep a copy of the database design (not necessarily the data). You should make a new copy every time you redesign the database, and periodically thereafter. This way, if the worse happens, you at least have a structure into which you can import the data. Extracting data from a damaged database is always easier than extracting the database design.
Your data is always at risk. For no apparent reason, your database can become unusable. Having adequate and viable backups is the first and most important safeguard, but isnt enough. Theres much you can do to help prevent corruption occurring. If it does occur, theres much you can do to remedy it. But if these fail, its comforting to know that you have a recent backup to fall back on. Accordingly, the last words on the subject must be, Backup! Backup!! Backup!!!
Alec Mulvey owns Keyword Software & Consultancy based in Ascot, near London, England. Alec has been building Q&A applications and performing data recovery for nine years, and is the current president of the (Inter)National Q&A User Group. Keyword Software is also the UK distributor for the International English edition of Q&A (the "Definitive Edition" CD-ROM) and also John Dow's Utilities. Fax +44-1344-884-111, email alec@KeywordSoftware.com http://www.KeywordSoftware.com.
6560 Rosemoor Street
Pittsburgh, PA 15217-3026
Web site: http://www.johntdow.com
Symantec data recovery
175 West Broadway
Eugene, OR 97401-3003
Web site: http://www.symantec.com/techsupp/recovery/qa/qa_index.html
Q&A User Group
Web site: http://www.qaug.com
Comments? Looking for something? Write us at firstname.lastname@example.org
Want to submit an article to The Quick Answer? Write to email@example.com
Phone - 800-780-5474 or 949-722-9127 / Fax - 949-722-9127
Marble Publications, 1927A Harbor Blvd., Costa Mesa, California 92627
Last modified: December 11, 2006