[ Database tutorial | Text databases ]

Unique identifiers, and generating ID numbers

It is a really good idea to have some sort of unique identifier that goes with every record in your datafile, even if you can't think of any readily apparent reason for one. When you come to delete or edit your records, you need some way to refer to a record so that it cannot possibly get confused with any other record (a unique identifier).

Generally, I give every record in a database an integer ID number, incrementing by one with every record that I add to the database, never reusing the same ID number, even after a field has been removed. Now, most commercial databases have an "autonumber" feature, or something like that, which automatically assigns a new ID number to each record for you, but when you are working with text files, you don't have anything that nice.

While there is more than one way to do this, here's a suggested way that works consistently for me.

I create an additiona datafile, called ID, or something similar, and put it in the same directory as my data files. This is just a text file with one line in it. To initialize it, I put a number in it, such as 1. If you want to start your ID numbers somewhere else for some reason, you can put any number in there that you like.

I then use a function like the following to get a new I number out of that file for each record.

sub GetID	{
	my ($file) = @_; # Read in parameters
	my ($id, $lock); # Declare additional local variables
	
	# Open the ID file and read in the last number used
	open (FILE, "$file");
	($id) = <FILE>;
	close FILE;
	
	# increment the ID number
	$id++;
	
	#  Get a lock on the lock file
	$lock = $file . ".lock";
	open (LOCK, ">$lock");
	flock LOCK, 2;

	#  Write the new ID out to the file
	open (FILE ,">$file"); 
	print FILE $id;
	close FILE;
	
	#  Release lock
	flock LOCK, 8;
	close LOCK;

	# Return the ID number
	return $id;
}
Then, from my program, I can call this function to get the next available ID number for my data.

$id = GetID('employee.id')
Several comments on this (in addition to the ones in the code).
  • I pass in the file name so that if I have several datafiles, I can have an ID file for each one, and still use the same funtion.
  • Notice that since the ID file is a one line file, the line
    ($id) = <FILE>
    reads the contents of the file into the array ($id) which contains only one element. I mention this because it sometimes seems strange to some beginning Perl programmers that $id is a scalar, but ($id) is an array of one element.
  • The file locking might be a little bit overkill, but in the off-chance that you did have two simultaneous hits, and got non-unique ID numbers, it would defeat the whole purpose of doing this. See the section on file locking.
  • I opened the datafile twice, which may seem like wasted disk access. Remember that when you open a file for writing with one > character, you are clobbering the file - that is, you are removing the entire contents of the file, and rewriting it from scratch.
  • [ Database tutorial | Text databases ]