Artificial Intelligence

Tuesday, January 26, 2010

Knowledge-Based System Task Three

In this task, I have suggested some alternative for characters that were misidentify in task NN3. Table below shows some of the number characters that have been incorrectly identify as alphabet characters.

Rules have been written in such away that if any of the alphabets in the above table was entered in char3 or char4 (to simulate misidentification), the alphabet will be changed to the respective number. I have set the variable mark to become altered if the character has been change. It will then fire the rule altered_mark and the amended registration marks will be printed out from the console.
Finally all the codes for the three tasks were integrated together and add on to the file GET_MARK.KSL. An overview of how the program works in a forward chain are shown in the flowchart below

Knowledge-Based System Task Two

In the second task, some rules have been written to interpret where and when a vehicle was registered given a valid registration marks. The first two characters of the marks indicate the local memory tag which identify the region and DVLA office the vehicle registered. The first rule at_England was to identify char1 for vehicle registered in England. Besides C and S, all other uppercase alphabet entered in char1 other then those invalid characters indicate that the vehicle was registered in England. If the characters C or S were entered in char1, the vehicle would be registered in either Wales or Scotland and the following eight rules were written to identify the regions and the local offices. Next, rules have been written to interpret the age identifier, that is, the third and fourth character of the registration mark. The rule mar_aug was written to identify the age identifier from 02 to 49, therefore char3 must be between 0 and 4. I have written a function get_val(Char) for converting a digit character entered into a number value. As for age identifier 50, a separate rule mar_aug_2050 was written so that char3 will not overlap with the age identifier from 51 to 59. The rule sep_feb0, sep_feb1 and sep_feb2 was written to identify the age identifier from 51 to 58, 59 and 60 to 99 respectively.

Monday, January 25, 2010

Knowledge-Based System Task One

After recognizing characters using neural networks, the next step was to use KBS to validate the vehicle registration marks. In this task, a rule-based program was written to determine whether or not the set of characters entered constitutes a valid vehicle registration marks. I have added on my flex code to the file GET_MARK.KSL. There were a total of fifteen rules written in this task. The first five rules were written to validate all alphabets entered must be in uppercase. For example, the rule range_char1 states that if char1 was outside the range of A to Z, a message is printed out to indicate that char1 must be entered with an uppercase alphabets. I have set a variable code to become invalid for all rules that satisfy invalid conditions. This would result printing out an error message ‘INVALID CODE’, as the last rule (error_code) in this task will fire. The rule invalid_char1 was to validate those alphabets that were not allowed in char1. This was done the same for char2, char5, char6 and char7. As for char3 and char4, they must be a number between 0 and 9. The combination of char3 and char4 must not be entered 00 or 01 because the year 2000 and 2001 are not shown in the age identifier table for the month March to August.

Sunday, January 24, 2010

Neural Network Task Four

The fourth task will involve using a Hopfeild network to recognize digit characters. I have created a training file containing only two digit characters, 7 and 8, from the data file IDEALTRN.NNA. These two characters were chosen because they have quite a great hamming distance. With a great hamming distance between stored patterns, the possibility of the network to perform well would be high. I have also created a testing file containing five example of each character 7 and 8 from the data file IDEALTST.NNA. A Hopfield network was then created using 48 input neurons, 48 hidden neurons and 48 output neurons, as there are 48 bits in each medium resolution character. The input and output layers act as a buffer and only the hidden layer do the processing. The network was trained and recalled one complete passes through the file. The output values are printed in the result file and these were compared with the input values contain in the training file to see how many correct outputs. All results were recorded.

I have done the same experiment as above using another pair of digit characters, 1 and 3, with a smaller hamming distance. It was found that it gives a poor results.

Neural Network Task Three

The third task was to test the RBF network with the full sets of character using medium resolution data, that is, 0 to 9, A to Z and a space character. Therefore there were a total of 37 characters. A RBF network was created with 24 input neurons, 50 prototype neurons and 37 output neurons. The network takes a longer time to train and it learns for 80,000 iteration before the RMS error converges. The network was then improved by increasing the number of prototype neurons. A total of four different number of prototype neurons(50, 70, 100 and 180) were used to run the networks and all results are recorded. As the output classes are larger, more prototype neurons were to be used for clustering.

Neural Network Task Two

The second task I have performed was to compare the networks that were tested with different resolution of data. The data set used was the same as the previous, which was from A to J. In the previous task, the data file used is of low resolution, that is, a total of 11 scans per character. For this task, I was using medium resolution data (24 scans) and high resolution data (47 scans) to run the network. It was found out from the previous task that the RBF network with 50 prototype neurons performs slightly better than the MLP network and therefore RBF was used here. The setting for the RBF network was exactly the same as the previous one with 50 prototype neurons except that for the number of input neurons have to be changed to 24 neurons for medium resolution and 47 neurons for high resolution. The network was trained for 20,000 iteration for each resolution before the RMS error converges. The scoring method used was also same as the previous and all results are recorded for both medium and high resolution data.

Neural Network Task One

In the first task of neural network, I had done an investigation on training two different kinds of networks using small data sets containing only character A to J. The networks used were MLP and RBF network. The data file used for training consists of the first five example of each of the ten character A to J and the data file used for testing consists of the last five example. All data files used in this task are low resolution data.

First, using the software NeuralWorks, create a MLP network with 11 input neurons, 10 hidden neurons and 10 output neurons. For low resolution data, there were only 6 horizontal scans and 5 vertical scans, a total of 11 scans. Therefore I have used 11 input neurons. Since there were only 10 classes (A to J), I think it was reasonable to use 10 to 20 hidden neurons for classification. The maximum and minimum table was selected, as the input values were not between 0 and 1. The network was trained for 20,000 iterations before the RMS error converges to a stable value. I have run the networks for three times using three different numbers of hidden neuron, 10, 15 and 20 hidden neurons. All the three results of RMS error is recorded. Next, I have evaluated the performance of the network more formally using the scoring system. A recall is made and output values are recorded in the ‘nnr’ file. Since most of the output values are below 0.5, I have decided to use the ‘winner takes all’ strategy’, that is, the classification was determined by the highest output values. Score 1 for outputs that were correctly classified and score 0 for mis-classification.

After training the MLP network, the next step was to train a RBF network and the performance was compared between these two networks. The same amount of input and output neurons were used, that is, 11 and 10 neurons respectively, since the same data files were used for training and testing. As for the prototype neurons, I have run the network for five times using five different numbers of neurons (10, 30, 50 and 70 neurons). Although there are only 10 classes, I have found that the result gets better if the number of prototype neurons were increased. The network also take about 20,000 iteration of training before its RMS error converges. The same scoring method was also used to evaluate the performance of the network and all results are recorded.