Artificial Intelligence

Tuesday, January 26, 2010

Knowledge-Based System Task Three

In this task, I have suggested some alternative for characters that were misidentify in task NN3. Table below shows some of the number characters that have been incorrectly identify as alphabet characters.

Rules have been written in such away that if any of the alphabets in the above table was entered in char3 or char4 (to simulate misidentification), the alphabet will be changed to the respective number. I have set the variable mark to become altered if the character has been change. It will then fire the rule altered_mark and the amended registration marks will be printed out from the console.
Finally all the codes for the three tasks were integrated together and add on to the file GET_MARK.KSL. An overview of how the program works in a forward chain are shown in the flowchart below

Knowledge-Based System Task Two

In the second task, some rules have been written to interpret where and when a vehicle was registered given a valid registration marks. The first two characters of the marks indicate the local memory tag which identify the region and DVLA office the vehicle registered. The first rule at_England was to identify char1 for vehicle registered in England. Besides C and S, all other uppercase alphabet entered in char1 other then those invalid characters indicate that the vehicle was registered in England. If the characters C or S were entered in char1, the vehicle would be registered in either Wales or Scotland and the following eight rules were written to identify the regions and the local offices. Next, rules have been written to interpret the age identifier, that is, the third and fourth character of the registration mark. The rule mar_aug was written to identify the age identifier from 02 to 49, therefore char3 must be between 0 and 4. I have written a function get_val(Char) for converting a digit character entered into a number value. As for age identifier 50, a separate rule mar_aug_2050 was written so that char3 will not overlap with the age identifier from 51 to 59. The rule sep_feb0, sep_feb1 and sep_feb2 was written to identify the age identifier from 51 to 58, 59 and 60 to 99 respectively.

Monday, January 25, 2010

Knowledge-Based System Task One

After recognizing characters using neural networks, the next step was to use KBS to validate the vehicle registration marks. In this task, a rule-based program was written to determine whether or not the set of characters entered constitutes a valid vehicle registration marks. I have added on my flex code to the file GET_MARK.KSL. There were a total of fifteen rules written in this task. The first five rules were written to validate all alphabets entered must be in uppercase. For example, the rule range_char1 states that if char1 was outside the range of A to Z, a message is printed out to indicate that char1 must be entered with an uppercase alphabets. I have set a variable code to become invalid for all rules that satisfy invalid conditions. This would result printing out an error message ‘INVALID CODE’, as the last rule (error_code) in this task will fire. The rule invalid_char1 was to validate those alphabets that were not allowed in char1. This was done the same for char2, char5, char6 and char7. As for char3 and char4, they must be a number between 0 and 9. The combination of char3 and char4 must not be entered 00 or 01 because the year 2000 and 2001 are not shown in the age identifier table for the month March to August.

Sunday, January 24, 2010

Neural Network Task Four

The fourth task will involve using a Hopfeild network to recognize digit characters. I have created a training file containing only two digit characters, 7 and 8, from the data file IDEALTRN.NNA. These two characters were chosen because they have quite a great hamming distance. With a great hamming distance between stored patterns, the possibility of the network to perform well would be high. I have also created a testing file containing five example of each character 7 and 8 from the data file IDEALTST.NNA. A Hopfield network was then created using 48 input neurons, 48 hidden neurons and 48 output neurons, as there are 48 bits in each medium resolution character. The input and output layers act as a buffer and only the hidden layer do the processing. The network was trained and recalled one complete passes through the file. The output values are printed in the result file and these were compared with the input values contain in the training file to see how many correct outputs. All results were recorded.

I have done the same experiment as above using another pair of digit characters, 1 and 3, with a smaller hamming distance. It was found that it gives a poor results.

Neural Network Task Three

The third task was to test the RBF network with the full sets of character using medium resolution data, that is, 0 to 9, A to Z and a space character. Therefore there were a total of 37 characters. A RBF network was created with 24 input neurons, 50 prototype neurons and 37 output neurons. The network takes a longer time to train and it learns for 80,000 iteration before the RMS error converges. The network was then improved by increasing the number of prototype neurons. A total of four different number of prototype neurons(50, 70, 100 and 180) were used to run the networks and all results are recorded. As the output classes are larger, more prototype neurons were to be used for clustering.

Neural Network Task Two

The second task I have performed was to compare the networks that were tested with different resolution of data. The data set used was the same as the previous, which was from A to J. In the previous task, the data file used is of low resolution, that is, a total of 11 scans per character. For this task, I was using medium resolution data (24 scans) and high resolution data (47 scans) to run the network. It was found out from the previous task that the RBF network with 50 prototype neurons performs slightly better than the MLP network and therefore RBF was used here. The setting for the RBF network was exactly the same as the previous one with 50 prototype neurons except that for the number of input neurons have to be changed to 24 neurons for medium resolution and 47 neurons for high resolution. The network was trained for 20,000 iteration for each resolution before the RMS error converges. The scoring method used was also same as the previous and all results are recorded for both medium and high resolution data.

Neural Network Task One

In the first task of neural network, I had done an investigation on training two different kinds of networks using small data sets containing only character A to J. The networks used were MLP and RBF network. The data file used for training consists of the first five example of each of the ten character A to J and the data file used for testing consists of the last five example. All data files used in this task are low resolution data.

First, using the software NeuralWorks, create a MLP network with 11 input neurons, 10 hidden neurons and 10 output neurons. For low resolution data, there were only 6 horizontal scans and 5 vertical scans, a total of 11 scans. Therefore I have used 11 input neurons. Since there were only 10 classes (A to J), I think it was reasonable to use 10 to 20 hidden neurons for classification. The maximum and minimum table was selected, as the input values were not between 0 and 1. The network was trained for 20,000 iterations before the RMS error converges to a stable value. I have run the networks for three times using three different numbers of hidden neuron, 10, 15 and 20 hidden neurons. All the three results of RMS error is recorded. Next, I have evaluated the performance of the network more formally using the scoring system. A recall is made and output values are recorded in the ‘nnr’ file. Since most of the output values are below 0.5, I have decided to use the ‘winner takes all’ strategy’, that is, the classification was determined by the highest output values. Score 1 for outputs that were correctly classified and score 0 for mis-classification.

After training the MLP network, the next step was to train a RBF network and the performance was compared between these two networks. The same amount of input and output neurons were used, that is, 11 and 10 neurons respectively, since the same data files were used for training and testing. As for the prototype neurons, I have run the network for five times using five different numbers of neurons (10, 30, 50 and 70 neurons). Although there are only 10 classes, I have found that the result gets better if the number of prototype neurons were increased. The network also take about 20,000 iteration of training before its RMS error converges. The same scoring method was also used to evaluate the performance of the network and all results are recorded.

Friday, January 22, 2010

Objective of Using Neural Network

Below are what I have targeted to perform:

Compare the performance of a MLP network and a RBF network for recognizing handwritten character.
Determine the effects of using different resolution data for training and testing on the RBF network.
Determine the effects of using different data sets for training and testing on the RBF network.
Demonstrate the suitability of the Hopfeild network to recognize character.
Write a rule-based program to validate the vehicle registration marks.
Write a rule-based program to interpret the local memory tags and age identifiers of the vehicle registration marks.
Write a rule-based program to suggestive alternative for misidentify characters.

Introduction of Project in Neural Network

In this project, an investigation was carried out using neural networks and knowledge-based systems to recognize, validate and interpret handwritten characters.

Neural network is one of the techniques used in artificial intelligent. It consists of many nonlinear computational elements that form the network neurons, linked by weighted interconnections.

The neurons are designed similar to the neurological system of animals. The networks are most effective in performing tasks like classification and error correction.

There are a few kinds of neural networks and those that are used in this project are MLP, RBF and Hopfield networks. They are being compared to see which is the most suitable for character recognition. Data files with different resolutions and data sets are used to train and test out the network.

Knowledge-based systems are intelligent systems that are build with the flexibility of adding new knowledge to the program without affecting the whole system. A rule-based system is a knowledge-based system where the knowledge-base is represented in the form of a sets of rules where rules are an flexible means of expressing knowledge.

This project uses a rule-based system to first validate a vehicle registration mark entered by user and then interpret the marks to identify where and when the vehicle is registered. Rules are also written to suggest alternative for misidentify characters. For example, a character ‘0’ are incorrectly identify as ‘O’ and these rules will change the character ‘O’ back to ‘0’. An agent-based system was also worked in conjunction with the rule-based system to validate vehicle registration marks.

Thursday, January 21, 2010

A Brief Summary of my Project

The objective of my final project is to develop intelligent systems capable of processing handwritten vehicle registration marks.

The main techniques used in this project will be on creating neural networks (NNs) to automatically recognize the individual handwritten characters that go to make up a vehicle registration marks and also by using knowledge-based systems (KBSs) to validate the intended registration mark and identify where and when the vehicle was registered.

The first part the project start off by doing a feasibility study of a MLP (multilayer perceptron) and a RBF (Radial Basis Function) networks to see which will contribute to a better performance in recognizing characters. Only the characters A to J were tested at this point of time and low resolution data was used. It was found that RBF is a better network to do the job and it was then use to perform on medium and high resolution data. The result shows that the network performs better on medium resolution data. The network was also used to train and test on a full set of characters (‘0’ to ‘9’, ‘A’ to ‘B’ and ‘space’). The last part on neural network is to use a Hopfield network to recognize characters. It was found that out of ten test patterns there were nine correct outputs.

The second part involves the investigation of a knowledge-based system. First, a rule-based program was written to validate the format of the vehicle registration marks using flex. If users enter an invalid mark, a message will be printed out in the console to tell the users that they have entered an invalid mark. The program continues to be written for interpreting the local memory tags and the age identifiers of the marks. This would identify where and when the vehicle was registered given a valid registration marks. The final portion of the program was written to suggest alternative for those characters that have misidentified during neural network processing.

Finally a agent-based system was build for discovering information about the status of the vehicles with a particular registration marks.

My Final Year Project

In this blog, I'll be sharing the content of my final year project which relates to intelligent systems. I am a computer science student and the final year project I have done was all about Artificial Intelligence . It's about developing intelligent systems to recognize handwritten characters on vehicle registration marks.