Image Processing can be considered one of the trending subcategories of machine learning as it is ubiquitous in today's world. I had been learning certain aspects of the subject years ago using algorithms that were purely based on mathematics rather than what we perceive as machine learning. Although they are two sources eventually telling the same story, their approaches are different. However, today after so many year, I had the pleasure of working in the field again as a result of a discussion in my ANN class.

ANN which stands for Artificial Neural Network is a learning method that dates back to the 40's, proving to be more practical in the late 1900's. ANN is now among the most commonly used methods for machine learning that has penetrated different types of problems. Moreover, it has turned into its own separate course in most institutions, which I happen to have taken this semester.

Today, as instances of older ANNs were being lectured in class, I started programming them just for good old fun, since these networks are no longer practical. As a result, in a matter of minutes my code functioned incredibly well, classifying input images that were 50x50 pixels with an English letter inside into two classes, similar to X or to C. Nevertheless, it was just an example of what would have been awe-inspiring maybe 35 years ago!

All of that aside, it is really interesting since the input data has to be modeled, and then the data can be formed into a data-set, fed to an ANN, and later when the ANN creates a model successfully, it can be used to recognize new images and patterns.

For instance, the modeling in this case has to be in the following form:

  • Rotating and cropping the input letter image as to make it recognizable and similar to others. This can be helpful since the letter A compared to O could have lower width, so it needs to be adjusted in the middle of the picture precisely to match the exact pattern O is following to ensure correct learning.
  • The inputs can be in black and white, therefore we can convert them into grayscale, turning a 3-dimensional matrix(or in some cases a tensor) containing into a regular matrix containing brightness levels in each cell from 0 to 255.
  • The images will then need to be resized to the exact same size, in this particular case, 50x50.
  • Since the inputs have no depth, they can be blacked out completely, making each pixel either completely white or completely black. As a result, the matrix can be converted to a binary matrix.

Finally, after all the filtering, our input images can be processed and fed to a learning algorithm, in this case an ANN. In order to do that, each file must be specified with its class, since this is a classification problem and the program must shuffle the data, use a fraction for learning which is called the training set, a fraction in order to evaluate learning while running which is called the evaluation set, and a final fraction called the test set which helps show the overall loss, or in this case, accuracy. On a very small data-set and a very basic ANN, one can achieve an accuracy of 100% which is almost never good, since it points toward over-fitting, which is a known problem that can occur in machine learning. However, as mentioned, this was nothing but a sample to help me personally, and maybe able to help others understand image processing at a very basic level.

Having moved on to Python and TensorFlow, I'll be exploring this and other areas soon, and I'll be sure to write whenever possible.

Stay tuned!