Artificial Intelligence is getting better – latest news and trends in AI concerning image processing

Artificial intelligence is now a part of new, more useful applications and it is getting better. In this blog post we will present you some of these new and interesting AI apps. And, let us just inform you that, from this blog post, every couple of months, we will show and discuss news and trends in image processing field, including new papers, research and applications!

And now, let’s start with news from our favorite, NVIDIA. What is NVIDIA up to?

AI can Detect Open Parking Spaces

With as many as 2 billion parking spaces in the United States, finding an open spot in a major city can be complicated. To help city planners and drivers more efficiently manage and find open spaces, MIT researchers developed a deep learning-based system that can automatically detect open spots from a video feed.

Parking spaces are costly to build, parking payments are difficult to enforce, and drivers waste an excessive amount of time searching for empty lots,” the researchers stated in their paper.

Article from:

New AI Imaging Technique Reconstructs Photos with Realistic Results

Researchers from NVIDIA, led by Guilin Liu, introduced a state-of-the-art deep learning method that can edit images or reconstruct a corrupted image, one that has holes or is missing pixels. The method can also be used to edit images by removing content and filling in the resulting holes. The method, which performs a process called “image inpainting”, could be implemented in photo editing software to remove unwanted content, while filling it with a realistic computer-generated alternative.

Our model can robustly handle holes of any shape, size location, or distance from the image borders. Previous deep learning approaches have focused on rectangular regions located around the center of the image, and often rely on expensive post-processing,” the NVIDIA researchers stated in their research paper.

Article from:

AI Can Now Fix Your Grainy Photos by Only Looking at Grainy Photos

What if you could take your photos that were originally taken in low light and automatically remove the noise and artifacts? Have grainy or pixelated images in your photo library and want to fix them? This deep learning-based approach has learned to fix photos by simply looking at examples of corrupted photos only. The work was developed by researchers from NVIDIA, Aalto University, and MIT, and was presented at the International Conference on Machine Learning in Stockholm, Sweden.

Recent deep learning work in the field has focused on training a neural network to restore images by showing example pairs of noisy and clean images. The AI then learns how to make up the difference. This method differs because it only requires two input images with the noise or grain.

Without ever being shown what a noise-free image looks like, this AI can remove artifacts, noise, grain, and automatically enhance your photos.

It is possible to learn to restore signals without ever observing clean ones, at performance sometimes exceeding training using clean exemplars,” the researchers stated in their paper.

Article from:

AI Model Can Generate Images from Natural Language Descriptions

To potentially improve natural language queries, including the retrieval of images from speech, Researchers from IBM and the University of Virginia developed a deep learning model that can generate objects and their attributes from natural language descriptions.

We show that under minor modifications, the proposed framework can handle the generation of different forms of scene representations, including cartoon-like scenes, object layouts corresponding to real images, and synthetic images,” the researchers stated in their paper.

Article from:

Now, some new research papers with different fields that need AI as well as image processing:

Digital image analysis in breast pathology—from image processing techniques to artificial intelligence 


Abstract: Breast cancer is the most common malignant disease in women worldwide. In recent decades, earlier diagnosis and better adjuvant therapy have substantially improved patient outcome. Diagnosis by histopathology has proven to be instrumental to guide breast cancer treatment, but new challenges have emerged as our increasing understanding of cancer over the years has revealed its complex nature. As patient demand for personalized breast cancer therapy grows, we face an urgent need for more precise biomarker assessment and more accurate histopathologic breast cancer diagnosis to make better therapy decisions. The digitization of pathology data has opened the door to faster, more reproducible, and more precise diagnoses through computerized image analysis. Software to assist diagnostic breast pathology through image processing techniques have been around for years. But recent breakthroughs in artificial intelligence (AI) promise to fundamentally change the way we detect and treat breast cancer in the near future. Machine learning, a subfield of AI that applies statistical methods to learn from data, has seen an explosion of interest in recent years because of its ability to recognize patterns in data with less need for human instruction. One technique in particular, known as deep learning, has produced groundbreaking results in many important problems including image classification and speech recognition. In this review, we will cover the use of AI and deep learning in diagnostic breast pathology, and other recent developments in digital image analysis.

Predicting tool life in turning operations using neural networks and image processing


Abstract: A two-step method is presented for the automatic prediction of tool life in turning operations. First, experimental data are collected for three cutting edges under the same constant processing conditions. In these experiments, the parameter of tool wear, VB, is measured with conventional methods and the same parameter is estimated using Neural Wear, a customized software package that combines flank wear image recognition and Artificial Neural Networks (ANNs). Second, an ANN model of tool life is trained with the data collected from the first two cutting edges and the subsequent model is evaluated on two different subsets for the third cutting edge: the first subset is obtained from the direct measurement of tool wear and the second is obtained from the Neural Wear software that estimates tool wear using edge images. Although the complete-automated solution, Neural Wear software for tool wear recognition plus the ANN model of tool life prediction, presented a slightly higher error than the direct measurements, it was within the same range and can meet all industrial requirements. These results confirm that the combination of image recognition software and ANN modelling could potentially be developed into a useful industrial tool for low-cost estimation of tool life in turning operations.

Automatic food detection in egocentric images using artificial intelligence technology 



Objective:To develop an artificial intelligence (AI)-based algorithm which can automatically detect food items from images acquired by an egocentric wearable camera for dietary assessment.

Design:To study human diet and lifestyle, large sets of egocentric images were acquired using a wearable device, called eButton, from free-living individuals. Three thousand nine hundred images containing real-world activities, which formed eButton data set 1, were manually selected from thirty subjects. eButton data set 2 contained 29 515 images acquired from a research participant in a week-long unrestricted recording. They included both food- and non-food-related real-life activities, such as dining at both home and restaurants, cooking, shopping, gardening, housekeeping chores, taking classes, gym exercise, etc. All images in these data sets were classified as food/non-food images based on their tags generated by a convolutional neural network.

Results:A cross data-set test was conducted on eButton data set 1. The overall accuracy of food detection was 91·5 and 86·4 %, respectively, when one-half of data set 1 was used for training and the other half for testing. For eButton data set 2, 74·0 % sensitivity and 87·0 % specificity were obtained if both ‘food’ and ‘drink’ were considered as food images. Alternatively, if only ‘food’ items were considered, the sensitivity and specificity reached 85·0 and 85·8 %, respectively.

Conclusions: The AI technology can automatically detect foods from low-quality, wearable camera-acquired real-world egocentric images with reasonable accuracy, reducing both the burden of data processing and privacy concerns.

Bioinformatics and Image Processing—Detection of Plant Diseases 



This paper gives an idea of how a combination of image processing along with bioinformatics detects deadly diseases in plants and agricultural crops. These kinds of diseases are not recognizable by bare human eyesight. First occurrence of these diseases is microscopic in nature. If plants are affected with such kind of diseases, there is deterioration in the quality of production of the plants. We need to correctly identify the symptoms, treat the diseases, and improve the production quality. Computers can help to make correct decision as well as can support industrialization of the detection work. We present in this paper a technique for image segmentation using HSI algorithm to classify various categories of diseases. This technique can also classify different types of plant diseases as well. GA has always proven itself to be very useful in image segmentation.

And, at the end, some news from public sector and applied algorithms:

China Now has Facial Recognition Based Toilets 

China has integrated facial recognition in the toilets across the country. Citizens now need WeChat or face scans to get the toilet papers. People will stand in the yellow recognition spot and will bring their face near the face identification machine.  Then after about three seconds, 90 centimeters of toilet paper will come out. People will then go in and use the toilet but only for limited time as alarm will buzz if someone occupies it for too long. In toilet, sensors will assess ammonium amount and spray a deodorant if required. The two bathrooms integrated with face scanners for being “clean and convenient,” and “reducing toilet paper waste.”

Read more here: 

Apple’s Camera-Toting Watch Band Uses Facial Recognition For Flawless FaceTime Calls 

U.S. Patent and Trademark Office granted a patent to Apple which says that the tech titan wants to widen the set of attributes of its wearable, by integrating an original camera system with the ability to automatically crop subject matter, trace objects such as user’s face and produce angle-adjusted avatars for FaceTime calls. “Image-capturing watch” U.S. Patent No. 10,129,503 of Apple tells a software and hardware solution that creates a camera-toting Apple Watch, that is both handy and feasible. Using a camera-toted Watch, consumers can put aside a heavy handheld device while playing sports, exercising or doing other energetic activities. However, a feasible smartwatch solution is hard to accomplish. The camera captures the motion data and then the watch processes it, after which it is mapped onto the computer produced picture, which imitates a consumer’s facial movements and expressions in real time. On the other hand, source movement data can be utilized to tell about the motion of inhuman avatars such as Apple’s Memoji and Animoji. It still remains unknown whether Apple wants to integrate its Apple Watch camera band tech.

Read more here:

Metropolitan Police London is to Integrate Face Recognition Tech 

London’s police will integrate face recognition tech as an experiment for two days. In the areas of Leicester Square, Piccadilly Circus, and Soho in London, the technology will examine crowds’ faces and compare them with the database of individuals wanted by the courts and Metropolitan Police in London. If the tech founds a match, the police officers in that field will analyze it and perform further tests to make sure the identity of that individual.

Read more here:

That’s all for now folks. But, tell me, what do you think, what are some areas where AI is going to bring most benefits? What are areas, by your opinion where there is space for more research? Can you actually believe that it is possible to have AI solutions in every day life?

All news are citations from the mentioned sites, where you can find the whole text about the topic.

Outliers in data and how to deal with them

Term explanation

In image processing, as an area in signal processing, modelling the data and expected values is very important in all kinds of applications. So, the data represent the problem that needs to be addressed. That is why it is necessary for us to know what kind of data to expect, and what are some values that are the result of some measurement errors, faulty data, erroneous procedures, or simply what are the areas where a certain theory might not be valid. So, to improve the model and gain better results of our applications, we must recognize and deal with outliers in the data.

What is an outlier?

In statistics, an outlier is a data point that differs significantly from other observations. Outliers in the data can be very dangerous, since they change the classical data statistics, such as mean value and variance of the data. This affects the results of an algorithm of any kind (image processing, machine learning, deep learning algorithm…). So, when modeling, it is extremely important to clean the data sample to ensure that the observations best represent the problem.

How to deal with outliers in the data

            The thing we know about outliers is that they do not fit the model we assumed, but we don’t know anything else about them, when they will appear or what value will have. We just know that we must stop them messing with our results. But how?

  • First step in determining the outliers is getting to know the data for the specific application. So, we must have some test dataset and start from there.
  • The next step is to find the data distribution (according to the available dataset), which can be tricky task sometimes. Let us assume that the data have normal (Gaussian) distribution.
Normal (Gaussian) distribution
  • When we are familiar with the distribution of the data, now we can identify outliers more easily. So, there is no precise way to define and identify outliers in general, but we must know how to define them for our specific application.
  • We can now use statistical methods to identify observations that appear to be rare or unlikely given the available data. Outliers can occur by chance in any distribution, but they often indicate either measurement error or that the population has a heavy-tailed distribution.
  • In the former case one wishes to discard them or use statistics that are robust to outliers, while in the latter case they indicate that the distribution has high skewness and that one should be very cautious in using tools or intuitions that assume a normal distribution.
  • In most larger samplings of data, some data points will be further away from the sample mean than what is deemed reasonable. This can be due to incidental systematic error or flaws in the theory that generated an assumed family of probability distributions, or it may be that some observations are far from the center of the data. In large samples, a small number of outliers is to be expected (and not due to any anomalous condition).
  • Now, we can deal with outliers. We can remove them from our dataset if we are dealing with the offline applications. But, on the other hand, if we are dealing with the real time online processing   than we must use some procedures, in order to make our application more robust.


Maybe one thinks that a simple way to handle outliers is to detect them and remove them from the data set. Deleting an outlier, although better than doing nothing, still poses a number of problems:

  • When is deletion justified? Deletion requires a subjective decision.
  • When is an observation “outlying enough” to be deleted?
  • The user or the author of the data may think that “an observation is an observation” (i.e., observations should speak for themselves) and hence feel uneasy about deleting them.
  • Since there is generally some uncertainty as to whether an observation is really atypical, there is a risk of deleting “good” observations, which results in underestimating data variability.

Since the results depend on the user’s subjective decisions, it is difficult to determine the statistical behavior of the complete procedure.

Robust Statistics

Let’s say something about normal distribution assumption. It is very common to assume the Gaussian distribution in different kinds of an engineering problems. The most widely used model formalization is the assumption that the observed data have a normal (Gaussian) distribution. This assumption has been present in statistics as well as engineering for two centuries and has been the framework for all the classical methods in regression, analysis of variance and multivariate analysis. The main justification for assuming a normal distribution is that it gives an approximate representation to many real data sets, and at the same time is theoretically quite convenient because it allows one to derive explicit formulas for optimal statistical methods such as maximum likelihood, likelihood ratio tests, etc. We refer to such methods as classical statistical methods and note that they rely on the assumption that normality holds exactly. The classical statistics are by modern computing standards quite easy to compute. Unfortunately, theoretical and computational convenience does not always deliver an adequate tool for the practice of statistics and data analysis. It often happens in practice that an assumed normal distribution model (e.g., Standard Kalman filter) holds approximately in that it describes the majority of observations, but some observations follow a different pattern or no pattern at all.

Now, we know that such atypical data are called outliers, and even a single outlier can have a large distorting influence on a classical statistical method that is optimal under the assumption of normality or linearity. The kind of “approximately” normal distribution that gives rise to outliers is one that has a normal shape in the central region but has tails that are heavier or “fatter” than those of a normal distribution. One might naively expect that if such approximate normality holds, then the results of using a normal distribution theory would also hold approximately. This is unfortunately not the case.

The robust approach to statistical modeling and data analysis aims at deriving methods that produce reliable parameter estimates and associated tests and confidence intervals, not only when the data follow a given distribution exactly, but also when this happens only approximately in the sense just described.

Robust methods fit the bulk of the data well: if the data contain no outliers the robust method gives approximately the same results as the classical method, while if a small proportion of outliers are present the robust method gives approximately the same results as the classical method applied to the “typical” data. As a consequence of fitting the bulk of the data well, robust methods provide a very reliable method of detecting outliers, even in high-dimensional multivariate situations.

We note that one approach to dealing with outliers is the diagnostic approach. Diagnostics are statistics generally based on classical estimates that aim at giving numerical or graphical clues for the detection of data departures from the assumed model. There is a considerable literature on outlier diagnostics, and a good outlier diagnostic is clearly better than doing nothing. However, these methods present two drawbacks. One is that they are in general not as reliable for detecting outliers as examining departures from a robust fit to the data. The other is that, once suspicious observations have been flagged, the actions to be taken with them remain the analyst’s personal decision, and thus there is no objective way to establish the properties of the result of the overall procedure.

Robust methods have a long history that can be traced back at least to the end of the nineteenth century. But the first great steps forward occurred in the 1960s, and the early 1970s with the fundamental work of John Tukey (1960, 1962), Peter Huber (1964, 1967) and Frank Hampel (1971, 1974). The applicability of the new robust methods proposed by these researchers was made possible by the increased speed and accessibility of computers.

In this post we will not talk about Robust Statistics any more. If you want to find out more, a new post will be published soon, or you can get some information from the references given at the end. This was just a beginning and a warm up for those who need more information about getting started in designing more robust applications.

What can we actually do with Deep Learning in Image Processing

Computers today can do almost everything we imagine with images and videos. Deep Learning, as a powerful tool in the area of Artificial Intelligence, can be very helpful state of the art gadget in Image Processing.

Application areas are numerous: from civil and industrial applications such as mobile telephone industry, augmented reality, home automation, gaming, retail and infotainment to some serious surveillance applications. If you ever wondered what is behind some FacebookAmazon or Pinterest algorithms for image classification and searches, it is Deep Learning.

Here are some concrete ideas of what we can actually accomplish with this tool:

Different stages of image and video processing examples

Note: If you are not familiar with the theory behind Deep Learning (read our next text about Deep Learning theory and Convolutional Neural Networks (CNN))

Pre-processing stage

Pre-processing represents the stage where we prepare our images for some future applications (from preparation to post on Instagram or showing on surveillance monitors, to preparation to use them in some complex applications, such as tracking, video stabilization…)

1.Noise Reduction for Image Restoration:

The process can go like this:

  • Prepare dataset, which consists of clean and corrupted image pairs (used to train a CNN)
  • Given a noisy image, predict clean image (using CNN)
  • Learn how to map corrupted image patches to clean ones, implicitly capturing the characteristic appearance of noise in natural images

Here is an example of thermal image denoising:

Noise reduction in thermal image

2. Image dehazing:

  • The presence of haze (dust, smoke, fog, mist, rain, snow…) directly influences visibility of the scene by reducing contrast and obscuring objects.  In severe haze conditions the image can practically loose the most of visual information.
  • This problem is where powerful optics cannot provide solution, and digital image processing is a must.
  • Video dehazing based on haze imaging includes estimation of haze transmission map, which needs to be “subtracted” or removed from the hazy image.
Removing haze from image

3. Image Enhancement:

  • Improve the visibility in the night conditions with CNNs
Improving visibility in night conditions
  • Improve the visibility in low contrast conditions with CNNs
Improve contrast in image for better visibility

4. Artefact Reduction

  • An efficient neural network can be used for seamless attenuation of different compression artefacts
  • Reduce JPEG compression artefacts
  • Reduce Twitter compression artefacts

Image processing stage

This is the stage where we actually use the information from images to artificially create a content and add meaning to group of pixels.

5. Boundary Detection

  • Detect object boundaries by using CNNs
Boundary detection

6. Feature learning – context encoders

  • Context Encoders – a CNN trained to generate the contents of an arbitrary image region conditioned on its surroundings
  • Context encoders need to both understand the content of the entire image, as well as produce a plausible hypothesis for the missing part(s).
Missing part of image retreival

7. Object Detection

Here we will explain the terms object classification, object localization, object detection and image segmentation:

  • Object detection models identify single or multiple relevant objects in a single image
  • The localization of the objects is provided, in comparison with image classification
Classification, localization, detection and segmentation explained

Object detection is VERY popular topic nowadays in scientific community, so several datasets have been released for object detection challenges:

     – PASCAL Visual Object Classification (PASCAL VOC) dataset

     – ImageNet

     – Common Objects in COntext (COCO) dataset

Researchers publish results of their algorithms applied to these challenges.

Overview of the scores on the 2007, 2010, 2012 PASCAL VOC and 2015, 2016 COCO datasets, by using different networks

8. Image Segmentation

  • The real time demo of Image Segmentation with SegNet can be seen here:

  • SegNet is a deep encoder-decoder architecture for multi-class pixelwise segmentation researched and developed by members of the Computer Vision and Robotics Group at the University of Cambridge, UK.

9. Object recognition

Object recognition is explained with the image below:

Detection, action classification, image captioning explained
  • Convolutional Neural Networks (CNN) have become an important tool for object recognition.
  • Convolutional Neural Networks for Visual Recognition example in real time can be found here:

Video processing

Video processing actually represents image processing on the set of frames (which represent the sequence).

Processing can be offline and online (

This actually means the sequence can be processed in real time (real time processing), frame by frame (and we should monitor the processing time between frames), or the sequence can be recorded and then processed, so there is no need for taking care of processing time.

In surveillance and monitoring, it is necessary to have real time video processing, which affects the complexity of the algorithm used. On the other hand, when we need post-processing in, for example, our recorded videos, complexity of the algorithms can rise. 

10. Object Tracking

  • Object tracking has always been a challenging problem in a field of computer vision.
  • Popular challenges or contests like the Visual Object Tracking (VOT) challenge and Visual Object Tracking in Thermal Infrared VOT-TIR challenge are a proof that object tracking is an ongoing demanding problem.
  • Deep Learning methods can also be applied in the task of single or multiple object tracking.

11. Digital Video Stabilization

  • Digital video stabilization is a task of removing jitter and shaking in video, due to unwanted camera motion (because of holding camera in hand or platform where camera is mounted is shaking)
  • Digital video stabilization is a task that can be performed on surveillance or monitoring cameras, therefore in real time or we might want to remove hand-shaking from our pre-recorded videos (offline stabilization).
  • Deep Learning framework enables faster algorithms, which can be applied succesfully in real time processing.
(a) Shaky video sequence from Matlab examples
(b) Stabilized video sequence

12. Video enhancement

  • Image enhancement techniques, mentioned in pre-processing section, can also be applied on the video sequence, in the same way.
  • The only concern can be about the complexity of the algorithm, in the case of real time processing.

This would be all for the post today, but if you want to study the topic more, here are some scientific papers that talk about the mentioned algorithms.