All about machine learning and artificial intelligence used in purposes of image and video processing. Learn and join the discussions about real implementation problems. Let’s overcome all the obstacles together!
We know that our eyes see reflected light, so it is easy for us to
understand the principle of forming the image from Visual (daylight and night
vision cameras). But if there is not enough light it is impossible for us or
the camera to see. This is not the case in the thermal imagery domain. Thermal
cameras measure temperature and emissivity of objects in the scene. In the
thermal infrared technologies, most of the captured radiation is emitted from
the observed objects, in contrast to visual and near infrared, where most of
the radiation is reflected. Thus, knowing or assuming material and
environmental properties, temperatures can be measured using a thermal camera (i.e.,
the camera is said to be radiometric). But,
let’s not forget: “Thermal cameras detect more than just heat though; they
detect tiny differences in heat – as small as 0.01°C – and
display them as shades of grey or with different colors.” 
Thermal image is different from visual camera image and cannot be
treated as a grayscale visual image. In thermal infrared there are no shadows,
and noise characteristics are different then in visual tracking. There are also
no color patterns like in visual domain, but patterns come out from variations
in material or temperature of objects.
The infrared wavelength band is usually divided into different sub-bands, according to their different properties: near infrared (NIR, wavelengths 0.7–1 µm), shortwave infrared (SWIR, 1–3 µm), midwave infrared (MWIR, 3–5 µm), and longwave infrared (LWIR, 7.5–12 µm). These bands are separated by regions where the atmospheric transmission is very low (i.e., the air is opaque) or where sensor technologies have their limits. LWIR, and sometimes MWIR, is commonly referred to as thermal infrared (TIR). TIR cameras should not be confused with NIR cameras that are dependent on illumination and in general behave in a similar way as visual cameras. Thermal cameras are either cooled or uncooled. Images are typically stored as 16 bits per pixel to allow a large dynamic range. Uncooled cameras give noisier images at a lower frame rate, but are smaller, silent, and less expensive. [2,3]
1. What is the biggest
difference between a high and low cost thermal camera?
The biggest difference is typically
resolution. The higher the resolution, the better the picture clarity. This
translates to a better picture at a greater distance as well, similar to the
megapixels of a regular digital camera.
2. Can thermal imaging
cameras see through objects?
No. Thermal imaging cameras only detect
heat; they will not “see” through solid objects, clothing, brick walls, etc.
They see the heat coming off the surface of the object.
3. Is there a difference
between night vision and thermal imaging?
Night vision relies on at least a very low level of light (less than the human
eye can detect) in order to amplify it so that it can produce a picture. Night
vision will not work in complete darkness whereas thermal imaging will
because it only “sees” heat.
4. Can rain and heavy fog
limit the range of thermal imaging cameras?
Rain and heavy fog can severely limit the range of thermal imaging cameras
because light scatters off of droplets of water.
Applications of thermal vision are numerous, in civil as well as in military sector, but here we will focus on applications in civil sector that can be of help in every day life. So, this technology can be used to observe and analyze human activities from a distance in a noninvasive manner, for example. Traditional computer vision utilizes RGB cameras, but problems with this sensor include its light dependency. Thermal cameras operate independently of light and measure the radiated infrared waves representing the temperature of the scene. In order to showcase the possibilities, both indoor and outdoor scenarios applications which use thermal imaging only are presented.
Surveillance: People counting in urban environments
Human movement can be automatically registered and analyzed. For both
real-time and long-term perspectives, this knowledge can be beneficial in
relation to urban planning and for shopkeepers in the city. Information in
real-time can be used for analyzing the current flow and occupancy of the city,
while long-term analysis can reveal trends and patterns related to specific
days, time or events in the city.
Security: Analyzing the use of sports arenas
The interest in analyzing and optimizing the use of public facilities in
cities has a large variety of applications in both indoor and outdoor spaces.
Here, the focus is on sports arenas, but other possible applications could be
libraries, museums, shopping malls, etc. The aim is to estimate the occupancy
of sports arenas in terms of the number of people and their positions in real
time. Potential use of this information is both online booking systems, and
post-processing of data for analyzing the general use of the facilities. For
the purpose of analyzing the use of the facilities, we also try to estimate the
type of sport observed based on people’s positions.
In indoor spaces, the temperature is often kept constant and cooler than
the human temperature. Foreground segmentation can therefore be accomplished by
automatic thresholding the image. In some cases, unwanted hot objects, such as
hot water pipes and heaters, can appear in the scene. In these situations,
background subtraction can be utilized.
Health and safety: Gas leaking location and event alert
Some public buildings of interest can be monitored with thermal cameras,
while gas or water leakage can be discovered before a hazardous situation
Localizing a suspected leak in a building can turn to be delicate,
sometimes requiring stopping the operations, if not probe walls or floors.
Whatever the mix of construction materials, thermal imaging can be the right
answer: in most cases, a leakage translates into an abnormal temperature
pattern. Thermal imaging is de facto a non-contact operation, increasing
inspector safety, capable of visualizing fluid leakage as well as electrical
dysfunction. Thermal imaging can of course also detect thermal bridges and, as
such, is a key tool to generate property investigation report.
Water leakage can be both hot and cold, and thermal imagers can catch
them both. It can sometimes be close to impossible to spot a water leak on your
own, especially when they are behind walls. That is why thermal cameras prevent
Traffic control: Traffic monitoring and specific event alert
As for monitoring heterogenous traffic, thermal imaging can be a
precious camera type reducing overall system costs and increasing reliability.
On contrary to Visible and NIR-based detectors, LWIR cameras are not affected
by the lighting conditions of the scene: e.g. night vs day, and sun
orientation. This remains true over long distances, enabling the detection of a
child, a biker, a car or a truck. Once coupled with relevant processing, LWIR
cameras turn to be a key asset of ITS, reducing the number of cameras while
increasing alarms reliability. This helps the manager on duty to take quickly
the right decision in case of e.g. obstacle detection, reverse direction
vehicle, abnormal traffic jam, etc. to ensure road-users security as well as
optimal commuting time.
Energy saving: Building occupancy
Monitoring building occupancy turns to be highly relevant for management
of commercial complex or public infrastructure: optimal adjustment of energy
supply, scheduling of maintenance services, as well as comfort and health of
It is also useful for sizing security services, and of crucial importance in
case of event requiring building evacuation. Advanced solutions, relying on
thermal sensors, integrate thermal imaging: low resolution detectors (detecting
presence / human activity) and/or a high-resolution thermal camera spotting
relevant doorways (for people counting / human activity characterization).
This time, our goal was to explain more the science behind thermal cameras and its applications. If there are some additional questions or anything else you would like to know about this topic, feel free to ask via mail or comments.
Air pollution is caused by solid and liquid particles and certain gases
that are suspended in the air. These particles and gases can come from car and
truck exhaust, factories, dust, pollen, mold spores, volcanoes and wildfires.
The solid and liquid particles suspended in our air are called aerosols.
Certain gases in the atmosphere can cause air pollution. For example, in
cities, a gas called ozone is a major cause of air pollution. Ozone is also a
greenhouse gas that can be both good and bad for our environment. It all
depends where it is in Earth’s atmosphere.
Ozone high up in our atmosphere is a good thing. It helps block harmful
energy from the Sun, called radiation. But, when ozone is closer to the ground,
it can be really bad for our health. Ground level ozone is created when
sunlight reacts with certain chemicals that come from sources of burning fossil
fuels, such as factories or car exhaust.
The major outdoor pollution sources include vehicles, power generation,
building heating systems, agriculture/waste incineration and industry. In
addition, more than 3 billion people worldwide rely on polluting technologies
and fuels (including biomass, coal and kerosene) for household cooking, heating
and lighting, releasing smoke into the home and leaching pollutants outdoors.
Air quality is closely linked to earth’s climate and ecosystems
globally. Many of the drivers of air pollution (i.e. combustion of fossil
fuels) are also sources of high CO2 emissions. Some air pollutants such as
ozone and black carbon are short-lived climate pollutants that greatly
contribute to climate change and affect agricultural productivity. Policies to
reduce air pollution, therefore, offer a “win-win” strategy for both climate
and health, lowering the burden of disease attributable to air pollution, as
well as contributing to the near- and long-term mitigation of climate change.
Air pollution can be significantly reduced by expanding access to clean
household fuels and technologies, as well as prioritizing: rapid urban transit,
walking and cycling networks; energy-efficient buildings and urban design;
improved waste management; and electricity production from renewable power
How does air pollution
affect our health?
Breathing in polluted air can be very bad for our health. Long-term
exposure to air pollution has been associated with diseases of the heart and
lungs, cancers and other health problems. That’s why it’s important for us to monitor
AI might be used to improve urban sustainability and quality of life. It
is about time that Artificial Intelligence is used for something important for
the whole planet. That is why we will talk about AI solutions that address the problem
of air pollution.
Air pollution – AI solutions
Artificial Intelligence for cleaner air in Smart Cities
In Singapore, where air pollution and related health costs are particularly high, a team of researchers investigated the possibility to combine the power of sensor technologies, Internet of things (IoT) and AI to get reliable and valid environmental data and feed better, greener policy-making. As reported by The Business Times, through the computation of real-time IoT sensor data measuring spatial and temporal pollutants, user-friendly air quality heat maps and executive dashboards can be created, and the most severe pollution hotspots can be determined with the help of machine learning algorithms for predictive modelling. This is the first step to take proactive actions towards further decarbonizing the economy, including incentives for virtuous businesses, the development of wiser land use plans, the revitalization of urban precincts, and more. (https://www.pdxeng.ch/2019/03/28/artificial-intelligence-for-cleaner-air-in-smart-cities/)
An Artificial Intelligence-Based Environment Quality Analysis System
The paper describes an environment quality analysis system based on a combination of some artificial intelligence techniques, artificial neural networks and rule-based expert systems. Two case
studies of the system use are discussed: air
pollution analysis and flood
forecasting with their impact on the environment and on the population
health. The system can be used by an environmental
decision support system in order to manage various environmental critical
situations (such as floods and environmental pollution), and to inform the
population about the state of the environment quality. (An Artificial
Intelligence-Based Environment Quality Analysis System – https://link.springer.com/chapter/10.1007/978-3-642-23957-1_55)
AI non-profit to track air pollution from every power plant in the world and make data public
A nonprofit artificial intelligence firm called WattTime is going to use satellite
imagery to precisely track the air pollution (including carbon emissions) coming out of every single power plant in
the world, in real time. And it’s going to make the data public. This system promises to effectively eliminate
poor monitoring and gaming of emissions data.
The plan is to use data from
satellites that make theirs publicly available, as well as data from a few
private companies that charge for their data. The images will be processed by
various algorithms to detect signs of
emissions. Google.org, Google’s philanthropic wing, is getting the project
off the ground…with a $1.7 million grant. WattTime made a splash earlier this
year with Automated Emissions Reduction.
AER is a program that uses real-time
grid data and machine learning to determine exactly when the grid is producing
the cleanest electricity.
Author: David Roberts, Vox,
Published on: 8 May 2019
A fresher breeze: How AI can help improve air quality
of our AI for Earth commitment, Microsoft
projects from Germany in the areas of environmental protection,
biodiversity and sustainability. In the next few weeks, we will introduce the
project teams and their innovative ideas that made the leap into our global programme and group of AI for Earth
AI for Earth
for Earth program helps researchers
and organizations to use artificial intelligence to develop new approaches to
protect water, agriculture, biodiversity and the climate. Over the next
five years, Microsoft will invest $ 50 million in “AI for Earth.” To become
part of the “AI for Earth” program, developers, researchers and organizations
can apply with their idea for a so-called “Grant”. If you manage to convince
the jury of Microsoft representatives, you´ll receive financial and
technological support and also benefit from knowledge transfer and contacts
within the global AI for Earth network. As part of Microsoft Berlin´s EarthLab
and beyond, five ideas have been convincing and will be part of our “AI for
Earth” program in the future in order to further promote their environmental
Artificial Intelligence For Air Quality Control Systems: A Holistic Approach
Recent environmental regulations introduced by the United States
environmental protection agency such as the Mercury Air Toxics Standards and
Hazardous Air Pollution Standards have challenged environmental particulate
control equipment especially the electro-static precipitators to operate beyond
their design specifications. The impact is exacerbated in power plants burning
a wide range of low and high-ranking fossil fuels relying on co-benefits from
upstream processes such as the selective catalytic reactor and boilers. To alleviate
and mitigate the challenge, this manuscript presents the utilization of modern and novel algorithms in machine learning and
artificial intelligence for improving the efficiency and performance of
electrostatic precipitators reflecting a holistic approach by considering
upstream processes as model parameters. In addition, the paper discusses
input relevance algorithms for neural networks and random forests such as
partial derivatives, input perturbation and GINI importance comparing their
performance and applicability for our case study. Our approach comprises of applying random forests and neural network
algorithms to an electrostatic precipitator extending the model to include
upstream process parameters such as the selective catalytic reactor and the
air heaters. To study variable importance differences and model generalization
performance between our employed algorithms, we developed a statistical approach to compare
features data distributions impact on input relevance.
Artificial intelligence based approach to forecast PM2.5 during haze episodes: A case study of Delhi, India
•Neural network and fuzzy
logic are combined for forecasting of PM2.5 during haze conditions.
•The haze occurs when the
level of PM2.5 is more than 50 μg/m3 and relative humidity is less than 90%.
•Neuro-fuzzy model is
capable for better forecasting of haze episodes over urbanized area than ANN
and MLR models.
Delhi has been listed as the worst performer across the world with
respect to the presence of alarmingly high level of haze episodes, exposing the
residents here to a host of diseases including respiratory disease, chronic
obstructive pulmonary disorder and lung cancer. This study aimed to analyze the haze episodes in a year and
to develop the forecasting methodologies
for it. The air pollutants, e.g.,
CO, O3, NO2, SO2, PM2.5 as well as meteorological parameters (pressure,
temperature, wind speed, wind direction index, relative humidity, visibility,
dew point temperature, etc.) have been used in the present study to analyze the
haze episodes in Delhi urban area. The nature of these episodes, their possible
causes, and their major features are discussed in terms of fine particulate
matter (PM2.5) and relative humidity. The correlation matrix shows that
temperature, pressure, wind speed, O3, and dew point temperature are the
dominating variables for PM2.5 concentrations in Delhi. The hour-by-hour
analysis of past data pattern at different monitoring stations suggest that the
haze hours were occurred approximately 48% of the total observed hours in the
year, 2012 over Delhi urban area. The haze hour forecasting models in terms of
PM2.5 concentrations (more than 50 μg/m3) and relative humidity (less than 90%)
have been developed through artificial
intelligence based Neuro-Fuzzy (NF) techniques and compared with the other
modeling techniques e.g., multiple linear regression (MLR), and artificial
neural network (ANN). The haze
hour’s data for nine months, i.e. from January to September have been chosen
for training and remaining three months, i.e., October to December in the year
2012 are chosen for validation of the developed models. The forecasted results
are compared with the observed values with different statistical measures,
e.g., correlation coefficients (R), normalized mean square error (NMSE),
fractional bias (FB) and index of agreement (IOA). The performed analysis has
indicated that R has values 0.25 for MLR, 0.53 for ANN, and NF: 0.72, between
the observed and predicted PM2.5 concentrations during haze hours invalidation
period. The results show that the artificial intelligence implementations have
a more reasonable agreement with the observed values. Finally, it can be
concluded that the most convincing advantage of artificial intelligence based
NF model is capable for better forecasting of haze episodes in Delhi urban area
than ANN and MLR models.
Artificial intelligence modeling to evaluate field performance of photocatalytic asphalt pavement for ambient air purification
In recent years, the application of titanium dioxide (TiO2) as a
photocatalyst in asphalt pavement has received considerable attention for purifying
ambient air from traffic-emitted pollutants via photocatalytic processes. In
order to control the increasing deterioration of ambient air quality, urgent
and proper risk assessment tools are deemed necessary. However, in practice,
monitoring all process parameters for various operating conditions is difficult
due to the complex and non-linear nature of air pollution-based problems.
Therefore, the development of models to
predict air pollutant concentrations is very useful because it can provide early warnings to the population and
also reduce the number of measuring
sites. This study used artificial
neural network (ANN) and neuro-fuzzy (NF) models to predict NOx
concentration in the air as a function of traffic count (Tr) and climatic
conditions including humidity (H), temperature (T), solar radiation (S), and
wind speed (W) before and after the application of TiO2 on the pavement
surface. These models are useful for modeling because of their ability to be
trained using historical data and because of their capability for modeling
highly non-linear relationships. To build these models, data were collected
from a field study where an aqueous nano TiO2 solution was sprayed on a
0.2-mile of asphalt pavement in Baton Rouge, LA. Results of this study showed
that the NF model provided a better fitting to NOx measurements than the ANN
model in the training, validation, and test steps. Results of a parametric
study indicated that traffic level, relative humidity, and solar radiation had
the most influence on photocatalytic efficiency.
Neuro Fuzzy Modeling Scheme for the Prediction of Air Pollution
The techniques of artificial intelligence based in fuzzy logic and neural networks are frequently applied together.
The reasons to combine these two paradigms come out of the difficulties and
inherent limitations of each isolated paradigm. Hybrid of Artificial Neural Networks (ANN) and Fuzzy Inference Systems
(FIS) have attracted the growing interest of researchers in various
scientific and engineering areas due to the growing need of adaptive intelligent systems to solve the
real world problems. ANN learns from scratch by adjusting the
interconnections between layers. FIS is a popular computing framework based on
the concept of fuzzy set theory, fuzzy if-then rules, and fuzzy reasoning. The
structure of the model is based on three-layered neural fuzzy architecture with
back propagation learning algorithm. The main objective of this paper is two
folds. The first objective is to develop
Fuzzy controller, scheme for the prediction of the changing for the NO2 or SO2,
over urban zones based on the measurement of NO2 or SO2 over defined industrial
sources. The second objective is to
develop a neural net, NN; scheme for the prediction of O3 based on NO2 and SO2
Sensing the Air We Breathe — The OpenSense Zurich Dataset
Monitoring and managing
urban air pollution is a significant challenge for the sustainability of our
environment. We quickly survey the air
pollution modeling problem, introduce a
new dataset of mobile air quality measurements in Zurich, and discuss the
challenges of making sense of these data.
This article is good for
getting started and gives a dataset to work with!
Development of artificial intelligence based NO2 forecasting models at Taj Mahal, Agra
The statistical regression
and specific computational intelligence based models are presented in this paper for the forecasting of hourly NO2 concentrations at a historical monument
Taj Mahal, Agra. The model was developed for the purpose of public health oriented air quality
forecasting. Last ten–year air pollution data analysis reveals that the
concentration of air pollutants increased significantly. It is also observed
that the pollution levels are always higher during the months of November at
around Taj Mahal, Agra. Therefore, the hourly observed data during November
were used in the development of air
quality forecasting models for Agra, India. Firstly, multiple linear
regression (MLR) was used for
building an air quality–forecasting model to forecast the NO2 concentrations at
Agra. Further, a novel approach, based on regression models, principal
component analysis (PCA) was
analyzed to find the correlations of different predictor variables between
meteorology and air pollutants. Then, the significant variables were taken as
the input parameters to propose the reliable physical artificial neural network
(ANN)-multi layer perceptron model for forecasting of air pollution in Agra. MLR and PCA–ANN models were evaluated
through statistical analysis. The correlation coefficients (R) were 0.89 and
0.91 respectively, for PCA–ANN and were 0.69 and 0.89 respectively for MLR in
the training and validation periods. Similarly, the values of normalized mean
square error (NMSE), index of agreement (IOA) and fractional bias (FB) were in
good agreement with the observed values. It was concluded that PCA–ANN model
performs better and can be used for forecasting air pollution at Taj Mahal,
A Novel Air Quality Early-Warning System Based on Artificial Intelligence
The problem of air pollution is a persistent issue for mankind and
becoming increasingly serious in recent years, which has drawn worldwide attention.
Establishing a scientific and effective
air quality early-warning system is really significant and important.
Regretfully, previous research didn’t thoroughly explore not only air pollutant prediction but also air
quality evaluation, and relevant research work is still scarce, especially
in China. Therefore, a novel air quality
early-warning system composed of prediction
and evaluation was developed in this
study. Firstly, the advanced data preprocessing technology Improved Complete Ensemble Empirical Mode Decomposition with Adaptive
Noise (ICEEMDAN) combined with the powerful swarm intelligence algorithm Whale Optimization Algorithm (WOA) and
the efficient artificial neural network
Extreme Learning Machine (ELM) formed the prediction model. Then the
predictive results were further analyzed by the method of fuzzy comprehensive evaluation, which offered intuitive air quality
information and corresponding measures. The proposed system was tested in the
Jing-Jin-Ji region of China, a representative research area in the world, and
the daily concentration data of six main air pollutants in Beijing, Tianjin,
and Shijiazhuang for two years were used to validate the accuracy and
efficiency. Therefore, the proposed system is believed to play an important
role in air pollution control and smart city construction all over the world in
How AI and IoT could help people combat air pollution issues
It is with little surprise that the UN’s 2019 World Environment Day Is a
call to action to #beatairpollution.
IT, as a sector, influences air quality
in terms of the energy used to drive our electronics, data centers and, indeed,
through business travel. With a large-scale industry presence in Asia, home to
some of the most polluted cities in the world, we need to do what we can to
minimize these impacts.
But technology can also be
part of the solution. Last year, Capgemini
announced a new global ambition to leverage technology to help organizations
with their sustainability challenges, recognizing that this is the biggest
impact we can make. Technology can be an
enabler to help address prevention at source, helping organizations optimize
their operations and reduce their impact. But with 4.2 million deaths every
year as a result of exposure to ambient outdoor air pollution, how can we also
leverage technology to monitor, inform, and ultimately change the behaviors of
those most affected as they head into our many cities?
The advances in technology give us the opportunity to reach people
directly and build a more sophisticated
monitoring and communication network. We could leverage both artificial intelligence (AI) and the internet of things
(IOT) with the capabilities from an increasing range of personal devices whether
it be the 2.5 billion smart phones or the estimated 278 million smart watches
in the world. Indeed, the wearable health and fitness technology sector is
set to grow 10–20% in the next five years, with an expanding set of
capabilities. These devices measure elements such as heart rate, blood
pressure, and breathing rate, which are indicators of overall health and are
also measurables that change with exposure to air pollutants such as PM,
nitrogen oxide and sulfur oxides. Yet they also monitor spatial and GPS data,
which if combined could demonstrate the impact of the external environment on
health factors, and better inform people of the issues. Data from different
sources and AI technology could allow us to drill down on very local issues.
If we overlay current air quality monitoring data sources onto an
individual, it would allow us to give a very precise prediction of local air
quality issues. We could then integrate
AI, to both refine and include a wider range of factors such as weather
conditions and traffic levels. Added
to this, if automatic number plate recognition (ANPR) is integrated, we could
discern the proportion of vehicle fuel types being used in specific locations.
This is important because diesel vehicles emit 90% of particulate matter.
Data analytics over time would allow people to
understand impacts on their health – and change behavior.
Over time, as an individual’s health and diagnostics data are inputted
into a data analytics model alongside their own spatial data and air pollution
exposure data, they could receive an analysis of how air pollution is impacting
their physiology. Based on this, they could receive tailored suggested actions to take as well. The ability to overlay a Google Map of your walk to school or work to
the air quality data around you could, instead of highlighting traffic
congestion, show air quality issues and provide the options to re-route to
avoid, or offer alternative options for time to start a journey.
So, this time we listed some novel
AI solutions for solving the environmental air pollution problem. Next time
we talk about this topic, expect the idea
how we are going to include Smart Imaging
and AI in Smart city solution for cleaner air. Do you have any suggestions?
Artificial intelligence is now a part of new, more useful applications and it is getting better. In this blog post we will present you some of these new and interesting AI apps. And, let us just inform you that, from this blog post, every couple of months, we will show and discuss news and trends in image processing field, including new papers, research and applications!
And now, let’s start with news from our favorite, NVIDIA. What is NVIDIA up to?
AI can Detect Open Parking Spaces
With as many as 2 billion parking spaces in the United States, finding an open spot in a major city can be complicated. To help city planners and drivers more efficiently manage and find open spaces, MIT researchers developed a deep learning-based system that can automatically detect open spots from a video feed.
“Parking spaces are costly to build, parking payments are difficult to enforce, and drivers waste an excessive amount of time searching for empty lots,” the researchers stated in their paper.
New AI Imaging Technique Reconstructs Photos with Realistic Results
from NVIDIA, led by Guilin Liu, introduced a state-of-the-art deep learning
method that can edit images or
reconstruct a corrupted image, one
that has holes or is missing pixels. The method can also be used to edit
images by removing content and filling in the resulting holes. The method,
which performs a process called “image inpainting”, could be implemented in
photo editing software to remove unwanted content, while filling it with a
realistic computer-generated alternative.
“Our model can robustly handle holes of any shape, size location, or distance from the image borders. Previous deep learning approaches have focused on rectangular regions located around the center of the image, and often rely on expensive post-processing,” the NVIDIA researchers stated in their research paper.
AI Can Now Fix Your Grainy Photos by Only Looking at Grainy Photos
What if you could take your photos that were originally taken in low light and automatically remove the noise and artifacts? Have grainy or pixelated images in your photo library and want to fix them? This deep learning-based approach has learned to fix photos by simply looking at examples of corrupted photos only. The work was developed by researchers from NVIDIA, Aalto University, and MIT, and was presented at the International Conference on Machine Learning in Stockholm, Sweden.
deep learning work in the field has focused on training a neural network to
restore images by showing example pairs of noisy and clean images. The AI then
learns how to make up the difference. This method differs because it only
requires two input images with the noise or grain.
Without ever being shown what a noise-free image looks like,
this AI can remove artifacts, noise, grain, and automatically enhance your
“It is possible to learn to restore signals without ever observing clean ones, at performance sometimes exceeding training using clean exemplars,” the researchers stated in their paper.
AI Model Can Generate Images from Natural Language Descriptions
To potentially improve natural language queries, including the retrieval of images from speech, Researchers from IBM and the University of Virginia developed a deep learning model that can generate objects and their attributes from natural language descriptions.
show that under minor modifications, the proposed framework can handle the
generation of different forms of scene representations, including cartoon-like
scenes, object layouts corresponding to real images, and synthetic images,”
the researchers stated in their paper.
Abstract: Breast cancer is the most common malignant disease in women worldwide. In recent decades, earlier diagnosis and better adjuvant therapy have substantially improved patient outcome. Diagnosis by histopathology has proven to be instrumental to guide breast cancer treatment, but new challenges have emerged as our increasing understanding of cancer over the years has revealed its complex nature. As patient demand for personalized breast cancer therapy grows, we face an urgent need for more precise biomarker assessment and more accurate histopathologic breast cancer diagnosis to make better therapy decisions. The digitization of pathology data has opened the door to faster, more reproducible, and more precise diagnoses through computerized image analysis. Software to assist diagnostic breast pathology through image processing techniques have been around for years. But recent breakthroughs in artificial intelligence (AI) promise to fundamentally change the way we detect and treat breast cancer in the near future. Machine learning, a subfield of AI that applies statistical methods to learn from data, has seen an explosion of interest in recent years because of its ability to recognize patterns in data with less need for human instruction. One technique in particular, known as deep learning, has produced groundbreaking results in many important problems including image classification and speech recognition. In this review, we will cover the use of AI and deep learning in diagnostic breast pathology, and other recent developments in digital image analysis.
Predicting tool life in turning operations using neural networks and image processing
Abstract: A two-step method is presented for the automatic prediction of tool life in turning operations. First, experimental data are collected for three cutting edges under the same constant processing conditions. In these experiments, the parameter of tool wear, VB, is measured with conventional methods and the same parameter is estimated using Neural Wear, a customized software package that combines flank wear image recognition and Artificial Neural Networks (ANNs). Second, an ANN model of tool life is trained with the data collected from the first two cutting edges and the subsequent model is evaluated on two different subsets for the third cutting edge: the first subset is obtained from the direct measurement of tool wear and the second is obtained from the Neural Wear software that estimates tool wear using edge images. Although the complete-automated solution, Neural Wear software for tool wear recognition plus the ANN model of tool life prediction, presented a slightly higher error than the direct measurements, it was within the same range and can meet all industrial requirements. These results confirm that the combination of image recognition software and ANN modelling could potentially be developed into a useful industrial tool for low-cost estimation of tool life in turning operations.
Automatic food detection in egocentric images using artificial intelligence technology
Objective:To develop an artificial intelligence (AI)-based algorithm which can automatically detect food items from images acquired by an egocentric wearable camera for dietary assessment.
Design:To study human diet and lifestyle, large sets of egocentric images were acquired using a wearable device, called eButton, from free-living individuals. Three thousand nine hundred images containing real-world activities, which formed eButton data set 1, were manually selected from thirty subjects. eButton data set 2 contained 29 515 images acquired from a research participant in a week-long unrestricted recording. They included both food- and non-food-related real-life activities, such as dining at both home and restaurants, cooking, shopping, gardening, housekeeping chores, taking classes, gym exercise, etc. All images in these data sets were classified as food/non-food images based on their tags generated by a convolutional neural network.
Results:A cross data-set test was conducted on eButton data set 1. The overall accuracy of food detection was 91·5 and 86·4 %, respectively, when one-half of data set 1 was used for training and the other half for testing. For eButton data set 2, 74·0 % sensitivity and 87·0 % specificity were obtained if both ‘food’ and ‘drink’ were considered as food images. Alternatively, if only ‘food’ items were considered, the sensitivity and specificity reached 85·0 and 85·8 %, respectively.
Conclusions: The AI technology can automatically detect foods from low-quality, wearable camera-acquired real-world egocentric images with reasonable accuracy, reducing both the burden of data processing and privacy concerns.
Bioinformatics and Image Processing—Detection of Plant Diseases
This paper gives an idea of how a combination of image processing along with bioinformatics detects deadly diseases in plants and agricultural crops. These kinds of diseases are not recognizable by bare human eyesight. First occurrence of these diseases is microscopic in nature. If plants are affected with such kind of diseases, there is deterioration in the quality of production of the plants. We need to correctly identify the symptoms, treat the diseases, and improve the production quality. Computers can help to make correct decision as well as can support industrialization of the detection work. We present in this paper a technique for image segmentation using HSI algorithm to classify various categories of diseases. This technique can also classify different types of plant diseases as well. GA has always proven itself to be very useful in image segmentation.
And, at the end, some news from public sector and applied algorithms:
China Now has Facial Recognition Based Toilets
China has integrated facial recognition in the toilets across the country. Citizens now need WeChat or face scans to get the toilet papers. People will stand in the yellow recognition spot and will bring their face near the face identification machine. Then after about three seconds, 90 centimeters of toilet paper will come out. People will then go in and use the toilet but only for limited time as alarm will buzz if someone occupies it for too long. In toilet, sensors will assess ammonium amount and spray a deodorant if required. The two bathrooms integrated with face scanners for being “clean and convenient,” and “reducing toilet paper waste.”
Apple’s Camera-Toting Watch Band Uses Facial Recognition For Flawless FaceTime Calls
Patent and Trademark Office granted a patent to Apple which says that the tech
titan wants to widen the set of attributes of its wearable, by integrating an
original camera system with the ability to automatically crop subject matter,
trace objects such as user’s face and produce angle-adjusted avatars for
FaceTime calls. “Image-capturing watch” U.S. Patent No. 10,129,503 of Apple
tells a software and hardware solution that creates a camera-toting Apple
Watch, that is both handy and feasible. Using a camera-toted Watch, consumers can put aside a heavy handheld
device while playing sports, exercising or doing other energetic activities.
However, a feasible smartwatch solution is hard to accomplish. The camera
captures the motion data and then the watch processes it, after which it is
mapped onto the computer produced picture, which imitates a consumer’s facial
movements and expressions in real time. On the other hand, source movement data
can be utilized to tell about the motion of inhuman avatars such as Apple’s
Memoji and Animoji. It still remains unknown whether Apple wants to integrate
its Apple Watch camera band tech.
Metropolitan Police London is to Integrate Face Recognition Tech
London’s police will integrate face recognition tech as an experiment for two days. In the areas of Leicester Square, Piccadilly Circus, and Soho in London, the technology will examine crowds’ faces and compare them with the database of individuals wanted by the courts and Metropolitan Police in London. If the tech founds a match, the police officers in that field will analyze it and perform further tests to make sure the identity of that individual.
That’s all for now folks. But, tell me, what do you think, what are some areas where AI is going to bring most benefits? What are areas, by your opinion where there is space for more research? Can you actually believe that it is possible to have AI solutions in every day life?
All news are citations from the mentioned sites, where you can find the whole text about the topic.
In image processing, as an area in signal processing, modellingthe data and expected values is very important in all kinds of applications. So, the data represent the problem that needs to be addressed. That is why it is necessary for us to know what kind of data to expect, and what are some values that are the result of some measurement errors, faulty data, erroneous procedures, or simply what are the areas where a certain theory might not be valid. So, to improve the model and gain better results of our applications, we must recognize and deal with outliers in the data.
In statistics, an outlier is a
data point that differs significantly from other observations. Outliers in the data can be very dangerous, since they change the classical data statistics,
such as mean value and variance of the
data. This affects the results
of an algorithm of any kind (image processing, machine learning, deep learning algorithm…).
So, when modeling, it is extremely importantto clean the data sample to ensure
that the observations best represent the problem.
How to deal with outliers in the data
The thing we know about outliers is that they do not
fit the model we assumed, but we don’t know anything else about them, when
they will appear or what value will have. We just know that we must stop them messing with our results. But how?
First step in determining
the outliers is getting to know the data
for the specific application. So, we must have some test dataset and start from there.
The next step is to find the data distribution
(according to the available dataset), which can be tricky task sometimes. Let us assume
that the data have normal (Gaussian)
When we are familiar with the distribution of the data, now we can identify outliers more easily. So, there is no precise way to define and identify outliers in general, but we must know how to define them for our specific application.
We can now use statistical methods to identify observations that appear to be rare or unlikely given the available data. Outliers can occur by chance in any distribution, but they often indicateeithermeasurementerror or that the population has a heavy-tailed distribution.
In the former case one wishes to discard them or use statistics that are robust to outliers, while in the latter case they indicate that the distribution has high skewness and that one should be very cautious in using tools or intuitions that assume a normal distribution.
In most larger samplings of data, some data points will be further away from the sample mean than what is deemed reasonable. This can be due to incidental systematic error or flaws in the theory that generated an assumed family of probability distributions, or it may be that some observations are far from the center of the data. In large samples, a small number of outliers is to be expected (and not due to any anomalous condition).
Now, we can deal with outliers. We can remove them from our dataset if we are dealing with the offline applications. But, on the other hand, if we are dealing with the real time online processing than we must use some procedures, in order to make our application more robust.
Maybe one thinks that a simple
way to handle outliers is to detect them and remove them from the data set.
Deleting an outlier, although better than doing nothing, still poses a number of problems:
When is deletion justified?
Deletion requires a subjective decision.
When is an observation
“outlying enough” to be deleted?
The user or the author of
the data may think that “an observation is an observation” (i.e., observations
should speak for themselves) and hence feel uneasy about deleting them.
Since there is generally
some uncertainty as to whether an observation is really atypical, there is a
risk of deleting “good” observations, which results in underestimating data
Since the results depend on the user’s subjective decisions, it is difficult to determine the statistical behavior of the complete procedure.
Let’s say something about normal
distribution assumption. It is very common to assume the Gaussian
distribution in different kinds of an engineering
problems. The most widely used model
formalization is the assumption that the observed data have a normal (Gaussian) distribution. This
assumption has been present in statistics as well as engineering for two
centuries and has been the framework for all the classical methods in
regression, analysis of variance and multivariate analysis. The main justification for assuming a
normal distribution is that it gives an approximate
representation to many real data sets, and at the same time is theoretically quite convenient because
it allows one to derive explicit formulas for optimal statistical methods such
as maximum likelihood, likelihood ratio tests, etc. We refer to such methods as
classical statistical methods and
note that they rely on the assumption that normality holds exactly. The classical statistics are by modern
computing standards quite easy to
compute. Unfortunately, theoretical and computational convenience does not always
deliver an adequate tool for the practice of statistics and data analysis. It often happens in practice that an
assumed normal distribution model (e.g., Standard
Kalman filter) holds approximately in
that it describes the majority of observations, but some observations
follow a different pattern or no pattern at all.
Now, we know that such atypical
data are called outliers, and even a single outlier can have a large
distorting influence on a classical statistical method that is optimal
under the assumption of normality or linearity. The kind of “approximately”
normal distribution that gives rise to outliers is one that has a normal shape
in the central region but has tails that are heavier or “fatter” than those of
a normal distribution. One might naively
expect that if such approximate normality holds, then the results of using a
normal distribution theory would also hold approximately. This is unfortunately
not the case.
The robust approach to statistical modeling and data analysis aims
at deriving methods that produce reliable parameter estimates and associated
tests and confidence intervals, not only when the data follow a given
distribution exactly, but also when this happens only approximately in the
sense just described.
Robust methods fit the bulk
of the data well: if the data contain no outliers the robust method gives
approximately the same results as the classical method, while if a small
proportion of outliers are present the robust method gives approximately the
same results as the classical method applied to the “typical” data. As a consequence of fitting the bulk of the
data well, robust methods provide a very reliable method of detecting outliers,
even in high-dimensional multivariate situations.
We note that one approach to dealing with outliers is the diagnostic approach. Diagnostics are
statistics generally based on classical estimates that aim at giving numerical
or graphical clues for the detection of data departures from the assumed model.
There is a considerable literature on outlier diagnostics, and a good outlier diagnostic
is clearly better than doing nothing. However, these methods present two drawbacks. One is that they are in
general not as reliable for detecting outliers as examining departures from a
robust fit to the data. The other is that, once suspicious observations
have been flagged, the actions to be taken with them remain the analyst’s personal
decision, and thus there is no objective
way to establish the properties of the result of the overall procedure.
Robust methods have a long
history that can be traced back at
least to the end of the nineteenth century. But the first great steps forward
occurred in the 1960s, and the early 1970s with the fundamental work of John Tukey (1960, 1962), Peter Huber (1964, 1967) and Frank Hampel (1971, 1974). The
applicability of the new robust methods proposed by these researchers was made possible
by the increased speed and accessibility of computers.
In this post we will not talk about Robust Statistics any more. If you want to find out more, a new post will be published soon, or you can get some information from the references given at the end. This was just a beginning and a warm up for those who need more information about getting started in designing more robust applications.
If you are a beginner in the area of the image and video processing, you may often hear the term real time processing. In this post, we will try to explain the term and list some typical concernsrelated to this term.
Real time image processing is related with typical frame rate. Current standard for capture is typically 30 frames per second. Real time processing would require processing all the frames as soon as they are captured. So broadly speaking, if capture rate is 30 FPS then 30 frames needs to be processed in one second. That comes to around 33 milliseconds (1000 ms / 30 frames = 33 ms/frame). Similar calculation can be done for any frame rate to get required processing time per frame.
In image and video processing, the source of our signal is a camera. So, what real time image processing really means is: produce output simultaneously with the input. What is actually meant is that the algorithm will run at the rate of the source (e.g. a camera) supplying the images, so the algorithm can process images at the frame rate of the camera.
Just out of curiosity, let’s see how
the human vision works:
The first thing to understand is that we perceive different aspects of vision differently. Detecting motion is not the same as detecting light.
Another thing is that different parts of
the eye perform differently. The center of vision is good at different
stuff than the periphery. And another thing is that there are natural, physicallimits to what
we can perceive. It takes time for the light that passes through your cornea to become information on which your brain can act, and our brains
can only process that information at a certain speed.
Another important concept:
the whole of what we perceive is greater than what any one
element of our visual system can achieve. This point is fundamental to
understanding our perception of vision.
The temporal sensitivity and resolution of human vision varies depending on the type and characteristics of visualstimulus, and it differs between individuals. The human visual system can process 10 to 12 images per second and perceive them individually, while higher rates are perceived as motion. Modulated light (such as a computer display) is perceived as stable by the majority of participants in studies when the rate is higher than 50 Hz through 90 Hz. This perception of modulated light as steady is known as the flicker fusion threshold. However, when the modulated light is non-uniform and contains an image, the flicker fusion threshold can be much higher, in the hundreds of hertz. Regarding image recognition, people have been found to recognize a specific image in an unbroken series of different images, each of which lasts as little as 13 milliseconds. Persistence of vision sometimes accounts for very short single-millisecond visual stimulus having a perceived duration of between 100 ms and 400 ms. Multiple stimuli that are very short are sometimes perceived as a single stimulus, such as a 10 ms green flash of light immediately followed by a 10 ms red flash of light perceived as a single yellow flash of light.
The real-time aspect is critical in many real-world devices or products such as mobile phones, digital
still/video/cell-phone cameras, portable media players, personal digital
assistants, high-definition television, video surveillance systems, industrial
visual inspection systems, medical imaging devices, vision-assisted intelligent
robots, spectral imaging systems, and many other embedded image or video
With the increasing capabilities of imaging systems like cameras with very high-density captures having 16 or more megapixels, it is extremelydifficult to get real time performance for many applications.
What applications need real time performance and
what applications do not:
When talking about the numerous applications of image and video processing, we can say that some applications in some systems need real time processing, and some don’t. That is why we will talk about online (real time) and offline processing.
Offline processing is processing already recorded video sequence or image. So, digital
video stabilization, video enhancement, video coloring, or any application can
work with already prepared video.
These applications can be found in marketing, industry, medical imaging, film
industry or in some ordinary commercial applications, such as a user that wants
to stabilize and enhance some video from the phone library.
Offline processing enables using
more complex and computationally demanding algorithms, therefore usually
gives better results than real time processing. That is why offline processing
tools are used a lot in academic research and in some kinds of challenges.
Some of Deep Learning tools for offline processing (on CPU) are:
On the other hand, some applications have a demand for real time processing. For example, traffic monitoring,
target tracking in military applications, surveillance and monitoring, real
time video games, etc. are apps that demand real time feedback and processed
image from sensor.
The algorithms that work in real timedo not have the luxury of high complexity, since the processing time for each frame is determined by source frame rate and resolution. New hardware solutions nowadays offer better processing speeds, but there are still limitations, depending of the specific application.
Systems with multiple complex applications
working in parallel:
Sometimes the application demands multiple
complex algorithms working in parallel. That is the time when not only the
complexity of the algorithms is considered, but also which algorithm will be processed
first and how this affects the desired performance of the application. One good
example is when video enhancement and
digital video stabilization algorithm work in parallel.
Video stabilization and video dehazing algorithms in the same video processing pipeline can affect the results of each other. This interesting topic is described in a paper [Dehazing Algorithms Influence on Video Stabilization Performance] given in references at the end of the post. When there is no severe haze, noise or low contrast in the scene, it is important to perform video stabilization algorithm prior to video dehazing algorithm. On the other hand, when the feature level in the scene is low, which happens because of severe haze or low contrast in image, the stabilization algorithm cannot perform well, since it cannot calculate global motion accurately. That is why, for the sake of the better stabilization performance, the proposed pipeline performs video dehazing algorithm prior to video stabilization.
At the end, we will mention
some of the possibilities for real time
image processing platforms:
FPGA – very good for complex parallel operations, example of the application
in paper [High-performance electronic image stabilization for shift and rotation
correction] given in references.
Nvidia Jetson TX1, TX2, Xavier –
“Get real-time Artificial Intelligence (AI) performance where you need it most with the high-performance, low-power NVIDIA Jetson AGX systems. Processing of complex data can now be done on-board edge devices. This means you can count on fast, accurate inference in everything from robots and drones to enterprise collaboration devices and intelligent cameras. Bringing AI to the edge unlocks huge potential for devices in network-constrained environments.” – from Nvidia site, given in references.
Computers today can do almost everything we imagine with images and videos. Deep Learning, as a powerful tool in the area of Artificial Intelligence, can be very helpful state of the art gadget in Image Processing.
Applicationareas are numerous: from civil and industrial applications such as mobile telephone industry, augmented reality, home automation, gaming, retail and infotainment to some serious surveillance applications. If you ever wondered what is behind some Facebook, Amazon or Pinterest algorithms for image classification and searches, it is Deep Learning.
Here are some concrete ideas of what we can actually accomplish with this tool:
Note: If you are not familiar with the theory behind Deep Learning (read our next text about Deep Learning theory and Convolutional Neural Networks (CNN))
the stage where we prepare our images
for some future applications (from preparation
to post on Instagram or showing on
surveillance monitors, to preparation to use them in some complex applications, such as tracking, video
1.Noise Reduction for Image Restoration:
process can go like this:
Prepare dataset, which consists of clean and corrupted image
pairs (used to train a CNN)
Given a noisy image, predict clean image (using CNN)
Learn how to map corrupted image patches to clean
ones, implicitly capturing the characteristic appearance of noise in natural
Here is an example of thermal
2. Image dehazing:
The presence of haze (dust, smoke, fog, mist, rain,
snow…) directly influencesvisibility of the scene by reducing
contrast and obscuring objects. In severe haze conditions the image can practically loose the most of visual information.
This problem is where
powerful optics cannot provide solution, and digital image processing is a must.
Video dehazing based on
haze imaging includes estimation of haze
transmission map, which needs to be “subtracted” or removed from the hazy
3. Image Enhancement:
Improve the visibility in the night conditions with
Improve the visibility in low contrast conditions
This actually means the sequence can be processed in real time (real time processing), frame by frame (and we should monitor the processing time between frames), or the sequence can be recorded and then processed, so there is no need for taking care of processing time.
In surveillance and
monitoring, it is necessary to have real
time video processing, which affects
the complexity of the algorithm used. On the other hand, when we need post-processing in, for example, our
recorded videos, complexity of the algorithms can rise.
10. Object Tracking
Object tracking has always
been a challenging problem in a
field of computer vision.
Popular challenges or contests like the Visual
Object Tracking (VOT) challenge and Visual Object Tracking in Thermal Infrared VOT-TIR
challenge are a proof that object tracking is an ongoing demanding problem.
Deep Learning methods can also be applied in the task of single or multiple object tracking.
11. Digital Video Stabilization
Digital video stabilization
is a task of removing jitter and shaking
in video, due to unwanted camera motion (because of holding camera in hand
or platform where camera is mounted is shaking)
Digital video stabilization is a task that can be performed on surveillance
or monitoring cameras, therefore in real
time or we might want to remove hand-shaking from our pre-recorded videos (offline stabilization).
Deep Learning framework enables faster
algorithms, which can be applied succesfully in real time processing.
12. Video enhancement
Image enhancement techniques, mentioned in pre-processing section, can also
be applied on the video sequence, in
the same way.
The only concern can be about the complexity of the algorithm, in the
case of real time processing.
This would be all for the
post today, but if you want to study the topic more, here are some scientific
papers that talk about the mentioned algorithms.