Accelerating Operational Intelligence and Production Using Computer Vision
Discover how Computer Vision works and how it can help upgrade operational intelligence and production, thus saving companies millions annually.
Computer vision has found application in several domains, with the most noticeable being in robotics. However, it is important to consider "perceptual" or "semantic" ambiguities" for its application in cyber-physical systems - they arise due to the limitations of computer vision and other related disciplines.
This article aims at reviewing some of these perceptual ambiguities and how they affect the performance and success of a computer vision system (particularly one implemented for cyber-physical systems), along with possible methods to overcome them.
What is Computer Vision?
Computer vision refers to the construction of a model that describes the world in terms of visual primitives, such as edges or surfaces. Image processing is involved in transforming an image into one of these visual primitives.
Semantic ambiguity is where one interpretation of the primitive is not always preferred over another. For instance, consider dealing with images containing shadows on top of objects whose colors are similar to or darker than those of their shadows. In this case, there are several plausible interpretations regarding the actual color of the object being displayed. Other examples include ambiguous diagonal lines and curves, which can be interpreted as "t" shaped objects instead.
Consider segmenting an image containing text written horizontally, vertically, or at some angle θ relative to the horizontal axis using standard segmentation techniques. Suppose the angle of the text is θ = 22.5 degrees. Now, when looking at the vertical boundaries between letters, they are not vertical anymore and appear inclined by this angle. This will affect the performance of a system trying to identify these boundaries.
Perceptual ambiguity is where an interpretation (of the primitive) is preferred over another and also affects the performance of a computer vision system in detecting primitives from images.
Consider that the text is perfectly straight but changing in its color from black to white. In this case, when looking at the vertical boundaries between letters, they are tilted by this angle due to perceptual ambiguity since it appears tilted instead of perfectly vertical.
This will not affect the performance of a system in identifying these boundaries because there is only one interpretation for these boundaries.
In general, a computer vision system is an intelligent system capable of understanding its environment and executing actions to maximize performance or optimize a task. However, it also faces various challenges due to ambiguity in images captured by video cameras.
Such ambiguities present a unique set of problems that diminish the performance and success of this imaging technology applied to computer vision applications such as scene reconstruction, traffic surveillance, autonomous navigation, robot localization, etc.
In most common cases, visual information from cameras is processed offline with limited interaction from their environments under controlled conditions. This makes it possible to augment data before processing which helps reduce ambiguity by providing additional information about objects observed in images.
Perceptual and Semantic Ambiguity in 3D Reconstruction
3D reconstruction is a computer vision process involving generating data describing the geometry and appearance of objects within an observed scene. This often involves estimating 3D representations from 2D images captured by video cameras.
This process faces various challenges due to ambiguity in images due to both semantic and perceptual ambiguities. To capture the 3D information, multiple images should be available for each object of interest.
However, real-world conditions such as lack of texture on surfaces, transparent or reflective surfaces can cause severe problems for 3D reconstruction algorithms. In addition to this, there can also be changes in viewpoint which further increases ambiguity when using stereo vision.
There are two types of ambiguous geometries, namely semantic and perceptual. Semantic ambiguity is the one that involves different interpretations for a single observed primitive.
For example, a straight line segment in an image can be interpreted as a curve or an edge depending on its orientation relative to the camera. In general, semantic ambiguities affect the performance of 3D reconstruction algorithms since it leads to multiple possible primitives that should be considered while processing images.
Conversely, perceptual ambiguities are caused by ambiguous shapes leading to ambiguous/multiple interpretations of actual object surfaces from images.
For example, a white circular patch of an object on a blue background would be perceived as an ellipse rather than a circle. Perceptual ambiguities result in more severe problems for 3D reconstruction algorithms since it leads to multiple possible interpretations that should be considered together with semantic ambiguity.
The combined effects of task and environment (static vs. dynamic) lead to different levels of ambiguity which can change over time. Thus, perceptual and semantic ambiguities depend on the task under consideration and specific environmental conditions.
What is Operational Intelligence?
Operational intelligence aims to help in strategic and tactical decision-making for the operations of modern organizations. It involves the usage of software agents that analyze multiple data sources such as text messages, emails, sensor feeds, etc., and extract pertinent information related to domain-specific problems in real-time.
Computer vision and machine learning techniques have been extensively used in various fields such as navigation, autonomous robots, games, surveillance systems, etc.
These techniques also offer significant potential for application in the point field, which requires an understanding of dynamic environments through analysis and prediction of actions performed by humans within them.
This enables the development of operational intelligence applications such as computer vision search engines, event detection systems, or collaborative filtering services.
Perceptual and semantic ambiguities affect the performance of 3D reconstruction algorithms since it leads to multiple possible primitives that should be considered while processing images.
Vision Meets Robotics
Combining the benefits of robotics and computer vision can be a very effective approach to solve problems in various fields such as navigation, surveillance, and object recognition. For example, robots with cameras can take images and videos of their environment, making it easy for them to create an accurate map.
However, precise localization of objects within these maps is still a challenging task since there are many possible interpretations of the visual information provided by the robot's sensors.
Computer vision techniques can help since they provide automatic interpretations for this visual information that might otherwise require human intervention.
The availability of real-time data from various sensors on mobile robots provides an opportunity to develop efficient systems with location awareness capabilities. These systems are known as cognitive mapping systems.
How Computer Vision Can Streamline Operations
Modern organizations have an overabundance of data that needs to be processed and analyzed to make meaningful decisions. There is a dire need for systems that can efficiently process this large amount of information available from various sources in today's competitive market.
This would enable the development of new business models by providing real-time insights into business operations such as customer behavior or financial situation.
In addition, these systems can also help with various operational tasks such as tracking products throughout the supply chain, energy management, etc.
These will lead to improved efficiency and lower human error since a significant number of decisions are made based on visual information received from different sensors.
Also, there is a need for effective and efficient ways for integrating various existing systems into an overall operational intelligence system. This would allow organizations to properly use the data they generate and save on costs associated with human intervention.
Computer vision can help make this possible since it has capabilities such as semantic segmentation, accurate object detection, and retrieval. These would allow the creation of new business models by providing real-time insights into business operations such as customer behavior or financial situation.
Moreover, computer vision can also improve operational efficiency by allowing machines to implement decision-making which requires low-level analytic capacities such as inspection, surveillance, etc. In fact, machine vision algorithms have already been used successfully in several industrial sectors, including agriculture, automotive, security, etc.
With the world's mining sector facing complex challenges, computer vision can play a crucial role in helping with various operational tasks. For example, cameras can be used to monitor stockpiles of raw materials and detect cracks or any other anomalies, allowing early detection of potential risks.
These systems can also help with the management of large mines since they enable automatic tracking of vehicles throughout the site and provide location awareness capabilities for mobile robots. This would reduce reliance on manual labor and save significant amounts of money by reducing risk exposure.
The technology has already been successfully tested at Rio Tinto mines, where it has reduced costs associated with human intervention by around $15 million annually.
Computer vision-based intelligent transportation systems (ITS) can help improve logistics efficiency by providing an automated means for tracking fleets of trucks within distribution centers or warehouses. Older ITS systems typically use magnetic markers to identify the location, but newer vision-based systems are also being used nowadays.
3. Industrial Facilities
Vision sensors have been successfully deployed in various industrial facilities for predictive maintenance, process monitoring, and data collection using mobile robots. For example, the technology is being widely used in the agricultural sector for crop yield prediction, cattle management, etc.
This has helped save on costs related to human intervention since machine vision algorithms are capable of performing tasks that would otherwise require low-level analytic capacities such as inspection, surveillance, etc.
Computer vision has been successfully used in the security industry to track objects and people in real-time, tag images/videos, automatic detection, etc. This is an important application since it has far-reaching implications for financial institutions and transportation along with public safety.
Use-Cases For the Future
Research is currently underway to develop systems that can automatically extract information from videos, including events, activities, and associated data (e.g., location and time). For example, a vision system can be trained using images depicting specific events such as an accident or crime scene to generate reports presented to human operators.
Also, researchers are working on semantic segmentation algorithms which separate different objects within the field of view so that analysts do not need to manually comb through hours of video footage in search of specific items or events.
Such vision systems can also be used to help generate alerts based on abnormalities detected within the monitored environment, which would result in reduced reliance on security staff.
Vision sensors are already being used extensively in the industrial sector for applications like predictive maintenance, process monitoring, and mobile robot data collection. These intelligent systems are capable of automatically tracking vehicles throughout distribution centers or warehouses. Computer vision enables the automatic detection of vehicle locations within distribution centers.
As per Grand View Research Inc. forecasts, the market is expected to reach $12 billion by 2022 from $1 billion in 2015 at a CAGR of 27.9%.
The above use-cases provide just a glimpse of the myriad ways that various industrial verticals are using computer vision. Although the technology has been in existence for decades, we are now witnessing its increased adoption at industrial facilities due to technological advancements which have resulted in reduced costs and improved accuracy.
However, it is important to note that low-level analytics cannot completely replace humans since machine learning algorithms still lack situational awareness and need constant user supervision or support. This means that industrial vision systems may require human assistance, especially when multiple events occur simultaneously.
Also, researchers are currently working on semantic segmentation algorithms which can separate different objects within the field of view so that analysts do not need to manually comb through hours of video footage in search of specific items or events.
As per market research of Tractica, the worldwide robotic vision systems market is expected to reach $12 billion by 2022 from $1 billion in 2015 at a CAGR of 27.9%.
To conclude, we can say that the future looks bright for computer vision, and it is on its way to becoming a ubiquitous technology. A few years from now, we will be seeing these algorithms being used at facilities of all sizes, including homes.