Ghost imaging is a computational imaging technique that, when combined with human vision, can image an object that cannot be seen by the person. It is a new development in using artificial intelligence towards enhancing human vision.
Daniele Faccio from the University of Glasgow in the UK has presented the new findings at the Optica, Imaging and Applied Optics Congress, titled “Non-Line-of-Sight (NLoS) Imaging and Imaging through Scattering Media.”
Imaging is possible by correlating a projected light pattern that interacts with the object and a reference pattern that does not. This is the first time researchers used a human visual system having a real person view the light patterns instead of a camera. The brain’s visual response is recorded and used as feedback for an algorithm that determines how to reshape the projected light patterns and reconstructs the final image.
EEG technique was used to estimate the intensity of light transmitted by the object and diffused from the white wall, information that was fed into a neurofeedback loop used to reconstruct the image. When the EEG signal fell below a certain threshold, it was concluded that the light pattern didn’t overlap with the object and could be automatically removed or carved out in real time by the system.
Advertisement
Through this technique, researchers could successfully reconstruct 16×16 pixel images of simple objects that could not be seen. They also demonstrated that the carving out process helped reduce the observation time needed for image reconstruction to about 1 minute.
“We believe that this work provides ideas that one day might be used to bring together human and artificial intelligence. The next steps in this work range from extending the capability to provide 3D depth information to looking for ways to combine multiple information from multiple viewers at the same time”, says Daniele Faccio, Professor of Quantum Technologies, School of Physics and Astronomy, University of Glasgow
More Trending Stories
Quantum Computers can Look Beyond Zeros and Ones! Research Reveals
Is Vitalik Insisting that Meta Will Fail Metaverse Just Like It Did with Crypto?
OpenAI’s GPT-3 Can Now Give You Philosopher-Level Gyan
Why Python Alone Will Make You Fail in Data Science Job?
Large Language Models Like GPT-3 Have Hardware Problems
Shiba Inu Aims Immortality and Price Rally with ‘Shiba Eternity’ Game
The post Field of View to Expand with ‘Ghost Vision’ appeared first on .
Image segmentation can provide more accurate descriptions of the objectives than picture classification
Accurate annotations have been provided by computer vision datasets, which are the foundation for many Artificial Intelligence (AI) models, for many years. They have performed admirably enough to satisfy the demands of machine perception systems. However, AI has reached a stage where it required precise outputs from computer vision models in order to enable delicate human-machine interaction and immersive virtual existence. Image segmentation, one of the most fundamental computer vision algorithms, is crucial for assisting robots in understanding and perceiving their environment.
It can provide more accurate descriptions of the objectives than picture classification and object identification for a variety of applications, such as image editing, augmented reality (AR), medical image processing, 3D reconstruction, satellite image analysis, and robot manipulation. We can categorize the aforementioned applications into “light” and “heavy” categories based on how directly they affect actual objects. Examples of “light” applications are image analysis and photo editing (like manufacturing and surgical robots).
The “light” applications may be more tolerant of segmentation failures and deflect because these issues largely raise labor and time costs, which is usually acceptable. Deflects or failures in “heavy” applications, however, are more likely to have catastrophic effects, such as physical damage to items or injuries that could be fatal to humans and other animals. Therefore, the models for these applications need to be precise and trustworthy. Most segmentation algorithms are still less suitable in such “heavy” applications due to accuracy and robustness, which hinders segmentation methods from playing more important roles in larger applications.
Dichotomous image segmentation (DIS), as it is known by researchers, is the process of trying to distinguish exceedingly exact objects from images of nature. They strive to provide a framework that can handle both “heavy” and “light” applications. However, the main focus of current picture segmentation problems is on segmenting objects with specified characteristics, such as visible, veiled, detailed, or specific categories. Basically, all jobs depend on the dataset because the majority of them use the same input/output formats and hardly ever use exclusive strategies explicitly developed for segmenting targets in their algorithms.
Advertisement
The recommended DIS task frequently emphasizes images with one or more objectives, in contrast to semantic segmentation. It is simpler to get more detailed, accurate information on each target. Therefore, the creation of a category-neutral DIS task for accurately segmenting objects of varying structural complexity, regardless of their attributes, is highly encouraging.
Researchers proposed the novel contributions listed below:
1. In DIS5K, a sizable, extensible DIS dataset, 5,470 high-resolution images, and accurate binary segmentation masks are merged.
2. By mandating direct feature synchronization, IS-Net, a unique starting point created with intermediate supervision, prevents over-fitting in high-dimensional feature spaces.
3. A recently created human correction efforts (HCE) statistic tracks the number of human interventions necessary to rectify the incorrect locations.
4. The most thorough DIS analysis is provided by the DIS benchmark, which is based on the most recent DIS5K.
Let’s see the top 6 open-source datasets for computer vision that deserve to be explored.
Computer vision is one of the top tech trends that has been booming these days. This technology is accelerating every domain in the industry helping organizations to revolutionize the way machines are being used. And to build a robust deep learning model for computer vision, one should apply high-quality datasets into the training phase. Let’s see the top 6 open-source datasets for computer vision that deserve to be explored.
ImageNet: It is an ongoing research effort aiming to provide researchers with an accessible image database. It is one of the most well-known image databases that is liked by researchers and learners alike and provides an average of 1000 images to illustrate each synset.
CIFAR-10 and CIFAR-100: CIFAR-10 and CIFAR-100 are a collection of images that are used to train machine learning and computer vision algorithms for beginners working in the field. These are also some of the most popular datasets for machine learning for quick comparison of algorithms as it captures the weaknesses and strengths without putting much burden on the parameter tuning process.
MS COCO: The MS COCO dataset, also known as the Microsoft Common Objects in Context, consists of 328K images. It annotates for object detection, key points detection, panoptic segmentation, captioning, and dense human pose estimation.
MPII Human Pose: This dataset is used for the evaluation of articulated human pose estimation. It consists of around 25K images comprising over 40K people with annotated body joints. Each image is extracted from different YouTube videos and is provided with preceding. Overall, the dataset covers around 410 humans and each image is labelled with a different activity.
Barkley DeepDrive: This dataset is used for autonomous vehicle training. It comprises over 100K video sequences with diverse kinds of annotations like object bounding boxes, drivable areas, image-level tagging, lane markings, and much more. Also, it presents a wide variety in representing various geographic, environmental, and weather conditions.
CityScapes: It is a database containing a diverse set of stereo and video sequences recorded in the street scenes from 50 different cities. It also includes semantic, instance-wise, and dense pixel annotations for 30 divisions grouped in 8 categories. CityScapes provides pixel-level annotations for 5000 frames and 20,000 coarsely annotated frames.
A new technique in computer vision may enhance our 3D understanding of 2D images.
Currently, a new technique in computer vision may enhance our 3D understanding of 2D images. 3D models are a collection of points in 3D space, so they have a length, width, and depth. 2D images only have length and width components. It’s been an important part of computer vision studies. Computer vision is a field of artificial intelligence enabling computers to derive information from images, videos, and other inputs. The problem is difficult for several reasons, one being that information is inevitably lost when a scene that takes place in 3D is reduced to a 2D representation.
3D understanding of 2D images:
There are some 3D modeling programs out there that can help you sculpt or create 3D models out of single 2D images. They do require a bit of time and patience. There is no software yet that can take a single two-dimensional image and produce a robust three-dimensional model. However, using a series of two-dimensional images and making a three-dimensional model through a process called photogrammetry. Photogrammetry is the art, science, and technology of obtaining reliable information about physical objects and the environment through processes of recording, measuring, and interpreting photographic images and patterns of recorded radiant electromagnetic energy and other phenomena.
There are some well-established strategies for recovering 3D information from multiple 2D images, but they each have some limitations. That is virtual correspondences (VCs). VCs are a pair of pixels from two images whose camera rays intersect in 3D. Similar to classic correspondences, VCs conform with epipolar geometry; unlike classic correspondences, VCs do not need to be co-visible across views.
Advertisement
Virtual correspondences offer a way to carry things further, that one photo is taken from the left side of a rabbit and another photo is taken from the right side. Researchers want to make computers that can understand the three-dimensional world just like humans do and they need to develop computers that can not only interpret still images but can also understand short video clips and eventually full-length movies.
More Trending Stories
What Does it Exactly Take for an AI to Become Perfectly ‘Sentient’?
Generative AI to Help Humans Create Hyperreal Population in Metaverse
Hertzbleed Hacks Target Computer Chips to Steal Sensitive Data
DeepMind’s AlphaFold2 Solves the Long-Stood Protein-Folding Problem
Will Sentient AI Gain Equal Rights as Humans in the Future?
Crypto Sphere has 20,000 Tokens Now! 10 Best to Buy Today
Python is No More Overrated, Thanks to Python 3.11
The post A New Computer Vision Technique can Recover 3D Info from 2D Images appeared first on .