Posted in | News | Imaging

High-Speed Way to Detect Location, Size, and Category of Multiple Objects

A new high-speed method to detect the size, location, and category of several objects has been developed by scientists. This is done without needing images or requiring complicated scene reconstruction.

High-Speed Way to Detect Location, Size, and Category of Multiple Objects

Researchers have developed a new high-speed way to detect the location, size, and category of multiple objects without acquiring images or requiring complex scene reconstruction. Image Credit: Lintao Peng, Beijing Institute of Technology

Since the new method highly reduces the computing power essential for object detection, it could be beneficial for determining risks while driving.

Our technique is based on a single-pixel detector, which enables efficient and robust multi-object detection directly from a small number of 2D measurements.

Liheng Bian, Research Team Leader, Beijing Institute of Technology in China

Bian added, “This type of image-free sensing technology is expected to solve the problems of heavy communication load, high computing overhead, and low perception rate of existing visual perception systems.”

The image-free perception techniques that are available currently could only obtain single object recognition, classification, or tracking. To carry out all three at a time, the scientists came up with a method called image-free single-pixel object detection (SPOD).

In the Optica Publishing Group journal Optics Letters, the scientists have reported that SPOD could achieve an object detection precision of just above 80%.

The SPOD method builds on the research group’s earlier accomplishments in coming up with an imaging-free sensing technology as effective scene perception technology. Their previous work consists of image-free classification, segmentation, and character recognition depending on a single-pixel detector.

For autonomous driving, SPOD could be used with lidar to help improve scene reconstruction speed and object detection accuracy. We believe that it has a high enough detection rate and accuracy for autonomous driving while also reducing the transmission bandwidth and computing resource requirements needed for object detection.

Liheng Bian, Research Team Leader, Beijing Institute of Technology in China

Detection Without Images

Generally, automating advanced visual tasks—if utilized to navigate a vehicle or track a moving plane—need elaborate images of a scene to withdraw the features essential to determine an object.

But, this needs either complicated imaging hardware or complex reconstruction algorithms, resulting in long running time, high computational cost, and heavy data transmission load. 

Image-free sensing techniques depending on single-pixel detectors have the potential to reduce the computational power required for object detection.

Rather than applying a pixelated detector like a CMOS or CCD, single-pixel imaging illuminates the scene having a sequence of structured light patterns. Also, it further records the transmitted light intensity to obtain the spatial information of objects. This data is then utilized to computationally rebuild the object or to evaluate its properties.

As far as SPOD is concerned, the scientists made use of a small but improved structured light pattern to rapidly scan the complete scene and achieve 2D measurements.

Such measurements are fed into a deep learning model called a transformer-based encoder to withdraw the high-dimensional significant features in the scene. Further, these features are fed into a multi-scale attention network-based decoder, which outputs the location, class, and size information of all targets in the scene at the same time. 

Compared to the full-size pattern used by other single-pixel detection methods, the small, optimized pattern produces better image-free sensing performance. Also, the multi-scale attention network in the SPOD decoder reinforces the network’s attention to the target area in the scene. This allows more efficient extraction of scene features, enabling state-of-the art object detection performance.

Lintao Peng, Group Member, Beijing Institute of Technology

Proof-of-Concept Demonstration

For SPOD to be illustrated experimentally, the scientists built a proof-of-concept setup. The images chosen randomly from the Pascal Voc 2012 test dataset were printed on film and then utilized as target scenes.

When a sampling rate of 5% was utilized, the average time to finish image-free object detection and spatial light modulation for every scene with SPOD was just 0.016 seconds. This is much quicker compared to executing the scene reconstruction first (0.05 seconds) and further object detection (0.018 seconds). SPOD displayed an average detection precision of 82.2% for all the object classes included in the test dataset.

Peng added, “Currently, SPOD cannot detect every possible object category because the existing object detection dataset used to train the model only contains 80 categories. However, when faced with a specific task, the pre-trained model can be fine-tuned to achieve image-free multi-object detection of new target classes for applications such as pedestrian, vehicle, or boat detection.”

Further, the scientists plan to extend the image-free perception technology to other types of detectors and computational acquisition systems to obtain reconstruction-free sensing technology.

Journal Reference:

Okamura, S., et al. (2023) Ultrafast measurement of vector spatial modes by using two-dimensional linear optical sampling. Optics Letters. doi.org/10.1364/OL.490009.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.