ICIP 2023 Challenge Session
Challenge Title: Object detection under uncontrolled acquisition environment and scene context constraints
Image acquisition conditions can significantly affect high-level tasks in computer vision, such as object detection, object recognition, object segmentation, depth estimation, scene understanding, or object tracking just to name a few. The improvement of the sensors’ quality and deep learning methods provided an increase in robustness against distortions to reach suitable performance in various computer vision algorithms. However, even taking advantage of new sensor technologies and deep learning approaches, the performance is quite limited in real applications where the visual scene contains both local and global distortions. This is the case in autonomous vehicles, video surveillance, or medical robotics for example. Several object detection benchmarks dataset have been proposed [1-3]. The most popular benchmark is the MSCOCO dataset. The performance of the object detection models are generally evaluated using the Mean Average Precision metric. However only global image distortions are considered in the experiment. For a better assessment of the robustness of object detection models, it is important to also consider the presence of local distortions and the complexity of the observed scenes in real environments. This will give more realism and reliability to databases including such scenarios. To this end we built a database containing several images with various global and local distortions by taking into account some relevant features related to the contexts to give more realism to the images. Our dedicated dataset comprises original and distorted images from the well-known MS-COCO dataset. The synthetic distortions are generated according to several types and severity levels with respect to the scene context. Important: The selected teams will be invited to be part of a joint paper, summarizing the top proposed solutions, to be submitted for publication in an IEEE Transaction.
The CD-COCCO database is available to ICIP 2023 challenge competitors who have registered via this link to test their methods against different distortions and severity levels. The proposed methods must be able to localize the objects as accurately as possible and determine their classes in a reasonable time. The duration of the detection process will be a parameter to be taken into account in the performance evaluation. The participants would be required to submit an easy to read code of their algorithm (preferably in Matlab or Python) with comments along with a document with a summary and steps of their method. This code should contain executable script with its corresponding readme file allowing us to test their solution on our CD-COCO test set. Some illustrative results could be submitted to display the efficiency of their solution. The challengers must also provide the execution time of their solution and their system configuration to normalize the execution time between competitors. Thus, the submitted methods must try to reach the following goals:
- Detect the presence of objects
- Determine which class they belong to
- Determine their location as precisely as possible in bounding boxes
Evaluation Criteria
The submitted methods will be assessed according to the official COCO mAP metric, which characterizes the methods’ precision by their ability to detect objects and locate them accurately. The criteria of accuracy and speed will be summarized in a ratio describing the efficiency of the proposed solutions. Furthermore, all proposed methods will be tested on our Lab computer with the same GPU to normalize the execution time. The distorted test sets contain images submitted to various distortions at random severity levels and with severity levels increasing progressively from set to set. The evaluation will thus have 2 parts:
- A general test set with all distortion types at random severity levels.
- Test sets for each distortions type with a specific severity level increasing linearly from set to set.
The CD-COCO dataset that will be used in this challenge session comes from the famous MS-COCO dataset that contains 164K images split into three sets, respectively the training set with 118K images, the validation set with 5K images, and the test set with 41K images. We applied dedicated distortions type at specific severity levels to the training set according to the scene context of each of its images. The choice of the distortion type would be correlated to the scene type (indoor/outdoor) and the scene context (the objects present and the scene depth). Likewise, the distortion severity level would be assigned according to the object type and position (pixel and depth) for local distortions or atmospheric distortions (rain and haze). For example, haze and rain cannot be present in indoor scenes, and the object motion blur should be correlated to the object’s velocity, which depends on the object type and its position in the scene. Thus, the distortion severity level based on the object type should consider the object’s sensitivity for a given distortion. Conversely, the object position and scene depth will allow to deal with the scene specificity to make the distortions more coherent according to the scene context and type. Important: The link to access the dataset will only be provided to the registered participants.
Parameter | Value |
---|---|
Number of images by sets | 118K for training and 5K for validation |
Acquisition conditions | 2 (Day, Night) |
Scene Type | 2 (Indoor, Outdoor) |
Resolution of images | 640 x 480 |
Category images | RGB images |
Number of distortions | 10 |
Number of distortion levels | 10 |
Distortion types | (D1) Image Compression, (D2) Noise (Additive White Gaussian Noise), (D3) Contrast changing, (D4) Rain, (D5) Haze, (D6) Motion blur (camera motion), (D7) Defocus Blur, (D8) Local Backlight illumination, (D9) Local Motion Blur, (D10) Local Defocus Blur |
Object types | 80 object categories from the COCO dataset (person, bicycle, car, etc.) |
Our CD-COCO dataset comprises local distortions such as blur motion, defocus blur, and backlight illumination applied to objects or specific areas. It is worth noticing that the weighting and magnitude of each distortion is adjusted according to the position of the object in the observed scene. This implies both 2D spatial position and depth are taken into account in the application of the synthetic distortions. This database also contains the case of global distortions related to camera parameters and characteristics, such as noise sensitivity, defocus or instabilities, and those related to acquisition conditions such as atmospheric turbulence, image artifacts (lossy compression artefacts), motion blur or uncontrolled lighting. Among the atmospheric and weather factors affecting the image acquisition quality, we consider rain and haze phenomena. The other factors related to camera sensors’ limitations are mainly noise sensitivity, contrast sensitivity and spatial resolution. The global blur may result from camera motion and/or optical defocus. Whereas, local motion blur results from moving objects. Our dataset is detailed in the following tables (see tables 1 and 2).
Distortions | Distortion Types | Scene type | Depth Influence | Object type influence |
---|---|---|---|---|
Image Compression | Global/Acquisition | No | No | No |
Noise | Global/Acquisition | No | No | No |
Contrast changing | Global/Atmospheric | Yes | Yes | No |
Rain | Global/Atmospheric | Yes | Yes | No |
Haze | Global/Atmospheric | Yes | Yes | No |
Motion blur | Global/Camera conditions | No | No | No |
Defocus blur | Global/Camera conditions | No | No | No |
Local Backlight illumination | Local/Scene conditions | No | Yes | No |
Local Defocus blur | Global/Scene conditions | No | Yes | No |
Following team will run the challenge session:
University Paris Saclay, France
Aman Beghdadi Malik Mallem Lotfi Beji
PhD candidate Professor Associate Professor
Norwegian University of Science and Technology (NTNU), Norway
Faouzi Alaya Sheikh Mohib Ullah Adane N. Tarekegn
Professor Postdoc Fellow Postdoc Fellow
University Sorbonne Paris Nord, France

Registration opening: | January 22, 2023 |
Training data available: | January 31, 2023 |
Testing data available: | March 1, 2023 |
Challenge paper submission (optional): | April 26, 2023 |
Solutions/codes submission : | April 30, 2023 |
Challenge paper acceptance notification: | June 21, 2023 |
Camera ready submission of accepted challenge papers (optional): | July 5, 2023 |
Announcement of the winners: | @ IEEE ICIP 2023 |
- Beghdadi, Ayman, Malik Mallem, and Lotfi Beji. « Benchmarking performance of object detection under image distortions in an uncontrolled environment. » 2022 IEEE International Conference on Image Processing (ICIP). IEEE, 2022.
- Beghdadi, A., Qureshi, M.A., Dakkar, B.E., Gillani, H.H., Khan, Z.A., Kaaniche, M., Ullah, M. and Cheikh, F.A., 2022, October. A New Video Quality Assessment Dataset for Video Surveillance Applications. In 2022 IEEE International Conference on Image Processing (ICIP) (pp. 1521-1525). IEEE.
- I. Bezzine, Z. A. Khan, A. Beghdadi, N. Almaadeed, M. Kaaniche, S. Almaadeed, A. Bouridane, F. Alaya Cheikh, » Video quality assessment dataset for smart public security systems « , in the Proceedings of the 23rd IEEE-INMIC, Bahawalpur, Pakistan, 5-7 November 2020.
- Michaelis, C., Mitzkus, B., Geirhos, R., Rusak, E., Bringmann, O., Ecker, A. S., … & Brendel, W. (2019). Benchmarking robustness in object detection: Autonomous driving when winter is coming. arXiv preprint arXiv:1907.07484.
- Hendrycks, D., & Dietterich, T. (2019). Benchmarking neural network robustness to common corruptions and perturbations. arXiv preprint arXiv:1903.12261.<
- Lin, T. Y., Mayor, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., … & Zitnick, C. L. (2014, September). Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Springer, Cham.