Packed Grocery Items Dataset

1. Overview

The Packed Grocery Items Dataset is designed to support the development of computer vision models for identifying, categorizing, and assessing the quality of grocery items.

Total Images

9,540

Classes

49 unique grocery items

Captured Conditions

Multiple backgrounds
Varying lighting
Mixed object configurations

2. Dataset Structure

The dataset is divided into three subsets to ensure effective training, validation, and testing:

Training Set: 88% (8,349 images)

Validation Set: 8% (793 images)

Test Set: 4% (398 images)

Top 10 Product Distribution

Distribution of top 10 product categories

Dataset Split

Train/Val/Test sets

Class Name	Total Count	Training	Validation	Test	Percentage

3. Data Preprocessing

The following preprocessing steps were applied to all images:

Auto-Orient: Corrected image orientation.
Resize: Stretched all images to 640x640 pixels.

4. Data Augmentation

To increase data variability and robustness during model training, the following augmentations were applied:

Effect	Details
Flip	Horizontal and vertical flips
Rotation	Random rotation: -45° to +45° 90° rotations: Clockwise, Counter-Clockwise, and Upside Down
Shear	Horizontal: ±13° Vertical: ±14°
Color Adjustments	Saturation: ±28% Brightness: ±13% Exposure: ±14%
Blur	Up to 2.5px

5. Annotation Details

All images were annotated for bounding boxes corresponding to the 49 product classes. The annotations provide:

Class Labels: Each object's product name.
Bounding Box Coordinates: Precise locations of each detected object in the image.

6. Collection Methodology

The dataset was curated with the following considerations:

Environmental Variety: Images were collected in diverse conditions, including different backgrounds and lighting.
Object Variability: Includes individual products and mixed configurations to enhance model robustness.

7. Technical Specifications

              Image Format: JPEG/PNG
Resolution: 640x640 pixels (after resizing)
Annotations: JSON or XML format compatible with major frameworks like YOLO, TensorFlow, and PyTorch.

            

8. Dataset Usage

This dataset can be used for:

Object Detection: Identifying and localizing products within images.
Classification: Categorizing grocery items into one of the 49 classes.
Model Training and Testing: Split provided for effective model evaluation.

9. Example Visualizations

Below are sample images with annotations, highlighting the bounding boxes for each detected object.

10. Dataset Challenges

              Lighting Conditions: Varying light intensities may affect model predictions.
Mixed Objects: Some images include overlapping objects, posing challenges for detection models.

11. Conclusion

The Packed Grocery Items Dataset provides a comprehensive resource for training deep learning models, focusing on accuracy and robustness in identifying grocery items under realistic conditions. By utilizing this dataset, developers can build efficient object detection and classification systems tailored for grocery products.