Object detection, category classification and material classification for IPALM

Part of the IPALM project, this is a fusion of a MobilenetV3 trained on the smaller MINC2500 dataset and the default Detectron2 InstanceSegmentor trained on COCO, ShopVRB, YCB and a few custom images.

Repository for training MobileNet on MINC2500 here (gitlab).
Repository for creating test data and evaluating test data for this project to create the confusion matrix from which the precision is calculated here (github).
Files added for the ipalm project, such as the material classification script are located in ipalm/

TODO

remove personal references => remove Andrej ;-;
add examples of how an input image is processed
clarify the diagrams
formalize the text
list the materials that are used from MINC
make a backup of this epo and make a fork of it from the original detectron repository
etc

Prerequisites

Tested on Ubuntu & debian
Requires Linux with CUDA
Versions of packages used:
- torchvision/0.9.1-fosscuda-2019b-PyTorch-1.8.0
- OpenCV/3.4.8-fosscuda-2019b-Python-3.7.4
- scikit-learn/0.21.3-fosscuda-2019b-Python-3.7.4
- scikit-image/0.16.2-fosscuda-2019b-Python-3.7.4
- and dependencies

How to install:

Go to some folder A and: git clone https://github.com/Hartvi/Detectron2-mobilenet
- This will create the folder called `Detectron2-mobilenet
Rename Detectron2-mobilenet to detectron2: mv Detectron2-mobilenet detectron2
In folder A: python -m pip install -e detectron2 --user
???
Profit 1 billion dollars

All in one: git clone https://github.com/Hartvi/Detectron2-mobilenet && mv Detectron2-mobilenet detectron2 && python -m pip install -e detectron2 --user

Short demo:

from detectron2 import andrej_logic

megapredictor = CatmatPredictor(threshold=0.6)
# folder with images: "images_input/[some_images]"
input_imgs = ["images_input/" + f for f in listdir("images_input") if isfile(join("images_input", f))]
# CatMatPredictor.get_andrej(raw_image[arr]/image_path[str]) returns a list of dictionaries for 
for inp_img in input_imgs:
    # this is a list of dicts in andrej format, see ipalm/andrej_output_format
    # optional argument: output_target="your_file_name.json" to save the dicts in json format
    predictions = megapredictor.get_andrej(inp_img)  
    # plot:
    quick_plot_bboxes(predictions, inp_img)

Output interface

The outputs are going to be further processed downstream and they require precision and probabilities.

The COCO style dataset compiled during the VIR project has labels in the format integer: "category - material" in human readable terms. However, the MINC materials do not exactly correspond to the materials from the COCO-style datasets, so we selected 8 materials from MINC to use for this project.

The requiredprecision is then calculated from the confusion matrix gained from running the networks on the test dataset that was used to train the basic Detectron2 for VIR. The first row and column of the matrix are ignored because those are the cases when the bounding boxes didn't contain any object in the image.

Information flow

For a summary of the contents of the files added for the ipalm project, see the ipalm/README.md

The high-level structure of the project is the following. The input image is fed into Detectron2 which is first used to locate objects of interest and its output data is saved. The bounding boxes gained from the first pass are extracted and plugged into Detectron2 (again) and also into MobileNet.

Figure 1. Information flow in project structure.

The following picture contains an explanation how categories are weighted. There are in total 2 passes of each bounding box through detectron. Therefore there is 1 bounding box that is then plugged back into detectron to get some more bounding boxes. The weight of the class initially detected by detectron is then area_of_first_bbox*first_probability + area_of_nested_bbox*second_probability. The weights of other classes are simply just area_of_nested_bbox_of_class_i*second_probability_of_class_i.

Figure 2. Category and material probability calculation.

Making of this project

The base of the project is the detectron2 framework's instance segmentation backbone by facebookresearch. Initially we tried to add a material segmentation/classification head to the instance segmentation network, however that proved to be exceedingly confusing because of the structure of the project. See the gitlab progress log for details on how it was unfeasible (for me) given the time constraints. Basically the detectron2's structure is such that I couldn't even discover any modifications of it not made by facebook employees.

We ended up using just the default instance segmentor from VIR, retrained because something in either a newer detectron2 or pytorch version changed something in the background. The bounding boxes gathered from detectron2 are then plugged back into detectron2 this time used as a category classifier and into a MobileNetV3 material classifier trained on the MINC 2500 dataset.

Encountered challenges

1. Detectron2 project structure

I spent about 3 work weeks trying to add a ROIHead to detectron2 that would also classify the material properties of the object inside the bounding box, however adding a head to detectron required a non-trivial modification of the project. None of the IDEs let alone searching through the raw text helped very much in determining where the program was being executed. A rough ROIHead definition in a yaml file, calling the python class using its string name in the yaml, some configuration in the python script, some configuration is completely hidden, actual ROIHead code as a python class, backprop functions, dataset processing and formatting, nesting the classification into the bounding box, etc.

2. Dataset formatting

The dataset formatting in COCO was done by Michal Pliska in the VIR subject. Formatting MINC2500 was only a minor issue of 2 of the 57500 images being single channel black and white, which crashed the training seemingly randomly.

License

Detectron2 is released under the Apache 2.0 license.

Citing Detectron2

If you use Detectron2 in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.

@misc{wu2019detectron2,
  author =       {Yuxin Wu and Alexander Kirillov and Francisco Massa and
                  Wan-Yen Lo and Ross Girshick},
  title =        {Detectron2},
  howpublished = {\url{https://github.com/facebookresearch/detectron2}},
  year =         {2019}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1,288 Commits
.circleci		.circleci
.github		.github
configs		configs
datasets		datasets
demo		demo
detectron2		detectron2
dev		dev
docker		docker
docs		docs
images_output		images_output
ipalm		ipalm
projects		projects
tests		tests
tools		tools
.clang-format		.clang-format
.flake8		.flake8
.gitignore		.gitignore
GETTING_STARTED.md		GETTING_STARTED.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
MODEL_ZOO.md		MODEL_ZOO.md
README.md		README.md
__init__.py		__init__.py
andrej_logic.py		andrej_logic.py
custom_vis.py		custom_vis.py
path_corection.py		path_corection.py
predictor.py		predictor.py
run_w_detectron2.py		run_w_detectron2.py
setup.cfg		setup.cfg
setup.py		setup.py
train.py		train.py
visualizer.py		visualizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Object detection, category classification and material classification for IPALM

TODO

Prerequisites

How to install:

Short demo:

Output interface

Information flow

Making of this project

Encountered challenges

1. Detectron2 project structure

2. Dataset formatting

License

Citing Detectron2

About

Releases

Packages

Contributors 149

Languages

License

Hartvi/Detectron2-mobilenet

Folders and files

Latest commit

History

Repository files navigation

Object detection, category classification and material classification for IPALM

TODO

Prerequisites

How to install:

Short demo:

Output interface

Information flow

Making of this project

Encountered challenges

1. Detectron2 project structure

2. Dataset formatting

License

Citing Detectron2

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Contributors 149

Languages

Packages