Commit bcc505b7 authored by 李聪聪's avatar 李聪聪
Browse files

reconstruction

parent 20ad751a
Loading
Loading
Loading
Loading
+3 −1
Original line number Diff line number Diff line
@@ -28,3 +28,5 @@ dist/
# project dirs
/datasets
/models

.DS_Store
 No newline at end of file
+21 −0
Original line number Diff line number Diff line
@@ -33,7 +33,28 @@ backbone | type | lr sched | im / gpu | train mem(GB) | train time (s/iter) | to
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
R-50-FPN | Keypoint | 1x | 2 | 5.7 | 0.3771 | 9.4 | 0.10941 | 53.7 | 64.3 | 9981060

### Light-weight Model baselines

We provided pre-trained models for selected FBNet models. 
* All the models are trained from scratched with BN using the training schedule specified below. 
* Evaluation is performed on a single NVIDIA V100 GPU with `MODEL.RPN.POST_NMS_TOP_N_TEST` set to `200`. 

The following inference time is reported:
  * inference total batch=8: Total inference time including data loading, model inference and pre/post preprocessing using 8 images per batch.
  * inference model batch=8: Model inference time only and using 8 images per batch.
  * inference model batch=1: Model inference time only and using 1 image per batch.
  * inferenee caffe2 batch=1: Model inference time for the model in Caffe2 format using 1 image per batch. The Caffe2 models fused the BN to Conv and purely run on C++/CUDA by using Caffe2 ops for rpn/detection post processing.

The pre-trained models are available in the link in the model id.

backbone | type | resolution | lr sched | im / gpu | train mem(GB) | train time (s/iter) | total train time (hr) | inference total batch=8 (s/im) | inference model batch=8 (s/im) | inference model batch=1 (s/im) | inference caffe2 batch=1 (s/im) | box AP | mask AP | model id
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
[R-50-C4](configs/e2e_faster_rcnn_R_50_C4_1x.yaml) (reference) | Fast | 800 | 1x | 1 | 5.8 | 0.4036 | 20.2 | 0.0875 | **0.0793** | 0.0831 | **0.0625** | 34.4 | - | f35857197
[fbnet_chamv1a](configs/e2e_faster_rcnn_fbnet_chamv1a_600.yaml) | Fast | 600 | 0.75x | 12 | 13.6 | 0.5444 | 20.5 | 0.0315 | **0.0260** | 0.0376 | **0.0188** | 33.5 | - | [f100940543](https://download.pytorch.org/models/maskrcnn/e2e_faster_rcnn_fbnet_chamv1a_600.pth)
[fbnet_default](configs/e2e_faster_rcnn_fbnet_600.yaml) | Fast | 600 | 0.5x | 16 | 11.1 | 0.4872 | 12.5 | 0.0316 | **0.0250** | 0.0297 | **0.0130** | 28.2 | - | [f101086388](https://download.pytorch.org/models/maskrcnn/e2e_faster_rcnn_fbnet_600.pth)
[R-50-C4](configs/e2e_mask_rcnn_R_50_C4_1x.yaml) (reference) | Mask | 800 | 1x | 1 | 5.8 | 0.452 | 22.6 | 0.0918 | **0.0848** | 0.0844 | - | 35.2 | 31.0 | f35858791
[fbnet_xirb16d](configs/e2e_mask_rcnn_fbnet_xirb16d_dsmask_600.yaml) | Mask | 600 | 0.5x | 16 | 13.4 | 1.1732 | 29 | 0.0386 | **0.0319** | 0.0356 | - | 30.7 | 26.9 | [f101086394](https://download.pytorch.org/models/maskrcnn/e2e_mask_rcnn_fbnet_xirb16d_dsmask.pth)
[fbnet_default](configs/e2e_mask_rcnn_fbnet_600.yaml) | Mask | 600 | 0.5x | 16 | 13.0 | 0.9036 | 23.0 | 0.0327 | **0.0269** | 0.0385 | - | 29.0 | 26.1 | [f101086385](https://download.pytorch.org/models/maskrcnn/e2e_mask_rcnn_fbnet_600.pth)

## Comparison with Detectron and mmdetection

+69 −7
Original line number Diff line number Diff line
Data Priming Network for Automatic Check-Out
-----------------
# Data Priming Network for Automatic Check-Out

Introduction
-----------------
@@ -18,14 +17,71 @@ visual item tallying network.

![DPNet](demo/DPNet.png)

## Installation

Install
-----------------
Check [INSTALL.md](INSTALL.md) for installation instructions.

## Inference

Results
-----------------
Run inference with pre-trained models using this command. Then images with boxes, labels and scores will
be saved to `rpc_results` folder.

```bash
python demo/rpc_demo.py --config-file configs/e2e_faster_rcnn_R_101_FPN_1x_rpc_xxx.yaml --images_dir /path/to/test2019
```

## Prepare dataset

Using `toolboxes` to extract masks, train [Salient Object Detection](https://github.com/AceCoooool/DSS-pytorch)
and render with [CycleGAN](https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix). Then modify `maskrcnn_benchmark/config/paths_catalog.py` 
to make the paths correct.

## Single GPU training

Most of the configuration files that we provide assume that we are running on 4 GPUs.
In order to be able to run it on fewer GPUs, there are a few possibilities:

**1. Run the following without modifications**

```bash
python tools/train_net.py --config-file "/path/to/config/file.yaml"
```
This should work out of the box and is very similar to what we should do for multi-GPU training.
But the drawback is that it will use much more GPU memory. The reason is that we set in the
configuration files a global batch size that is divided over the number of GPUs. So if we only
have a single GPU, this means that the batch size for that GPU will be 8x larger, which might lead
to out-of-memory errors.

If you have a lot of memory available, this is the easiest solution.

**2. Modify the cfg parameters**

If you experience out-of-memory errors, you can reduce the global batch size. But this means that
you'll also need to change the learning rate, the number of iterations and the learning rate schedule.

Here is an example for Mask R-CNN R-50 FPN with the 1x schedule:
```bash
python tools/train_net.py --config-file "configs/e2e_mask_rcnn_R_50_FPN_1x.yaml" SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025 SOLVER.MAX_ITER 720000 SOLVER.STEPS "(480000, 640000)" TEST.IMS_PER_BATCH 1
```
This follows the [scheduling rules from Detectron.](https://github.com/facebookresearch/Detectron/blob/master/configs/getting_started/tutorial_1gpu_e2e_faster_rcnn_R-50-FPN.yaml#L14-L30)
Note that we have multiplied the number of iterations by 8x (as well as the learning rate schedules),
and we have divided the learning rate by 8x.

We also changed the batch size during testing, but that is generally not necessary because testing
requires much less memory than training.


## Multi-GPU training
We use internally `torch.distributed.launch` in order to launch
multi-gpu training. This utility function from PyTorch spawns as many
Python processes as the number of GPUs we want to use, and each Python
process will only use a single GPU.

```bash
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_net.py --config-file "path/to/config/file.yaml"
```
## Results

![DPNet](demo/results.png)

@@ -35,3 +91,9 @@ Results
|   medium | Syn+Render (DPNet) | 80.68% | 97.38% | 0.32 | 0.03 | 98.07% | 77.25% |
|     hard | Syn+Render (DPNet) | 70.76% | 97.04% | 0.53 | 0.03 | 97.76% | 74.95% |
| averaged | Syn+Render (DPNet) | 80.51% | 97.33% | 0.34 | 0.03 | 97.91% | 77.04% |

## Citations
Please consider citing this project in your publications if it helps your research. The following is a BibTeX reference. The BibTeX entry requires the `url` LaTeX package.
```
TODO
```
 No newline at end of file
+2 −3
Original line number Diff line number Diff line
@@ -22,11 +22,10 @@ MODEL:
    PREDICTOR: "FPNPredictor"
    NUM_CLASSES: 201
DATASETS:
  TRAIN: ("rpc_2019_train_rendered",)
  TRAIN: ("rpc_2019_train_render",)
  TEST: ("rpc_2019_val",)
DATALOADER:
  SIZE_DIVISIBILITY: 32
  ASPECT_RATIO_GROUPING: False
SOLVER:
  BASE_LR: 0.01
  WEIGHT_DECAY: 0.0001
@@ -36,4 +35,4 @@ SOLVER:
TEST:
  IMS_PER_BATCH: 4

OUTPUT_DIR: 'outputs_v9_bag_like_only_back_front'
 No newline at end of file
OUTPUT_DIR: 'outputs_rpc_2019_train_render'
 No newline at end of file
+3 −3
Original line number Diff line number Diff line
@@ -21,12 +21,12 @@ MODEL:
    FEATURE_EXTRACTOR: "FPN2MLPFeatureExtractor"
    PREDICTOR: "FPNPredictor"
    NUM_CLASSES: 201
  DENSITY_ON: True
DATASETS:
  TRAIN: ("rpc_2019_train_with_density_rendered",)
  TRAIN: ("rpc_2019_train_render_density_map",)
  TEST: ("rpc_2019_val",)
DATALOADER:
  SIZE_DIVISIBILITY: 32
  ASPECT_RATIO_GROUPING: False
SOLVER:
  BASE_LR: 0.01
  WEIGHT_DECAY: 0.0001
@@ -36,4 +36,4 @@ SOLVER:
TEST:
  IMS_PER_BATCH: 4

OUTPUT_DIR: 'outputs_v9_bag_like_only_back_front'
 No newline at end of file
OUTPUT_DIR: 'outputs_rpc_2019_train_render_density_map'
 No newline at end of file
Loading