GitHub - rajy4683/MonoMaskDepth: Final Project of EVA4 Phase 1 - Building a model to perform multi-task learning of Depth Estimation and Mask Prediction

MonoDepthMask

A Dataset for multi-task learning of Monocular Depth Estimation and Mask Prediction. The data was generated by taking:

~100 random background images
~110 random foreground images + flipped images of these originals (Total = 220)
Overlaying these 100 fg images on each of the background at 20 different places (Yup! sometimes you could see poeple flying)
Using DenseDepth model to generate Depth maps of the overlayed images

Dataset Description

MonoDepthMask Dataset consists of following images:

bg_image: Images with Only Background E.g: Malls, classrooms, college_outdoors, lobbies etc;
fg_bg_images: Images with an object/person overlayed randomly on a Background
mask_images: Ground truth of Masked Images of foregroud object/person.
depth_images: Ground truth of Depth Map generated from fg_bg_images.
DepthMapDataSet.csv: CSV file for the dataset contains following columns:

Column Name	Column Description
ImageName	fg_bg_image
MaskName	mask_image
Depthname	depth_image
BGImageName	bg_image
BaseImageFName	Zip file containing fg_bg_images and mask_images
DepthImageFName	Zip file containing depth_images
BGType	Class to which the bg_image belongs
BGImageFName	Zip file containing bg_images

Dataset Stats

ImageType	Count	Dimension	Channel Space	Channelwise Mean	Channelwise StdDev	Link
fg_bg_images	484320	250x250x3	RGB	[0.56632738, 0.51567622, 0.45670792]	[0.1076622, 0.10650349, 0.12808967]	https://github.com/rajy4683/MonoMaskDepth/blob/master/README.md#fg_bg_images-and-mask_images
bg_images	484320	250x250x3	RGB	[0.57469445, 0.52241555, 0.45992244]	[0.11322354, 0.11195428, 0.13441683]	https://github.com/rajy4683/MonoMaskDepth/blob/master/README.md#bg_images
mask_images	484320	250x250x1	RGB	[0.0579508]	[0.001662]	https://github.com/rajy4683/MonoMaskDepth/blob/master/README.md#fg_bg_images-and-mask_images
depth_images	484320	320x240x1	RGB	[0.3679109]	[0.03551773]	https://github.com/rajy4683/MonoMaskDepth/blob/master/README.md#depth_images

All the above data is indexed in the below CSVs:

FullDataSet (~480K)
Training Data (~340K)
Test Data (~150K)
Sample Data (500)

Dataset Generation Process

How to create transparent foreground images:
- Download PNG/JPG format images of people with any background
- Upload individual images to https://www.remove.bg/upload
- Since I was using the free version so images had to be transformed one at a time
- Download and save the transparent images

How to create masks for above foreground images:

Used a simple OpenCV based conversion

def generate_mask(img,debug=False):    
   lower_white = np.array([1, 1,1,4])
   upper_white = np.array([255,255,255,4])
   mask = cv2.inRange(img, lower_white, upper_white)
   if debug == True:
      cv2_imshow(img)
      cv2_imshow(mask)
   return mask

How were fg overlayed over bg and created 20 variants:
- Please refer to this notebook for end-to-end flow
- Primarily used albumentations to generate flipped images and for resizing images to fit the background. Code can be found here
- Main advantage of albumentation was that it operates on masks/bboxes also in the same operation
- FG images were of the size (125,125) or (64,64)
- Range of 20 random positions within ((0, height_bg - height_fg), (0, width_bg - width_fg)) was used to prevent images being cropped at the edge. Code can be found here
- A csv file with a tuple of every background with 40 positions (flipped + regular) was created.
- Slices of this CSV file was run parallely on 4 Colab instances to generate 4 files listed in this section.
  - All the files generated were stored locally on the colab instance
  - All the input files were copied at the start of the run to the colab instance's local directory
  - The files were later zipped and saved back on to Google drive.
  - Currently analyzing how to make this process faster and streamlined
- To overcome disk space and colab file handling issue,
How did you create your depth images?
- Base Model used was DenseDepth
- The test utilities were modified to handle the following:
  - From input zip files generated above, directly read ~300 images
  - Resize these images using albumentations to 480x640
  - Run the model on the inputs and save the output depth data in 'plasma' cmap. This will be modified to grayscale.
  - Similar to above step, this step was also run of 4 colab instances in parallel to generate respective depth images
  - Code for this handling can be found here
How did you calculate mean and stddev?
- Code for computation can be found in this notebook
- PyTorch based DepthDataset class was created
- This allows to use either PyTorch dataloaders/plain iterators to be used over the entire dataset.
- Using Knuth's algorithm, the mean and stddev were calculated over each channel of all the images.
- The dataset loading and iteration is currently very slow and will need to be improved drastically.

Links

fg_bg_images and mask_images:

depth_images:

bg_images:

https://drive.google.com/file/d/1-350ev9Q6IiH3dLsjELG2i7nMi4kxVoW/view?usp=sharing

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
DataGeneratorPipeline.ipynb		DataGeneratorPipeline.ipynb
DataSetMeanStd.ipynb		DataSetMeanStd.ipynb
README.md		README.md
allimg.png		allimg.png
baseimageflip.png		baseimageflip.png
densedepth_utils.py		densedepth_utils.py
depthmap.png		depthmap.png
monomaskdepthutils.py		monomaskdepthutils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

MonoDepthMask

Dataset Description

Dataset Stats

Dataset Generation Process

Links

fg_bg_images and mask_images:

depth_images:

bg_images:

csv_files:

Sample Images

FG Images:

Base Images：

Depth Images：

About

Releases

Packages

Languages

rajy4683/MonoMaskDepth

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

MonoDepthMask

Dataset Description

Dataset Stats

Dataset Generation Process

Links

fg_bg_images and mask_images:

depth_images:

bg_images:

csv_files:

Sample Images

FG Images:

Base Images：

Depth Images：

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages