Skip to content

Commit

Permalink
🎉 Yet another great project!!
Browse files Browse the repository at this point in the history
  • Loading branch information
fuqianya committed Jan 27, 2022
1 parent cd2c914 commit d12dceb
Show file tree
Hide file tree
Showing 29 changed files with 2,028 additions and 9 deletions.
28 changes: 19 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,15 +44,16 @@
</tr>
<tr align="center">
<td>复现精度</td>
<td>89.8</td>
<td>98.8</td>
<td>99.7</td>
<td>78.2</td>
<td>95.8</td>
<td>98.3</td>
<td>90.4</td>
<td>98.5</td>
<td>99.8</td>
<td>78.1</td>
<td>96.2</td>
<td>98.2</td>
</tr>
</table>


## 三、数据集

本项目所使用的数据集为[COCO2014](https://cocodataset.org/)。该数据集共包含123287张图像,每张图像对应5个标题。训练集、验证集和测试集分别为113287、5000、5000张图像及其对应的标题。本项目使用预提取的`bottom-up`特征,可以从[这里](https://biglmdiag.blob.core.windows.net/oscar/datasets/coco_ir.zip)下载得到。
Expand Down Expand Up @@ -83,11 +84,18 @@ cd Oscar-Paddle
pip install -r requirements.txt
```

### step3: 下载数据
### step3: 挂载数据

```bash
# 下载数据集及特征
bash ./download_dataset.sh
# 相关数据集已上传至Aistudio
# 详情见: https://aistudio.baidu.com/aistudio/datasetdetail/124153

# paddle格式的预训练权重也已上传至Aistudio
# 详情见: https://aistudio.baidu.com/aistudio/datasetdetail/124186

# 下载或挂载数据集和预训练权重之后
# 需要修改配置文件(configs/retrieval_train.yaml和configs/retrieval_test.yaml)
# 的一些参数: DATA_DIR (数据集目录), PRETRAINED-DIR (预训练权重路径)
```

### step4: 训练
Expand All @@ -97,6 +105,8 @@ export PYTHONPATH=$PWD:$PYTHONPATH
CUDA_VISIBLE_DEVICES='0, 1, 2, 3' python -m paddle.distributed.launch tools/train_retrieval.py --cfg_file configs/retrieval_train.yaml
```

**执行之前,需要手动修改配置文件(configs/retrieval_train.yaml)的一些参数: DATA_DIR (数据集路径),PRETRAINED-DIR (预训练权重路径)**

### step5: 测试

```bash
Expand Down
Empty file added config/__init__.py
Empty file.
Binary file added config/__pycache__/__init__.cpython-37.pyc
Binary file not shown.
Binary file added config/__pycache__/default.cpython-37.pyc
Binary file not shown.
84 changes: 84 additions & 0 deletions config/default.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
#! /usr/bin/env python
# -*- coding: utf-8 -*-

from yacs.config import CfgNode as CN

# Create a Node
__C = CN()

# ========================== INPUT =========================
__C.INPUT = CN()
__C.INPUT.BERT_MODEL = 'bert-base-uncased'
__C.INPUT.MAX_REGION = 50
__C.INPUT.MAX_SEQ_LEN = 70
__C.INPUT.IMG_FEATURE_DIM = 2054
__C.INPUT.IMG_FEATURE_TYPE = 'frcnn'
# Whether add object detection labels as input
__C.INPUT.ADD_OD_LABEL = True
__C.INPUT.DO_LOWER_CASE = True
__C.INPUT.ATT_MASK_TYPE = 'CLR'
# Sample this number of captions for each image
__C.INPUT.NUM_CAPTIONS_PER_IMAGE_TRN = 5
__C.INPUT.NUM_CAPTIONS_PER_IMAGE_DEV = 5

# ========================== DATASET =========================
__C.DATASET = CN()
__C.DATASET.NAME = 'COCO'
__C.DATASET.DATA_DIR = ''
__C.DATASET.TRAIN = 'train'
__C.DATASET.DEV = 'minival'
__C.DATASET.TEST = 'test'

# ========================== OUPUT =========================
__C.OUTPUT = CN()
__C.OUTPUT.SAVE_NAME = ''
# Save checkpoint frequency (epochs)
__C.OUTPUT.SAVE_FREQ = 1
__C.OUTPUT.NUM_LABELS = 2
__C.OUTPUT.CHECKPOINT_DIR = './exp'

# ========================== OPTIMIZATION =========================
__C.OPTIMIZATION = CN()
__C.OPTIMIZATION.LR = 1e-5
__C.OPTIMIZATION.EPSILON = 1e-8
__C.OPTIMIZATION.LOSS_TYPE = 'sfmx'
__C.OPTIMIZATION.BATCH_SIZE = 16
__C.OPTIMIZATION.WARMUP_STEPS = 0
__C.OPTIMIZATION.LR_SCHEDULER = 'linear'
__C.OPTIMIZATION.WEIGHT_DECAY = 0.05
__C.OPTIMIZATION.EPOCHS = 30
# Clip gradients at this value
__C.OPTIMIZATION.CLIP_MAX_NORM = 1.0
__C.OPTIMIZATION.OPTIMIZER = 'adamw'
# Gradient accumulation steps
__C.OPTIMIZATION.GRADIENT_ACCUMULATION_STEPS = 4

# ========================== MONITOR =========================
__C.MONITOR = CN()
# Print training log frequency (steps)
__C.MONITOR.PRINT_STEP = 100
# Evaluation frequency (epochs)
__C.MONITOR.EVAL_FREQ = 1

# ========================== PRETRAINED =========================
__C.PRETRAINED = CN()
__C.PRETRAINED.DIR = ''
__C.PRETRAINED.RESUME = ''

# ========================== EVAL =========================
__C.EVAL = CN()
__C.EVAL.CHECKPOINT_DIR = ''
__C.EVAL.EVAL_CROSS_IMAGE = False
__C.EVAL.EVAL_IMG_KEYS_FILE = ''
__C.EVAL.EVAL_CAPTION_INDEX_FILE = ''

# ========================== MISC =========================
__C.MISC = CN()
__C.MISC.SEED = 123
__C.MISC.NUM_WORKERS = 8


def get_cfg_defaults():
"""Get a yacs CfgNode object with default values."""
# Return a clone so that the defaults will not be altered
return __C.clone()
22 changes: 22 additions & 0 deletions configs/retrieval_test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
INPUT:
NUM_CAPTIONS_PER_IMAGE_TRN: 5
NUM_CAPTIONS_PER_IMAGE_DEV: 5

DATASET:
NAME: 'COCO'
DATA_DIR: 'coco_ir_paddle/'
TRAIN: 'train'
DEV: 'minival'
TEST: 'test'

OUTPUT:
SAVE_NAME: 'finetune_retrieval'
NUM_LABELS: 2

OPTIMIZATION:
BATCH_SIZE: 32

EVAL:
CHECKPOINT_DIR: 'exp/finetune_retrieval_22Y_01M_02D_23H/checkpoint-30'
EVAL_CROSS_IMAGE: True
EVAL_IMG_KEYS_FILE: test_img_keys_1k.tsv
21 changes: 21 additions & 0 deletions configs/retrieval_train.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
INPUT:
NUM_CAPTIONS_PER_IMAGE_TRN: 5
NUM_CAPTIONS_PER_IMAGE_DEV: 20

DATASET:
NAME: 'COCO'
DATA_DIR: '/mnt/disk6T/Data/Research/Multi-Modal-Pretraining/2020-Oscar-ECCV/data/coco_ir_paddle/'
TRAIN: 'train'
DEV: 'minival'
TEST: 'test'

OUTPUT:
SAVE_NAME: 'finetune_retrieval'
NUM_LABELS: 2

PRETRAINED:
DIR: '/mnt/disk6T/Data/Research/Multi-Modal-Pretraining/2020-Oscar-ECCV/pretrained_model/paddle_version'

EVAL:
EVAL_CAPTION_INDEX_FILE: 'minival_caption_indexs_top20.pd'

Binary file not shown.
Loading

0 comments on commit d12dceb

Please sign in to comment.