http://host.robots.ox.ac.uk/pascal/VOC/
PASCAL(Pattern Analysis, Statistical Modeling and Computational Learning)
VOC(Visual Object Classes)
Pascal VOC Chanllenges 2005-2012
Classification/Detection Competitions
- Classification
For each of the twenty classes, predicting presence/absence of an example of that class in the test image. - Detection
Predicting the bounding box and label of each object from the twenty target classes in the test image.
Segmentation Competition
- Segmentation
Generating pixel-wise segmentations giving the class of the object visible at each pixel, or "background" otherwise.
Action Classification Competition
- Action Classification
Predicting the action(s) being performed by a person in a still image.
ImageNet Large Scale Visual Recognition Competition
To estimate the content of photographs for the purpose of retrieval and automatic annotation using a subset of the large hand-labeled ImageNet dataset (10,000,000 labeled images depicting 10,000+ object categories) as training.
Person Layout Tester Competition
- Person Layout
Predicting the bounding box and label of each part of a person (head, hands, feet).
Data
폴더 계층 구조
VOC20XX
├─ Annotations
├─ ImageSets
├─ JPEGImages
├─ SegmentationClass
└─ SegmentationObject
- Annotations: JPEGImages 폴더 속 원본 이미지와 같은 이름들의 xml 파일들이 존재, 정답 데이터
- ImageSets: 사용 목적의 이미지 그룹 정보(test, train, trainval, val), 특정 클래스가 어떤 이미지에 있는지 등에 대한 정보 포함
- JPEGImages: *.jpg 확장자를 가진 이미지 파일들, 입력 데이터
- SegmentationClass: Semantic segmentation 학습을 위한 label 이미지
- SegmentationObject: Instance segmentation 학습을 위한 label 이미
XML 파일 구조
<annotation>
<folder>VOC2012</folder>
<filename>2007_000027.jpg</filename>
<source>
<database>The VOC2007 Database</database>
<annotation>PASCAL VOC2007</annotation>
<image>flickr</image>
</source>
<size>
<width>486</width>
<height>500</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>person</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>174</xmin>
<ymin>101</ymin>
<xmax>349</xmax>
<ymax>351</ymax>
</bndbox>
<part>
<name>head</name>
<bndbox>
<xmin>169</xmin>
<ymin>104</ymin>
<xmax>209</xmax>
<ymax>146</ymax>
</bndbox>
</part>
<part>
<name>hand</name>
<bndbox>
<xmin>278</xmin>
<ymin>210</ymin>
<xmax>297</xmax>
<ymax>233</ymax>
</bndbox>
</part>
<part>
<name>foot</name>
<bndbox>
<xmin>273</xmin>
<ymin>333</ymin>
<xmax>297</xmax>
<ymax>354</ymax>
</bndbox>
</part>
<part>
<name>foot</name>
<bndbox>
<xmin>319</xmin>
<ymin>307</ymin>
<xmax>340</xmax>
<ymax>326</ymax>
</bndbox>
</part>
</object>
</annotation>
- size: xml 파일과 대응되는 이미지의 width, height, depth(channel) 정보
- width
- height
- depth
- segmented:
- object
- name: 클래스 이름
- pose: person의 경우만 사용됨
- truncated: 0 = 전체 포함, 1 = 일부 포함
- difficult: 0 = 인식하기 쉬움, 1 = 인식하기 어려움
- bndbox
- xmin: 좌측상단 x 좌표값
- ymin: 좌측상단 y 좌표값
- xmax: 우측하단 x 좌표값
- ymax: 우측하단 y 좌표값
- part: person의 경우에만 사용됨
'ML' 카테고리의 다른 글
COCO Dataset (0) | 2023.08.16 |
---|---|
분류 모델의 성능평가지표 Accuracy, Recall, Precision, F1-score (0) | 2022.12.19 |
윈도우 버전 YOLO v3 설치 (0) | 2021.08.16 |