http://host.robots.ox.ac.uk/pascal/VOC/

 

The PASCAL Visual Object Classes Homepage

2006 10 classes: bicycle, bus, car, cat, cow, dog, horse, motorbike, person, sheep. Train/validation/test: 2618 images containing 4754 annotated objects. Images from flickr and from Microsoft Research Cambridge (MSRC) dataset The MSRC images were easier th

host.robots.ox.ac.uk

PASCAL(Pattern Analysis, Statistical Modeling and Computational Learning)

VOC(Visual Object Classes)

 

Pascal VOC Chanllenges 2005-2012

 

Classification/Detection Competitions

  1. Classification
    For each of the twenty classes, predicting presence/absence of an example of that class in the test image.
  2. Detection
    Predicting the bounding box and label of each object from the twenty target classes in the test image.

20 classes

Segmentation Competition

  • Segmentation
    Generating pixel-wise segmentations giving the class of the object visible at each pixel, or "background" otherwise.

Action Classification Competition

  • Action Classification
    Predicting the action(s) being performed by a person in a still image.

10 action classes + "other"

 

ImageNet Large Scale Visual Recognition Competition

To estimate the content of photographs for the purpose of retrieval and automatic annotation using a subset of the large hand-labeled ImageNet dataset (10,000,000 labeled images depicting 10,000+ object categories) as training.

 

 

Person Layout Tester Competition

  • Person Layout
    Predicting the bounding box and label of each part of a person (head, hands, feet).

 

Data

폴더 계층 구조

VOC20XX
 ├─ Annotations
 ├─ ImageSets
 ├─ JPEGImages
 ├─ SegmentationClass
 └─ SegmentationObject
  • Annotations: JPEGImages 폴더 속 원본 이미지와 같은 이름들의 xml 파일들이 존재, 정답 데이터
  • ImageSets: 사용 목적의 이미지 그룹 정보(test, train, trainval, val), 특정 클래스가 어떤 이미지에 있는지 등에 대한 정보 포함
  • JPEGImages: *.jpg 확장자를 가진 이미지 파일들, 입력 데이터
  • SegmentationClass: Semantic segmentation 학습을 위한 label 이미지
  • SegmentationObject: Instance segmentation 학습을 위한 label 이미

XML 파일 구조

<annotation>
  <folder>VOC2012</folder>
  <filename>2007_000027.jpg</filename>
  <source>
    <database>The VOC2007 Database</database>
    <annotation>PASCAL VOC2007</annotation>
    <image>flickr</image>
  </source>
  <size>
    <width>486</width>
    <height>500</height>
    <depth>3</depth>
  </size>
  <segmented>0</segmented>
  <object>
    <name>person</name>
    <pose>Unspecified</pose>
    <truncated>0</truncated>
    <difficult>0</difficult>
    <bndbox>
      <xmin>174</xmin>
      <ymin>101</ymin>
      <xmax>349</xmax>
      <ymax>351</ymax>
    </bndbox>
    <part>
      <name>head</name>
      <bndbox>
        <xmin>169</xmin>
        <ymin>104</ymin>
        <xmax>209</xmax>
        <ymax>146</ymax>
      </bndbox>
    </part>
    <part>
      <name>hand</name>
      <bndbox>
        <xmin>278</xmin>
        <ymin>210</ymin>
        <xmax>297</xmax>
        <ymax>233</ymax>
      </bndbox>
    </part>
    <part>
      <name>foot</name>
      <bndbox>
        <xmin>273</xmin>
        <ymin>333</ymin>
        <xmax>297</xmax>
        <ymax>354</ymax>
      </bndbox>
    </part>
    <part>
      <name>foot</name>
      <bndbox>
        <xmin>319</xmin>
        <ymin>307</ymin>
        <xmax>340</xmax>
        <ymax>326</ymax>
      </bndbox>
    </part>
  </object>
</annotation>
  • size: xml 파일과 대응되는 이미지의 width, height, depth(channel) 정보
    • width
    • height
    • depth
  • segmented:
  • object
    • name: 클래스 이름
    • pose: person의 경우만 사용됨
    • truncated: 0 = 전체 포함, 1 = 일부 포함
    • difficult: 0 = 인식하기 쉬움, 1 = 인식하기 어려움
    • bndbox
      • xmin: 좌측상단 x 좌표값
      • ymin: 좌측상단 y 좌표값
      • xmax: 우측하단 x 좌표값
      • ymax: 우측하단 y 좌표값
    • part: person의 경우에만 사용됨

'ML' 카테고리의 다른 글

COCO Dataset  (0) 2023.08.16
분류 모델의 성능평가지표 Accuracy, Recall, Precision, F1-score  (0) 2022.12.19
윈도우 버전 YOLO v3 설치  (0) 2021.08.16

+ Recent posts