Skip to main content
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
Poster 11 Oct 2023

Fine-grained Visual Classification (FGVC), which aims to identify objects from subcategories, presents great challenges for classification due to large intra-class differences and subtle inter-class differences. To address these issues of FGVC, this paper proposes a patch selection model referenced from CLIP for Fine Grained Visual Classification, namely CLIPFG. Specifically, unlike the previous CLIP, which focused only on the level of text and image, we calculate the similarity between labels and image patches. Top k image patches are selected and their indexes fed into the Vision Transformer to select discriminative areas to improve the performance of fine grained image classification. Quantitative evaluations show CLIP-FG’s competitive performance against mainstream methods.

More Like This

  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00
  • SPS
    Members: Free
    IEEE Members: $11.00
    Non-members: $15.00