Addressing Challenges In Building Web-Scale Content Classification Systems

Aditya Srinivas Timmaraju, Angli Liu, Pushkar Tripathi

DOI

SPS

Members: Free
IEEE Members: $11.00
Non-members: $15.00

Length: 13:19

04 May 2020

Understanding the semantic meaning of content on the web through the lens of a taxonomy has many practical advantages. However, when building large-scale content classification systems, practitioners are faced with unique challenges involving finding the best ways to leverage the scale and variety of data available on internet platforms. We present learnings from our efforts in building a content classification system for multiple document types at Facebook using Multi-modal Transformers. We empirically demonstrate the effectiveness of multi-lingual, multi-modal and cross-document type learning. We describe effective strategies for exploiting weakly supervised signals as a pre-training step and show that they lead to significant gains in downstream classification accuracy. We also discuss label collection schemes that help minimize the amount of noise in collected data.

Tags:

sps conference

icassp 2020 virtual conference

May 2020

icassp 2020

Addressing Challenges In Building Web-Scale Content Classification Systems

Aditya Srinivas Timmaraju, Angli Liu, Pushkar Tripathi

Value-Added Bundle(s) Including this Product

ICASSP 2020 Virtual Conference - Presentation Videos Product Bundle

More Like This

IEEE ICASSP 2023, 4-10 June 2023, Greece. Virtual and In-Person Conference - Presentation Videos Product Bundle

IEEE ICASSP 2024, 1 4-19 April 2024, Seoul, Korea. Conference Presentation Videos Bundle

ICIP 2022, October 16-19, 2022, Bordeaux, France - Presentation Videos Product Bundle

Join the IEEE Signal Processing Society