Contrastive Learning of Sentence Embeddings in Product Search
Bo-Wen Zhang (Beijing Academy of Artificial Intelligence); Yan Yan (CUMTB); Jiapei Yu (Alibaba Group)
-
SPS
IEEE Members: $11.00
Non-members: $15.00
Contrastive learning has been widely applied for learning effective sentence embeddings for continue pretraining in domain-specific scenarios. Recent approaches in the product search domain have mainly focused on embedding-based retrieval, in which the state-of-the-art contrastive learning approach, unsupervised SimCSE, is usually employed. Ordinarily, as we know, the supervised SimCSE outperforms the unsupervised SimCSE, owing to the incorporation of annotated pairs as positives and hard negatives. In this paper, we propose WS-SimCSE, a weak supervision approach based on graph neural networks, which utilizes user behavior data to model relevance relationship between queries and items in a heterogeneous graph. With the neighborhood information of nodes with the same type, the positive pairs and hard negatives can be constructed for training objectives of supervised contrastive learning. Comparison evaluation experiments on several downstream benchmarks from a real-world online tourism platform demonstrate the robustness and effectiveness of WS-SimCSE.