Graph Convolutional Neural Networks for Web-Scale Recommender Systems

Person

Files & media

Journal / Conference

Progress

Not started

URL

Year

비고

ABSTRACT

Recent advancements in deep neural networks for graph-structured data have led to state-of-the-art performance on recommender system benchmarks. However, making these methods practical and scalable to web-scale recommendation tasks with billions of items and hundreds of millions of users remains a challenge. Here we describe a large-scale deep recommendation engine that we developed and deployed at Pinterest. We develop a data-efficient Graph Convolutional Network (GCN) algorithm PinSage, which combines efficient random walks and graph convolutions to generate embeddings of nodes (i.e., items) that incorporate both graph structure as well as node feature information. Compared to prior GCN approaches, we develop a novel method based on highly efficient random walks to structure the convolutions and design a novel training strategy that relies on harder-and-harder training examples to improve robustness and convergence of the model. We also develop an efficient MapReduce model inference algorithm to generate embeddings using a trained model. We deploy PinSage at Pinterest and train it on 7.5 billion examples on a graph with 3 billion nodes representing pins and boards, and 18 billion edges. According to offline metrics, user studies and A/B tests, PinSage generates higher-quality recommendations than comparable deep learning and graph-based alternatives. To our knowledge, this is the largest application of deep graph embeddings to date and paves the way for a new generation of web-scale recommender systems based on graph convolutional architectures.
Plain Text
복사

SUMMARY

Pinsage 논문

•

Constructing convolutions via random walks

◦

Computation load를 줄이기 위한 노드 샘플링

◦

short random walk을 통한 computation graph  생성

•

Importance pooling

◦

random walk similarity measure를 통해 aggregation 과정에서 노드 가중치 부여

◦

46%의 성능 향상

•

Curriculum training

◦

sample 난이도(hard examples)를 올려가며 학습

◦

12%의 성능 향상