Novel vision architecture representing images as graphs of nodes rather than sequences or grids. Splits images into patches viewed as nodes, connects nearest neighbors, and uses Grapher modules with graph convolution plus FFN modules for feature transformation. Supports both isotropic and pyramid architectures. Published at NeurIPS 2022.

Outputs 2

ViG

model

Vision GNN: An Image is Worth Graph of Nodes

paper

arXiv: 2206.00272

visiongraph-neural-networkarchitectureopen-source