VisDiff: SDF-Guided Polygon Generation for Visibility Reconstruction, Characterization and Recognition

Robotic Sensor Network Lab, University of Minnesota and UT Austin
Accepted to Neural Information Processing Systems (NeurIPS) 2025
Arxiv Polygon Dataset Github

Abstract

The ability to capture rich representations of combinatorial structures has enabled the application of machine learning to tasks such as analysis and generation of floorplans, terrains, images, and animations. Recent work has primarily focused on understanding structures with well-defined features, neighborhoods, or underlying distance metrics, while those lacking such characteristics remain largely unstudied. Examples of these combinatorial structures can be found in polygons, where a small change in the vertex locations causes a significant rearrangement of the combinatorial structure, expressed as a visibility or triangulation graphs. Current representation learning approaches fail to capture structures without well-defined features and distance metrics.

In this paper, we study the open problem of Visibility Reconstruction : Given a visibility graph G, construct a polygon P whose visibility graph is G. We introduce VisDiff, a novel diffusion-based approach to generate polygon from the input visibility graph . The main novelty of our approach is that, rather than generating the polygon's vertex set directly, we first estimate the signed distance function (SDF) associated with the polygon. The SDF is then used to extract the vertex location representing the final polygon. We show that going through the SDF allows to learn the visibility relationship much more effectively than generating vertex locations directly. In order to train , we create a carefully curated dataset. We use this dataset to benchmark our method and achieve 26% improvement in F1-Score over standard methods as well as state of the art approaches. We also provide preliminary results on the harder visibility graph recognition problem in which the input is not guaranteed to be a visibility graph. To demonstrate the applicability of VisDiff beyond visibility graphs, we extend it to the related combinatorial structure of triangulation graph. Lastly, leveraging these capabilties, we show that VisDiff can perform high-diversity sampling over the space of all polygons. In particular, we highlight its ability to perform both polygon-to-polygon interpolation and graph-to-graph interpolation, enabling diverse sampling across the polygon space.


Main Idea

Architecture


VisDiff Architecture consists of two main components: a U-Net SDF Diffusion block and a Vertex Extraction block. The diffusion block takes a noisy signed distance function (SDF) sampled from a Gaussian distribution and iteratively denoises it through transformer cross-attention, where spatial CNN features act as queries and the input visibility graph provides keys and values. From the recovered clean SDF, an initial vertex set is extracted via contouring. The vertex extraction block then refines these vertices by combining pixel-aligned and global SDF features to predict the final polygon structure. The model is trained with supervision from both ground-truth SDFs and polygons, while at test time it requires only the visibility graph as input.


Architecture

Dataset


We study visibility characterization and reconstruction, which require datasets where multiple polygons map to the same visibility graph and where graph diversity is high. No existing dataset satisfies these properties. To address this, we generate 60,000 polygons with 25 vertices, sampled uniformly in [-1,1]2 and arranged anticlockwise using the 2-opt algorithm. 2-opt move algorithm exhibited non-uniformity with respect to the link diameter of the visibility graph. Hence, we rebalance the dataset by resampling based on link diameter resulting in 18,500 polygons. We additionally create 20 augmentations for each polygon, with each augmentation having the characteristic of multiple polygons with the same visibility graph, thereby producing 370,000 polygons. The final dataset can be found here .


dataset
Comparison to Real-World Datasets
Real-World Dataset Average Link Diameter Max Link Diameter
Ours 4.4 / 2.2 9.0
MNIST 1.83 / 0.50 4.1
COCO 2017 1.32 / 0.25 3.823
Quantitative Metrics and Qualitative Results

We compare VisDiff with baselines on the visibility reconstruction problem on the test split of the curated dataset. We evaluate based on the F-1 score between the visibility graph of the generated polygon and that of the ground truth polygon.


Method Acc ↑ Prec ↑ Rec ↑ F1 ↑
(a) Vertex Diffusion [41] 0.777 0.773 0.716 0.724
(b) Variational Autoencoder [42] 0.740 0.718 0.704 0.702
(c) Graph Neural Network [43] 0.730 0.786 0.686 0.674
(d) VisDiff 0.924 0.914 0.911 0.912
(e) MeshGPT [9] 0.747 0.739 0.723 0.712
Main Idea

Interpolation Visualization


We demonstrate VisDiff’s diversity with two interpolations: polygon-to-polygon, where we fix a visibility graph and linearly interpolate between two diffusion seeds over 50 steps to obtain valid intermediate polygons; and graph-to-graph, where we move along a flip-graph path between two valid triangulations and generate smooth intermediate polygon sequences.


Graph-to-Graph Interpolation

Polygon-to-Polygon Interpolation

Applications


  • Navigation benchmark in high-occlusion settings : using the VisDiff dataset as a navigation benchmark to evaluate motion planners in high occlusion settings.
  • Polygon data augmentation: sample diverse, valid polygons conditioned on graph structure.
  • Privacy-aware floorplan generation: The finding of this work can be used to incorporate room-to-room visibility (visibility graph analysis) into polygon-based floorplan models to balance connectivity and privacy.

To Cite Our Work

@misc{moorthy2025visdiffsdfguidedpolygongeneration,
  title        = {VisDiff: SDF-Guided Polygon Generation for Visibility Reconstruction and Recognition},
  author       = {Rahul Moorthy and Jun-Jee Chao and Volkan Isler},
  year         = {2025},
  eprint       = {2410.05530},
  archivePrefix= {arXiv},
  primaryClass = {cs.CG},
  url          = {https://arxiv.org/abs/2410.05530},
}