‘Generative Benchmark Creation for Table Union Search’

“Data management has traditionally relied on synthetic data generators to generate structured benchmarks … where we can control important parameters like data size and its distribution precisely. … Our current methods for creating benchmarks involve the manual curation and labeling of real data. These methods are not robust or scalable and … it is not clear how robust the created benchmarks are. We propose to use generative AI models to create structured data benchmarks for table union search. We present a novel method for using generative models to create tables with specified properties.”

Find the paper and full list of authors at ArXiv.

View on Site: ‘Generative Benchmark Creation for Table Union Search’