In summary, the contributions of our work are as follows:
- We present a novel vision-based relation detection approach, named ViRED, to address the issue of predicting relations for non-textual components in complex documents. This approach has been specifically implemented for the purpose of circuit-to-table relation matching in electrical design drawings.
- We develop a dataset of electrical engineering drawings derived from industrial design data, and we annotate the instances and their relationships within the dataset.
- We evaluate our method using various metrics on the electrical engineering drawing dataset. Furthermore, we perform comparative analyses with existing approaches and provide a performance comparison between the existing methods and our proposed technique.
- We perform extensive ablation studies to compare the impact of different model architectures, hyperparameters, and training methods on the overall performance. Moreover, we refined our model architecture based on these comparative analyses.