Dynamic Retrieval Augmented Generation of Ontologies using Artificial Intelligence (DRAGON-AI)

Table 6 DRAGON-AI performance on definition generation task

method	model name	accuracy	score	consistency
DRAGON	gpt-3.5-turbo	4.058	3.632	3.735
DRAGON	gpt-4	3.97	3.567	3.689
DRAGON	nous-hermes-13b	3.776	3.389	3.566
curator	human	4.326	4.069	4.13

A comparison of base performance of DRAGON on definition generation when compared with existing editor-provided definitions. Evaluator scores shown for three score categories (accuracy, consistency, and overall score). Evaluators evaluated definitions generated by three different models, alongside existing ontology definitions. Evaluators were not shown the source of definitions until after evaluation

ISSN: 2041-1480