Use-cases in Archaeology and English Heritage
Keith May (English Heritage)
Exploring the Use of Semantic Technologies for Cross-Search of Archaeological Grey Literature and Data
Work has been ongoing at English Heritage in the use and development of the CIDOC CRM ontology for modelling the archaeological processes, data and conceptual relationships involved in excavation recording and analysis. This modelling has been used to bring together a range of different archaeological datasets – originating from a number of separate organisations – so that they could be cross-searched using semantic technologies. This level of interoperability for otherwise unintegrated data is itself a valuable step. Further work in the Semantic Technologies for Archaeological Resources (STAR) project explored the possibilities of mapping elements of descriptive free text to the Conceptual Reference Model and thereby making aspects of the archaeological reports cross-searchable too, alongside the other datasets.
This work required initial detailed annotation of a sample set of reports taken from the corpus of OASIS ‘grey literature’ that was available from the ADS online library. A methodology was developed to identify which parts of the reports would be best to extract information from and how to do so using Natural Language Processing (NLP) techniques. A series of rule based Information Extraction (IE) routines were built, using GATE software, to handle both the grammatical and ontological ‘rules’ that were needed to process the text in the reports.
This presentation presents key elements of this work, and discusses various issues that were encountered in trying to extract information about Events, Places, Objects and Materials.
Paul Cripps (University of South Wales)
GeoSemantic Technologies for Archaeological Resources
The semantics of heritage data is a growing area of interest with ontologies such as the CIDOC-CRM providing semantic frameworks and exemplary projects such as STAR and STELLAR demonstrating what can be done using semantic technologies applied to archaeological resources. In the world of the Semantic Web, advances regarding geosemantics have emerged to extend research more fully into the spatio-temporal domain, for example extending the SPARQL standard to produce GeoSPARQL. Importantly, the use of semantic technologies, particularly the structure of RDF, aligns with graph and network based approaches, providing a rich fusion of techniques for geospatial analysis of heritage data expressed in such a manner.
This paper gives an overview of the ongoing G-STAR research project (GeoSemantic Technologies for Archaeological Resources) with reference to broader sectoral links particularly to commercial archaeology. Particular attention is paid to examining the integration of spatial data into the heritage Global Graph and the relationship between Spatial Data Infrastructure (SDI) and Linked Data, moving beyond notions of ‘location’ as simple nodes, placenames and coordinates towards fuller support for complex geometries and advanced spatial reasoning. Finally, the potential impacts of such research is discussed with particular reference to the current practice of commercial archaeology, access to and publishing of (legacy, big) data, and leveraging network models to better understand and manage change within archaeological information systems.
Image from the English Heritage extension of the CIDOC-CRM, STAR and STELLAR projects