Integrated access to cultural heritage resources through representation and alignment of controlled vocabularies
The Authors
Antoine Isaac, Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, Noord-Holland, The Netherlands
Stefan Schlobach, Department of Computer Science, Vrije Universiteit Amsterdam, Amsterdam, Noord-Holland, The Netherlands
Henk Matthezing, Koninklijk Bibliotheek, The Hague, The Netherlands
Claus Zinn, Max-Planck Institute for Psycholinguistics, Nijmegen, The Netherlands
Acknowledgements
This paper is based on a talk given at “Information Access for the Global Community, An International Seminar on the Universal Decimal Classification” held on 4-5 June 2007 in The Hague, The Netherlands. An abstract of this talk will be published in Extensions and Corrections to the UDC, an annual publication of the UDC consortium.
Abstract
Purpose – To show how semantic web techniques can help address semantic interoperability issues in the broad cultural heritage domain, allowing users an integrated and seamless access to heterogeneous collections.
Design/methodology/approach – This paper presents the heterogeneity problems to be solved. It introduces semantic web techniques that can help in solving them, focusing on the representation of controlled vocabularies and their semantic alignment. It gives pointers to some previous projects and experiments that have tried to address the problems discussed.
Findings – Semantic web research provides practical technical and methodological approaches to tackle the different issues. Two contributions of interest are the simple knowledge organisation system model and automatic vocabulary alignment methods and tools. These contributions were demonstrated to be usable for enabling semantic search and navigation across collections.
Research limitations/implications – The research aims at designing different representation and alignment methods for solving interoperability problems in the context of controlled subject vocabularies. Given the variety and technical richness of current research in the semantic web field, it is impossible to provide an in-depth account or an exhaustive list of references. Every aspect of the paper is, however, given one or several pointers for further reading.
Originality/value – This article provides a general and practical introduction to relevant semantic web techniques. It is of specific value for the practitioners in the cultural heritage and digital library domains who are interested in applying these methods in practice.
Article Type:
Conceptual paper
Keyword(s):
Worldwide web; Archives management; Digital libraries.
Journal:
Library Review
Volume:
57
Number:
3
Year:
2008
pp:
187-199
Copyright ©
Emerald Group Publishing Limited
ISSN:
0024-2535
Introduction: the semantic interoperability problem
In the digital age, cultural heritage (CH) institutions have the opportunity to, and face the challenge of, using the World Wide Web to make accessible the digital artefacts of their collections, together with their metadata. Web-based access to digitised images and their descriptions, at anytime from anywhere, lowers the barriers for access to information resources. Once there is digital access to the content of museums, libraries and archives, there is also the tremendous opportunity to merge collections from different locations into virtual, federated institutions, thus increasing access across collections and institutional boundaries.
In stark contrast to the vast amount of existing digital resources on the World Wide Web, CH assets from libraries, museums and archives are very well described. Over many generations, librarians, curators and archivists have developed knowledge organisation systems (KOSs) – controlled vocabularies such as thesauri, classification schemes and ontologies – to organise and manage their collections. The organisation and access to CH, and the human capacity to deal with information and knowledge, is a valuable achievement in itself. It helps us to grasp our past and present, and this understanding must be exploited to facilitate its access at a grander scale.
The move toward cross-institutional CH portals is well under way, as, for instance, The European Library[1] and the memory of the Netherlands[2] testify. In this paper, we describe how CH expertise can be combined with knowledge and technology from the Semantic Web (SW) community to deliver portals that provide a seamless and unified access to different collections via semantic search and navigation.
Figure 1 illustrates the problem that needs to be solved in a networked environment. Consider two collections, each of which is indexed by its dedicated knowledge organisation system. Instead of using one single conceptual vocabulary for querying or browsing the objects of both collections simultaneously, users are expected and required to use the terminology of the first KOS to identify objects of the first collection, and the second KOS to identify those of the second collection.
We say that these two KOSs are not interoperable at the semantic level. In the given example, when searching for objects showing a “Madonna” one will only retrieve objects that were indexed using this specific subject description (the statue in the upper right); one will not find the manuscript illumination (in the lower right) that was indexed as “Virgin Mary”, which is clearly a conceptually similar subject description, but stems from another controlled vocabulary.
Not taking care of the semantic heterogeneity of their respective KOSs when merging collections clearly hampers the ease of accessibility. The burden of search is indeed transferred to users who then need to perform two well-formulated queries (using the respective correct terminology) to obtain the desired objects from the two collections.
Two heterogeneity problems must be solved to enhance the interoperability of controlled vocabularies and, hence, of the systems and collections that use them:
- Representational heterogeneity: vocabularies often come in different formats; some will be encoded in XML while others will come as plain text. Moreover, the models guiding their design might not be directly compatible. They might mirror different general information needs (e.g. thesauri contain “terms” while classification schemes contain “classes”), and different KOS might have different kinds of notes and labels attached to conceptual entities, for instance.
- Conceptual heterogeneity: any two vocabularies will usually contain concepts that have identical or similar meanings but different labels or names (e.g. like “Virgin Mary” and “Madonna”). Also, there will be concepts that are more general than others (e.g. like “Mother” and “Virgin Mary”). Such similarity and subsumption links have to be determined and exploited so that an integrated system can provide users with seamless access to joint content described by several vocabularies.
In this paper, we show how these two problems can be addressed using techniques that are currently being investigated in the Semantic Web research domain. In section 2, we describe the basic elements of the Semantic Web infrastructure, and illustrate how the simple knowledge organisation systems (SKOS) standard model can be used to represent different vocabularies KOSs homogeneously. In section 3, we show how the representation of the different vocabularies, then commonly represented in the SKOS format, can be semantically aligned to enable a semantic integration of different collections. Finally, in section 4, we demonstrate how we solved a real-life problem with a combination of SW techniques, and briefly describe the resulting prototype.
Semantic Web techniques and controlled vocabulary representation
The Semantic Web (Berners-Lee et al., 2001) is a proposed extension of the existing web, where information found on the web is augmented with machine-accessible knowledge[3]. The basic building blocks of the Semantic Web, as introduced by the resource description format (RDF)[4], are resources which denote any element that can be identified on (or even outside) the web. These resources are described by three-part statements that link them together. Each statement has a subject resource which is linked to an object resource via a property resource. Together, several such triplets form a graph, such as the one represented in Figure 2[5]. These graphs can contain:
- Factual knowledge: the third paragraph of the described document is about “Amsterdam”; the type of the described document is “Article”; and the selected paragraph “par3” is part of a (larger) file called “file1”.
- Ontological knowledge: the Semantic Web is concerned about the way resources can be grouped in conceptual classes. These classes are introduced in ontologies that contain formally expressed knowledge about them. Here, “Article” is a class more specific than (or a subclass of) “Document”.
The information contained in ontologies is important, since it provides material for automated reasoning on the resources which populate the classes. For example, from the information found in Figure 2 for “file1”, “Article” and “Document”, an automated reasoning engine can infer that “file1” is also an instance of the “Document” class, which will yield more answers for queries containing “Document”.
It should be noted that the RDF framework is designed to allow different sources of knowledge to co-exist with each other, inhabiting the same space. This means that Semantic Web data can merge and operate with resources coming from different information spaces. In our example, the objects and links in Figure 2 come from different namespaces, either user-defined (myVoc1:, myVoc2:) or predefined (RDF:). The resource “Amsterdam” in myVoc2: may indeed refer to the capital of The Netherlands (as the RDF graphs in which it occurs would show), while some other resource with the same name, but from a different vocabulary space, may refer to a city in the state of New York, USA. In any case, both resources stem from different name spaces and can both inhabit different contexts, further defining and constraining their intended meaning.
RDF “triples” are the basic building blocks for translating KOS into a homogeneous format. Also, in order to mirror a KOS' modelling elements (e.g. the “broader than”, or “narrower than” relation types of thesauri), additional constructs are necessary. RDF-Schema (Brickley and Guha, 2004), in short RDF-S, is a simple representation language that allows users to define their models, introducing different types for RDF resources and links. One can also express, for instance, that the source and target of a relation are of a specific type, e.g. that the relation type “has painted” requires a subject of type “painter” (or “artist”), and an object of type “painting” (or “drawing”). The current standard web ontology language is called OWL (McGuinness and Harmelen, 2004). This formal language is more expressive than RDF-S, allowing users to define a variety of different properties of classes and relations between them. A more detailed discussion of OWL is beyond the scope of this paper.
To support experts in converting their KOSs into the RDF-based formats, but also to facilitate the future exchange of such formats, the World Wide Web Consortium (W3C) has initiated the development of SKOS[7], a standard model that allows CH practitioners (and other terminologists) to homogenously represent the basic features of KOSs. SKOS introduces a set of constructs for RDF, which mainly allow for the description of concepts and concept schemes (Miles and Brickley, 2005).
Concept description
SKOS has chosen a concept-based approach for the representation of controlled vocabularies. As opposed to a term-based approach, where terms from natural language are the first-order elements of a KOS, SKOS describes abstract concepts that may have a different materialisation in language (lexicalisations). SKOS introduces a special construct skos:Concept[8] to properly characterise the (web) resources that denote such KOS elements. To further specify these conceptual resources, SKOS features:
- Labelling properties, e.g. skos:prefLabel and skos:altLabel, to link a concept to the terms that represent it in language. The prefLabel value is a non-ambiguous term that uniquely identifies the concept, and can be used as a descriptor in an indexing system. The term altLabel is used to introduce alternative entries, such as synonyms, abbreviations, and so forth. SKOS allows concepts to be linked to prefLabels and altLabels in different languages. SKOS concepts can thus be used seamlessly in multilingual environments.
- Semantic properties are used to represent the structural relationships between concepts, which are usually at the core of controlled vocabularies like thesauri. The construct skos:broader denotes the generalisation link (BT in standard thesauri), while skos:narrower denotes its reciprocal link (NT), and skos:related the associative relationship (RT).
- Documentation properties. Often, informal documentation plays an important role in a KOS. SKOS introduces explanatory notes – skos:scopeNote, skos:definition, skos:example – and management notes – skos:changeNote, skos:historyNote, etc.
Concept scheme description
A KOS as a whole also has to be represented and described. SKOS coins a skos:ConceptScheme construct for this. It also introduces specific properties to represent the links between different KOSs and the concepts they contain. The term skos:inScheme asserts that a given concept is part of a given concept scheme, while skos:hasTopConcept states that a KOS contains a concept as the root of (one of) its constituent hierarchical tree(s) (i.e. a concept without a broader concept).
Conversion from a KOS native representation to SKOS RDF data requires the analysis of the original model of the KOS, and the linking of the elements of this model to the SKOS ones that fit them most (Assem et al., 2005). One can, for instance, decide to represent a “class” in a classification scheme as a resource of type skos:Concept. Based on such a specification, it is then possible to implement an appropriate conversion program – e.g. an XSL stylesheet when the vocabulary is natively encoded in XML – to automatically convert the initial representation to a SKOS one.
As an example, a subject 11F coming from the Iconclass concept scheme[9], “the Virgin Mary”, identified by the (as yet fictive) resource http://www.iconclass.nl/s_11F, could be partly represented by the graph in Figure 3.
Vocabulary alignment as a solution to the interoperability problem
Having unified and linkable representations of the concepts contained in different collections' vocabularies helps managing them in a single framework. However, this is not sufficient for solving the semantic interoperability problem. One still has to determine semantic similarity links between the elements of the different vocabularies – to align[10] them (Doerr, 2001). Figure 4 illustrates that if a search engine “knew” that a SKOS concept C from a thesaurus T1 is semantically equivalent to a SKOS concept D from thesaurus T2, then it could return all the objects that were indexed against D for a query for objects described using C. The objective is therefore to align as many concepts of one thesaurus to their semantic equivalents in the other thesaurus. Where such equivalency cannot be established, it may be possible to establish links between concepts of one thesaurus and concepts of the second thesaurus that are either more specific or more general, and to exploit such “narrower than” and “broader than” relations for query processing.
Such an approach has been investigated for subject vocabularies in projects such as HILT (Macgregor et al., 2007). The alignment of these vocabularies is however a labour-intensive task that requires considerable expertise in the concerned thesauri. Manual alignment has been approached by several projects, notably, CARMEN (Krause, 2003), Renardus (Day et al., 2005), KoMoHe[11], AOS (Liang and Sini, 2006) or the ongoing CRISSCROSS[12], MACS[13] (Landry, 2004) and MSAC (Balikova, 2005). These projects have yielded very interesting results such as the development of tools to support manual alignment, the deployment of search engines that exploit resulting alignments, and the contribution of initial methodological ideas. However, they also demonstrated the complexity, difficulty and cost of manually aligning large vocabularies (usually containing many thousand concepts) in realistically-sized collections and settings. Given that manual labour is expensive and that vocabularies evolve over time, it is clear that the construction and maintenance of alignment constitutes an important issue that needs to be addressed. There is a need for developing advanced, computer-based tools that can identify candidate mappings between two vocabularies, and that can then propose them to the human expert for consideration. Alignment would thus become a semi-automatic task where thesaurus experts' work would be assisted, and where the integration of collections would become more cost-efficient.
Recently, the Semantic Web community has produced alignment tools that address the specific problem of formal ontology matching (Shvaiko and Euzenat, 2005). However, the techniques they employ and the goals they advertise make them deployable in a more general context, including thesauri and other similar KOSs.
Although most of the existing ontology alignment tools rely on sophisticated methods (Euzenat and Shvaiko, 2007), they can be classified and described by the basic techniques they build upon and the different sources of information they exploit: the lexical information attached to the concepts of the vocabularies, the structure of vocabularies, the collection objects described by vocabularies, or other (external) knowledge sources.
Lexical alignment techniques
In these techniques the lexical materialisations of the concepts are compared to each other. If a significant similarity is found, then we can establish a semantic link between the concerned concepts. A straightforward example is when two concepts have the same label. But one can also search for string inclusion patterns or more complex techniques relying for instance on lemmatisers – getting normalised forms of labels, e.g. “tree” for “trees” – and syntactical analysis tools. A concept labelled “(map of) the North Pole” can be detected as more specific than a concept “Charts, maps”. These lexical methods exploit the preferred labels of concepts, but they can also turn to their lexical variants or their associated definitions and scope notes.
Clearly, such approaches encounter the same problems as humans when dealing with words taken out of context. Polysemy and homonymy, for instance, are common sources of errors. This has to be compensated with contextual information.
Structural alignment techniques
The first kind of context is provided by the vocabulary itself, as it contains hierarchical and associative links between concepts. These links, especially those concerning hierarchical generalisation and specialisation, are useful to constrain the natural interpretation of a concept: “bank” will be understood differently if it is a narrower term of “finance” or “geography”. Some tools will analyse this semantic context, either to check similarities obtained by other techniques or to derive new similarities from existing ones. If two concepts from different vocabularies are semantically equivalent, this equivalence will positively influence the alignment tool when it will examine the children of these concepts to find similarities between them.
Extensional alignment techniques
The second kind of context comes from the actual usage of the concepts in real-life applications. For instance, a class from a classification scheme will be used to categorise a number of objects in a collection (e.g. books). Accessing this information will provide an “extensional” characterisation of the class' intended meaning – akin to its literary warrant. When documents are described using two different vocabularies[14], statistical techniques can be employed to compare the sets of documents described by the concepts from these vocabularies (Figure 5). A high degree of overlap between these sets will yield a high similarity between corresponding concepts. Several such techniques have already been experimented in the KOS field, as in (Zhang, 2006) or (Isaac et al., 2007).
Background knowledge-based alignment techniques
A final group of alignment methods rely on knowledge sources that are external to the application and the vocabularies being considered. Sources of different kinds can be used, for instance general-purpose ontologies like CYC[15] or semantic networks like Wordnet (Miller, 1995). These sources can contribute KOS-external knowledge to compensate for the lack of KOS-internal lexical or structural information. For example, a concept “calendar” from one thesaurus can be aligned to the more general concept “publication” from another thesaurus, using the hypernymy relation that holds between the two corresponding terms in Wordnet (Figure 6).
Integrated collection access: an example
To illustrate the potential of the described technology, we used it for creating integrated access to two collections belonging to two different Dutch CH institutions, the Rijksmuseum, and the National Library of the Netherlands (Gendt et al., 2006). The manuscripts collection contains 10,000 medieval illuminations, which are annotated by subject indices describing the content of the image. These indices come from the Iconclass classification scheme, a vocabulary of 25,000 elements designed for iconographical analysis. The masterpieces collection contains 700 objects such as paintings and sculptures, and its subjects are indexed using the Amsterdam Rijksmuseum Inter Actief (ARIA) “catalogue”, a vocabulary conceived mainly as a resource for hierarchical browsing.
Both vocabularies were translated into SKOS, and mappings between them were calculated with existing state-of-the art mapping tools, namely, Falcon (Jian et al., 2005) and S-Match (Giunchiglia et al., 2005). Falcon uses a mixture of lexical and structural techniques. In addition to lexical techniques, S-Match uses Wordnet as background knowledge, and exploits “semantic reasoning” using a logical interpretation of the concepts based on the structure of the vocabularies.
We implemented a faceted browser, in which the mappings and the vocabularies' Semantic Web representations are exploited to provide integrated assess to the collections, offering three different views: single, combined, and merged view.
The single view presents the integrated collections from the perspective of just one of the vocabularies. In the screen capture (Figure 7), the first four pictures come from the Rijksmuseum, the others are illuminated manuscripts. Browsing is done solely using the ARIA catalogue (i.e. these illuminations have been selected exploiting the mapping between the currently selected ARIA concept “Animal Pieces” and the Iconclass concept “25F:animals”).
The combined view provides simultaneous access to the collections through their respective vocabularies in parallel. This allows us to browse through the integrated collections as if it was a single collection indexed against two vocabularies. In Figure 8, we made a subject refinement to ARIA “Animal pieces”, and narrowed down our search with Iconclass to the subject “Classical Mythology and Ancient History”.
Finally, the merged view gives access to the collections through a merged thesaurus combining both original vocabularies into a single facet, based on the links found between them in the automatic mapping process. If we select the ARIA concept “Animal pieces”, the view provides both ARIA concepts (such as “Birds”) and Iconclass concepts (such as “29A:animals acting as human beings”) for further refining our search.
Discussion and conclusion
Existing alignment tools have been reported to perform poorly on real-life cases such as CH thesaurus alignment (Gendt et al., 2006). In fact, alignment is still an open research problem as no single technique is universally applicable, or will return satisfactory results. In practice, different techniques have to be carefully selected and combined, depending on the characteristics of the case at hand, such as the richness of the semantic structures of vocabularies, their lexical coverage and the existence of collections simultaneously described by several vocabularies. It should be noted, however, that a continuous improvement of techniques and tools can lead to significant improvements, as witnessed in the regular evaluation campaigns organised by the research community (Euzenat et al., 2006).
The Semantic Web-inspired methods and tools described in this paper still require further experimentation in practical applications, and a greater availability of vocabularies. Nevertheless, current representation and alignment techniques can be employed to build demonstrators that showcase the integration of collections at the semantic level, leading the way from separate islands of collections and vocabularies to better connected networks of CH knowledge.
One such demonstrator is described in the previous section of this paper; this faceted browser gives a unified access to two collections of illuminated manuscripts via any of its respective vocabularies. Other examples of web portals that illustrate the use of Semantic Web techniques in the CH domain can be seen on the websites of the MuseumFinland[15] and eCulture[16] projects. These projects, even if not focusing on semantic alignment, demonstrate the possible benefits of using Semantic Web technologies: the use of the SKOS representation format, the development of innovative interfaces to access Cultural Heritage collections, and the exploitation of automated reasoning techniques over RDF-based metadata.
Other portals are being created with enhanced functionality and usability, as the synergy between CH and SW communities increases. One example is the ongoing eCulture project, which has been given the Semantic Web Challenge[17] award in 2006. In fact, the richness and high quality of CH data is very attractive to researchers of the Semantic Web community, as they have many tools but little real-life metadata to show their true potential. On the other hand, the CH domain (including digital libraries) could profit from techniques and tools developed by the SW community in creating a web of CH that delivers high quality content via easily accessible semantic search and navigation. Compared to current web searching techniques that are based on full text search (i.e. matching strings), Semantic search – matching meanings – represents a huge advance.
Notes
- See www.theeuropeanlibrary.org (accessed 18 December 2007).
- See www.geheugenvannederland.nl (accessed 18 December 2007).
- The following is a simplified introduction to the Semantic Web. For further detail, the reader is encouraged to consult the Semantic Web Primer (Antoniou and Harmelen, 2004).
- See www.w3.org/RDF (accessed 18 December 2007).
- Figure 2 is an abstract representation of an RDF graph. Such a graph will be usually serialized in the form of an XML file, according to the RDF/XML syntax specified by the W3C.
- Nodes in the graph are RDF resources; labelled edges represent assertions of a property between the linked elements. The rdf: namespace stands for http://www.w3.org/1999/02/22-rdf-syntax-ns#, rdfs: for http://www.w3.org/2000/01/rdf-schema#, myVoc1: for http://example.org/voc1#, and myVoc2: for http://example.org/voc2#.
- SKOS is currently under scrutiny by the W3C Semantic Web Deployment Working Group and is planned to be published as a W3C Proposed Recommendation in 2008. See www.w3.org/2004/02/skos (accessed 18 December 2007).
- In the following “skos:” stands for http://www.w3.org/2004/02/skos/core#. (accessed 18 December 2007).
- See www.iconclass.nl (accessed 18 December 2007).
- In this paper, alignment refers to the creation of semantic relationships (e.g. equivalence) between concepts coming from different KOSs in order to solve interoperability problems. This notion approximates what is referred to in the KOS community by vocabulary mapping, crosswalk or reconciliation, and in the Semantic Web community by ontology alignment, mapping or matching.
- See www.gesis.org/en/research/information_technology/komohe.htm (accessed 18 December 2007).
- See www.d-nb.de/wir/projekte/crisscross.htm (accessed 18 December 2007).
- See http://macs.cenl.org (accessed 18 December 2007).
- See www.opencyc.org (accessed 18 December 2007).
- See www.museosuomi.fi (accessed 18 December 2007).
- See http://e-culture.multimedian.nl (accessed 18 December 2007).
- See http://challenge.semanticweb.org (accessed 18 December 2007).
Figure 1Semantic heterogeneity hampers collection access
Figure 2A Semantic Web RDF graph[6]
Figure 3A SKOS graph partly representing the Iconclass subject 11F
Figure 4Using vocabulary alignment for integrated access to different collections
Figure 5Using object-level information to align vocabularies
Figure 6Using background knowledge to align vocabularies
Figure 7Single view: using the ARIA thesaurus to browse the two collections
Figure 8Combined view: using ARIA and Iconclass to browse the two collections
References
Antoniou, G., Harmelen, F. van (2004), Semantic Web Primer, MIT Press, Cambridge, MA, .
Balikova, M. (2005), "Multilingual subject access to catalogues of national libraries (MSAC), Czech Republic's collaboration with Slovakia, Slovenia, Croatia, Macedonia, Lithuania and Latvia”, in", Proceedings of the 71st IFLA General Conference and Council “Libraries – A voyage of discovery”, .
Berners-Lee, T., Hendler, J., Lassila, O. (2001), "The Semantic Web", Scientific American, available at, Vol. 284 No.5, pp.34-43.
Brickley, D., Guha, R.V. (2004), RDF Vocabulary Description Language 1.0: RDF Schema, W3C Recommendation, 10 February 2004, available at, .
cDay, M., Koch, T., Neuroth, H. (2005), "Searching and browsing multiple subject gateways in the Renardus service”, in", Proceedings of the Sixth International Conference on Social Science Methodology, .
Doerr, M. (2001), "Semantic problems of thesaurus mapping", Journal of Digital Information, available at, Vol. 1 No.8, .
Euzenat, J., Shvaiko, P. (2007), Ontology Matching, Springer, Berlin, .
Euzenat, J., Mochol, M., Shvaiko P., Stuckenschmidt, H., Svab O., Svatek, V., Hage, W. van, Yatskevich, M. (2006), "Results of the ontology alignment evaluation initiative 2006”, in", Proceedings of the First International Workshop on Ontology Matching, 5th International Semantic Web Conference, .
Giunchiglia, F., Shvaiko, P., Yatskevich, M. (2005), "Semantic schema matching”, in", Proceedings of the 13th International Conference on Cooperative Information Systems (CoopIS 2005), .
Harmelen, F. van (2005), "Ontology mapping: a way out of the medical tower of babel?”, in", Artificial Intelligence in Medicine: proceedings of the 10th Conference on Artificial Intelligence in Medicine (AIME 2005), .
Isaac, A., van der Meij, L., van der, Schlobach, S., Wang, S. (2007), "An empirical study of instance-based ontology matching”, in", Proceeedings of the 6th International Semantic Web Conference (ISWC 2007), Busan, Corea, 2007, .
Jian, N., Hu, W., Cheng, G., Qu, Y. (2005), "Falcon-AO: aligning ontologies with Falcon”, in Proceedings of the K-CAP Workshop on Integrating Ontologies", Banff, Canada, 2005, .
Krause, J. (2003), "Standardization, heterogeneity and the quality of content indexing: a key conflict of digital libraries and its solution”, in", World library and information congress: 69th IFLA general conference and council, .
Landry, P. (2004), "Multilingual subject access: the linking approach of macs", Cataloging and Classification Quarterly, Vol. 37 No.s. 3-4, pp.177-91.
Liang, A., Sini, M. (2006), "Mapping AGROVOC and the Chinese agricultural thesaurus: definitions, tools, procedures", New Review in Hypermedia and Multimedia, Vol. 12 No.1, pp.51-62.
Macgregor, G., McCulloch, E., Nicholson, D. (2007), "Terminology server for improved resource discovery: analysis of model and functions”, in", Proceedings of the Second International Conference on Metadata and Semantics Research, .
McGuinness, D.L., Harmelen F. van (2004), OWL Web Ontology Language Overview, W3C Recommendation, 10 February 2004, .
Miles, A., Brickley, D. (2005), SKOS Core Guide, W3C Working Draft, 2 November 2005 (Work in progress), available at, .
Miller, G. (1995), "Wordnet: a lexical database for English", Communications of the ACM, Vol. 38 No.11, pp.39-41.
Shvaiko, P., Euzenat, J. (2005), "Ontology matching", D-Lib Magazine, In Brief, available at, Vol. 11 No.12, .
van Assem, M., Malaise, V., Miles, A., Schreiber, G. (2005), "A method to convert thesauri to SKOS", in Proceedings of the Third European Semantic Web Conference, Budva, .
van Gendt, M., Isaac, A., van der Meij, L., Schlobach, S. (2006), "Semantic Web techniques for multiple views on heterogeneous collections: a case study”, in", Proceedings of the 10th European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2006), .
Zhang, X. (2006), "Concept integration of document databases using different indexing languages", Information Processing and Management, Vol. 42 pp.121-35.
Corresponding author
Antoine Isaac can be contacted at: aissac@few.vu.nl