The Wayback Machine - https://web.archive.org/web/20110317235951/http://metadaten-twr.org:80/2010/02/03/international-standard-name-identifier-an-introduction/
The KIM Technology Watch Report: http://metadaten-twr.org

International Standard Name Identifier: An introduction

Author: Juha Hakala

Director, IT development, The National Library of Finland

juha.hakala@helsinki.fi

Member of the ISNI working group

Abstract:
The International Organization for Standardization (ISO) [1] is developing the International Standard Name Identifier (ISO 27729) as a standard identifier for public identities of parties. At the time of writing (December 2009) the document has reached Draft International Standard status. Much information is available at http://www.isni.org/ [2], including a useful FAQ [3]. This article describes the standard and potential implementation challenges.

Parties and their public identities

The main aim of the ISNI standard is to enable the disambiguation of public identities that might otherwise be confused. For instance, there are six different Juha Hakalas in the Finnish National Bibliography, each with his own ordinal number (see [6]). Because both the year of birth and year of death are considered confidential for 100 years after the death of the person, this information can neither be shown in OPACs nor included in the bibliographic and authority records delivered by the national library to partners such as OCLC.

In the Finnish National Bibliography, Juha Hakalas are disambiguated, though not in a way that is informative. In WorldCat all Juha Hakalas are put together. The ISNI, providing a globally unique and persistent identifier, will be an efficient tool for disambiguating names. ISNI does not have data security concerns, so it should be possible to include them in data. Although currently just an emerging standard, ISNIs will eventually be understood by most bibliographic information systems.

ISNI usage rules are simple, at least in principle. But it is important to understand the application scope of “standard identifier of public identities of parties”. “Party” is a generic term referring to a natural or legal person or a fictional character, or any group of these entities. Both past and present natural persons are included; ISNIs can be applied retrospectively to authors such as Homer. “Public identity” is an identity a party has chosen to use for publishing purposes. For example, Frederic Dannay and Manfred B. Lee used the identity Ellery Queen to publish 1,238 works in 3,390 publications in 31 languages (see [7]).

Alternative spellings of the same public identity will get the same ISNI. This also applies to character set and transliteration variants. There are many ways to write the name of Anton Pavlovic Cehov — a long but not complete list of variants can be found in the OCLC identity record available at [8]) — but these variants all refer to the same public identity and must get the same ISNI.

A party may have multiple public identities. For instance, early in his literary career Anton Cehov chose to publish under the pseudonym Antosa Cehonte in order to hide his true identity. Antosa Cehonte will get its own ISNI, unlike for instance Anton Tšehov, the most common variant spelling of Cehov’s name in Finland. There are many of those variants depending on the transliteration system and national traditions applied. With the help of a fully developed ISNI reference database (see below), a single search will retrieve all these variant forms, including other public identities of a party. The intention is to link such identities to one another in the ISNI database.

An ISNI could in principle be given to every natural and legal person on earth, but in practice the scope of the system will be limited to people and organizations involved with artistic creation in one form or another. All six Juha Hakalas mentioned earlier have written one or more textual works and may therefore receive an ISNI, but other Juha Hakalas may not, unless they have created other kinds of works. Due to their limited scope, ISNIs will neither compete with, nor replace, existing personal identifiers such as social security numbers.

The idea of developing an identifier for public identities of parties in the area of cultural creation is not new. For instance, International Standard Authority Data Number (ISADN) was first proposed in 1984. However as Patton describes, libraries chose not to develop a new identifier system and concentrated instead on defining the functional requirements for authority records [4]. Disambiguation of names has a much longer history, during which sophisticated rules have been developed (see for instance German guidelines available at [9]). Although ISNI cannot re-use existing name identifiers (ISNI syntax does not allow this), authority databases such as the German National Library’s Personennamendatei (PND) can supply metadata for the ISNI system.

ISNI syntax

The working group developing the ISNI had to choose between two options: making the identifier semantic (with an internal structure with embedded meaning) or non-semantic (“dumb”). Of the existing identifiers, the ISBN represents the former, while the ISSN is an example of the latter. The strongest option for a semantic ISNI was to start with a country code. This would have enabled decentralized ISNI assignment, but such functionality might have been a mixed blessing, as two or more countries could have given an ISNI to a public identity they “shared”.

A dumb ISNI assigned centrally does not imply any nationality or region, so assigning an ISNI to famous artists of diverse cultural background does not pose a problem. Austria, Hungary and Germany need not argue to whom Franz Liszt – or Ferenc Liszt – “belongs”, since the ISNI avoids the entire question: all variants of the name — the two listed here, and other forms preferred elsewhere — are equal, and the ISNI itself does not make any statements on nationality, at least not at the level of the syntax.

Each ISNI consists of 16 characters, but there is only room for one hundred trillion ISNI identifiers since the last number is a check character. The working group is however confident that unlike the ISBN, the ISNI will not run out of identifiers after just a few decades of use.

When displayed, an ISNI will look like this:

ISNI 1422 4586 3573 0476

or like this:

ISNI 3456 7890 3456 666X

ISNI governance

The ISNI system will be governed by a Registration Authority (RA). The RA will be responsible for creating and maintaining the ISNI reference database. It will also take care of the administration and governance of the ISNI Standard. The ISNI RA has not yet been elected, but there is a candidate consortium with considerable expertise in this area. ISO TC 46 SC9 will elect the candidate, and the nominated RA will be submitted to ISO’s Technical Management Board for appointment.

In addition to the ISNI RA, there will be ISNI Registration Agencies (RAGs). They will be appointed by the ISNI International Agency (operating as RA), and their main job is to act as interfaces between the users and the Registration Authority. Individual authors must be represented by a RAG; they cannot address the RA directly. Any entity with a proven interest in the scope of ISNI will be eligible to become a RAG. No exclusivity will be granted either on a territorial basis, or on a market segment basis such as the Book Publishing industry or the Music Industry. However, the ISNI RA will define different classes of registrants and rules governing their authority to act on behalf of or in respect of parties in applying for an ISNI [5, p. 6]. It is not clear how these classes will be defined and what rights they will have.

The functioning of the entire ISNI system will be heavily dependent on the quality of the reference database. From the very beginning it ought to be as exhaustive and as free of errors as possible. RAGs will only have read access to the data; the RA and the RA alone is entitled to add, change or delete the ISNI metadata, based on applications made by RAGs.

The founding candidate members of the International Agency which operates as the ISNI Registration Authority are:

  • CISAC – International Confederation of Societies of Authors and Composers
  • IFRRO – International Federation of Reproduction Rights Organisations
  • IPDA – International Performers’ Database Association
  • Bowker
  • OCLC – Online Computer Library Center
  • Bibliothèque Nationale de France and the British Library

These organizations already maintain large registers of public identities — or authority databases as we prefer to call them in libraries. The creation of the ISNI database and the initial assignment of identifiers will be a massive task, but with these candidates the system will get off to a good start. The rest will depend also on RAGs. The first RAGs may be nominated in 2010, if the ISNI draft is approved. The ballot will be completed in March 2010.

ISNI metadata

When RAGs make an application for an ISNI, they must send ISNI metadata to the RA for each public identity to be identified. This metadata must be detailed enough to enable disambiguation in the ISNI reference database. RAGs must also check from the reference database that the public identity in question has not yet received an ISNI. Such a check may save a lot of time and also money: ISNIs may not be free, although RA (and RAGs themselves) will operate under the not-for-profit model applicable to all ISO standards. Even if an ISNI has been assigned, RAGs may still want to provide additional variant spellings of the name or modify the ISNI or extend metadata.

Problems will inevitably occur: one public identity may erroneously get several ISNIs when variant forms of the name are not matched properly in the ISNI reference database, and there may be a need to split one public identity (such as Juha Hakala) into several identities. The RA will be responsible for fixing these problems.

There are only three mandatory metadata elements: name of public identity, type of party (natural person, legal person, etc.), and one or more of the following elements:

  1. URI, the machine-readable link to an external set of data relating to activities in the creation class and/or role defined in 2) and 3).
  2. Creation class, definition of repertoire picked from an allowed value list published by the ISNI Registration Authority (e.g. musical work, literary work, audio-visual work).
  3. Role the party has played in the production of the creation, picked from a value list published by the ISNI Registration Authority (e.g. author, publisher, director).

For instance, Juha Hakala is the author of several literary works, including the one at hand.

It is obvious that this metadata would often be insufficient for disambiguation. The ISNI RA is responsible for adding to the ISNI metadata record, possibly on request of one or more RAGs, two more data elements: date with type of date, and place with type of place. A date can be year of birth or year of death or both for people, and year of registration and year of dissolution for organisations. A place may be, for instance, country of birth, death or the place where the artist flourished. Lists will be provided for the values allowed.

There are two additional elements that can be used to complement the authority records sent to the RA. The ISNI RA can specify the following, by its own initiative or upon request of one or more RAGs:

  • Related ISNI, the ISNI of another public identity related to the party associated with the current public identity.
  • Relationship, the allowed value describing the nature of the relationship as published by the ISNI Registration Authority (e.g. is the pseudonym of).

These elements provide a means for linking related public identities to one another.

Compared with some existing systems such as the German PND, ISNI metadata is somewhat simpler. Whether this may lead to disambiguation problems is not clear.

The ISNI RA will develop guidelines for updating the metadata in the reference database. This task is challenging, since the ISNI RA must correct errors and resolve conflicts between ISNI applications sent by RAGs. It is not a simple task to find all alternative spellings of a public identity, since problems often begin at a national level. Kustaa Mauri Armfelt and Gustav Mauritz Armfelt are the same person, but these and other variant spellings of Armfelt’s name may initially get separate ISNIs.

Developing algorithms for automatic mapping of variants can be demanding, especially when the differences get more pronounced. Most Germans can guess who Michaelis Schumacheris is, although the Latvian spelling of the name differs from the original, but an algorithm may miss even an alternative spelling which is obvious for a human. Chinese or Japanese spelling variants might be hard to compare even for humans when transliterated back to the Latin script.

Conclusion

Even before development of ISNI started in ISO, the library community had discussed the possibility of establishing an identifier for names. These talks were not fruitful as the technical challenges related to establishing such a system seemed too difficult to resolve.

In the end it took the concerted effort of authors and composers societies, book trade and libraries to get the standardization effort started. The ISNI standard is not yet an accepted ISO standard, and the required technical infrastructure has not been built, but as a member of the ISNI working group I am optimistic that something highly useful will emerge from this effort. The system may not be perfect when first established, but the ISNI registry does not need to be perfect to be useful. There will be errors and omissions in the ISNI registry database when it is published, but close co-operation between RA and RAGs will gradually improve the quality and coverage of the system.

In those countries where personal data security is of major importance, it is important to clarify, prior to the ISNI system becoming operational, what kind of personal data can be supplied to the ISNI RA without violating local data protection laws. The National Library of Finland is not allowed to send any data concerning the author’s birth or death to its peers, but the ISNI central database can be treated differently. The ISNI database itself will not be publicly available in the same way as most OPACs are, since only RAGs will have read access to it (no technical details have been published yet exactly how this will happen), and if necessary some metadata elements could at least in theory be hidden even from all other RAGS except the one which supplied the data.

References:
[1] http://www.iso.org/iso/home.htm

[2] http://www.isni.org/

[3] ISNI Frequently Asked Questions. Version 2.0 November 2009. Electronic resource, available at http://www.isni.org/docs/isni_faq.pdf. (checked 2009-11-22).

[4] Patton, Glenn: FRANAR: A Conceptual Model for Authority Data. Electronic resource, available at http://www.sba.unifi.it/ac/relazioni/patton_eng.pdf (checked 2009-12-14).

[5] ISO/DIS 27729. International Standard Name Identifier. Geneva, International Organization for Standardization, 2009.

[6] https://fennica.linneanet.fi/

[7] http://www.worldcat.org/wcidentities/lccn-n79-139599

[8] http://www.worldcat.org/wcidentities/lccn-n79-130807

[9] http://www.d-nb.de/standardisierung/pdf/praxisregel_individualisierung_911.pdf


More information about the author: Juha Hakala


Tags: , , , , , , , , , ,

3 Responses to “International Standard Name Identifier: An introduction”

  1. Image Thom Hickey Says:

    There seem to be typos in both the WorldCat Identities URIs. Here are pointers to the production versions of Ellery Queen and Anotn Chekov, along with pointers to the corresponding VIAF records:

    Ellery Queen:
    http://www.worldcat.org/wcidentities/lccn-n79-139599
    http://www.viaf.org/viaf/7376791
    Chekov:
    http://www.worldcat.org/wcidentities/lccn-n79-130807
    http://viaf.org/viaf/95216565

    –Th

  2. Image Andreas Gros Says:

    Thank you very much, Thom! We’ve changed the URIs.

    – Andreas

  3. Image Janifer Gatenby Says:

    Note that the ISNI database will only contain representative resources, usually only one, and that all an ISNI can guarantee is that the metadata associated with the ISNI is correct. There is no control of databases applying existing ISNIs to creators of resources and there will be some errors in this regard that are beyond the reach of the ISNI system. This will mean that most “splitting” is done outside of ISNI maintenance and will be manifested as new ISNI requests.
    –Janifer

Leave a Reply