Answers to Questions on Metadata

Answers to the questions posed in the Consultation Paper on Developing Common Standards Metadata Scheme for Websites in the Legal and Advice Sectors of November 2000

Question 1: Do you agree that the CLS should take overall responsibility for the structure and content of a Metadata Scheme for websites in the Legal and Advice Sectors?

Answer: Coordination in this field is important. If CLS is the right body I cannot judge.

Question 2: Do you agree that there should be a Metadata Scheme covering all websites in the Legal and Advice Sectors?

Answer: Good to aim for, but perhaps too ambitious to achieve. More likely the Metadata Scheme to be developed, will live together with other metadata schemes. It should be ensured, that the schemes supplement each other and communicate as well as possible.

Question 3: Do you agree that the LAMS Metadata Scheme should conform closely to "Simple" Dublin Core?

Answer: Yes

Question 4: Do you agree with the proposed modifications of "Simple" Dublin Core to meet the needs of the Legal and Advice Sectors?

Answer: Yes

Question 5: Do you agree that website owners should be responsible for the creation and maintenance of metadata on their websites?

Answer: Yes

Question 6: Do you agree that XML should be adopted as the standard for metadata definition?

Answer: Yes, in particular RDF. See also my general remarks.

Question 7: Do you agree that the process outlined in paragraph 2.19.1 above should be adopted for use in the Legal and Advice Sectors?

Answer: Yes, but further possibilities should be taken advantage of. I refer to my general remarks.

Question 8: Do you agree that content classification scheme(s) with supporting thesaurus are required to support the proposed metadata scheme?

Answer: Yes

Question 9: Do you agree that the CLS should commission its own content classification scheme with supporting thesaurus for the Subject metadata element?

Answer: Yes. The Paper is not clear on the question in which format the content classification scheme and thesaurus will be described. If RDF is used to describe the metadata, I would recommend, that RDF is also used to define the content classification scheme and thesaurus. See also my general remarks.

Question 10: Do you agree that ISO standards should be adopted where appropriate for other metadata elements?

Answer: Yes.

General Remarks:

I support the aims of the Project. I will describe two implications of the project, which from my perspective may and should have an impact on the planning and execution of the Project. One is the "downstream" one is the "upstream".

Downstream

The Project is about metadata, so in principle not about data within a document (in the traditional sense, not in the sense of an HTML document), but the data "outside" the document. Some metadata may, however, be extracted from the document itself, like date (of creation) and author. The Project envisages a content classification scheme and thesaurus (together referred to as "the Classification Scheme"). The Classification Scheme will provide for many keywords and the relationship between those keywords. Traditionally these relationships are of a hierarchal nature. If RDF Schema will be used, the keywords may be structured with the help of classes and subclasses, where needed with multiple inheritance. If the metadata are described in RDF it seems only logical, that the Classification Scheme will be described in the same language, to allow for an easy interaction between the two Schemes. By structuring the Content Classification scheme with RDF, a socalled "RDF dictionary" would come into existence. For a good example of an RDF dictionary I refer to www.dataconsortium.org, in particular, DCN DCD v1.0 (XML) (version 8 October 2000, next version has been announced).

The actual information the Project wishes to make more accessible to the citizens is contained in documents. A number of initiatives in the legal world exist, to mark up legal documents with XML elements to capture the information content in a structure. I refer to www.legalxml.org and the European code law counterpart www.lexml.de. LegalXML is developing standard DTD´s for various legal documents. Further information and examples may be found at their homepage. Lexml is a forum where several continental European XML initiatives in the legal field meet. For an example of a DTD for German language judicial decisions see "der Saarbrücker Standard" (http://edvgt.jura.uni-sb.de/Tagung00/ak00/XML.htm). Capturing the structure of information ideally starts right from the beginning, i.e. at the moment a document is written. That way metadata are far easier attached to a document. Instead of having someone read the document, decide which metadata should be attached and then enter these metadata manually, the metadata can be extracted largely automatically from the XML mark up. It will take some time for the practice to evolve, that, for instance, courts will deliver their judgements with XML mark up. At the rate XML is now conqering the web, this future may, however, not be too far away. In fact the Project could accelerate the process of encouraging the various players in the legal market to deliver their documents with XML mark up.

Many DTD´s and schemas (for instance XML Schema, as an alternative to a DTD) will be developed. These DTD´s and schemas will have to be enabled to exchange stuctured information. They need a common source through which they can communicate. This common source could be exactly the RDF dictionary which the Project could deliver. The keywords used in the RDF dictionary may serve as elementnames for the DTD´s and Schemas. Even if the RDF keywords do not fit exactly the elementnames needed for a particular mark up, the DTD or Scheme may refer to the RDF Dictionary by explaining (in RDF) what the relationship between the RDF keyword and the element name is. This is the concept of "mapping" which will be of central importance in the years to come, when providing information on the web.

Upstream

British law, whether one likes it or not, is to an ever greater extent not made in London, but in Brussels. The information which is the subject of the Project, has for a significant part its source in Brussels. The positive side to this: Great Britain has this source in common with the other EU member countries, which creates a potential for exchange. In these countries also legal RDF dictionaries will emerge. It would be of great advantage if these RDF dictionaries could "communicate" with one another. A search with English concepts in the French and German corpus of legal information would be the resulting gain. Not only foreign judgements influencing the interpretation of British law based on EU law, but also, for instance, a British exporter may be guided on agency regulations in Italy or a British citizen planning a job on the continent has an easier access to foreign legal information relevant to his plans. The other way around, a Dutch company preparing for an investment in Great Britain, would be accomodated to gather preliminary information, searching with the concepts it is used to in the Netherlands . It would be missing an historic chance, not to take these possibilities into consideration. Great Britain is probably one of the first EU members to start an XML initiative on government level in the legal field. If Great Britain sets a good example, other countries may follow, to the advantage of all parties involved.

RDF itself provides for the appropriate mechanism for mapping between RDF dictionaries . In fact one creates an RDF dictionary one layer of abstraction higher. Lexml and Legalxml have just started together an experiment with an RDF dictionary of higher order, which is geared to reaching the goal of automated exchange of structured legal information across jurisdictional and language borders.

If the above will lead the Project to perceiving itself as potentially playing a central role in the development of the "semantic web" in the legal field, my wish will have been fulfilled.

Berlin, January 2001

Murk Muller
Attorney at law, admitted in the Netherlands and Germany
Initiator of Lexml

Schuchardtweg 5
14109 Berlin
Tel: +49-30-80602491
Fax: +49-30-80602492
e-mail: mm@mmrecht.com