A corpus of Australian contract language: Description, profiling and analysis

Michael Curtotti*, Eric C. McCreath

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    16 Citations (Scopus)

    Abstract

    Written contracts are a fundamental framework for economic and cooperative transactions in society. Little work has been reported on the application of natural language processing or corpus linguistics to contracts. In this paper we report the design, profiling and initial analysis of a corpus of Australian contract language. This corpus enables a quantitative and qualitative characterisation of Australian contract language as an input to the development of contract drafting tools. Profiling of the corpus is consistent with its suitability for use in language engineering applications. We provide descriptive statistics for the corpus and show that document length and document vocabulary size approximate to log normal distributions. The corpus conforms to Zipf's law and comparative type to token ratios are consistent with lower term sparsity (an expectation for legal language). We highlight distinctive term usage in Australian contract language. Results derived from the corpus indicate a longer prepositional phrase depth in sentences in contract rules extracted from the corpus, as compared to other corpora.

    Original languageEnglish
    Title of host publication13th International Conference on Artificial Intelligence and Law, ICAIL 2011 - Proceedings of the Conference
    Pages199-208
    Number of pages10
    DOIs
    Publication statusPublished - 2011
    Event13th International Conference on Artificial Intelligence and Law, ICAIL 2011 - Pittsburgh, PA, United States
    Duration: 6 Jun 201110 Jun 2011

    Publication series

    NameProceedings of the International Conference on Artificial Intelligence and Law

    Conference

    Conference13th International Conference on Artificial Intelligence and Law, ICAIL 2011
    Country/TerritoryUnited States
    CityPittsburgh, PA
    Period6/06/1110/06/11

    Fingerprint

    Dive into the research topics of 'A corpus of Australian contract language: Description, profiling and analysis'. Together they form a unique fingerprint.

    Cite this