Empirical Evaluation of Latest Generation Language Models for Anonymisation of Singaporean Court Decisions and Case Files

Project Information

The increasing availability of large language models (LLMs) presents opportunities for the judiciary to enhance data accessibility while maintaining confidentiality. This study evaluates whether current-generation LLMs can reliably and safely anonymise judicial data, such as court decisions and case files, to facilitate controlled data sharing for research, legal technology development, and transparency.

To achieve this, we will conduct systematic experiments using both local computing infrastructure and secure cloud services. The anonymisation performance of LLMs will be assessed through comparative analysis against human anonymisation efforts, with evaluations carried out by domain experts. The study will explore federated learning and human-in-the-loop architectures to assess their suitability for maintaining high privacy standards while improving automation efficiency. Additionally, a literature review will be conducted on best practices in anonymisation and privacy standards, ensuring alignment with global regulatory and ethical frameworks. The final research findings will be published in a renowned academic journal to contribute to both judicial and scholarly discourse.

This research is highly relevant to the judiciary, as it provides empirical insights into anonymisation efficacy, potentially enabling safer and more scalable data-sharing frameworks. By improving access to anonymised judicial data, this initiative could enhance legal research, case analysis, and AI-driven tools for judgment drafting and support for self-represented litigants. Furthermore, the project aligns with Singapore’s goal of being a global leader in judicial technology, reinforcing its commitment to transparency and access to justice.

Given the sensitive nature of judicial data, strict data governance protocols, secure infrastructure, and trained personnel will be employed to ensure confidentiality. If necessary, judicial computing resources can be leveraged to mitigate security risks. The Empirical Judicial Research Grant Scheme provides an ideal framework for executing this study within the highest ethical and security standards.

Project Information

Associated People

Follow SMU CDL on:

sgsmucdl

sgsmucdl

@sgsmucdl

@sgsmucdl

SgSMUCDL

Where to find us

Get in touch