Project Information
Large Language Models (LLMs) have become a focal point in AI research, revolutionizing industries by tackling complex tasks like never before. In this proposal, our objective is to prime the existing pre-trained LLMs to enhance their applicability in governance use-cases. We recognize that while these LLMs are trained on extensive open domain datasets, they often lack access to crucial government policy, laws, and regulation documents, limiting their usability, e.g., inability to answer “What happens to the installed solar panels if there is en-bloc/SERS?” or “Why is there no electricity storage on the residential block rooftop?”.
To address such limitations, we propose techniques to incorporate external knowledge into pre-trained LLMs. By integrating such knowledge, LLMs can become aware of the rules and regulations of various government agencies, enabling them to provide accurate responses to user queries. We aim to achieve this through two primary and complementary approaches.
Firstly, we will train LLMs on law and policy documents from government organizations, enriching their understanding of legal frameworks. Secondly, during query inference, we will extract pertinent information from law and policy documents and inject this knowledge into LLMs, enabling them to provide contextually appropriate responses. However, a significant challenge remains: ensuring the veracity of LLM-generated responses, which is crucial when employing LLMs in governance applications. Given the known propensity of LLMs to hallucinate and generate confident but erroneous information, we propose techniques to verify the outputs of LLMs, thereby fostering trustworthiness and enhancing their utility.
In this proposal, we envision addressing this by the means of knowledge grounding. Our proposed approach will be able to verify the generated text by LLMs against external knowledge sources to ensure the LLMs generated output can be trusted. In our approach, we prioritize the principles of helpfulness and honesty in the behavior of LLMs. To ensure the trustworthiness of the responses, we will provide explanations supported by relevant verified articles. By grounding our responses in factual and authoritative sources, we aim to enhance the reliability and credibility of the information provided. Like any other use case, LLMs employed in governance applications must be responsible.
For instance, consider the query, “Suggest ways to install a CCTV at the main door of my HDB flat”. According to MND rules, installing a CCTV camera at the main door of an HDB flat is prohibited. However, an LLM lacking knowledge of this rule may provide misleading information. Our proposed responsible LLM would refuse to answer this question, citing the relevant rules of the Singapore Government. To achieve this, we will create an instruction dataset consisting of unethical queries and the appropriate responses that should be generated.