Tutorial: Safe and Trustworthy Large Language Models (Part 1 of 2)

Kush R. Varshney

DOI

Tutorial 14 Apr 2024

Overview of traditional (non-LLM) trustworthy machine learning based on the book “Trustworthy Machine Learning” by the presenter Definitions of trustworthiness and safety in terms of aleatoric and epistemic uncertainty AI fairness Human-centered explainability Adversarial robustness Control-theoretic view of transparency and governance What are the new risks Information-related risks Hallucination, lack of factuality, lack of faithfulness Lack of source attribution Leakage of private information Copyright infringement and plagiarism Interaction-related risks Hateful, abusive, and profane language Bullying and gaslighting Inciting violence Prompt injection attacks Brief discussion of moral philosophy How to change the behavior of LLMs Data curation and filtering Supervised fine tuning Parameter efficient fine tuning, including low-rank adaptation Reinforcement learning with human feedback Model reprogramming and editing Prompt engineering and prompt tuning How to mitigate risks in LLMs and make them safer Methods for training data source attribution based on influence functions Methods for in-context source attribution based on post hoc explainability methods Equi-tuning, fair infinitesimal jackknife, and fairness reprogramming Aligning LLMs to unique user-specified values and constraints stemming in use case constraints, social norms, laws, industry standards, etc. via policy elicitation, parameter-efficient fine-tuning, and red team audits Orchestrating multiple possibly conflicting values and constraints

Tags:

icassp

ICASSP 2024

2024

tutorial

Language Models

Tutorial: Safe and Trustworthy Large Language Models (Part 1 of 2)

Kush R. Varshney

Value-Added Bundle(s) Including this Product

Tutorial Bundle: Safe and Trustworthy Large Language Models (Parts 1-2)

More Like This

Bundle: 2024 IEEE SustainTech Leadership Forum

Keynote: Navigating the Transition to Sustainable Energy Solutions in a Power-Hungry World

Panel: Leveraging Technology to Achieve Carbon Neutrality of Buildings and Factories

Join the IEEE Signal Processing Society