Name: Security, Ethics, and Adversarial Resilience
Price: 59 INR
Rating: 0

1

Adversarial Prompt Attacks and Jailbreaking

This topic surveys the landscape of adversarial prompting, where attackers craft inputs to bypass safeguards or manipulate outputs. It introduces examples such as instructing the model to ignore previous instructions, role-playing scenarios that justify harmful actions, or encoding instructions in hidden formats. Learners review analyses from security researchers documenting how...

This topic surveys the landscape of adversarial prompting, where attackers craft inputs to bypass safeguards or manipulate outputs. It introduces examples such as instructing the model to ignore previous instructions, role-playing scenarios that justify harmful actions, or encoding instructions in hidden formats. Learners review analyses from security researchers documenting how prompt attacks can cause models to reveal internal prompts, generate prohibited content, or execute unauthorized tool calls. The topic classifies attacks by technique and objective, such as prompt leaking, policy evasion, or data exfiltration, and introduces mental models for thinking like an attacker so that defenses can be better designed.

2

Information Leakage and Data Protection

This topic explores the privacy and confidentiality dimensions of prompt engineering. It explains how secrets such as API keys, personal identifiers, or confidential business data may enter prompts, be stored in logs, or be memorized by models over time. Learners review data protection principles such as minimization (sending only what...

This topic explores the privacy and confidentiality dimensions of prompt engineering. It explains how secrets such as API keys, personal identifiers, or confidential business data may enter prompts, be stored in logs, or be memorized by models over time. Learners review data protection principles such as minimization (sending only what is necessary), pseudonymization, redaction, and differential access controls. The topic compares architectural options, from public SaaS APIs to private, self-hosted models running within a secure perimeter. It also touches on vendor policies regarding training on customer data and how to configure data retention and logging options in AI platforms. Learners design prompt templates that avoid embedding secrets and define operational controls for cleaning or anonymizing content before it is sent to external services.

3

Guardrail Implementation and Safety Filters

This topic introduces guardrail systems that sit before or after the main LLM call. Before-call guardrails may rewrite prompts, block certain inputs, or require additional authorization for sensitive actions. After-call guardrails inspect model outputs for policy violations, such as hate speech, self-harm instructions, or explicit content, and either block, redact,...

This topic introduces guardrail systems that sit before or after the main LLM call. Before-call guardrails may rewrite prompts, block certain inputs, or require additional authorization for sensitive actions. After-call guardrails inspect model outputs for policy violations, such as hate speech, self-harm instructions, or explicit content, and either block, redact, or re-route responses. Learners examine commercial and open-source guardrail libraries and learn how to encode organizational policies in machine-readable form. The topic emphasizes that guardrails are part of a layered defense strategy alongside responsible prompting and model-level safety mechanisms, and discusses trade-offs such as false positives, user friction, and explainability of blocked outputs.

4

Bias Detection and Mitigation in Prompting

This topic addresses how biases present in training data and in human-authored prompts can lead to discriminatory or unrepresentative outputs. Learners examine examples of gender, race, and socio-economic bias in text and images and practice designing evaluation prompts that surface such patterns. They then explore mitigation techniques including neutral or...

This topic addresses how biases present in training data and in human-authored prompts can lead to discriminatory or unrepresentative outputs. Learners examine examples of gender, race, and socio-economic bias in text and images and practice designing evaluation prompts that surface such patterns. They then explore mitigation techniques including neutral or counter-stereotypical prompt wording, explicit fairness goals in instructions, balanced and diverse few-shot examples, and the use of bias detection tools that scan outputs. The topic emphasizes that mitigation is an ongoing process requiring monitoring, stakeholder input, and organizational accountability, not a one-time fix.

5

Transparency, Documentation, and Incident Response

This topic addresses the operational side of responsible AI. Learners create model cards or system cards for their prompt-based systems, describing purposes, capabilities, limitations, and known risks. They document prompt templates, evaluation methods, guardrail settings, and change histories. The topic describes incident response processes for AI-specific issues, including how to...

This topic addresses the operational side of responsible AI. Learners create model cards or system cards for their prompt-based systems, describing purposes, capabilities, limitations, and known risks. They document prompt templates, evaluation methods, guardrail settings, and change histories. The topic describes incident response processes for AI-specific issues, including how to detect, triage, investigate, and remediate problems such as harmful outputs, data leaks, or policy violations. Learners design communication plans for internal and external stakeholders and connect incident learning back into prompt and system improvements. The topic reinforces that transparency and responsiveness are critical for building and maintaining trust in AI systems.

6

Prompt Injection and Goal Hijacking Defenses

This topic focuses on defensive design patterns. Learners study guidelines such as never treating untrusted user input as instructions, keeping system prompts private, and sanitizing or delimiting user content so the model can distinguish it from instructions. They examine structured formats like XML or JSON that label which parts of...

This topic focuses on defensive design patterns. Learners study guidelines such as never treating untrusted user input as instructions, keeping system prompts private, and sanitizing or delimiting user content so the model can distinguish it from instructions. They examine structured formats like XML or JSON that label which parts of the prompt are data versus directives, and how model APIs increasingly support function/tool calling to constrain behaviors. The topic explains layered defenses including static scanning of prompts, pattern-based filters, and runtime checks that validate tool call arguments. Learners design example system prompts that instruct the model to treat user text as data only, to refuse to follow meta-instructions that conflict with safety rules, and to log suspicious interactions for analysis.

7

Privacy, Regulation, and Responsible AI Governance

This topic situates prompt engineering within broader regulatory and governance contexts. It introduces concepts such as lawful basis for data processing, data subject rights, records of processing, and impact assessments. Learners map AI use cases to regulatory categories (e.g., high-risk vs. low-risk under EU AI legislation) and identify what this...

This topic situates prompt engineering within broader regulatory and governance contexts. It introduces concepts such as lawful basis for data processing, data subject rights, records of processing, and impact assessments. Learners map AI use cases to regulatory categories (e.g., high-risk vs. low-risk under EU AI legislation) and identify what this implies for transparency and oversight. They design documentation for prompts and systems that records intended use, limitations, training data assumptions, and risk mitigations. The topic emphasizes cross-functional collaboration with legal, security, and compliance teams and highlights the need for governance structures that can adapt to rapidly changing AI regulations.

Security, Ethics, and Adversarial Resilience

Description

Learning Objectives

Topics (7)