Skip to content
← Back to trail
Rangitoto12 min

AI Safety & Ethics

What you'll learn

  • Identify common AI hallucination patterns
  • Understand bias in AI outputs
  • Know when NOT to rely on AI
  • Practice responsible AI use

The Lesson Everyone Skips (But Shouldn't)

This is the lesson most people skim past because they want to get to the "cool stuff." We get it. But if you skip this lesson, you will eventually learn these concepts the hard way — by trusting AI output that turns out to be wrong at the worst possible moment.

Everything we have covered so far has been about how to use AI effectively. This lesson is about how to use AI responsibly. Both matter equally.

Let us start with the biggest trap.

Hallucinations: When AI Makes Things Up

In the first lesson, we defined hallucination as "when an AI generates information that sounds confident and plausible but is factually wrong." Now let us dig into what that actually looks like in practice, because it is more subtle than you might expect.

What Hallucination Looks Like

Here is the dangerous thing about AI hallucinations: they do not look like errors. They look like perfect, confident, well-written answers. The AI does not say "I am guessing here" or "I made this up." It presents fabricated information with exactly the same tone and confidence as accurate information.

Some real-world examples of hallucination:

  • Fake citations. Ask an AI to cite sources and it might generate plausible-looking academic paper titles, complete with author names and journal names — that do not exist. The format looks perfect. The papers are entirely fabricated.
  • Invented statistics. "According to a 2023 study, 73% of remote workers prefer asynchronous communication." That sounds real. The AI wrote it with confidence. But unless you verify it, you have no idea if that study or that number is real.
  • Fictional people and events. AI can generate biographies of people who do not exist, or describe historical events that never happened, with convincing detail.
  • Incorrect technical details. An AI might describe an API function that does not exist, or explain a medical condition with symptoms that are slightly wrong. Close enough to sound right, far enough to be dangerous.

⚠️The confidence trap

The most dangerous hallucinations are not the obviously wrong ones — those are easy to spot. The dangerous ones are the ones that are 90% correct with one critical detail wrong. They slip past your defenses because most of the information checks out.

Why Hallucinations Happen

Remember from Lesson 1: LLMs are pattern-matching engines. They generate text that looks like a correct response based on patterns they learned during training. When the model encounters a question where it does not have strong patterns to draw from, it does not say "I do not know." Instead, it generates text that fits the pattern of what an answer should look like.

Think of it this way: if someone asked you to write a fake Wikipedia article about a made-up country, you could do it. You know what Wikipedia articles look like. You know the format, the tone, the structure. You could produce something convincing. That is essentially what the AI is doing when it hallucinates — it is producing text that matches the pattern of a correct answer, even when it does not have the actual correct answer.

How to Protect Yourself from Hallucinations

Here are practical strategies:

  1. Verify claims independently. If the AI tells you a fact, statistic, or quote, look it up. Do not assume it is correct because it sounds right.

  2. Be skeptical of specifics. The more specific the claim (exact numbers, dates, names, citations), the more likely it needs verification. General concepts are usually more reliable than precise details.

  3. Ask the AI about its confidence. "How confident are you in this response? What parts are you least sure about?" LLMs can often identify their own uncertainty when asked directly.

  4. Cross-check with a second source. Run important claims through a different AI tool, a search engine, or a human expert.

  5. Watch for "too good to be true" responses. If the AI gives you a perfectly complete, beautifully detailed answer to a question you expected to be hard, that is a yellow flag. Hard questions should produce nuanced, hedged responses.

🛠️

Catch the Hallucination

Try this exercise to calibrate your hallucination radar:

  1. Ask your AI tool: "Tell me about the history of [a small town near you]." Check the response against what you actually know or can quickly verify. How much is accurate?

  2. Ask your AI tool: "Can you cite three academic papers about [any topic you know well]?" Look up whether those papers actually exist. Check the author names, journal names, and dates.

  3. Ask your AI tool to solve a specific math problem and check the answer with a calculator.

This exercise is not meant to make you distrust AI entirely. It is meant to show you where the boundaries are so you can use AI wisely within them.

Bias in AI Outputs

Hallucination gets the most attention, but bias is arguably a bigger problem because it is harder to spot.

Where Bias Comes From

LLMs are trained on text from the internet and published sources. That text reflects all the biases of the people who wrote it. If the training data contains more content from certain perspectives, demographics, cultures, or viewpoints, the model's outputs will skew that way.

This is not a theoretical concern. It shows up in practical ways:

  • Cultural bias. Ask an AI to describe a "typical family" and it might default to Western, middle-class assumptions. Ask for a "professional outfit" and it might skew toward specific cultural norms.
  • Gender bias. AI might default to assuming doctors are male and nurses are female, or assign leadership qualities more readily to male-coded descriptions.
  • Recency and popularity bias. AI tends to favor information that was popular or frequently discussed online, which might not be the most accurate or representative information.
  • English-language bias. Most training data is in English, from English-speaking countries. Information about non-English-speaking cultures may be less accurate or nuanced.

💡Bias is not a bug to be fixed — it is a reality to manage

It is tempting to think that AI companies can just "remove the bias." But bias in AI reflects bias in the data it was trained on, which reflects bias in human society. The companies are working on reducing harmful biases, but the problem will never be fully "solved." Your job is to be aware of it and compensate for it.

How to Manage Bias in Practice

  • Be explicit about the perspective you want. Instead of "Write about leadership styles," try "Write about leadership styles, including perspectives from different cultures and management traditions."
  • Ask for multiple viewpoints. "Give me three different perspectives on this issue" produces more balanced output than asking for one answer.
  • Notice what is missing. When the AI gives you a list, ask yourself: whose perspective is not represented? What assumptions are being made?
  • Challenge defaults. If the AI makes assumptions about gender, age, culture, or background, push back. "You assumed the user is male — rewrite this in a gender-neutral way."

When NOT to Rely on AI

There are specific situations where relying on AI output without verification is genuinely risky. Knowing these boundaries is as important as knowing how to use AI well.

High-Risk Categories

Medical information. AI can provide general health information, but it should never replace professional medical advice. An AI might miss important context about your specific situation, medications, or medical history. Always consult a healthcare professional for medical decisions.

Legal information. AI can explain general legal concepts, but it can hallucinate about specific laws, regulations, or precedents. Legal advice requires a licensed professional who understands your jurisdiction and specific circumstances.

Financial decisions. AI can help you think through financial concepts, but do not rely on it for specific investment advice, tax calculations, or financial planning. One wrong number could be costly.

Safety-critical systems. If you are writing code that controls physical systems, medical devices, or anything where a bug could cause harm, AI-generated code needs especially rigorous review and testing.

Personal information about real people. AI might generate incorrect or outdated information about real people. Do not rely on AI to tell you someone's current role, contact information, or background without verifying independently.

⚠️ Warning

The general rule: the higher the stakes, the more verification you need. For a casual email, a quick skim is enough. For a legal contract, a medical recommendation, or production code, verify everything with appropriate human expertise.

The "Three-Source Rule"

For anything important, we recommend the three-source rule: verify the AI's output with at least two additional independent sources. If you asked the AI, check with a search engine and a human expert. If all three agree, you are probably safe. If any disagree, investigate further.

Privacy and Data Considerations

When you use an AI tool, you are sending your data to that company's servers. This has real implications you should think about.

What to Consider

  • Do not paste sensitive data into AI tools unless you understand the provider's data policy. Company secrets, personal information about employees or customers, passwords, API keys, and proprietary code all deserve caution.
  • Read the privacy policy. Most AI tools say they do not use your conversations to train their models, but policies vary and can change. Claude, for example, does not train on your conversations by default.
  • Be careful with client data. If you are working with client information, check whether your agreement with them allows you to process their data through third-party AI services.
  • Use the appropriate tier. Business and enterprise plans from AI providers typically come with stronger data protections and contractual guarantees than free or consumer plans.

💡 Tip

A simple rule of thumb: do not paste anything into an AI tool that you would not be comfortable seeing on a public website. If it is sensitive, either anonymize it first or use an AI tool with appropriate enterprise-grade data protections.

The "Automation Complacency" Problem

There is a psychological trap that comes with using AI: the more you use it, the less critically you evaluate its output. Researchers call this "automation complacency" — the tendency to over-trust automated systems.

You have probably experienced this in other areas. Think about GPS navigation: the first time you used it, you probably double-checked every turn. After years of use, you follow it blindly — even when it tells you to turn onto a road that is clearly wrong.

The same thing happens with AI. In the beginning, you will read every response carefully. After weeks of good responses, you will start skimming. After months, you might stop checking entirely. That is when mistakes slip through.

How to Fight Complacency

  • Build verification into your workflow. Do not rely on willpower to check things. Create a process: before publishing any AI-generated content, run it through a verification checklist.
  • Keep a "mistake log." When AI gets something wrong, write it down. Reviewing your mistake log periodically reminds you that errors are real and ongoing, not a thing of the past.
  • Rotate your verification methods. Sometimes verify by checking sources manually. Sometimes ask a different AI. Sometimes ask a colleague. Variety keeps you sharp.
  • Stay curious. When the AI gives you an answer, occasionally ask "How do you know that?" or "What are you unsure about?" Maintaining a questioning posture is the best defense against complacency.

🐾Haku says

Even the most well-behaved cat still knocks things off the table sometimes. Not because they are bad cats, but because curiosity is in their nature and even the most graceful climber can misjudge a leap. AI is the same: reliable most of the time, but always worth keeping an eye on.

Responsible AI Use: A Framework

Let us wrap up with a simple framework for responsible AI use that you can carry with you through the rest of this course and beyond:

  1. Verify before you trust. Check important facts. Do not assume accuracy because it sounds right.
  2. Disclose when appropriate. If you used AI to help write something, consider whether your audience should know that. In many professional contexts, transparency about AI assistance is becoming expected.
  3. Protect sensitive data. Think before you paste. Keep private information private.
  4. Consider the impact. Before publishing or sharing AI-generated content, think about who might be affected and whether the content could cause harm.
  5. Stay informed. AI capabilities and limitations change rapidly. What is true today might not be true next month. Keep learning.
🛠️

Build Your Verification Checklist

Create a personal AI verification checklist that you will use going forward. Here is a starting template — customize it for your needs:

  1. Facts and statistics: Have I verified any specific claims with an independent source?
  2. Citations and references: If the AI cited sources, have I confirmed they exist?
  3. Code: Have I tested the code and reviewed it for security issues before using it?
  4. Personal information: Does this contain claims about real people that need verification?
  5. Sensitive data: Did I accidentally include any private or confidential information in my prompt?
  6. Bias check: Does the output make assumptions about gender, culture, or demographics that should be questioned?
  7. Tone check: Does this sound like something I would actually say or write?

Save this checklist somewhere accessible. Use it every time you are about to share or publish AI-generated content.

Congratulations: Rangitoto Complete

You have made it through Rangitoto. Let us recap what you now know:

  • What AI is: A pattern-matching engine trained on massive amounts of text, not a sentient being or an infallible oracle.
  • Which tools to use: Claude as your primary, with Gemini and ChatGPT filling specific niches.
  • How to talk to AI: The 8-part prompt framework for structured, effective communication.
  • How to do real work: The interview-first technique, context files, iterative refinement, and meta-prompts.
  • How to stay safe: Hallucination awareness, bias management, privacy considerations, and a verification mindset.

You have a solid foundation. Now it is time to put it into practice with real tools.

Paw Print Check

Before moving on, make sure you can answer these:

  • 🐾Can you describe what a hallucination is and give an example of one?
  • 🐾Do you know three strategies for protecting yourself from hallucinations?
  • 🐾Can you explain how bias shows up in AI outputs?
  • 🐾Can you name three situations where you should NOT rely on AI without verification?
  • 🐾Have you created your own AI verification checklist?

Next Up

Setting Up Claude Desktop

Install Claude Desktop, create your account, and start your first real AI workflow.

Enjoying the course?

If you found this helpful, please share it with friends and family — it really helps us out!

Stay in the loop

Get notified about new lessons, trails, and updates — no spam, just the good stuff.