Deploying a Free Tier VM on Oracle Cloud with Terraform: A Debugging Story
A step-by-step walkthrough of every error encountered, why it happened, and how it was fixed — written for those new to Oracle Cloud Infrastructure (OCI) and Terraform.
Deploying a Free Tier VM on Oracle Cloud with Terraform: A Debugging Story
A step-by-step walkthrough of every error encountered, why it happened, and how it was fixed — written for those new to Oracle Cloud Infrastructure (OCI) and Terraform.
🗺️ Background: What Are We Even Doing?
Before diving into the errors, let's establish what all the moving pieces are.
What is Oracle Cloud Infrastructure (OCI)?
OCI is Oracle's cloud computing platform — think of it like renting space in a giant, remote data centre. Instead of buying a physical server, you ask Oracle to spin up a virtual machine (VM) for you. Oracle's Always Free Tier allows you to run two small VMs at no cost, forever. That's what we're trying to create here.
What is Terraform?
Terraform is an Infrastructure-as-Code (IaC) tool made by HashiCorp. Instead of clicking through a web UI to create resources, you write configuration files (.tf files) that describe what you want, and Terraform figures out how to build it. Think of it like writing a recipe — Terraform is the chef that follows it.
What is the OCI Terraform Provider?
Terraform needs a translator to talk to each cloud. The OCI provider (oracle/oci) is that translator for Oracle Cloud. It takes your Terraform instructions and converts them into OCI API calls behind the scenes.
How Does Terraform Talk to OCI?
OCI uses API key authentication. Think of it like a physical key and a lock:
-
You generate a private key (kept secret on your machine — the key)
-
You upload a public key to OCI (the lock)
-
OCI also stores a fingerprint — a short unique identifier of the lock, so you can reference it without exposing the full key
When Terraform wants to create something in OCI, it signs the request with your private key. OCI checks the signature against the registered public key. If they match, the request is allowed.
🗂️ Project Structure
The Terraform project had the following files:
terraform_learn/
├── compute.tf # Defines the VM instance
├── variables.tf # Declares input variable names
├── terraform.tfvars # Provides the actual values (credentials, OCIDs)
├── iam_policy.tf # IAM permission policies
└── outputs.tf # What to print after a successful apply
The workflow was:
terraform plan -out=tfplan.out # Dry-run: what WOULD be created?
terraform apply tfplan.out # Actually create it
🐛 Error 1 — Wrong Authentication Method
The Error
401-NotAuthenticated, The required information to complete authentication was not provided
What is a 401 Error?
A 401 Unauthorized error means "I don't know who you are." OCI rejected the request entirely before even checking permissions. It's like arriving at a hotel and not having any ID at all — you never even get to the front desk.
What Caused It?
The Terraform provider was configured with:
auth = "CloudShell"
OCI Cloud Shell is Oracle's browser-based terminal. It has its own built-in authentication — when you use it interactively, OCI automatically knows who you are through your browser session. But when Terraform runs, it isn't a browser session. Setting auth = "CloudShell" tells Terraform to use that browser-session method, which simply doesn't work for programmatic API calls.
The Fix
Change the auth method to APIKey, which uses the cryptographic key pair:
auth = "APIKey"
🐛 Error 2 — Private Key Path Not Resolving
The Error
401-NotAuthenticated, could not find private key
What is a Private Key Path?
When using API key auth, Terraform needs to find the private key file on disk. The path in terraform.tfvars was:
private_key_path = "~/.oci/oci_api_key.pem"
The ~ symbol is a shell shorthand for your home directory (e.g., /home/fongyang). It's like writing "my house" instead of "123 Main Street" — everyone in a conversation understands it, but a computer program needs the full address.
What Caused It?
The OCI Terraform provider does not expand the ~ shorthand. It tried to find a file literally named ~/.oci/oci_api_key.pem — which doesn't exist. The tilde was never substituted.
The Fix
Use the absolute path:
private_key_path = "/home/fongyang/.oci/oci_api_key.pem"
🐛 Error 3 — Fingerprint Mismatch
The Error
401-NotAuthenticated, The fingerprint claimed by the user does not match
What is a Fingerprint?
A fingerprint is a short hash derived from your public key — a condensed identifier. Think of it like a person's face vs. their driver's license number. The face (public key) contains all the information, but the license number (fingerprint) is a quick reference. OCI uses the fingerprint to look up which public key to use when verifying a request.
What Caused It?
The fingerprint in terraform.tfvars:
fingerprint = "16:30:10:a2:b7:3f:a7:19:67:b6:d5:f4:8b:0d:d4:b1"
...did not match the fingerprint of the private key file being used (/home/fongyang/.oci/oci_api_key.pem). This happens when:
-
Multiple API keys have been generated over time
-
The .tfvars file was copied from a different setup
-
The wrong key file is referenced
It's like trying to open a lock with the right brand of key, but it's the key to a different lock.
The Fix
Re-derive the correct fingerprint directly from the key file on disk and update the config:
openssl pkey -in /home/fongyang/.oci/oci_api_key.pem -pubout -outform DER \
| openssl dgst -md5 -c | awk '{print $2}'
Then paste the result into terraform.tfvars.
🐛 Error 4 — The Main Culprit: Shape Quota of Zero in the Wrong Availability Domain
This was the trickiest bug, hidden behind a misleading error message.
The Error
404-NotAuthorizedOrNotFound, Authorization failed or requested resource not found.
Operation: LaunchInstance
Why This Error is Misleading
A 404 normally means "the thing you're looking for doesn't exist" — like a broken web link. OCI uses the same 404 code for both authorization failures and resource-not-found situations. This made the error look like a permissions problem when it was actually something else entirely.
The Investigation Path
Because the error said "authorization", we first investigated IAM:
What is IAM?
Identity and Access Management (IAM) is OCI's permission system. It answers: "Is this user allowed to do this action on this resource?" Think of it like a building's security badge system — different badges grant access to different floors.
In OCI, IAM is controlled by Policy statements like:
ALLOW GROUP Administrators TO MANAGE all-resources IN TENANCY
What Did We Find?
-
The user
fongyangis in theAdministratorsgroup ✅ -
The
Administratorsgroup has full tenancy-level access ✅ -
IAM was never the problem
We then investigated other possible causes:
| Check | Result |
|---|---|
| Shape VM.Standard.E2.1.Micro exists? | ✅ Available |
| Subnet exists and is AVAILABLE? | ✅ Regional subnet, no AD restriction |
| Image Oracle-Linux-8.10 found? | ✅ Image ID resolved correctly |
| Compartment is ACTIVE? | ✅ Active |
| Shape quota per Availability Domain? | ❌ QUOTA = 0 in AD-1! |
The Real Root Cause: Service Limits per Availability Domain
What is an Availability Domain (AD)?
An Availability Domain is an isolated data centre within an OCI region. us-ashburn-1 (Northern Virginia) has three ADs:
us-ashburn-1
├── Mbag:US-ASHBURN-AD-1 ← Terraform was targeting this
├── Mbag:US-ASHBURN-AD-2 ← Only AD with capacity
└── Mbag:US-ASHBURN-AD-3
Think of ADs like three separate buildings in the same city. Each building has its own inventory. If Building 1 is out of stock of a particular item, you have to go to Building 2.
What is a Service Limit?
OCI puts caps on how many of each resource type you can create in each location. For the Free Tier shape VM.Standard.E2.1.Micro, here were the limits per AD:
| Availability Domain | Shape | Limit |
|----------------------|--------------------------|-------|
| Mbag:US-ASHBURN-AD-1 | vm-standard-e2-1-micro | 0 | ← No capacity!
| Mbag:US-ASHBURN-AD-2 | vm-standard-e2-1-micro | 2 | ← 2 free VMs allowed
| Mbag:US-ASHBURN-AD-3 | vm-standard-e2-1-micro | 0 | ← No capacity!
Only AD-2 had quota. The Terraform config was using availability_domains[0] — which is index zero, resolving to AD-1. OCI responded with a 404 because from its perspective: "You can't create this here — the quota is zero."
The Fix
A one-line change in compute.tf:
# BEFORE — targets AD-1 (quota = 0)
availability_domain = data.oci_identity_availability_domains.ads.availability_domains[0].name
# AFTER — targets AD-2 (quota = 2)
availability_domain = data.oci_identity_availability_domains.ads.availability_domains[1].name
Why index
[1]and not[2]? Arrays in programming are zero-indexed — they start counting at 0, not 1. So index[0]= AD-1,[1]= AD-2,[2]= AD-3.
✅ Final Result
After the fix:
oci_core_instance.free_vm: Creating...
oci_core_instance.free_vm: Still creating... [10s elapsed]
oci_core_instance.free_vm: Still creating... [20s elapsed]
oci_core_instance.free_vm: Still creating... [30s elapsed]
oci_core_instance.free_vm: Creation complete after 36s
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
Outputs:
instance_public_ip = "132.145.171.179"
The VM was live in 36 seconds. 🎉
📚 Key Lessons Learned
1. OCI's 404 is a Dual-Purpose Error
404-NotAuthorizedOrNotFoundcovers both missing resources and permission failures. Don't assume it's always an IAM problem — check resource existence, quotas, and availability domain constraints too.
2. Always Check Service Limits Per AD
When a shape isn't launching, run:
oci limits value list \
--compartment-id <tenancy_ocid> \
--service-name compute \
--all \
--query "data[?name=='<shape-limit-name>']" \
--output table
Pay attention to the availability-domain column — a limit of 0 means no capacity in that AD.
3. The ~ Shorthand Doesn't Always Work
In configuration files processed by applications (not the shell itself), always use absolute paths:
# Find your home directory
echo $HOME
4. Keep Fingerprints in Sync
If you regenerate or rotate API keys, always update all three together in terraform.tfvars:
-
fingerprint -
private_key_path -
user_ocid
They are a matched set — like a lock, key, and serial number.
5. Bypass Terraform to Isolate Errors
When Terraform gives a vague error, replicate the exact same call using the OCI CLI directly. The CLI often returns the same error but in a context that's easier to reason about:
oci compute instance launch \
--compartment-id <compartment_ocid> \
--availability-domain "<ad_name>" \
--shape "<shape_name>" \
--subnet-id <subnet_ocid> \
--image-id <image_ocid> \
--display-name "test-vm-cli" \
--auth api_key
🔍 Full Error Resolution Timeline
terraform apply
│
├── ❌ Error 1: auth = "CloudShell" → Fix: auth = "APIKey"
│
├── ❌ Error 2: ~ not expanded in key path → Fix: use absolute path
│
├── ❌ Error 3: Fingerprint mismatch → Fix: re-derive from key file
│
└── ❌ Error 4: 404 on LaunchInstance
│
├── 🔎 Checked: IAM policy ✅ Admin group, full access
├── 🔎 Checked: Subnet ✅ AVAILABLE, regional
├── 🔎 Checked: Image ✅ Oracle Linux 8.10 found
├── 🔎 Checked: Compartment ✅ ACTIVE
└── 🔎 Checked: Shape quota per AD
├── AD-1: quota = 0 ← Terraform targeted here ❌
├── AD-2: quota = 2 ← Fix: use index [1] ✅
└── AD-3: quota = 0 ❌
Debugging cloud infrastructure often feels like peeling an onion — each layer reveals a new issue underneath. The key is to systematically eliminate possibilities, trust your diagnostic commands, and never assume an error message means exactly what it says.
Last updated 1 day ago
Built with Documentation.AI