Reset OpenStack Volume Host After Ceph Configuration Issues
A deep dive in to how to fix an issue with OpenStack volume deletion failures caused by a downed/incorrect backend host configuration
Learn how to fix OpenStack volume deletion failures caused by Ceph configuration issues using cinder-manage commands from the controller node.
- Problem Overview - Understanding the issue
- Prerequisites - What you need before starting
- Solution Steps - The fix using cinder-manage
- Verification - Confirm the fix worked
- Prevention - How to avoid this issue in the future
Overview
When OpenStack volumes fail to delete, I ran into a situation where the 'host' set on the volume was referencing an incorrect backend host, and I was unable to delete teh volumes, even forcibly. This is based around a Kolla-Ansible deployment. This typically happens when:
- There is a misconfiguration or pools are reconfigured or renamed
- Storage backends are changed
- Host information becomes inconsistent between Cinder and Ceph
- Volume host references point to non-existent or misconfigured storage locations
The solution involves using cinder-manage
from within the cinder-volume container on the controller node to update the volume host information.
1. PROBLEM OVERVIEW
Symptoms:
- OpenStack volume deletion commands fail with errors like:
ERROR: Volume deletion failed
- Volumes appear stuck in "deleting" state
- Cinder logs show host-related errors
Root Cause: The volume's host information in the Cinder database doesn't match the actual Ceph configuration, preventing proper volume deletion.
2. PREREQUISITES
Before attempting this fix, ensure you have:
- Controller Node Access: SSH access to your OpenStack controller node
- Admin Privileges: OpenStack admin credentials
- Volume Information: The volume ID that's failing to delete
- Ceph Pool Details: Current Ceph pool configuration information
Required Commands:
# Check current volume status
openstack volume show <volume-id>
# Verify Ceph pool configuration
ceph osd pool ls
ceph osd pool ls detail
3. SOLUTION STEPS
Follow these steps to fix the volume deletion issue:
Step 1: Access the Cinder Volume Container
From your controller node, access the cinder-volume container:
# List running containers to find cinder-volume
docker ps | grep cinder-volume
# Access the cinder-volume container
docker exec -it <cinder-volume-container-id> bash
Step 2: Identify Current Host Information
Inside the container, check the current host information for the problematic volume:
# Check volume details in Cinder database
cinder-manage volume show <volume-id>
Look for the host
field in the output, which will show something like:
host: controller03@rbd-2#rbd-2
Step 3: Update Volume Host Information
Use cinder-manage
to update the volume's host information to match your current Ceph configuration:
# Update the volume host information
cinder-manage volume update_host \
--currenthost controller03@rbd-2#rbd-2 \
--newhost rbd:volumes_tier2@rbd-2#rbd-2
Command Breakdown:
--currenthost
: The current (incorrect) host information--newhost
: The correct host information matching your Ceph configuration- Format:
backend:pool@ceph-cluster#ceph-cluster
Step 4: Exit Container
exit
4. VERIFICATION
After updating the host information, verify the fix:
Test Volume Deletion
# Attempt to delete the volume
openstack volume delete <volume-id>
# Check volume status
openstack volume show <volume-id>
Verify Database Update
You can also verify the change was applied correctly:
# Access the container again
docker exec -it <cinder-volume-container-id> bash
# Check the updated volume information
cinder-manage volume show <volume-id>
5. PREVENTION
To avoid this issue in the future:
Best Practices
- Document Configuration Changes: Keep detailed records of Ceph pool changes
- Test in Staging: Always test storage configuration changes in a staging environment first
- Monitor Volume Operations: Set up monitoring for volume deletion failures
- Regular Audits: Periodically audit volume host information consistency
Configuration Management
# Regular health check script
#!/bin/bash
# Check for volumes with inconsistent host information
cinder-manage volume list | grep -E "controller[0-9]+@rbd-[0-9]+#rbd-[0-9]+"
COMMON SCENARIOS
Scenario 1: Pool Rename
If you renamed a Ceph pool from volumes
to volumes_tier2
:
cinder-manage volume update_host \
--currenthost controller03@volumes@rbd-2#rbd-2 \
--newhost controller03@volumes_tier2@rbd-2#rbd-2
Scenario 2: Backend Change
If you changed from lvm
to rbd
backend:
cinder-manage volume update_host \
--currenthost controller03@lvm#lvm \
--newhost controller03@rbd-2#rbd-2
TROUBLESHOOTING
If the Command Fails
- Check Volume Status: Ensure the volume isn't in use
openstack volume show <volume-id> | grep status
- Verify Host Format: Double-check the host format matches your Ceph configuration
ceph osd pool ls detail
- Check Cinder Logs: Review logs for additional error details
docker logs <cinder-volume-container-id>
Alternative Approaches
If cinder-manage
doesn't work, you can also:
- Direct Database Update: Update the Cinder database directly (use with caution)
- Volume Migration: Migrate the volume to a working backend
- Force Delete: As a last resort, force delete from Ceph directly
SUMMARY
This guide walked you through fixing OpenStack volume deletion issues caused by Ceph configuration problems. The key steps were:
- Identify the problematic volume and its current host information
- Access the cinder-volume container on the controller node
- Update the volume host using
cinder-manage volume update_host
- Verify the fix by attempting volume deletion
- Prevent future issues through proper configuration management
The cinder-manage volume update_host
command is a powerful tool for resolving host-related volume issues in OpenStack environments with Ceph storage backends.
Next Steps
After fixing the volume deletion issue, consider:
- Implementing monitoring for similar issues
- Documenting your Ceph configuration changes
- Setting up automated health checks
- Reviewing your storage configuration management processes
For more information about OpenStack Cinder management, visit the official OpenStack documentation.
Need help with your OpenStack deployment? Get $200 in free credits and start hosting your applications on Gozunga Cloud today!