Skip to content

Skip stale destroyed volumes in local storage checks#13125

Draft
andrijapanicsb wants to merge 1 commit intomainfrom
fix/skip-stale-destroyed-volumes-local-storage-check
Draft

Skip stale destroyed volumes in local storage checks#13125
andrijapanicsb wants to merge 1 commit intomainfrom
fix/skip-stale-destroyed-volumes-local-storage-check

Conversation

@andrijapanicsb
Copy link
Copy Markdown
Contributor

@andrijapanicsb andrijapanicsb commented May 7, 2026

Description

During VM migration and host maintenance flows, UserVmManagerImpl.isAnyVmVolumeUsingLocalStorage checks VM volumes to decide whether the VM uses local storage. If a stale destroyed volume row still references a storage pool that has since been removed, the method can dereference a null StoragePoolVO and fail with an NPE.

This change skips non-active volume rows while evaluating local-storage usage:

  • null volume entries
  • volumes with removed set
  • volumes in Destroy state
  • volumes in Expunged state

For active volumes, the method now fails with a clear CloudRuntimeException if the referenced storage pool is missing or removed instead of throwing a null-pointer exception. This keeps stale destroyed metadata from blocking unrelated VM/host operations while still surfacing real active-volume inconsistency.

Fixes #13124

Tests

Added unit coverage in UserVmManagerImplTest for:

  • skipping destroyed volumes with missing pools
  • skipping removed volumes
  • failing clearly for active volumes with missing pools
  • failing clearly for active volumes with removed pools




##########################################################
Generated by AI - do you REALLY think I became a Java developer overnight?
So, REVIEW PROPERLY and feel free to decline the PR if it's bunch of junk !
#########################################################

@andrijapanicsb andrijapanicsb marked this pull request as draft May 7, 2026 15:50
@andrijapanicsb andrijapanicsb requested review from DaanHoogland and harikrishna-patnala and removed request for DaanHoogland May 7, 2026 15:50
@codecov
Copy link
Copy Markdown

codecov Bot commented May 7, 2026

Codecov Report

❌ Patch coverage is 53.84615% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 18.09%. Comparing base (f6efda5) to head (a429813).

Files with missing lines Patch % Lines
.../src/main/java/com/cloud/vm/UserVmManagerImpl.java 53.84% 2 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##               main   #13125   +/-   ##
=========================================
  Coverage     18.09%   18.09%           
- Complexity    16721    16725    +4     
=========================================
  Files          6037     6037           
  Lines        542546   542557   +11     
  Branches      66431    66436    +5     
=========================================
+ Hits          98159    98168    +9     
+ Misses       433368   433366    -2     
- Partials      11019    11023    +4     
Flag Coverage Δ
uitests 3.52% <ø> (ø)
unittests 19.25% <53.84%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@vishesh92 vishesh92 requested a review from Copilot May 7, 2026 16:46
@nvazquez
Copy link
Copy Markdown
Contributor

nvazquez commented May 7, 2026

@blueorangutan package

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 7, 2026

@sureshanaparti
Copy link
Copy Markdown
Contributor

@blueorangutan package

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a NullPointerException in UserVmManagerImpl.isAnyVmVolumeUsingLocalStorage that can occur during VM migration/host maintenance when stale destroyed volume rows reference storage pools that have been removed. The change filters out non-active volume rows and replaces the prior implicit NPE with an explicit CloudRuntimeException for inconsistent active volumes.

Changes:

  • Skip null/removed/destroyed/expunged volume rows when checking whether a VM uses local storage.
  • For active volumes, throw a clear CloudRuntimeException when the referenced storage pool is missing or removed.
  • Add unit tests covering the skip behavior and the new failure modes for active volumes with missing/removed pools.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
server/src/main/java/com/cloud/vm/UserVmManagerImpl.java Adds filtering for non-active volumes and explicit error handling when an active volume references a missing/removed pool.
server/src/test/java/com/cloud/vm/UserVmManagerImplTest.java Adds unit tests for skipping stale volume rows and for clear failures on active volumes with invalid pool references.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 7669 to +7673
DiskOfferingVO diskOffering = _diskOfferingDao.findById(vol.getDiskOfferingId());
if (diskOffering.isUseLocalStorage()) {
if (diskOffering != null && diskOffering.isUseLocalStorage()) {
return true;
}
StoragePoolVO storagePool = _storagePoolDao.findById(vol.getPoolId());
Long poolId = vol.getPoolId();
Comment on lines +7663 to +7668
if (vol == null || vol.getRemoved() != null ||
Volume.State.Destroy.equals(vol.getState()) ||
Volume.State.Expunged.equals(vol.getState())) {
logger.debug("Skipping non-active volume while checking local storage usage: {}", vol);
continue;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if it would rather be better to have a method is the VolumeDao to only return the active / non-removed volumes of a VM. To avoid the filtering here. Just a thought - that way we can reuse the logic in future as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are the Java guru, not me :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

CloudStack 4.22: prepareHostForMaintenance throws NPE when stale destroyed volume references removed storage pool

5 participants