Skip to content

Static Routes with nexthop non-functional for private gateways#12859

Open
bhouse-nexthop wants to merge 3 commits intoapache:4.22from
bhouse-nexthop:fix-static-route-pbr
Open

Static Routes with nexthop non-functional for private gateways#12859
bhouse-nexthop wants to merge 3 commits intoapache:4.22from
bhouse-nexthop:fix-static-route-pbr

Conversation

@bhouse-nexthop
Copy link
Copy Markdown

@bhouse-nexthop bhouse-nexthop commented Mar 18, 2026

Description

Static routes were only being added to the main routing table, but policy-based routing (PBR) is active on VPC routers. This caused traffic coming in from specific interfaces to not find the static routes, as they use interface-specific routing tables (Table_ethX).

This fix:

  • Adds a helper method to find which interface a gateway belongs to by matching the gateway IP against configured interface subnets
  • Modifies route add/delete operations to update both the main table and the appropriate interface-specific PBR table
  • Inserts iptables FORWARD rules for static routes that don't have a nexthop of a private gateway.
  • Uses existing CsAddress databag metadata to avoid OS queries
  • Handles both add and revoke operations for proper cleanup
  • Adds comprehensive logging for troubleshooting

Fixes #12857

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • Build/CI
  • Test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

See #12857 for the tested scenario

How did you try to break this feature and the system with this change?

@boring-cyborg
Copy link
Copy Markdown

boring-cyborg bot commented Mar 18, 2026

Congratulations on your first Pull Request and welcome to the Apache CloudStack community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/cloudstack/blob/main/CONTRIBUTING.md)
Here are some useful points:

@bhouse-nexthop bhouse-nexthop changed the title Fix static routes to be added to PBR tables in VPC routers Static Routes with nexthop non-functional for private gateways Mar 18, 2026
@bhouse-nexthop bhouse-nexthop force-pushed the fix-static-route-pbr branch 2 times, most recently from f51b1ab to 1141a7a Compare March 18, 2026 18:35
@bhouse-nexthop bhouse-nexthop marked this pull request as ready for review March 18, 2026 19:03
@bhouse-nexthop bhouse-nexthop changed the base branch from main to 4.22 March 18, 2026 19:13
@bhouse-nexthop bhouse-nexthop changed the base branch from 4.22 to main March 18, 2026 19:14
Static routes were only being added to the main routing table, but
policy-based routing (PBR) is active on VPC routers. This caused
traffic coming in from specific interfaces to not find the static
routes, as they use interface-specific routing tables (Table_ethX).

This fix:
- Adds a helper method to find which interface a gateway belongs to
  by matching the gateway IP against configured interface subnets
- Modifies route add/delete operations to update both the main table
  and the appropriate interface-specific PBR table
- Uses existing CsAddress databag metadata to avoid OS queries
- Handles both add and revoke operations for proper cleanup
- Adds comprehensive logging for troubleshooting

Fixes apache#12857
@bhouse-nexthop
Copy link
Copy Markdown
Author

@weizhouapache can you approve the workflow runs and review this. I've just confirmed this fixes both of my issues.

@rajujith rajujith moved this from Todo to In Progress in Apache CloudStack 4.22.1 Mar 26, 2026
@DaanHoogland
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@DaanHoogland a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17284

@DaanHoogland DaanHoogland requested a review from Copilot March 30, 2026 11:36
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@weizhouapache weizhouapache requested a review from Copilot March 31, 2026 07:04
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@weizhouapache weizhouapache requested a review from Copilot April 2, 2026 06:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Member

@weizhouapache weizhouapache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhouse-nexthop
overall lgtm, not tested tet
left two minor comments

Comment thread systemvm/debian/opt/cloud/bin/cs/CsHelper.py
Comment thread systemvm/debian/opt/cloud/bin/cs/CsStaticRoutes.py Outdated
@sureshanaparti
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

@sureshanaparti
Copy link
Copy Markdown
Contributor

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

Copy link
Copy Markdown
Member

@weizhouapache weizhouapache left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code lgtm

not tested yet

@weizhouapache weizhouapache requested a review from Copilot April 15, 2026 13:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"-A PREROUTING -m state --state NEW -i %s -s %s ! -d %s/32 -j ACL_OUTBOUND_%s" %
(self.dev, guestNetworkCidr, self.address['gateway'], self.dev)])

# Process static routes for this interface
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bhouse-nexthop , can we make the follwing code a method? i.e process_static_routes_for_Interface(…)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a vast majority of this block was just moved up outside of the if self.is_private_gateway():, I don't think it warrants a helper function tbh. Plus, the code as it is right now has been well tested, so we'd have to go through the validation process again.

If you feel really strongly about it I'll do it, but I'd rather not :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel strongly but that is not absolute. I don’t think you should force yourself or that it is a requirement for merging (at all) or that you should do it if you feel it is a risk. I do think that such extractions make code more readable and maintainable, hence my question.

@DaanHoogland
Copy link
Copy Markdown
Contributor

code looks good generally, what can we do in terms of testing, other than trusting @bhouse-nexthop ? @sureshanaparti @rajujith (cc @weizhouapache @winterhazel )

@DaanHoogland
Copy link
Copy Markdown
Contributor

btw, seems we have merged lint-errors. the errors in https://github.com/apache/cloudstack/actions/runs/24455968567/job/71456964714?pr=12859 refer files not touched in this PR!

@blueorangutan
Copy link
Copy Markdown

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ el10 ✔️ debian ✔️ suse15. SL-JID 17505

@DaanHoogland
Copy link
Copy Markdown
Contributor

@blueorangutan test

@blueorangutan
Copy link
Copy Markdown

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

[SF] Trillian test result (tid-15888)
Environment: kvm-ol8 (x2), zone: Advanced Networking with Mgmt server ol8
Total time taken: 51022 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr12859-t15888-kvm-ol8.zip
Smoke tests completed. 148 look OK, 1 have errors, 0 did not run
Only failed and skipped tests results shown below:

Test Result Time (s) Test File
ContextSuite context=TestClusterDRS>:setup Error 0.00 test_cluster_drs.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

Private Gateway static route with nexthop is not modifying PBR routing table

9 participants