Although the title is a bit overdramatic, at first we did think we had some pretty hard issues to tackle before we could even think about migrating. But after systematically working through the list, we solved them all.
One of our first steps was to get the current contracts/EPGs out of ACI. Like almost every product nowadays ACI has an API. I did not have rights to the ACI environment, but the customer was able to pull out the requested data via moquery (more information here; ACI object moquery Cheat Sheet – Cisco Community). We imported the data into Excel and we were able to produce different CSV’s/jSONs that we could use to automatically create securitygroups, services, and rules. I won’t discus the building of the actual CSV but it’s pretty straightforward if you have the export.
Mo problems;
In my previous post, I outlined a few issues we saw for this migration and I promised you the solution. So here we go!
VMs within the same EPG can communicate with each other (without contract)
The most important difference is that in NSX the VMs each get individually a firewall rule whereas in ACI this only applies if you go outside your EPG. In fact, this was pretty easy to fix. We made an export of the portgroups which contain the VMs. We took this info as a starting point for our tag creation and made sure that the VMs which are now in the same EPG came into one (new)security group. within NSX we created an Allow-any-any rule from Securitygroup-to-Securitygroup for each EPG. For Example;
By doing so we recreated the default behavior that VMs that reside in the same EPG can freely communicate with each other.
Production and Acceptance have their own VRF thus their own contracts and EPGs
What important is to understand, is that these environments can have different contracts because they both exist within their own VRF. ( Of course, there should be no difference; if you have a production or acceptance EPG, you would suspect that they both have the same rules. But something with assumptions and fuckups, we had to double check this).
After we made the export from the contracts of production and acceptance we saw indeed some differences between contracts for the same application within both environments. As stated before we needed to make sure that the configuration from NSX was equal to the configuration of NSX so we created some double contracts for Acceptance and Production and made a list of the deviations so we can take this up with the application owner after migration.
If the traffic is not sent to an EPG in the same bridge domain the traffic is sent to the firewall (l3-out EPG “External”
Although it’s not really hard to understand what happens to traffic that is sent to networks outside NSX/ACI, it causes some problems with the way NSX has a firewall on every VM instead. We fixed this with a north-south policy with some negate rules to allow traffic that is not part of an NSX logical segment. Therefore every traffic that is not sent to an NSX segment is allowed and taken care of by the upstream firewall.
So far, so good. But as I mentioned the traffic from bridge domain to bridge domain was also taken care of by the firewall(Because these are different subnets in ACI). But all these subnets are now living in the same NSX environment so they are now routed locally within NSX and never reach the firewall. To fix this we recreated the rules which were on the external firewall for bridge-domain to bridge-domain traffic so this traffic was still possible and secure.
A few contracts which apply to all EPG
This was not a big issue; We just created the appropriate rules for these contracts which contains every EPG. In the near future we are going to see if these are really used.
There is a DHCP relay configured for two bridge domain
In NSX you can also create a DHCP-Relay there even are default firewall rules in place to allow this traffic so we tested this before migration with a test subnet without any problem.
A separate Heartbeat EPG for Windows clusters
This one was a bit tricky. Within ACI some servers were connected with a second interface to the heartbeat EPG. Because of the default behavior, all these servers were able to “talk” to each other. In NSX we could recreate this and add every second nic(interface) to a securitygroup “SG-Heartbeat”. But this would mean we had to TAG these interfaces and that it was possible for all the VMs with a NIC in that specific SG-Heartbeat to see each other. Although this was currently present in ACI we didn’t think recreating this would be such a great idea. So we decided to make an exception and redesigned the way heartbeat was implemented.
We created 2 separate heartbeat logical segments (Not connected to a Tier-1) And placed every heartbeat NIC in this LS. Because of the default SG-to-SG-Any firewall rule, only the heartbeat nics within the same securitygroup were able to communicate with each other.
We did encounter a problem with one application after migration which misused the configuration in ACI. They connected the second heartbeat interface from the DB server to the application EPG. Therefore there was no contract needed for communication. Because of the way we redesigned the heartbeat this was not possible anymore. So we had to create a rule to allow this traffic.
Preparing for migration
In my next post, I will deviate more about this migration, but we needed to have some things in place before migration day. Because the new NSX environment was built on completely new hardware/vCenter we did not have the VMs at our proposal before migration, so we had to be a bit creative to get all the preliminary work done.
As I told you the EPGs can be translated into securitygroups because, in the end, they have the same purpose. Grouping a bunch of VMs. To fill this security group we created a tagging system wherein each VM would get 5 tags and based on these tags it gets placed in a certain security group. This brought 2 automation desires;
- Creating the groups based on the tags
- Tagging the VMs based on a predefined jSON file
But then again, back at the beginning. We must first create the items defined below;
- TAGS
- Security groups based on this tags
- Services
- Firewall rules
We primarily used postman for the creation of these rules. In one of my previous posts(NSX-T RestAPI – Adding Multiple Segments) I outlined how you can talk with NSX-T via Postman. One thing we added to all the Postman requests was a basic test to validate if the request was successful. (keep in mind that sometimes you get another response as 200 so change it accordingly. This is all well documented in the NSX-T Data Center REST API – VMware API Explorer – VMware {code}).
pm.test("Status test", function () {
pm.response.to.have.status(200);
});
Creating Tags
You cannot create tags in NSX without adding them to a VM. We created a few tagging VMs and generated all the TAGS on this VM so that they are in place and we can use/test them with our security groups. We created these tags manually, but in my next blogpost about the migration, I will outline the postman scripts we used to set the tags to the VMs after migration. This is reusable to set the tags to the test/tagging VMs.
Keep in mind that a tag that is not assigned to a VM automatically will be removed within NSX-T.
Creating security group
Although we could have created security groups without first creating the tags because they look for VMs with the tag and not for the tag itself. I like to have a few VMs with tags to validate the groups are correctly populated based on the tags.
We create a new collection to create the security groups. Define the variables, started up the “Postman Runnner” and selected the jSON as input. After a few seconds, the groups are created.
{ "expression": [ { "member_type": "VirtualMachine", "key": "Tag", "operator": "EQUALS", "scope_operator": "EQUALS", "value": "Applicatie|{{Applicatie}}", "resource_type": "Condition" }, { "conjunction_operator": "AND", "resource_type": "ConjunctionOperator" }, { "member_type": "VirtualMachine", "key": "Tag", "operator": "EQUALS", "scope_operator": "EQUALS", "value": "Zone|{{Zone}}", "resource_type": "Condition" }, { "conjunction_operator": "AND", "resource_type": "ConjunctionOperator" }, { "member_type": "VirtualMachine", "key": "Tag", "operator": "EQUALS", "scope_operator": "EQUALS", "value": "AppTier|{{AppTier}}", "resource_type": "Condition" }, { "conjunction_operator": "AND", "resource_type": "ConjunctionOperator" }, { "member_type": "VirtualMachine", "key": "Tag", "operator": "EQUALS", "scope_operator": "EQUALS", "value": "Omgeving|{{Omgeving}}", "resource_type": "Condition" }, { "conjunction_operator": "AND", "resource_type": "ConjunctionOperator" }, { "member_type": "VirtualMachine", "key": "Tag", "operator": "EQUALS", "scope_operator": "EQUALS", "value": "Toepassing|{{Toepassing}}", "resource_type": "Condition" } ], "description": "{{SecurityGroups}}", "display_name": "{{SecurityGroups}}" }
Services
We tried automating this from the export from ACI, but after a day of hard work, we decided it was faster to do this manually. The main reason was that ACI has some layers in its contracts with subcontracts etc. Although we did it by hand, we kept the contract names the same between NSX and ACI so we can easily determine after migration if there is a difference between a service and a contract.
Firewall Rules
Eventually, we had to create 3 types of firewall rules.
- To replicate the behavior of ACI where any VM in the same SG can communicate without any contract/rule with each other
- Consumed-Provided EPGs and their assigned contracts
- A firewall rule for our north-south traffic
We decided to make a firewall policy per application, so we had a logical grouping of the rules per application. We also decided to create display names so we can corollate the rules between ACI and NSX after migration.
security group to security group
With the rest API call below we created all the corresponding policies and the default SG-to-SG groups
{
"description": "{{applicatie}}",
"display_name": "{{applicatie}}",
"category": "Application",
"rules": [
{
"description": "{{displayname}}",
"display_name": "{{displayname}}",
"sequence_number": 1,
"source_groups": [
"{{securitygroup}}"
],
"destination_groups": [
"{{securitygroup}}"
],
"services": [
"any"
],
"action": "ALLOW"
}
]
}
Consumed-Provided EPGs and their assigned contracts
These are the actual rules where we define which security group can communicate with each other via which ports (services).
{
"description": "{{displayname}}",
"display_name": "{{displayname}}",
"sequence_number": 1,
"source_groups": [
"{{SG}}"
],
"logged": false,
"destination_groups": ["{{OtherSGs}}"],
"scope": [
"ANY"
],
"action": "ALLOW",
"services": [
"{{Service}}"
]
}
A firewall rule for our north-south traffic
We created these rules by hand. Below is a screenshot from my lab environment to give you an idea of how this looks in the DFW
Migration
So next post I will tell you more about the migration and the steps/scripts we use to automate this as far as we could.
The original article was posted on: www.ruudharreman.nl