Post

Azure OpenAI Network Hardening

In my last two posts, I shared how you could load balance multiple OpenAI instances using Azure API Management and using Azure Application Gateway. The reality of enterprise organization requirements, however, is much more complicated. The key challenge is that enterprise organizations do not like exposing PUBLIC endpoints. This is a short post on reference architectures to meet network hardening requirements.

Network Hardening using Private Endpoints

A private endpoint is a network interface that uses a private IP address from your virtual network. This network interface connects you privately and securely to a service that’s powered by Azure Private Link. By enabling a private endpoint, you’re bringing the service into your virtual network (source: here.)

Azure Cognitive Services, including Azure OpenAI, supports private endpoints. This allows us to, say, deploy a private endpoint in a Southeast Asia region Virtual Network that is linked to a West Europe Azure OpenAI resource. With private endpoints, the following network architectures are possible:

flowchart LR
    Client -.- WAF
    subgraph VirtualNetwork[ ]
        subgraph DMZ
            WAF --- FE
        end
        subgraph BE[Backend]
            FE[FrontEnd] --- O
            O[Orchestrator<br />Service] --- AppGW[Application<br />Gateway]
            AppGW --- A((Private<br />Endpoint))
            AppGW --- B((Private<br />Endpoint))
            AppGW --- C((Private<br />Endpoint))
            AppGW --- D((Private<br />Endpoint))
        end
    end
    A --- AOAI[OpenAI-East US]
    B --- BOAI[OpenAI-West Europe]
    C --- COAI[OpenAI-Japan East]
    D --- DOAI[OpenAI-...]
Network Hardened Azure OpenAI with AppGW Load Balancer


flowchart LR
    Client -.- WAF
    subgraph VirtualNetwork[ ]
        subgraph DMZ
            WAF --- FE
        end
        subgraph BE[Backend]
            SPACEBE[<br />]
            FE[FrontEnd] --- GW
            subgraph APIM[API Management]
                SPACEAPIM
                GW[Gateway] --- OAPI
                GW --- Others
                GW --- OAIAPI
                subgraph APIs
                OAPI[Orchestrator API]
                OAIAPI[OpenAI API]
                Others[Other APIs]
                end
            end
            OAPI --- O[Orchestrator<br />Service]
            OAIAPI --- A((Private<br />Endpoint))
            OAIAPI --- B((Private<br />Endpoint))
            OAIAPI --- C((Private<br />Endpoint))
            OAIAPI --- D((Private<br />Endpoint))
        end
    end
    A --- AOAI[OpenAI-East US]
    B --- BOAI[OpenAI-West Europe]
    C --- COAI[OpenAI-Japan East]
    D --- DOAI[OpenAI-...]

style SPACEBE fill:transparent,stroke:transparent,color:transparent;
style SPACEAPIM fill:transparent,stroke:transparent,color:transparent,height:1px;
Network Hardened Azure OpenAI with APIM Load Balancer

Isolated OpenAI Virtual Network

In some cases, an organization’s policy might only allow Azure resources for specific regions (e.g. restricted to Southeast Asia only).

Since Azure OpenAI deployment to multiple regions is the recommended way to address tokens-per-minute limitations, the cloud administration team may consider an exception for Azure OpenAI — through an exempted (separate) Azure Subscription with a very tailor-fit Azure Policy. The main subscription can connect to the external subscription via Private Link to an Azure Application Gateway:

flowchart LR
    Client -.- WAF
    subgraph VirtualNetwork[ ]
        subgraph DMZ
            WAF --- FE
        end
        subgraph BE[Backend]
            FE[FrontEnd] --- O
            O[Orchestrator<br />Service] --- PE((Private<br />Endpoint))
        end
    end
    subgraph S[Separate Subscription/VNET]
        PE -- private link --- AppGW[Application<br />Gateway]
        AppGW --- A((Private<br />Endpoint))
        AppGW --- B((Private<br />Endpoint))
        AppGW --- C((Private<br />Endpoint))
        AppGW --- D((Private<br />Endpoint))
    end
    A --- AOAI[OpenAI-East US]
    B --- BOAI[OpenAI-West Europe]
    C --- COAI[OpenAI-Japan East]
    D --- DOAI[OpenAI-...]
Network Hardened Azure OpenAI with AppGW in a Separate Subscription


flowchart LR
    Client -.- WAF
    subgraph VirtualNetwork[ ]
        subgraph DMZ
            WAF --- FE
        end
        subgraph BE[Backend]
            SPACEBE[<br />]
            FE[FrontEnd] --- GW
            subgraph APIM[API Management]
                SPACEAPIM
                GW[Gateway] --- OAPI
                GW --- Others
                GW --- OAIAPI
                subgraph APIs
                OAPI[Orchestrator API]
                OAIAPI[OpenAI API]
                Others[Other APIs]
                end
            end
            OAPI --- O[Orchestrator<br />Service]
            OAIAPI --- PE((Private<br />Endpoint))
        end
    end
    subgraph S[Separate Subscription/VNET]
        PE -- private link --- AppGW[Application<br />Gateway]
        AppGW --- A((Private<br />Endpoint))
        AppGW --- B((Private<br />Endpoint))
        AppGW --- C((Private<br />Endpoint))
        AppGW --- D((Private<br />Endpoint))
    end
    A --- AOAI[OpenAI-East US]
    B --- BOAI[OpenAI-West Europe]
    C --- COAI[OpenAI-Japan East]
    D --- DOAI[OpenAI-...]

style SPACEBE fill:transparent,stroke:transparent,color:transparent;
style SPACEAPIM fill:transparent,stroke:transparent,color:transparent,height:1px;
Network Hardened Azure OpenAI with APIM and AppGW in a Separate Subscription

Additional Components

In implementing the above, do not forget the other components that needs to be considered, such as:

  • Ports (mainly port 443), which is configured in your NSG or firewall,
  • Private DNS for name resolution,
  • Application Identity, keyless authentication is required when using an Azure Application Gateway.

Hope this helps! I’d also love to learn about other scenarios, please do share in the comments section below.

This post is licensed under CC BY 4.0 by the author.