Skip to content

Resilience in Security

Something that is resilient is something that is able to withstand or recover quickly from difficult conditions.

In this document we are going describe whether or not all of the different Security services are resilient when other services they depend on are not available, offline or simply not responding.

Security Admin

Authentication not available: RESILIENT (page refresh required)

Security Admin stays on the loading page "Authenticating, please wait..." indefinitely. When Authentication comes back online we have to refresh the page for Security Admin to redirect us correctly.

Security API not available: RESILIENT (page refresh required)

Security Admin redirects us to the Authentication page with no problem, although trying to login gives un an error message at the top of the screen. If we are already logged in to Security Admin, navigating to certain sections of the application, when the Security API is offline, will give us constant loading animations. When the Security API is back online, we have to refresh the page for correct behavior.

Authorization

Authentication not available: RESILIENT

If we are already authenticated and Authentication is down the Authorization server still works without any issues. If we have to authenticate then we get a "Service unavailable" error message returned. When it's back online we can authenticate with no problems immediately after.

Security API not available: RESILIENT

Authorization simply returns the values from it's cache if the Security API is down. If Authorization gets notified that there has been a change in the permissions and it has to go back to the Security API to refresh it's cache then we are returned with an Internal Server Error, but when Security API is back online and we try the call again then we get an updated response immediately.

Message bus not available: RESILIENT

Authorization is completely unavailable if the message bus is not available, but when it comes back online Authorization recovers immediately.

Logging database offline: RESILIENT

Authorization still returns the correct information even if the logging database is offline.

Authentication

Security API not available: RESILIENT

If the Security API is down then the Authentication login page loads up fine but it's impossible to authenticate ourselves as we are given an unexpected error message at the top of the screen. As soon as the API is back the login works perfectly.

Logging database offline: RESILIENT

Authentication still returns the correct information even if the logging database is offline.

Security API

Authentication not available: RESILIENT

If Authentication is down then the Security API will obviously not be able to authenticate, but as soon as it comes back online a reattempt is enough to continue as normal.

Message bus not available: RESILIENT

When the message bus is not available, certain API call that don't publish anything to the bus still work fine but the ones that do publish messages get stuck with no response. When the message bus comes back online if we simply reattempt the API call it should work correctly and publish the expected messages without any kind of manual intervention.

Logging database offline: RESILIENT

Just like the rest of the applications when the logging database is offline the Security API works perfectly fine, it just doesn't write to the logging database. When it's back online it writes logs again without any intervention.

Multitenant database offline: RESILIENT

The database info is cached so the Security API runs normally when the multitenancy database is offline and is not affected at all by this change. If, though, the Security API service is restarted somehow, therefore clearing the cache, then the API doesn't respond until the multitenancy database is back online.

Tenant database offline: RESILIENT

If the main tenant database is offline then of course the Security API will not be able to authenticate calls, but if we are already authenticated the problem relies on not even being able to access any of the data in the database, making it unusable. When the database is back online data can be retrieved immediately.

SMTP server not available: RESILIENT

With no available SMTP server then the "Reset Password" functionality will not be functional, but when the SMTP server recovers then the email sending from Security should work perfectly.

Security Sync Service

Message bus not available: NOT RESILIENT

With no message bus available then the Sync service is unable to publish messages and therefore the sync process cannot be triggered. If the message bus recovers then we have to manually restart the Sync service for it to recover it's expected behavior.

Logging database offline: RESILIENT

Just like the rest of the applications when the logging database is offline the Sync service works perfectly fine, it just doesn't write to the logging database. When it's back online it writes logs again without any intervention.

LDAP Sync Service

Message bus not available: RESILIENT

When the message bus is not available, if an LDAP sync process is attempted then nothing happens at all, no entry is created in the LDAP Monitoring tab, just an error in the database. When the message bus is up and running again if we click the button to manually synchronize, Security Admins UI indicates the request for a sync has been done but, and now the new entry for the sync attempt is successfully created with no issues.

Logging database offline: RESILIENT

Just like the rest of the applications when the logging database is offline the LDAP Sync service works perfectly fine, it just doesn't write to the logging database. When it's back online it writes logs again without any intervention.

Security API not available: RESILIENT

If the Security API is down, even if the LDAP Sync Service automatically runs it cannot complete any kind of sync as it depends on the API to create the new entry in the database and to updated or create users. Once back online though, synchronization is immediately available again though.

Azure AD Sync Service

Message bus not available: RESILIENT

When the message bus is not available, if any sync process has been already scheduled it will be executed. if a new Azure AD sync process is attempted from the UI then nothing happens at all, no entry is created in the Azure AD Monitoring tab, just an error in the database. When the message bus is up and running again if we click the button to manually synchronize, Security Admins UI indicates the request for a sync has been done but, and now the new entry for the sync attempt is successfully created with no issues.

Logging database offline: RESILIENT

Just like the rest of the applications when the logging database is offline the Azure AD Sync service works perfectly fine, it just doesn't write to the logging database. When it's back online it writes logs again without any intervention.

Security API not available: RESILIENT

If the Security API is down, even if the Azure AD Sync Service automatically runs it cannot complete any kind of sync as it depends on the API to create the new entry in the database and to updated or create users. Once back online though, synchronization is immediately available again though.