Housekeeping & maintenance¶
Introduction¶
This document is written with the intent of providing guidelines for the housekeeping tasks (periodic maintenance) that should be performed to keep the Sequel Security services performing optimally.
General housekeeping¶
Like any sizeable software solution, Sequel Security benefits from periodic maintenance, both of application data and of the platform on which the solution components are installed. This includes maintenance of the application servers and SQL servers.
In addition to maintenance of application data, it is important to monitor the overall health of the platform. It is also important to monitor overall capacity to ensure service levels are met and service continues uninterrupted.
There are a number of best practice procedures that benefit any environment.
Security Audit logs¶
Security audits all update and delete actions made during the day-to-day use of the application. Security does not consider this information as business data and is not used in the downstream; so is possible to remove it without affecting the system. The audit information is stored at [AuditLog]
table in each schema Authentication
and Authorization
.
The [AuditLog]
tables grow constantly. These tables tend to be quite large holding hundreds of millions of records.
The client can query the AuditLog tables from the database to find out updates and deletions to records.
We suggest clients to archive the Audit Log records into a different database after a while, removing data from the Security AuditLog tables into an archive database. That will release space on the database while logs would still be saved in another archived AuditLog database, in case needed in the future.
It is down to each client to decide how much data to keep in the AuditLog table and how much to move into another database. For example, Sequel recommends clients to keep the last 6 month of AuditLogs database and move everything that is older than 6 months into the archive database.
Sequel Security Services does not provide any scripts or tool to move data from the Security database into an archive AuditLog database.
Security clients table¶
Clients defined at authentication
schema stores URLs, Origin and secrets in different tables. It is possible to end up with duplicated entries on those tables that can be safety removed using below stored procedures:
[authentication].[cleanDuplicatedClientCorsOrigins]
[authentication].[cleanDuplicatedClientPostLogoutRedirectUris]
[authentication].[cleanDuplicatedClientRedirectUris]
[authentication].[cleanDuplicatedClientSecrets]
All these stored procedures also acts as ID scripts (shows what is going to do, but does nothing), and by default this is the behaviour of them. To actually remove entries, an argument with value 1 must be set when calling the SP.
Data Protection table¶
Using Database as Data Protection mode, all keys will be store in authentication.DataProtection table. When a key expires a new one row is created and from that moment all new cookies generated by Authentication server are encrypted only with the new key. Due to the minimum expiration time allowed for this keys (7 days) a maximum of 52 keys per year could be created. Older cookies can be decrypted using expired values but if those old keys are removed that will force users to enter the credentials again when they will be redirected to Authentication server during a login flow.
Security Logs¶
In this section, we cover the logs generated by the application due to different events or action, the information stored is not intended to be for auditing (as this is done by AuditLog), the aim is trace how the application behaves and help monitoring and diagnostics tasks if required. It contains. The main log system used by Security is the logging database or Sequel.Core.Logging; although, there are some scenarios where logs are persisted to file during the start-up of the applications; this is covered in the troubleshooting section as those files do not required housekeeping actions.
Logging database¶
Most of the Sequel's web based application implement a logging mechanism to log actions and exceptions happened in the application (aka Sequel.Core.Logging). A separated database is recommended to store all the data generated by the logging.
The logging verbosity is defined in levels:
- Fatal: Fatal represents truly catastrophic situations, as far as your application is concerned.
- Error: An error is a serious issue and represents the failure of something important going on in your application. Unlike fatal, the application itself isn't going down necessarily.
- Warning: Indicates that the application might has a problem and that an unusual situation has been detected.
- Information: Information messages correspond to normal application behaviour and milestones. Like, a service started or stopped.
- Debug: Include more granular, diagnostic information. This is "noisy" territory and furnishing more information than we would not want in normal production situations. This should just be used for diagnostic purposes and moved back to a lower log level.
Information level is preferred on a production environment. When selecting a log level this will include all lower levels (i.e. Information will log as well Warning, Error and Fatal logs).
During the installation process the Logging database connection string is required in order to setup the application. Logging data is stored in a table called Logs
in the dbo
schema. Once installed you can see the logging database configuration by looking at the appsettings.json
on each Security service; check the parameter named LoggingConnectionString
. The verbosity level is defined at MinimumLogLevel
, where the high verbosity level include the lower levels.
"LoggingSettings": {
"ConnectionString": "Server=...",
"MinimumLogLevel": "Information",
},
Growth Monitoring:
Growth of this database will depend on the minimum logging level selected (see previous section), the lower the level selected, the higher the growth rate of that database. The higher levels (Information, Warning, Error and Fatal) are the most important and the less recurrent, so if the logging database grows very fast these should be saved before deleting.
It is down to each client to decide how much data to keep in the Logging database and how much to move into another database. For example, Sequel recommends clients to keep the last 6 month of the logging database and remove or move into an archive database everything that is older than 6 months. We do not recommend to keep debug entries if this is not in the context of an issue investigation under any circumstance.
In the next lines is described the mechanism to automatically delete all entries older than a date.
The associated logging database (aka Sequel.Core.Logging) for Security could grow heavily based on the log level configured and the usage. It is recommended to follow the housekeeping recommendations for the Sequel.Core.Logging project: take actions on proactive housekeeping of old data is important. When logging is configured to use the SQL repository, the housekeeping can be done with the stored procedure dbo.sp_LogTableCleanup(@DaysToKeep INT, @DeleteBatchSize INT = 5000)
. This stored procedure removes old entries in batches: indicate the number of days to keep (and all previous days will be removed) and size of each batch (delete action). This deletion is done in batches until there are no more outstanding rows that meets the filter; reducing the impact of this action in the database.
Ideally, this stored procedure can be configured in a job every day to remove old data keeping just a few days. More information about creating SQL job steps at: https://docs.microsoft.com/en-us/sql/ssms/agent/create-a-job?view=sql-server-ver15.
A T-SQL script for scheduling a housekeeping job for removing older entries than 180 days (6 months), using batches of 500 records in ##DATABASE##
database, could look like:
USE msdb;
GO
DECLARE @jobName nvarchar(50);
DECLARE @scheduleName nvarchar(50);
DECLARE @jobStepCommand nvarchar(2000);
set @jobName = N'Logging_Database_Housekeeping_Job' ;
set @scheduleName = N'Logging_Database_Housekeeping_Schedule';
set @jobStepCommand = N'EXEC [##DATABASE##].[dbo].[sp_LogTableCleanup] 180, 500';
EXEC dbo.sp_add_job
@job_name = @jobName;
EXEC sp_add_jobstep
@job_name = @jobName,
@step_name = N'Remove logs older than 180 days',
@subsystem = N'TSQL',
@command = @jobStepCommand,
@retry_attempts = 5,
@retry_interval = 5 ;
EXEC dbo.sp_add_schedule
@schedule_name = @scheduleName,
@freq_type = 4, -- Daily
@freq_interval = 1, -- Once a day
@freq_subday_type = 0x1, -- At a given time
@active_start_time = 010000; --at 01:00 AM. On a 24-hour clock, and must be entered using the form HHMMSS.
EXEC sp_attach_schedule
@job_name = @jobName,
@schedule_name = @scheduleName;
EXEC dbo.sp_add_jobserver
@job_name = @jobName;
GO
While this job for housekeeping is not required, is highly recommended.
Event logs¶
Regular checks of the Windows Application, System and Security event logs are recommended. In particular, the system log should be inspected for any sign of impending service interruption (such as low disk space warnings, net log-on warnings, disk timeouts etc). Additionally, the Windows Application event log should be inspected and any unexpected application errors analysed to assess the potential impact.
It is suggested that the event logs be checked on a regular basis e.g. weekly or monthly (and configured to roll over more frequently than this so events are not lost before they can be analysed).
Internet Information Services Logs¶
Ensure that if IIS logs are enabled for diagnostics are later turned off, or configured to automatically delete old files. More information at Microsoft documentation: https://docs.microsoft.com/en-us/iis/manage/provisioning-and-managing-iis/managing-iis-log-file-storage.
System capacity¶
As with any system, regular capacity checks should be made of the platform hosting application environment. These checks should include:
- Checking that there is sufficient disk space for continued operation on all volumes used by the system (based on existing usage trends). Additionally, de-fragmentation of the file system requires a minimum amount of free disk space (typically 15%). De-fragmentation of large files (SQL database) requires significant free disk space.
- Checking that there is sufficient system processing resource to allow the system to continue to perform optimally. This typically involves ensuring that the system is not building processor queues, and that there are free CPU cycles available during normal use. Occasional peaks of 100% CPU utilisation may be acceptable but if servers are operating regularly at 80% or more during working hours further work should be carried out to analyse the cause and implement plans to mitigate the issue.
- Checking that there is sufficient system memory available to allow the system to perform optimally. This typically involves confirming that the system is not paging to disk excessively, and that ideally there is free memory available for the system to use as a disk cache and free page pool. A minimum of 5% of free memory should always be available on the application server.
System changes / software updates¶
Best practice dictates that all changes to the platform and application environment are change controlled. This provides a necessary degree of due diligence control over system alterations. All changes should be clearly planned with any possible rollback detailed and then peer reviewed before implementation. Additionally, change control logs can be reviewed when diagnosing issues with the system. These logs can also be shared with the vendor to assist in vendor diagnosis. It is also important to ensure that the environments always adhere to the platform specification provided as part of the release.
Database maintenance¶
The SQL databases used by Sequel Security Services will benefit from the majority of typical period database maintenance procedures. As with any maintenance – good backups and a roll-back plan are prerequisite to any activity (https://docs.microsoft.com/en-us/sql/relational-databases/performance/monitor-and-tune-for-performance?view=sql-server-2017). Maintenance operations like should cover:
Index and statistics maintenance¶
SQL Server statistics are important for optimizing query speeds, by default SQL Server automatically updates statistics but it also recommended that you perform manual updates periodically. It is recommend to schedule an overnight job to run the attached script against Security databases. This script will reindex all tables in this database, not just those under the Authentication and Authorization schemas.
/*
SCRIPT_NAME : Reindex_Script_AllTablesInADb.sql
PURPOSE : Re-indexing database - varies rebuild / reorganise based on fragmentation.
AUTHOR : Venkat
CREATED DATE : 02-Oct-2014
*/
DECLARE @IndexOption varchar(255),
@ObjectName varchar(255),
@sql nvarchar(500);
/*
Deal with Indexes first because rebuilding an index also updates the the statistics.
We want to include both tables and Indexed views. Ignore small tables that are
contained within one Extent.
*/
DECLARE IndexCursor CURSOR LOCAL FAST_FORWARD
FOR
SELECT '[' + OBJECT_SCHEMA_NAME(O.[object_id]) + '].[' + O.NAME + ']' ObjectName,
CASE WHEN MAX(st.avg_fragmentation_in_percent) >= 30 THEN ' REBUILD;'
WHEN MAX(ST.avg_fragmentation_in_percent) > 5 THEN ' REORGANIZE;'
END AS IndexOption
FROM sys.objects AS O
INNER JOIN sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, 'DETAILED') AS ST ON O.[object_id] = ST.[object_id]
AND ST.avg_fragmentation_in_percent > 5
AND ST.page_count > 8
WHERE O.[Type] = 'U'
OR (
O.[type] = 'V'
AND EXISTS ( SELECT 1
FROM sys.indexes AS I
WHERE I.[object_id] = O.[object_id] )
)
GROUP BY '[' + OBJECT_SCHEMA_NAME(O.[object_id]) + '].[' + O.NAME + ']'
ORDER BY ObjectName;
OPEN IndexCursor;
FETCH NEXT FROM IndexCursor INTO @ObjectName, @IndexOption;
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT 'Index:' + @ObjectName + ' ' + @IndexOption
SET @sql = 'SET QUOTED_IDENTIFIER ON; ALTER INDEX ALL ON ' + @ObjectName + @IndexOption;
EXEC (@sql);
FETCH NEXT FROM IndexCursor INTO @ObjectName, @IndexOption;
END;
CLOSE IndexCursor;
DEALLOCATE IndexCursor;
/*
Now sort out Statistics, if we rebuilt the indexes the statistics would have already been
updated and will not be picked up by this script as we filter for tables and views that
have had data modifications since the Statistics were last updated.
*/
DECLARE StatsCursor CURSOR LOCAL FAST_FORWARD
FOR
SELECT 'SET QUOTED_IDENTIFIER ON; UPDATE STATISTICS ' + D.TableName + ' ' + D.Statistic + ' WITH FULLSCAN;' AS SQLStatement, D.TableName + ' ' + D.Statistic AS ObjectName
FROM (
SELECT QUOTENAME(SCHEMA_NAME(SCH.[schema_id]), '[') + '.' + QUOTENAME(O.name) AS TableName,
QUOTENAME(S.name) AS Statistic,
SP.last_updated AS StatsLastUpdated,
SP.[rows] AS RowsInTable,
SP.modification_counter AS RowModifications,
CAST(SP.modification_counter AS decimal(19, 2)) / NULLIF(SP.[rows], 0) * 100 AS PercentageChange
FROM sys.stats AS S
INNER JOIN sys.objects AS O ON S.[object_id] = O.[object_id]
INNER JOIN sys.schemas AS SCH ON O.[schema_id] = SCH.[schema_id]
CROSS APPLY sys.dm_db_stats_properties(O.[object_id], S.[stats_id]) AS SP
WHERE O.[type] IN ('U', 'V')
AND SP.modification_counter > 0
) AS D
WHERE (
PercentageChange > 1
OR (
RowModifications >= 1000
AND RowsInTable >= 100000
)
)
ORDER BY SQLStatement;
OPEN StatsCursor;
FETCH NEXT FROM StatsCursor INTO @sql, @ObjectName;
WHILE @@FETCH_STATUS = 0
BEGIN
PRINT 'Statistic:' + @ObjectName
EXEC (@sql);
FETCH NEXT FROM StatsCursor INTO @sql;
END;
CLOSE StatsCursor;
DEALLOCATE StatsCursor;
GO
Integrity checking¶
It is prudent to periodically check database integrity and again Sequel recommends a regular set of checks are implemented. This should first be benchmarked on a production like environment to gather information to allow for realistic schedule regarding content and timings to be implemented.
Integrity checking (light)¶
For example a good starting point would be to regularly run the following command:
DBCC CHECKDB (<DatabaseName>, NOINDEX) WITH PHYSICAL_ONLY
It is recommended that the command is run with the ‘PHYSICAL_ONLY’ option to reduce the run time. Using the ‘PHYSICAL_ONLY’ option limits the consistency check to the physical structure of the database, b-tree structure and allocation consistency of the database. It can also detect torn pages, checksum failures and common hardware failure that compromise data. The (NOINDEX)
option stops the CHECKDB
command from checking clustered indexes, which also significantly reduces execution time. By adding the NO_INFOMSGS
option only displays error messages and not additional information messages, therefore can be more useful for quickly analysing the output of the integrity check.
Integrity checking (full)¶
A more comprehensive set of checks should be implemented to run less frequently depending on time and service constraints. An example of a suitable command is:
DBCC CHECKDB WITH NO_INFOMSGS
This is a full integrity check which by adding the NO_INFOMSGS
option only displays error messages and not additional information messages, therefore can be more useful for quickly analysing the output of the integrity check.
Data Size / Space¶
It is very important to guarantee that your databases will not run out of space and is therefore sized appropriately. It is recommended to set the databases logical files to auto grow when necessary. However, it is important to note that the auto grow feature is only used as a safety net. You should not rely on auto grow to grow your databases in a production environment as the allocation of extra space will impact disk performance, increase fragmentation and therefore may also degrade overall user experience.
Ensure that the database is configured to auto grow,
ALTER DATABASE [database]
NAME = [logical filename], FILEGROWTH = 10%
This command should be execute for each logical file in the database. The names of the logical files can be retrieved by running ‘sp_helpfile’ against the database. Best practice dictates periodic review of the free space in a database. Run the following command against the main db:
sp_spaceused
This will detail the free space in the database If free space is below a safe threshold (based on current trends), the database size should be increased (outside of working hours) by using the command:
alter database [DatabaseName]
modify file (name = [LogicalFileName] , size = [absolute new file size])
…for each file to be grown. The logical file names are those returned by the previous sp_spaceused
command.
The size is the total new size of the file, not the amount by which to grow.
It is also recommended to periodically delete rows that are no longer relevant from the audit log table in the Sequel Security databases as it can easily grow very large.
SQL Alerting¶
SQL Agent alerting can be used as a final layer of monitoring to trap error states relating to low free space in the database and the inability to grow database files.
Error numbers for typical errors are reproduced below for convenience. Additionally it is possible to retrieve a list of all error number & associated descriptions by running the query select * from sys.messages
Free space related errors:
Error ID | Error Description |
---|---|
1101 | Could not allocate new page for database '%.ls'. There are no more pages available in file group %.ls. Space can be created by dropping objects, adding additional files, or allowing file growth. |
1105 | Could not allocate space for object '%.ls' in database '%.ls' because the '%.*ls' file group is full. |
1703 | Could not allocate disk space for a work table in database '%.*ls'. You may be able to free up space by using BACKUP LOG, or you may want to extend the size of the database by using ALTER DATABASE. |
9002 | The log file for database '%.*ls' is full. Back up the transaction log for the database to free up some log space. |