1. 程式人生 > >.NET Core 3.0之深入原始碼理解HealthCheck(一)

.NET Core 3.0之深入原始碼理解HealthCheck(一)

寫在前面

我們的系統可能因為正在部署、服務異常終止或者其他問題導致系統處於非健康狀態,這個時候我們需要知道系統的健康狀況,而健康檢查可以幫助我們快速確定系統是否處於正常狀態。一般情況下,我們會提供公開的HTTP介面,用於專門化健康檢查。

NET Core提供的健康檢查庫包括Microsoft.Extensions.Diagnostics.HealthChecks.Abstractions和Microsoft.Extensions.Diagnostics.HealthChecks。這兩個庫共同為我們提供了最基礎的健康檢查的解決方案,後面擴充套件的元件主要有下面幾個,本文不作其他說明。

AspNetCore.HealthChecks.System
AspNetCore.HealthChecks.Network
AspNetCore.HealthChecks.SqlServer
AspNetCore.HealthChecks.MongoDb
AspNetCore.HealthChecks.Npgsql
AspNetCore.HealthChecks.Redis
AspNetCore.HealthChecks.AzureStorage
AspNetCore.HealthChecks.AzureServiceBus
AspNetCore.HealthChecks.MySql
AspNetCore.HealthChecks.DocumentDb
AspNetCore.HealthChecks.SqLite
AspNetCore.HealthChecks.Kafka
AspNetCore.HealthChecks.RabbitMQ
AspNetCore.HealthChecks.IdSvr
AspNetCore.HealthChecks.DynamoDB
AspNetCore.HealthChecks.Oracle
AspNetCore.HealthChecks.Uris

原始碼探究

Microsoft.Extensions.Diagnostics.HealthChecks.Abstractions是.NET Core健康檢查的抽象基礎,從中我們可以看出這個庫的設計意圖。它提供了一個統一的介面IHealthCheck,用於檢查應用程式中各個被監控元件的狀態,包括後臺服務、資料庫等。這個介面只有一個方法CheckHealthAsync,

該方法有一個引數是HealthCheckContext,它表示當前健康檢查執行時所關聯的上下文物件,它的返回值HealthCheckResult表示當前健康檢查結束後所產生的被監控元件的執行狀態。

原始碼如下所示:

   1:  public interface IHealthCheck
   2:  {
   3:      Task<HealthCheckResult> CheckHealthAsync(HealthCheckContext context, CancellationToken cancellationToken = default);
   4:  }

HealthCheckRegistration

HealthCheckContext裡面只有一個成員就是HealthCheckRegistration例項。

而HealthCheckRegistration是一個相當重要的物件,它體現了健康檢查需要關注和注意的地方,其內部涉及到五個屬性,分別用於:

  • 標識健康檢查名稱
  • 建立IHealthCheck例項
  • 健康檢查的超時時間(防止我們因為健康檢查而過多佔用資源)
  • 失敗狀態標識
  • 一個標籤集合(可用於健康檢查過濾)

這五個屬性的相關原始碼如下:

   1:  public Func<IServiceProvider, IHealthCheck> Factory
   2:  {
   3:      get => _factory;
   4:      set
   5:      {
   6:          if (value == null)
   7:          {
   8:              throw new ArgumentNullException(nameof(value));
   9:          }
  10:      
  11:          _factory = value;
  12:      }
  13:  }
  14:      
  15:  public HealthStatus FailureStatus { get; set; }
  16:      
  17:  public TimeSpan Timeout
  18:  {
  19:      get => _timeout;
  20:      set
  21:      {
  22:          if (value <= TimeSpan.Zero && value != System.Threading.Timeout.InfiniteTimeSpan)
  23:          {
  24:              throw new ArgumentOutOfRangeException(nameof(value));
  25:          }
  26:      
  27:          _timeout = value;
  28:      }
  29:  }
  30:      
  31:  public string Name
  32:  {
  33:      get => _name;
  34:      set
  35:      {
  36:          if (value == null)
  37:          {
  38:              throw new ArgumentNullException(nameof(value));
  39:          }
  40:      
  41:          _name = value;
  42:      }
  43:  }
  44:      
  45:  public ISet<string> Tags { get; }

 

HealthCheckResult

HealthCheckResult是一個結構體,可以看出這裡更多的是基於承擔資料儲存和效能問題的考量。

HealthCheckResult用於表示健康檢查的相關結果資訊,同樣的,通過該類,我們知道了健康檢查需要關注的幾個點:

  • 元件的當前狀態
  • 異常資訊
  • 友好的描述資訊(不管是異常還是正常)
  • 額外可描述當前元件的鍵值對,這是一個開放式的屬性,方面我們記錄更多資訊

該類含有四個公共屬性,和三個方法,相關原始碼如下:

   1:  public struct HealthCheckResult
   2:  {
   3:      private static readonly IReadOnlyDictionary<string, object> _emptyReadOnlyDictionary = new Dictionary<string, object>();
   4:   
   5:      public HealthCheckResult(HealthStatus status, string description = null, Exception exception = null, IReadOnlyDictionary<string, object> data = null)
   6:      {
   7:          Status = status;
   8:          Description = description;
   9:          Exception = exception;
  10:          Data = data ?? _emptyReadOnlyDictionary;
  11:      }
  12:   
  13:      public IReadOnlyDictionary<string, object> Data { get; }
  14:   
  15:      public string Description { get; }
  16:   
  17:      public Exception Exception { get; }
  18:   
  19:      public HealthStatus Status { get; }
  20:   
  21:      public static HealthCheckResult Healthy(string description = null, IReadOnlyDictionary<string, object> data = null)
  22:      {
  23:          return new HealthCheckResult(status: HealthStatus.Healthy, description, exception: null, data);
  24:      }
  25:   
  26:      public static HealthCheckResult Degraded(string description = null, Exception exception = null, IReadOnlyDictionary<string, object> data = null)
  27:      {
  28:          return new HealthCheckResult(status: HealthStatus.Degraded, description, exception: exception, data);
  29:      }
  30:      
  31:      public static HealthCheckResult Unhealthy(string description = null, Exception exception = null, IReadOnlyDictionary<string, object> data = null)
  32:      {
  33:          return new HealthCheckResult(status: HealthStatus.Unhealthy, description, exception, data);
  34:      }
  35:  }

可以看出這個三個方法都是基於HealthStatus這個列舉而建立不同狀態的HealthCheckResult例項,這個列舉表達了健康檢查需要關注的幾種狀態,健康、異常以及降級。

HealthStatus的原始碼如下:

   1:  public enum HealthStatus
   2:  {
   3:      Unhealthy = 0,
   4:   
   5:      Degraded = 1,
   6:   
   7:      Healthy = 2,
   8:  }

IHealthCheckPublisher

健康檢查功能本質上是一種輪詢功能,需要定期執行,.NET Core 抽象定期執行的介面,即IHealthCheckPublisher,我們可以通過實現這個介面,並與我們自定義的定時功能相結合。

同時,作為一次健康檢查,我們還需要關注相關的健康檢查報告,那麼我們需要關注那些點呢?

  • 額外可描述當前元件的鍵值對,這是一個開放式的屬性,方面我們記錄更多資訊
  • 友好的描述資訊(不管是異常還是正常)
  • 元件的當前狀態
  • 異常資訊
  • 當前這次檢查所耗費的時間
  • 相關的標籤資訊

HealthReportEntry表示單個健康檢查報告,HealthReport表示一組健康檢查報告。HealthReport內部維護了一個HealthReportEntry的字典資料,HealthReport原始碼如下所示:

   1:  public sealed class HealthReport
   2:  {
   3:      public HealthReport(IReadOnlyDictionary<string, HealthReportEntry> entries, TimeSpan totalDuration)
   4:      {
   5:          Entries = entries;
   6:          Status = CalculateAggregateStatus(entries.Values);
   7:          TotalDuration = totalDuration;
   8:      }
   9:   
  10:      public IReadOnlyDictionary<string, HealthReportEntry> Entries { get; }
  11:   
  12:      public HealthStatus Status { get; }
  13:   
  14:      public TimeSpan TotalDuration { get; }
  15:   
  16:      private HealthStatus CalculateAggregateStatus(IEnumerable<HealthReportEntry> entries)
  17:      {
  18:          var currentValue = HealthStatus.Healthy;
  19:          foreach (var entry in entries)
  20:          {
  21:              if (currentValue > entry.Status)
  22:              {
  23:                  currentValue = entry.Status;
  24:              }
  25:   
  26:              if (currentValue == HealthStatus.Unhealthy)
  27:              {
  28:                  // Game over, man! Game over!
  29:                  // (We hit the worst possible status, so there's no need to keep iterating)
  30:                  return currentValue;
  31:              }
  32:          }
  33:   
  34:          return currentValue;
  35:      }
  36:  }

總結

通過以上內容,我們知道了,一個完整的健康檢查需要關注健康檢查上下文、健康狀態的維護、健康檢查結果、健康檢查報告,同時,為了更好的維護健康檢查,我們可以將健康檢查釋出抽象出來,並與外部的定時器相結合,共同守護健康檢查程