# monitor **Repository Path**: tars-node/monitor ## Basic Information - **Project Name**: monitor - **Description**: TARS 框架中用于服务监控、特性监控上报 - **Primary Language**: JavaScript - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2019-09-13 - **Last Updated**: 2020-12-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Monitor `Monitor` is a` TARS(TUP) `service and feature monitoring and reporting module. It consists of 3 sub-modules: * Service monitoring (stat) * Property monitoring (property) * PP monitoring (propertyplus) ## Installation `npm install @tars/monitor` ## Initialization __If the service runs through [node-agent](https://github.com/tars-node/node-agent "node-agent") (or on the TARS platform), you don't need to execute this method.__ Initialization is achieved by calling the `init(data)` method of a specific module. __data__: You can configure the file path for tars or a configured `(@tars/utils).Config` instance. ## Service monitoring (stat) ```js var stat = require ('@tars/monitor'). stat; ``` The service monitoring mainly counts (reports) the `success, failure, and timeout invocations ` of each request, and collects the `invocation time` when the invocation is successful. Because other modules have integrated this module, __so in general, the service script does not need to use this module explicitly.__ The integrated modules are as follows: * TUP Client & Server is reported by [@tars/rpc](https://github.com/tars-node/rpc "@tars/rpc"). * HTTP (S) Server is reported by [node-agent](https://github.com/tars-node/node-agent "node-agent"), but due to the characteristics of the `HTTP(S)` protocol: * Does not count the amount of timeout calls. The timeout of all requests is reported as 0. * [response.statusCode> = 400](http://www.nodejs.org/api/http.html#http_response_statuscode "http_response_statuscode") is a failed call, otherwise it is a successful call. If you decide to report manually, you can use the following code: ```js stat.report (headers, type [, timeout]); ``` __headers__: * __masterName__: name of the main module * __slaveName__: name of the module being called * __interfaceName__: Interface name of the module to be adjusted * __masterIp__: Master IP * __slaveIp__: adjusted IP * __slavePort__: port being called * __bFromClient__: reported by the client as `true` and reported by the server as` false` * __returnValue__: return value, * default value is 0 * If set to `set` then` headers` also needs to include the following information: * __slaveSetName__: called `set` name * __slaveSetArea__: adjusted `set` area name * __slaveSetID__: set name If the key is set, then the headers also need to include the following information: * __masterSetInfo__: Calling `set` information (consisting of setName.setArea.setID) The value of the parameter `type` is one of` stat.TYPE`, as shown below: __stat.TYPE__: * __SUCCESS__: Success * __ERROR__: failed * __TIMEOUT__: timeout If `type === stat.TYPE.SUCCESS` must report the response time-consuming` timeout` _(integer)_ After the data is reported, the user can view the reported data in the service monitoring tab. ## Property monitoring (property) ```js var property = require ('@ tars / monitor'). property; ``` The feature monitoring report is the “custom feature” of the service script, which consists of the feature name, feature value, and statistical method _(key / value pairs)_. ### property.create (name, policies) Calling the `create` method will return (or create) a report object, which can be reported by calling the` report (value) `method of the returned object. Where `name` is the name of the reported attribute value, and` policies` is an instance array of the statistical method class (specifying the statistical method of the data). ```js property.create ('name', [new property.POLICY.Count, new property.POLICY.Max]); ``` The instance objects in the _`policies` array must not have duplicate statistical methods._ __Please note: Do not call the `create` method to get the report object every time you report. This will cause performance loss.__ ### obj.report (value) `property.create` will return a report object, which can be reported by calling the` report` method of the object. Data can only be reported once for each call to `report`, and` value` must be a numeric value in general. After the data is reported, the user can view the reported data in the feature monitoring. ## PP Monitor (propertyplus) ```js var pp = require('@tars/monitor').propertyplus; ``` PP monitoring allows users to report features through `custom dimensions `and ` custom indicators`, which are composed of dimension names, index values, and corresponding index statistical methods. Compared with `feature monitoring`, PP monitoring has more dimensions and can be customized more widely. Can output multi-dimensional service monitoring like `service monitoring`. ### pp.create (name [, policies, options]) Calling the `create` method will return (or create) a report object, which can be reported by calling the` report(keys, values) `method of the returned object. * __name__: Reported log name * __policies__: an array of statistical method classes (statistical methods for specifying indicator values ​​in order), *default is to use `POLICY.Sum` for each indicator for statistics* * __options__: * __notTarsLog__: Whether to report by non-TARS service (the log name is assembled differently, there is no default dimension value), *The default value is: `false`* * __cacheKeyPolicy__: Whether to enable local cache to improve performance. *Default value: `false`* ```js pp.create ('name', [property.POLICY.Count, property.POLICY.Max]); ``` The statistical method of the corresponding position in the __`policies` array specifies the statistical policy of the corresponding position of the indicator value array at the time of reporting.__ Therefore, the number of statistical methods should be the same as the number of indicators reported each time, that is, `policies.length === values.length` All statistical methods except `POLICY.Distr` can be used for this monitoring. If the number of dimensions reported by the service script (the cardinality of the dimension values) is very large, it is recommended to enable `cacheKeyPolicy` to improve performance and avoid memory overflow. Do not call the `create` method to get the report object every time you report. This will cause performance loss. ### obj.report (keys, values) `pp.create` will return a report object, which can be reported by calling the object's` report` method. Each item in the `keys` array must be __character__, which represents _dimension name_. Each item in the `values` array must be __value__, which represents _indicator value_. __The number of dimensions and indicators of the same report object must be the same each time (the order is the same), and the order of the indicator values ​​must be consistent with the order of the `policies` statistical method.__ ### Examples When calling the DB in the service, you need to monitor the DB call. Among them:   * Dimension: DB name and corresponding IP * Metrics: number of calls and average time ```js var obj = pp.create ('db_status', [pp.POLICY.Sum, pp.POLICY.Avg]); ``` Calling DB: abc@192.168.1.1, takes 12.2ms: ```js obj.report (['abc', '192.168.1.1'], [1, 12.2]); ``` Call DB: test@127.0.0.1, takes 25.6ms: ```js obj.report (['test', '127.0.0.1'], [1, 25.6]); ``` ## statistical methods The data reported by `characteristic monitoring` (that is, when` create` is called) needs to specify one or more statistical methods in order to count the values ​​over a period of time. These methods are defined in `POLICY`. They are : * POLICY.Max: statistical maximum * POLICY.Min: statistical minimum * POLICY.Count: Count how many data there are * POLICY.Sum: add all data * POLICY.Avg: Calculate the average value of the data * POLICY.Distr: statistics between partitions __Except for `property.POLICY.Distr`, no construction parameters need to be passed__ ### property.POLICY.Distr (ranges) `Distr` is statistics between partitions. The interval is divided in advance, and the number of` value` falling into this interval will be automatically counted when reporting. At the same time, when displaying data, the statistical display between partitions becomes _pie chart type_. Among them, the parameter `ranges` is an array, each item of which is a value (Int), and they are arranged in ascending order. E.g: ```js var monitor = property.create ('name', [new property.POLICY.Distr ([0, 10, 100, 1000])]); monitor.report (2); monitor.repott (20); monitor.report (200); ``` The statistics of the above example are:    > [0-10] = 1 > (10-100) = 1 > (100-1000) = 1 _Each interval includes the right side and does not include the left side (except the first interval)_ ## Reporting interval In monitoring and reporting, the data is not reported every time the `report` method is called. The module collects the data submitted within a certain period of time, and performs integrated statistics for one-time reporting (one-way calling). The module will automatically read the `tars.application.client.report-interval` configuration section (unit: ms) in the TARS configuration file to configure the reporting interval. __Please note: The configured reporting interval cannot be lower than 10s, nor higher than the `TARS master refresh time` (that is, the` tars.application.client.refresh-endpoint-interval` configuration section).__ In order to prevent cyclic calls, the call status of the reporting module itself is reported by the called party (that is, the reporting logic of the one-way call).