There are a wide variety of solutions on the market all claim to have various levels of machine learning, adaptive baselines, market driven analytics, learning algorithms etc… By and large they all collect the data the same way, poll vCenter for performance data via real-time APIs and on at the same time or less frequently poll all the other aspects of vSphere (inventory, hierarchy, tasks & events, etc…) and then store that data at varying levels of data granularity.
With every solution collecting the data from the same place, it should make it a straight out one algorithm vs. another, but it turns out that is not the case. There are two aspects that greatly influence the analytics – Frequency of polling and data retention.
Frequency of polling is pretty straight forward. Pull data faster equates to more data points and a better chance of catching peaks, valleys and general usage. However, with faster polling it comes with a cost of performance to poll the data every X minutes/seconds (on vCenter, the data collector and the solutions database) and a huge impact longer term on storing that data. Ideally, there should be some middle ground on collecting the data.
Read the entire article here, Why data granularity matters in monitoring
via the fine folks at VMware!