New updates and improvements to Kerno.
In this release, we've introduced several new features and enhancements to make troubleshooting faster and more efficient, along with bug fixes to improve the overall stability of the platform. From interactive service maps to a more intuitive issue resolution process, our goal is to provide you with the tools needed to resolve issues more accurately. Key updates include improved stack trace support, infrastructure monitoring at the cluster level, and a integrations with Slack. We’ve also significantly improved the user interface and experience across various pages, ensuring a cleaner, more informative, and easier-to-use interface.
Interactive service maps for issue resolution – A real-time visual representation of service interactions, showing request flows, error rates, and performance metrics across your system so you can quickly pinpoint issues and identify affected services for faster troubleshooting.
Added native stack trace support for Java, JavaScript, and Golang. This feature provides language-specific error tracing, allowing developers to quickly identify and resolve issues in their codebases with detailed, language-tailored stack information.
Endpoint Grouping Enhancement: Introduced a feature to consolidate and group similar endpoint URLs by abstracting away dynamic parameters. This reduces noise in the system, allowing developers to focus on identifying problematic endpoints without being overwhelmed by unnecessary detail.
Normal State Metric Baseline: Introduced a new feature that calculates a "normal state" for each metric based on a service or endpoint's historical behavior. This helps developers quickly determine whether observed metrics, like error rates, are within normal ranges or cause for concern. Instead of guessing if a 15-20% error rate is problematic, developers will now have clear context based on past performance to assess and act on potential issues faster.
K8s Infrastructure Metrics: Added Kubernetes infrastructure metrics at the cluster, node, and pod levels to provide better context during issue resolution. This feature helps developers quickly differentiate between infrastructure-related and application-specific issues, speeding up the troubleshooting process.
Slack Integration for New Issues: Introduced an initial Slack integration that sends notifications for new issues directly to a dedicated "Kerno" channel. This allows teams to stay informed about critical problems in real time. Future updates will expand notification options for more event types.
Log Capture for Issues: Introduced log capture functionality for issues, allowing detailed logs to be automatically collected and attached to issue reports. This gives developers immediate access to relevant logs during troubleshooting, speeding up the identification and resolution of problems.
Cluster View: Introduced a new cluster view feature, allowing users to assess the overall health of their Kubernetes cluster at a glance. This feature provides a comprehensive overview of every component running within the cluster, helping developers quickly identify any issues affecting the system.
Consistent 'Empty Data' Messages—Standardized the 'Empty Data' messages across the app for a more uniform and cohesive user experience.
Infrastructure Color Coding—Updated bullet points in the infrastructure details to reflect status-based colors. The colors now indicate the health or status of components, providing clearer insight into system conditions.
Interactive Graphs—Enhanced graphs to be interactive, allowing for better filtering of errors and events. This provides users with more control and deeper insights while troubleshooting.
Issue Resolution Page Rework—Reworked the Issue Resolution page with a redesigned UX/UI for better clarity and usability. Unnecessary data has been removed, new informative visualizations have been added, and the layout has been cleaned up to improve issue troubleshooting.
Persistent Table Sorting—Table sorting is now persisted throughout a session. When users navigate away and return to the components table, their previous sorting preferences are retained for a smoother experience.
K8s metrics—Fixed an issue where null usage metric values were not being sent. Now, null values are correctly passed, as they are filtered out before sending node and pod usage metrics, ensuring more accurate and reliable metric reporting.
Missing Resources in Infrastructure Details—Fixed an issue where resources were not listed in the infrastructure details section if they weren't created or updated within the selected timeframe. Now, all relevant resources are displayed regardless of their creation/update date, ensuring comprehensive visibility.
Issue Details Window Resize Glitch—Fixed a glitch where the chart on the Issue Details page would behave erratically during window resizing. The chart now adjusts smoothly and without glitches when the window is resized.
UI Improvements and Minor Fixes—Addressed various small bugs, including UI inconsistencies, incorrect labels, and graph sizing issues. These fixes enhance the overall user experience by improving visual alignment and accuracy across the interface.
This month, we prioritized resolving issues and enhancing features based on the valuable feedback from our Public Beta launch.
Cost—We optimized our infrastructure and refactored our data handling to keep offering you flat and predictable pricing.
Dashboard Performance—We optimized queries and table partition keys, resulting in over a 5X improvement in loading performance for metrics graphs.
Service Map—Enhanced navigability with improved controls (Zoom in/out and recenter), auto-zoom on component selection, and path highlights to illustrate inter-service communications.
Dockerized Installer—Kerno can now be installed using Docker, simplifying the setup process.
Global Search—Pagination has been added to improve data management and user experience using global search functionality.
User list - Pagination has been added to improve data management and user experience when managing users and teams.
Stack Traces—Resolved an issue where stack traces either failed to appear or took several seconds to load in the portal.
Metrics Graphs—Fixed an issue causing metrics graphs to fail to load.
Kafka Consumer Lag—Addressed a problem where Kafka consumers were falling behind.
User Invites—Corrected an issue that prevented users from being reinvited.
Our first early alpha release of Kerno’s discovery module! In this release you get an out of the box service map outlining distributed components and their relationships in your microservices architecture! With visual hints and context brought from component monitoring! Say goodbye to manually updating IDPs or service catalogs forever more!
Soon to come into this view: routes, events, change history, errors and documentation!
Out of the box, Kerno monitors HTTP activity and detects issues from your application stack traces.
Assign branches, people and transition states from within Kerno.
Coming soon: desktop notifications and assignment suggestions based on Git and CI activity!
Kerno helps you cut through the noise by correlating events of interest with sudden spikes in errors and rejections with our new time series event timeline. Stay focused on what matters instead of getting distracted by dozens of alerts.
Kerno 2.0 now installs all the necessary components in both the cluster and its required cloud resources for in-cloud storage in one friendly script! Script source code and helm charts available on GitHub!
Our node-agent -preon- is currently working at ~5% CPU utilization. We measure this efficiency against aggressively -near 100% CPU utilization- saturated nodes with dummy services that only respond immediately to requests -at >3000 requests per second per core-. This basically places preon in the range of ~7000 messages analized per second per 0.1 core with a memory utilization rarely crossing the 60MiB baseline (we still do reserve more than that to prevent OOM kills). Our expectations regarding overhead in real-world scenarios are actually way better.
Kerno captures and uniquevoquely identifies error samples to avoid over-transmission and storage of error trace data. Stack traces and metrics are sent to our cloud to guarantee a top-of-class user experience, but critical log and trace data is stored encrypted in your cloud with very low storage consumption thanks to our at-source sample deduplication techniques.