Commit Graph

227 Commits

Author SHA1 Message Date
Brian Gates 074595c0a9
Fix remote API (Fabric/API key): 429 handling, NVR filter, updateWeb nil panic (#958) 2026-02-18 06:34:04 -05:00
Brian Gates 40e2a7703f
Fix panic when remote discovery fails and no controllers configured (fixes #953) (#957)
* Fix panic when remote discovery fails and no controllers are configured

Call setDefaults(&u.Default) before logController(&u.Default) when
len(u.Controllers) == 0 so HashPII, DropPII, etc. are initialized
and logController does not dereference nil pointers.

Co-authored-by: Cursor <cursoragent@cursor.com>

* chore: trigger CI re-run

Co-authored-by: Cursor <cursoragent@cursor.com>

* ci: use golangci-lint v2.9 for Go 1.26-compatible deps

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-17 18:13:25 -06:00
Brian Gates b4fa16b2fd
fix(influxunifi): use CelsiusSafe() for temp fields to fix InfluxDB type conflict (#944) (#945)
* fix(influxunifi): use CelsiusSafe() for temp fields to fix InfluxDB type conflict

Write temp_* fields as float64 instead of int64 so InfluxDB does not
report 'field type conflict' when the measurement already has float.

Requires github.com/unpoller/unifi/v5 with CelsiusSafe() (unpoller/unifi#195).
Fixes #944.

Co-authored-by: Cursor <cursoragent@cursor.com>

* deps: unifi v5.17.0; nil guards and 429 retry (unpoller#943)

- Bump github.com/unpoller/unifi/v5 to v5.17.0 (CelsiusSafe, ErrNilUnifi, RateLimitError)
- inputunifi: guard pollController for nil c.Unifi; controllerID(c) in formatSites/Clients/Devices
- inputunifi: getUnifi retry with backoff on 429 (up to 5 attempts, Retry-After or exponential backoff)

Co-authored-by: Cursor <cursoragent@cursor.com>

* test(influxunifi): expect temp_* as float after CelsiusSafe() (fix #944)

Co-authored-by: Cursor <cursoragent@cursor.com>

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-03 20:12:26 -06:00
Brian Gates 5ea7fcf736
feat: UPS battery metrics, example Prometheus/Loki alerts (unpoller#930) (#941) 2026-01-31 20:25:58 -06:00
brngates98 ca568384d1 feat: add controller sysinfo metrics (unpoller#927)
- Add Sysinfo collection from stat/sysinfo endpoint
- Export controller_info, uptime, update_available, data retention, ports
- Hostname fallback: name, then site_name when API omits hostname
- Apply site name override to Sysinfo for remote/cloud
- Add Discover/Discoverer for endpoint discovery
- Require unpoller/unifi v5.15.0

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-01-31 20:25:56 -05:00
brngates98 9cfb732c11 Replace Python endpoint-discovery with --discover flag (replaces #936)
- Add --discover and --discover-output to unpoller; uses first unifi
  controller from config to probe known API endpoints and write a
  shareable markdown report.
- Add Discoverer interface and RunDiscover(); inputunifi implements
  Discoverer via unifi.DiscoverEndpoints.
- Remove tools/endpoint-discovery/ (Python/Playwright).
- Add docs/PR_936_REPLACEMENT.md. .gitignore: test config and report.

Requires unpoller/unifi with DiscoverEndpoints (replace in go.mod until
unifi release).
2026-01-30 20:17:00 -05:00
brngates98 b96606128d chore: Update go.sum for unifi v5.11.0 and fix formatting
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-01-29 17:37:33 -05:00
brngates98 b8519ca058 feat: Add WAN metrics to InfluxDB and Datadog exporters
Add comprehensive WAN metrics support to InfluxDB and Datadog exporters:

InfluxDB Metrics (measurement: wan):
- Configuration: failover_priority, load_balance_weight, provider_download_kbps,
  provider_upload_kbps, smartq_enabled, magic_enabled, vlan_enabled
- Statistics: uptime_percentage, peak_download_percent, peak_upload_percent,
  max_rx_bytes_rate, max_tx_bytes_rate
- Service Provider: service_provider_asn
- Metadata: creation_timestamp

Tags: wan_id, wan_name, wan_networkgroup, wan_type, wan_load_balance_type,
      isp_name, isp_city

Datadog Metrics (namespace: unpoller.wan.*):
- Same metrics as InfluxDB with gauge type
- All metrics tagged with WAN and ISP information

Changes:
- pkg/influxunifi/wan.go: New WAN exporter for InfluxDB
- pkg/influxunifi/influxdb.go: Add WAN to loopPoints and switchExport
- pkg/datadogunifi/wan.go: New WAN exporter for Datadog
- pkg/datadogunifi/datadog.go: Add WAN to loopPoints and switchExport

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-01-29 17:27:28 -05:00
brngates98 aac4917da7 feat: Add WAN metrics export to Prometheus
Add comprehensive WAN metrics support to unpoller:

WAN Configuration Metrics:
- wan_failover_priority: WAN failover priority
- wan_load_balance_weight: Load balancing weight
- wan_provider_download_kbps: Configured ISP download speed
- wan_provider_upload_kbps: Configured ISP upload speed
- wan_smartq_enabled: SmartQueue QoS status
- wan_magic_enabled: Magic WAN status
- wan_vlan_enabled: VLAN configuration status

WAN Statistics Metrics:
- wan_uptime_percentage: WAN uptime percentage
- wan_peak_download_percent: Peak download utilization
- wan_peak_upload_percent: Peak upload utilization
- wan_max_rx_bytes_rate: Maximum receive rate
- wan_max_tx_bytes_rate: Maximum transmit rate

WAN Service Provider Metrics:
- wan_service_provider_asn: ISP autonomous system number

Labels include:
- wan_id, wan_name, wan_networkgroup
- wan_type (dhcp, static, pppoe)
- wan_load_balance_type (weighted, failover-only)
- isp_name, isp_city (service provider metrics)
- site_name, source

Changes:
- pkg/poller/config.go: Add WANConfigs field to Metrics struct
- pkg/poller/inputs.go: Append WAN configs in metric aggregation
- pkg/inputunifi/input.go: Add WANConfigs field to Metrics struct
- pkg/inputunifi/collector.go: Fetch WAN enriched configuration
- pkg/promunifi/wan.go: New WAN metrics exporter
- pkg/promunifi/collector.go: Initialize and export WAN metrics

Depends on: unpoller/unifi PR (WAN API support)

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-01-29 17:24:12 -05:00
brngates98 86bc1c9d6d fix: rename unused exportWithTags param to _ to satisfy revive
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-01-29 17:07:43 -05:00
brngates98 74c30eabe6 feat: Add DHCP lease metrics export to Prometheus
- Add DHCP lease fetching in inputunifi collector
- Create promunifi/dhcp_leases.go with network-level and per-lease metrics
- Network-level metrics: pool_size, active_leases, utilization_percent, free_percent, available_ips
- Per-lease metrics: is_static, lease_end, lease_start, lease_time
- Separate network-level pool metrics from per-lease metrics
2026-01-28 21:42:44 -05:00
brngates98 6d85ea76ab Add device tag support to Prometheus metrics
- Add 'tag' label to all device metric descriptors
- Update exportWithTags helper to create separate metric series per tag
- Update all device export functions (UAP, USW, UDM, USG, UXG, PDU, UBB, UCI) to include tags
- Update all label arrays (VAP, Radio, Port, etc.) to include tag label
- Devices with multiple tags create multiple metric series (one per tag)
- Devices without tags export with tag=""

Requires unpoller/unifi#92
2026-01-28 20:48:10 -05:00
Cody Lee 97d3f995b1
Enrich alarms with device names for Loki logs
Added device name enrichment to alarms so that Loki logs show
human-readable device names instead of just MAC addresses.

Changes:
- Modified collectAlarms to fetch devices and build MAC-to-name lookup
- Added extractDeviceNameFromAlarm helper to extract MAC addresses from
  alarm messages and lookup corresponding device names
- Device names are extracted from messages like "AP[fc:ec:da:89:a6:91]"
  or from SrcMAC/DstMAC fields
- Added go.mod replace directive to use local unifi library with new
  DeviceName field

The device_name field will now be included in the JSON output sent to
Loki, making it easier to identify which device triggered an alarm.

Fixes #415

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 12:17:12 -06:00
Cody Lee ae1ab40386
Populate num_user field for VPN subsystem metrics
Fixes #417

UniFi controllers populate RemoteUserNumActive for VPN connections but
leave NumUser at 0 for the VPN subsystem. This caused dashboard queries
looking for num_user in the VPN subsystem to always show 0 active users,
even when VPN connections were active.

Root Cause:
For most subsystems (wlan, lan, www), the controller populates NumUser
directly. However, for the VPN subsystem, the controller uses the
RemoteUserNumActive field instead, leaving NumUser at 0.

The Prometheus exporter had special handling for VPN (lines 148-156 in
pkg/promunifi/site.go) and exported RemoteUserNumActive, but did not
export NumUser. The InfluxDB and Datadog exporters exported all fields
for all subsystems without special handling, resulting in num_user
always being 0 for VPN.

Existing Grafana dashboards query:
  SELECT "num_user" FROM "subsystems" WHERE subsystem='vpn'

This always returned 0 even with active VPN users.

Solution:
For all three exporters (InfluxDB, Datadog, Prometheus), when the
subsystem is 'vpn' and NumUser is 0 but RemoteUserNumActive has a
value, populate num_user with RemoteUserNumActive.

Changes:
- pkg/influxunifi/site.go: Add VPN-specific num_user fallback logic
- pkg/datadogunifi/site.go: Add VPN-specific num_user fallback logic
- pkg/promunifi/site.go: Add NumUser metric to VPN case with fallback

This maintains backward compatibility - existing queries for num_user
will now work correctly, and the remote_user_num_active field is still
available for those who updated their dashboards.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 12:09:01 -06:00
Cody Lee f51a0c7202
Allow polling to continue when individual controllers fail
Fixes #425

When polling multiple controllers, if one controller was down or
unreachable, unpoller would stop collecting data from ALL controllers.
This caused complete data loss across all sites when just one was down.

Root Cause:
Both Metrics() and Events() methods would immediately return an error
when any controller failed, skipping all remaining controllers in the
loop.

Changes:
- Log errors from failed controllers but continue to next controller
- Track collection errors separately from successful data collection
- Only return error if ALL controllers failed and no data was collected
- Return success if at least one controller provided data

This allows unpoller to continue monitoring healthy controllers even
when some are temporarily unreachable due to network issues, timeouts,
or maintenance.

Example behavior:
- Controller 1: Down (timeout) - logs error, continues
- Controller 2: Up - collects data successfully
- Controller 3: Up - collects data successfully
- Result: Returns data from controllers 2 and 3

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 12:01:37 -06:00
Cody Lee a1a8963159
Fix authentication retry to prevent data gaps after re-auth
Fixes #904

When a poll fails (typically with 401 Unauthorized after ~2 hour token
expiration), the code would re-authenticate but then return the original
poll error without retrying. This caused a one-minute data gap every
2 hours.

Changes:
- After successful re-authentication, retry the poll operation
- Add 500ms delay before retry to allow controller to process new auth
- Rename error variable to avoid shadowing during re-auth attempt

This ensures that transient authentication failures during the re-auth
window don't cause data gaps.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 11:53:31 -06:00
Cody Lee 9e3debd58a
Allow PoE-providing ports to be scraped even when disabled
Ports providing PoE power are no longer considered "dead" even when
disabled or down. This allows users to collect PoE metrics from ports
that are disabled for security reasons but still providing power.

Fixes #910

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 11:31:39 -06:00
Cody Lee 07781214c3
Add config option to suppress unknown device type messages
Adds log_unknown_types config option (default: false) to control logging
of unknown UniFi device types. When disabled (default), unknown devices
are silently ignored to reduce log volume. When enabled, they are logged
as DEBUG messages instead of ERROR. Addresses issue #912.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-25 11:24:33 -06:00
brngates98 1235430478 Update to unifi library v5.6.0 and fix linter errors
- Update go.mod to use unifi library v5.6.0 (includes remote API support)
- Remove temporary replace directive now that v5.6.0 is published
- Fix empty-block linter errors in input.go by removing empty if blocks
2026-01-25 10:58:08 -05:00
brngates98 e17d8bf62e move remote.go to use unifi library functions 2026-01-25 08:59:11 -05:00
brngates98 0cb331a745 Fix golangci-lint empty-block errors in input.go
Remove empty if blocks by inverting conditions:
- Line 289: Invert Remote check for URL default
- Line 303: Invert APIKey check in Remote mode
- Line 401: Invert Remote check for URL default in setControllerDefaults
2026-01-25 08:34:06 -05:00
brngates98 28e77d1ac5 Fix site name override for DPI clients, anomalies, and site metrics
- Apply site name override to DPI clients (ClientsDPI) in augmentMetrics
- Apply site name override to client anomalies when collecting events
- Apply site name override to sites (both Name and SiteName fields) when adding to metrics
- Apply site name override to DPI sites, speed tests, and country traffic
- Move applySiteNameOverride call to end of augmentMetrics to ensure all metrics are processed
- This ensures all Prometheus metrics use console names instead of 'Default (default)' for Cloud Gateways
2026-01-24 22:26:49 -05:00
brngates98 3996fd8683 Format code with gofmt 2026-01-24 18:22:40 -05:00
brngates98 d0abba6ddb Improve site name override to handle all default site name variations
- Add isDefaultSiteName helper to match any site name containing 'default' (case-insensitive)
- Handles variations like 'Default', 'default', 'Default (default)', etc.
- Ensures site_name in metrics shows console names instead of generic 'Default' values
- Makes metrics more compatible with existing dashboards that expect meaningful site names
- Also checks SiteName field on sites in addition to Name field
2026-01-24 18:22:34 -05:00
brngates98 1440f1426e Fix site name override for remote API Cloud Gateways
- Keep actual site name 'default' for API calls to prevent 404 errors
- Apply site name override only in metrics for display purposes
- Fixes issue where console names were used in API paths causing 404s
- Site name override now correctly applied to devices, clients, sites, and rogue APs in metrics only
2026-01-24 17:46:32 -05:00
brngates98 5f76c59fa2 fix duplicate controllers due to cloud gateways site being default 2026-01-24 17:42:54 -05:00
brngates98 28eae6ab22 Add remote API support for UniFi Site Manager
- Add remote API mode with automatic controller discovery
- Discover consoles via /v1/hosts endpoint
- Auto-discover sites for each console via integration API
- Use console name from hosts response as site name override for Cloud Gateways
- Support both config-level and per-controller remote mode
- Add example configs for YAML, JSON, and TOML formats
- Remote API uses api.ui.com with X-API-Key authentication
- Automatically discovers all consoles when remote=true and remote_api_key is set

This enables monitoring multiple UniFi Cloud Gateways through a single
API key without requiring direct network access to each controller.
2026-01-24 17:32:36 -05:00
aharper343 25ba0bd14a Fix incorrect initialization of SaveTraffic 2025-12-24 14:08:47 -05:00
aharper343 f7d488a887 Lint and format cleanup 2025-12-24 12:09:19 -05:00
aharper343 9b62519bfe Rebasing 2025-12-24 00:25:09 -05:00
aharper343 6205900446 Adding constants for periods and debug logs for retrieved counts 2025-12-24 00:23:05 -05:00
aharper343 ab7073d63d Added support for regions and sub-regions 2025-12-24 00:23:05 -05:00
aharper343 22dfc25801 Temp fix for test cases and warning from Dockerfile 2025-12-24 00:23:05 -05:00
aharper343 0b9d3de5cc First working version DPI metrics and traffic exported 2025-12-24 00:23:00 -05:00
Sven Grossmann 7e59c4883b fix: add HTTP timeout configuration to prevent indefinite hangs
The UniFi controller HTTP client was created without a timeout, causing
unpoller to hang indefinitely when the controller becomes unresponsive.
This resulted in random stops where polling would cease until the
container was restarted.

Changes:
- Add Timeout field to Controller struct (cnfg.Duration)
- Set default timeout of 60 seconds
- Pass timeout to unifi.Config when creating the client
- Log timeout value on startup for visibility

The timeout can be configured via:
- Config file: timeout = "60s"
- Environment: UP_UNIFI_DEFAULT_TIMEOUT=60s

Fixes issue where container would hang overnight:
  2025/12/22 22:29:27 - Requesting https://unifi/.../stat/sta
  [~2 hour gap - request hung indefinitely]
  2025/12/23 00:17:57 - Unmarshalling Device Type: udm...
2025-12-23 11:13:54 +01:00
Sven Grossmann 07e1e5bc4d feat: add UniFi Protect logs support with Loki integration
- Add SaveProtectLogs config option to enable Protect log collection
- Add ProtectThumbnails config option to fetch event thumbnails
- Add collectProtectLogs function with 24h default fetch window
- Add ProtectLogEvent for Loki reporting with separate thumbnail log lines
- Add PII redaction for Protect log entries
- Filter thumbnail fetching to camera events only (motion, smartDetect*, etc.)
- Update log output to show Protect logs status
2025-12-22 22:55:30 +01:00
Sven Grossmann a3dc4cd0b2 feat: add save_syslog option for v2 system-log API
Add new save_syslog config option to collect events from the v2 UniFi
system-log API (/v2/api/site/{site}/system-log/all).

Changes:
- Add SaveSyslog field to Controller struct
- Add collectSyslog() function using v2 API
- Keep collectEvents() using v1 API for backwards compatibility
- Add RedactIPPII() helper for PII redaction
- Update lokiunifi to log raw JSON (parseable with Loki | json)
- Reduce indexed labels to low-cardinality fields only
- Add SystemLogEntry handler in lokiunifi report

Config: save_syslog (v2 API) vs save_events (v1 API)
Env: UP_UNIFI_DEFAULT_SAVE_SYSLOG=true
2025-12-22 17:23:53 +01:00
Cody Lee a00aeb2eb5
Add byte counters for InfluxDB and Prometheus outputs (issue #350)
Track the number of bytes written per request for both InfluxDB and Prometheus outputs.

InfluxDB:
- Added bytesT counter constant
- Implemented calculateMetricBytes() to estimate line protocol size
- Updated batchV1() and batchV2() to count bytes per point
- Updated log output to display bytes written

Prometheus:
- Added Bytes field to Report struct
- Updated export() to calculate approximate metric byte size
- Updated log output to display bytes written

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-11 10:55:33 -06:00
Cody Lee 0ffe6152ab
Fix multi-WAN speed test reporting (issue #841)
Speed tests were not being reported correctly for multi-WAN setups
because the device-level speedtest-status field was returning zeros.
The data has moved to a new aggregated dashboard API endpoint.

Changes:
- Add GetSpeedTests() and GetSiteSpeedTests() methods to fetch from
  /v2/api/site/{site}/aggregated-dashboard endpoint
- Create SpeedTestResult data structures to capture per-WAN metrics
- Update Prometheus exporter with new speedtest_* metrics per interface
- Update InfluxDB exporter to write speedtest measurements per WAN
- Update Datadog exporter with unifi.speedtest.* metrics per WAN
- Update metrics collection to include speed test data for all sites

Metrics now include labels/tags for:
- wan_interface: Physical interface (eth8, eth9, etc.)
- wan_group: Logical WAN name (WAN, WAN2, etc.)
- site_name: Site identifier
- source: Controller URL

Gracefully handles older controllers without the new API endpoint.

Fixes #841

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-09 16:46:14 -06:00
Cody Lee 8000597fce
Refactor Prometheus UBB label construction to use append
Replace manual array indexing (labels[1], labels[2], labels[3]) with
cleaner append syntax using slice notation (labels[1:]...).

This makes the code more maintainable and idiomatic Go.

Before: labelTotal := []string{"total", labels[1], labels[2], labels[3]}
After:  labelTotal := append([]string{"total"}, labels[1:]...)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-09 11:12:00 -06:00
Cody Lee c61d2651a2
Enhance InfluxDB and Datadog UBB outputs with comprehensive metrics
This change significantly expands the metrics exported for UBB devices
to InfluxDB and Datadog, matching the comprehensive coverage added to
the Prometheus output.

Changes to InfluxDB (pkg/influxunifi/ubb.go):
- Added batchUBBstats() to export comprehensive statistics separated
  by radio (total, wifi0, terra2, user-wifi0, user-terra2)
- Added VAP table export via processVAPTable()
- Added Radio table export via processRadTable()
- Added P2P stats (rx_rate, tx_rate, throughput)
- Added link quality metrics (link_quality, link_quality_current,
  link_capacity)
- Comprehensive stats exported to new "ubb_stats" table with full
  breakdown of traffic per radio

Changes to Datadog (pkg/datadogunifi/ubb.go):
- Added batchUBBstats() to export comprehensive statistics separated
  by radio (total, wifi0, terra2, user-wifi0, user-terra2)
- Added VAP table export via processVAPTable()
- Added Radio table export via processRadTable()
- Added P2P stats (rx_rate, tx_rate, throughput)
- Added link quality metrics (link_quality, link_quality_current,
  link_capacity)
- Comprehensive stats exported with namespace "ubb.stats"

All implementations now fully support:
- 5GHz radio (wifi0) metrics
- 60GHz radio (terra2/ad) metrics - Full 802.11ad support!
- Per-radio RX/TX packets, bytes, errors, dropped, retries
- User-specific metrics for each radio
- Interface-specific metrics (ath0 for 5GHz, wlan0 for 60GHz)
- Point-to-point link statistics and quality metrics

Fixes: #409

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-09 11:04:35 -06:00
Cody Lee 6a135c60a7
Enhance UBB device support with comprehensive Prometheus metrics
This change significantly improves UniFi Building Bridge (UBB) device
support by adding comprehensive Prometheus metric exports.

UBB devices are point-to-point wireless bridges with dual radios:
- wifi0: 5GHz radio (802.11ac)
- terra2/wlan0/ad: 60GHz radio (802.11ad - Terragraph/WiGig)

Changes:
- Added exportUBBstats() to export UBB-specific statistics separated
  by radio (total, wifi0, terra2, user-wifi0, user-terra2)
- Added exportP2Pstats() to export point-to-point link metrics
  (rx_rate, tx_rate, throughput)
- Added VAP (Virtual Access Point) table export via existing exportVAPtable()
- Added Radio table export via existing exportRADtable() to capture
  60GHz radio metrics
- Added link quality metrics (link_quality, link_quality_current,
  link_capacity)
- Added comprehensive comments documenting UBB device characteristics
  and 60GHz band support

The implementation reuses existing UAP metric descriptors where
appropriate, allowing UBB metrics to be collected alongside UAP metrics
in Prometheus with proper labeling for differentiation.

Requires: unpoller/unifi#169 (UBB type definition fixes)
Fixes: #409

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-09 10:20:23 -06:00
Cody Lee 832334655c
Fix health check port binding conflict (issue #892)
The Docker health check was attempting to bind to ports already in use by
the running application, causing "address already in use" errors. This fix
adds a health check mode that skips network binding operations while still
validating output configuration (listen addresses, paths, etc.).

Changes:
- Add health check mode flag in pkg/poller/outputs.go
- Update prometheus and webserver DebugOutput() to skip port binding in health check mode
- Maintain full configuration validation without network conflicts

Fixes #892

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-09 08:11:21 -06:00
Cody Lee b960695f3b
Add Docker health check support
Implements #406 by adding a --health CLI flag and HEALTHCHECK instruction
to the Dockerfile. This allows Docker and container orchestration platforms
to monitor container health automatically.

Changes:
- Added --health flag that validates configuration and plugin connectivity
- Implemented HealthCheck() method in pkg/poller/commands.go
- Updated Dockerfile with HEALTHCHECK instruction (30s interval, 10s timeout)
- Updated MANUAL.md with --health flag documentation
- Added health check documentation to Docker README
- Added comments to docker-compose examples about built-in health check

The health check:
- Validates configuration file is found and parseable
- Ensures at least one input and one enabled output are configured
- Performs basic validation on enabled outputs
- Returns exit code 0 (healthy) or 1 (unhealthy)
- Runs silently for Docker compatibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-08 13:09:10 -06:00
Cody Lee 7e2fb0135e
fix dd client interface change, update deprecated context lib in influxdb 2025-12-03 11:51:40 -06:00
Cody Lee 6f4384c18d
fix linting 2025-12-03 11:40:21 -06:00
Cody Lee c3126d27e3
interface change updates 2025-08-20 11:36:29 -05:00
Traxmaxx 8fb9c3cb40 fix: skip loki reporting if streams is empty 2025-07-20 13:18:58 +02:00
Sofiane A 10ccd0c2d7 Correct logic for default site condition 2025-04-29 19:34:52 +02:00
Sofiane A 5a89a4634a Add default_site_name_override to support customizable default site names 2025-04-29 16:12:32 +02:00