Compare commits

..

21 Commits

Author SHA1 Message Date
J. Nick Koston b2257caeb7 touch ups 2026-05-22 09:11:01 -05:00
J. Nick Koston 0ec0ea30ac single pass 2026-05-22 09:06:15 -05:00
J. Nick Koston 584b32c8b3 address copilot, cleanups 2026-05-22 09:01:39 -05:00
J. Nick Koston 4033a8b83a Apply suggestions from code review
Co-authored-by: J. Nick Koston <nick+github@koston.org>
2026-05-22 08:53:47 -05:00
J. Nick Koston add8a5f799 Merge branch 'dev' into cache-split-tests 2026-05-22 08:53:27 -05:00
DeerMaximum 40c0d79d1d Replaced duplicate constant with homeassistant.const in NINA (#171852) 2026-05-22 15:41:04 +02:00
J. Nick Koston 7c137b5c73 cleanup 2026-05-22 08:34:32 -05:00
Franck Nijhof bef8632d78 Fix OpenHome config flow crash when UDN is a list (#171841) 2026-05-22 15:23:42 +02:00
Duco Sebel f00decfaa3 Use uptime device class for HomeWizard uptime sensor (#171830) 2026-05-22 15:23:09 +02:00
Manu 42e7add026 Add selector options translations to System Bridge integration (#171771) 2026-05-22 15:22:22 +02:00
Franck Nijhof 263aa3f16e Fix Hue device trigger crash for devices removed from bridge (#171844) 2026-05-22 15:18:00 +02:00
J. Nick Koston 4a6c5b5a22 cleanups 2026-05-22 07:46:31 -05:00
J. Nick Koston 1009ce4180 Merge branch 'dev' into cache-split-tests 2026-05-21 23:09:44 -05:00
J. Nick Koston 22fb68b7a1 Revert "DNM: test cache, touch cloud manifest only"
This reverts commit a8bc244a7a.
2026-05-21 16:53:19 -05:00
J. Nick Koston 81e06539e6 Revert "DNM: test cache bust, touch cloud conftest"
This reverts commit 7c18b67b2e.
2026-05-21 16:53:19 -05:00
J. Nick Koston 7c18b67b2e DNM: test cache bust, touch cloud conftest 2026-05-21 16:48:05 -05:00
J. Nick Koston a8bc244a7a DNM: test cache, touch cloud manifest only 2026-05-21 16:44:43 -05:00
J. Nick Koston 5975f4b179 Skip cache walking when --cache is not passed
Address Copilot review feedback on the cache PR:

* Split collect_tests into _collect_tests_uncached (the original
  directory-based pre-cache flow) and _collect_tests_cached (walks
  the tree to build per-file hashes).  Without --cache we now skip
  the walk + per-file hash entirely.
* A single-file root has no ancestor conftests to walk, so the
  conftest_hash would always be empty and stale counts could survive
  a real conftest change; bypass the cache for the file-root case.
* Save the cache file with explicit utf-8 encoding and
  ensure_ascii=False.
2026-05-21 16:08:44 -05:00
J. Nick Koston 9ed16b63a3 Cache per-file test counts in split_tests
Persist the result of pytest --collect-only between CI runs as a JSON
file keyed by content hash, so unchanged test files are served from
cache and only edited or new files are re-collected.  The cache is
self-healing:

* Missing, corrupt, or wrong-version files fall back to a full collect.
* Any conftest.py change anywhere under the test root invalidates the
  whole cache, so fixture parametrization shifts cannot silently skew
  counts.
* Files pytest returns nothing for (helper modules named test_*.py with
  no test functions) are cached as zero so they don't get re-collected
  forever.

Walking is done once with os.walk (~2x faster than Path.rglob) and
collects test files plus conftests in a single pass.  When the cache
is fully cold we feed pytest top-level directories rather than
thousands of file paths so cold runs stay as fast as before the cache
landed.

Wire the new --cache flag through the prepare-pytest-full job and back
the cache file with actions/cache so PRs can pick up the latest dev
snapshot via restore-keys.  Local timings: cold 11s, warm with no diff
0.4s, warm with one file edited 2.3s.
2026-05-21 15:56:08 -05:00
J. Nick Koston 8dadaa2f9e Filter fan-out children and fail fast on empty batch list
Only pass directories and test_*.py files to pytest --collect-only so
helpers like tests/components/conftest.py and tests/components/common.py
are not treated as explicit collection targets, and bail out with a
clear error if no eligible paths are found instead of running pytest
with no arguments.
2026-05-21 15:17:42 -05:00
J. Nick Koston 4f98c71586 Run pytest --collect-only in parallel batches in split_tests
cProfile showed 99.6% of split_tests.py wall time was spent in the
single pytest --collect-only subprocess.  Fan out the collection across
``os.cpu_count()`` workers; round-robin chunking keeps each batch
roughly equal, and tests/components is expanded one level deeper so
the ~1000 integration subdirectories distribute evenly.  Local wall
time dropped from ~132s to ~11s on an 18-core box.  Bucket output is
unchanged because we still parse the same pytest -qq output, just
aggregated from multiple invocations.
2026-05-21 15:10:01 -05:00
17 changed files with 902 additions and 121 deletions
+27 -1
View File
@@ -917,12 +917,38 @@ jobs:
key: >-
${{ runner.os }}-${{ runner.arch }}-${{ steps.python.outputs.python-version }}-${{
needs.info.outputs.python_cache_key }}
- name: Restore pytest test counts cache
id: cache-pytest-counts
uses: actions/cache/restore@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
with:
path: pytest_test_counts.json
key: >-
pytest-counts-${{ runner.os }}-${{ runner.arch }}-${{
steps.python.outputs.python-version }}-${{
needs.info.outputs.python_cache_key }}-${{ github.sha }}
restore-keys: |
pytest-counts-${{ runner.os }}-${{ runner.arch }}-${{ steps.python.outputs.python-version }}-${{ needs.info.outputs.python_cache_key }}-
- name: Run split_tests.py
env:
TEST_GROUP_COUNT: ${{ needs.info.outputs.test_group_count }}
run: |
. venv/bin/activate
python -m script.split_tests ${TEST_GROUP_COUNT} tests
python -m script.split_tests \
--cache pytest_test_counts.json \
${TEST_GROUP_COUNT} tests
- name: Save pytest test counts cache
# Only the canonical dev push writes the cache, otherwise every PR
# build would create a new entry and the actions/cache quota fills
# up with near-duplicate snapshots. PRs and feature branches still
# restore from dev's most recent cache via restore-keys.
if: |
github.event_name == 'push'
&& github.ref == 'refs/heads/dev'
&& steps.cache-pytest-counts.outputs.cache-hit != 'true'
uses: actions/cache/save@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
with:
path: pytest_test_counts.json
key: ${{ steps.cache-pytest-counts.outputs.cache-primary-key }}
- name: Upload pytest_buckets
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7.0.1
with:
-43
View File
@@ -1,43 +0,0 @@
#!/usr/bin/env python3
"""Add @override decorator to methods listed in explicit_override_errors.txt."""
from __future__ import annotations
import re
import subprocess
from collections import defaultdict
from pathlib import Path
INPUT = Path("explicit_override_errors.txt")
ERROR_RE = re.compile(r"^(.+?):(\d+): error:.*\[explicit-override\]")
def decorator_stack_top(lines: list[str], def_idx: int) -> int:
"""Return the index of the topmost decorator above the def at def_idx."""
i = def_idx - 1
while i >= 0 and lines[i].lstrip().startswith("@"):
i -= 1
return i + 1
by_file: dict[Path, set[int]] = defaultdict(set)
for line in INPUT.read_text().splitlines():
if m := ERROR_RE.match(line):
by_file[Path(m.group(1))].add(int(m.group(2)))
for path, line_nums in by_file.items():
lines = path.read_text().splitlines(keepends=True)
for lineno in sorted(line_nums, reverse=True):
insert_idx = decorator_stack_top(lines, lineno - 1)
target = lines[insert_idx]
indent = target[: len(target) - len(target.lstrip())]
lines.insert(insert_idx, f"{indent}@override\n")
first_import = next(
i for i, ln in enumerate(lines) if ln.startswith(("import ", "from "))
)
lines.insert(first_import, "from typing import override\n")
path.write_text("".join(lines))
print(f"Updated {path} ({len(line_nums)} methods)")
if by_file:
subprocess.run(["ruff", "check", "--fix", *map(str, by_file)], check=False)
-21
View File
@@ -1,21 +0,0 @@
#!/usr/bin/env python3
"""Run mypy on a directory and write `[explicit-override]` errors to a file."""
from __future__ import annotations
import subprocess
import sys
from pathlib import Path
OUTPUT = Path("explicit_override_errors.txt")
target = sys.argv[1]
result = subprocess.run(
["mypy", "--enable-error-code=explicit-override", target],
capture_output=True,
text=True,
check=False,
)
matches = [line for line in result.stdout.splitlines() if "[explicit-override]" in line]
OUTPUT.write_text("\n".join(matches) + ("\n" if matches else ""))
print(f"Wrote {len(matches)} errors to {OUTPUT}")
+2 -10
View File
@@ -35,7 +35,6 @@ from homeassistant.helpers.device_registry import DeviceInfo
from homeassistant.helpers.entity_platform import AddConfigEntryEntitiesCallback
from homeassistant.helpers.typing import StateType
from homeassistant.util.dt import utcnow
from homeassistant.util.variance import ignore_variance
from .const import DOMAIN
from .coordinator import HomeWizardConfigEntry, HWEnergyDeviceUpdateCoordinator
@@ -66,13 +65,6 @@ def to_percentage(value: float | None) -> float | None:
return value * 100 if value is not None else None
def uptime_to_datetime(value: int) -> datetime:
"""Convert seconds to datetime timestamp."""
return utcnow().replace(microsecond=0) - timedelta(seconds=value)
uptime_to_stable_datetime = ignore_variance(uptime_to_datetime, timedelta(minutes=5))
SENSORS: Final[tuple[HomeWizardSensorEntityDescription, ...]] = (
HomeWizardSensorEntityDescription(
key="smr_version",
@@ -643,7 +635,7 @@ SENSORS: Final[tuple[HomeWizardSensorEntityDescription, ...]] = (
HomeWizardSensorEntityDescription(
key="uptime",
translation_key="uptime",
device_class=SensorDeviceClass.TIMESTAMP,
device_class=SensorDeviceClass.UPTIME,
entity_category=EntityCategory.DIAGNOSTIC,
entity_registry_enabled_default=False,
has_fn=(
@@ -651,7 +643,7 @@ SENSORS: Final[tuple[HomeWizardSensorEntityDescription, ...]] = (
),
value_fn=(
lambda data: (
uptime_to_stable_datetime(data.system.uptime_s)
utcnow() - timedelta(seconds=data.system.uptime_s)
if data.system is not None and data.system.uptime_s is not None
else None
)
@@ -87,6 +87,8 @@ def async_get_triggers(
# Get Hue device id from device identifier
hue_dev_id = get_hue_device_id(device_entry)
if hue_dev_id is None or hue_dev_id not in api.devices:
return []
# extract triggers from all button resources of this Hue device
triggers: list[dict[str, Any]] = []
model_id = api.devices[hue_dev_id].product_data.product_name
@@ -6,6 +6,7 @@ from homeassistant.components.binary_sensor import (
BinarySensorDeviceClass,
BinarySensorEntity,
)
from homeassistant.const import ATTR_ID
from homeassistant.core import HomeAssistant
from homeassistant.helpers.entity_platform import AddConfigEntryEntitiesCallback
@@ -14,7 +15,6 @@ from .const import (
ATTR_DESCRIPTION,
ATTR_EXPIRES,
ATTR_HEADLINE,
ATTR_ID,
ATTR_RECOMMENDED_ACTIONS,
ATTR_SENDER,
ATTR_SENT,
-2
View File
@@ -29,8 +29,6 @@ ATTR_SEVERITY: str = "severity"
ATTR_RECOMMENDED_ACTIONS: str = "recommended_actions"
ATTR_AFFECTED_AREAS: str = "affected_areas"
ATTR_WEB: str = "web"
# pylint: disable-next=home-assistant-duplicate-const
ATTR_ID: str = "id"
ATTR_SENT: str = "sent"
ATTR_START: str = "start"
ATTR_EXPIRES: str = "expires"
@@ -37,11 +37,15 @@ class OpenhomeConfigFlow(ConfigFlow, domain=DOMAIN):
_LOGGER.debug("async_step_ssdp: Incomplete discovery, ignoring")
return self.async_abort(reason="incomplete_discovery")
_LOGGER.debug(
"async_step_ssdp: setting unique id %s", discovery_info.upnp[ATTR_UPNP_UDN]
)
udn = discovery_info.upnp[ATTR_UPNP_UDN]
if isinstance(udn, list):
if not udn:
return self.async_abort(reason="incomplete_discovery")
udn = udn[0]
await self.async_set_unique_id(discovery_info.upnp[ATTR_UPNP_UDN])
_LOGGER.debug("async_step_ssdp: setting unique id %s", udn)
await self.async_set_unique_id(udn)
self._abort_if_unique_id_configured({CONF_HOST: discovery_info.ssdp_location})
_LOGGER.debug(
@@ -89,3 +89,4 @@ power_command:
- "restart"
- "shutdown"
- "sleep"
translation_key: "power_command"
@@ -178,6 +178,18 @@
"title": "System Bridge upgrade required"
}
},
"selector": {
"power_command": {
"options": {
"hibernate": "Hibernate",
"lock": "Lock",
"logout": "Logout",
"restart": "[%key:common::action::restart%]",
"shutdown": "Shutdown",
"sleep": "Sleep"
}
}
},
"services": {
"get_process_by_id": {
"description": "Gets a process by the ID.",
Generated
+1 -1
View File
@@ -17,7 +17,7 @@ no_implicit_optional = true
warn_incomplete_stub = true
warn_redundant_casts = true
warn_unused_ignores = true
enable_error_code = deprecated, explicit-override, ignore-without-code, redundant-self, truthy-iterable
enable_error_code = deprecated, ignore-without-code, redundant-self, truthy-iterable
disable_error_code = annotation-unchecked, import-not-found, import-untyped
extra_checks = false
check_untyped_defs = true
-1
View File
@@ -54,7 +54,6 @@ GENERAL_SETTINGS: Final[dict[str, str]] = {
"enable_error_code": ", ".join( # noqa: FLY002
[
"deprecated",
"explicit-override",
"ignore-without-code",
"redundant-self",
"truthy-iterable",
+316 -35
View File
@@ -4,6 +4,8 @@
import argparse
from concurrent.futures import ProcessPoolExecutor
from dataclasses import dataclass, field
import hashlib
import json
from math import ceil
import os
from pathlib import Path
@@ -15,13 +17,15 @@ from typing import Final
# place to subdivide to keep each pytest invocation roughly equal in size.
_FAN_OUT_DIRS: Final = frozenset({"components"})
# Cache file format version; bump on any incompatible schema change so old
# caches are ignored rather than misread.
_CACHE_VERSION: Final = 2
class Bucket:
"""Class to hold bucket."""
def __init__(
self,
):
def __init__(self) -> None:
"""Initialize bucket."""
self.total_tests = 0
self._paths: list[str] = []
@@ -81,9 +85,9 @@ class BucketHolder:
if not test_folder.added_to_bucket:
raise ValueError("Not all tests are added to a bucket")
def create_ouput_file(self) -> None:
def create_output_file(self) -> None:
"""Create output file."""
with Path("pytest_buckets.txt").open("w") as file:
with Path("pytest_buckets.txt").open("w", encoding="utf-8") as file:
for idx, bucket in enumerate(self._buckets):
print(f"Bucket {idx + 1} has {bucket.total_tests} tests")
file.write(bucket.get_paths_line())
@@ -184,9 +188,10 @@ def _collect_batch(paths: list[Path]) -> tuple[str, str, int]:
def _iter_eligible_children(path: Path) -> list[Path]:
"""Return immediate children of ``path`` that pytest should collect.
Filters out hidden/dunder entries, non-``test_*.py`` files (so helper
modules like ``conftest.py`` and ``common.py`` are not passed as
explicit collection targets), and pycache-style directories.
Skips entries whose name starts with ``.`` or ``_`` (hidden dirs,
``__pycache__``, private helpers), and non-``test_*.py`` files (so
helper modules like ``conftest.py`` and ``common.py`` are not passed
as explicit collection targets).
"""
children: list[Path] = []
for entry in sorted(path.iterdir()):
@@ -216,44 +221,314 @@ def _enumerate_batch_paths(path: Path) -> list[Path]:
return paths
def collect_tests(path: Path) -> TestFolder:
"""Collect all tests."""
batch_paths = _enumerate_batch_paths(path)
if not batch_paths:
print(f"No eligible test paths found under {path}")
sys.exit(1)
workers = min(len(batch_paths), os.cpu_count() or 1) or 1
# Round-robin chunking keeps batches roughly balanced when path
# ordering correlates with test size.
batches = [batch_paths[i::workers] for i in range(workers)]
def _hash_file(path: Path) -> str:
"""Return a short content hash for ``path``."""
return hashlib.sha256(path.read_bytes()).hexdigest()[:16]
def _walk_test_tree(root: Path) -> tuple[list[Path], list[Path]]:
"""Walk ``root`` once and return (test files, fixture files).
Test files are the ``test_*.py`` modules that pytest will collect.
Fixture files are every other ``.py`` under ``root`` — ``conftest.py``
plus helper modules like ``common.py``. Helpers go into the
invalidation hash because they often hold the ``VALUES`` lists that
test files import for ``@pytest.mark.parametrize``: editing one
changes a test's collected count even though the test file itself is
untouched.
Uses ``os.walk`` rather than ``Path.rglob`` because it's ~2x faster on
a 5000-file tree, and subdirectories whose names start with ``.`` or
``_`` are pruned instead of visited (hidden dirs, ``__pycache__``,
private helpers). Doing both walks in one pass keeps total tree I/O
down.
"""
test_files: list[Path] = []
fixtures: list[Path] = []
for dirpath, dirnames, filenames in os.walk(root):
dirnames[:] = [d for d in dirnames if not d.startswith((".", "_"))]
base = Path(dirpath)
for name in filenames:
if not name.endswith(".py"):
continue
if name.startswith("test_"):
test_files.append(base / name)
else:
fixtures.append(base / name)
test_files.sort()
fixtures.sort()
return test_files, fixtures
def _find_ancestor_conftests(root: Path) -> list[Path]:
"""Return ancestor ``conftest.py`` files that pytest would still apply.
Pytest walks up from each test file looking for conftests; when
``root`` is a subtree (eg ``tests/components``) the conftests above
it (eg ``tests/conftest.py``) still affect parametrization, so they
must contribute to the invalidation hash too. Stops at the first
ancestor without a ``conftest.py``.
"""
ancestors: list[Path] = []
current = root.resolve().parent
while True:
conftest = current / "conftest.py"
if not conftest.is_file():
break
ancestors.append(conftest)
if current == current.parent:
break
current = current.parent
return ancestors
def _compute_invalidation_hash(root: Path, fixtures: list[Path]) -> str:
"""Return a hash that changes whenever any file in ``fixtures`` changes.
Any change to a fixture file (conftests, helper modules like
``common.py``, ancestor conftests) invalidates the entire test-count
cache. This is coarse but safe: any of these can shift fixture
parametrization in ways the cache cannot otherwise detect, so we
just re-collect everything.
Paths are encoded with ``os.path.relpath`` so the hash stays stable
across machines and also covers ancestor conftests above ``root``
(whose ``relative_to(root)`` would fail).
"""
digest = hashlib.sha256()
for fixture in fixtures:
digest.update(os.path.relpath(fixture, root).encode())
digest.update(b"\0")
digest.update(fixture.read_bytes())
digest.update(b"\0")
return digest.hexdigest()
@dataclass
class _CacheEntry:
"""Cached test count for a single file."""
hash: str
count: int
@dataclass
class _Cache:
"""Mapping of test file path → cached entry, plus invalidation key."""
invalidation_hash: str
entries: dict[str, _CacheEntry]
@classmethod
def empty(cls, invalidation_hash: str = "") -> _Cache:
"""Return a new empty cache."""
return cls(invalidation_hash=invalidation_hash, entries={})
@classmethod
def load(cls, path: Path, current_invalidation_hash: str) -> _Cache:
"""Load cache from ``path`` and invalidate it on schema/fixture drift.
Any failure (missing file, bad JSON, version drift, fixture drift)
returns an empty cache so the script just falls back to a full
collection. This is the self-healing path.
"""
try:
raw = json.loads(path.read_bytes())
except OSError, ValueError:
return cls.empty(current_invalidation_hash)
if not isinstance(raw, dict) or raw.get("version") != _CACHE_VERSION:
return cls.empty(current_invalidation_hash)
if raw.get("invalidation_hash") != current_invalidation_hash:
return cls.empty(current_invalidation_hash)
files = raw.get("files")
if not isinstance(files, dict):
return cls.empty(current_invalidation_hash)
entries: dict[str, _CacheEntry] = {}
for key, value in files.items():
if (
not isinstance(value, dict)
or not isinstance(value.get("hash"), str)
or not isinstance(value.get("count"), int)
):
# Skip malformed entries instead of discarding the whole cache.
continue
entries[key] = _CacheEntry(hash=value["hash"], count=value["count"])
return cls(invalidation_hash=current_invalidation_hash, entries=entries)
def save(self, path: Path) -> None:
"""Write the cache to ``path``."""
path.write_text(
json.dumps(
{
"version": _CACHE_VERSION,
"invalidation_hash": self.invalidation_hash,
"files": {
key: {"hash": entry.hash, "count": entry.count}
for key, entry in sorted(self.entries.items())
},
},
indent=2,
ensure_ascii=False,
)
+ "\n",
encoding="utf-8",
)
def _resolve_from_cache(
test_files: list[Path],
cache: _Cache,
root: Path,
) -> tuple[dict[Path, _CacheEntry], dict[Path, str]]:
"""Split ``test_files`` into ``(cached_entries, miss_hashes)``.
A file is served from cache when its content hash matches what we
previously stored; otherwise it is queued for re-collection. Each
file is hashed exactly once: hits carry the stored hash forward,
misses carry the just-computed hash so the rebuild step doesn't
re-read the same bytes a second time.
"""
hits: dict[Path, _CacheEntry] = {}
miss_hashes: dict[Path, str] = {}
for file in test_files:
file_hash = _hash_file(file)
entry = cache.entries.get(str(file.relative_to(root)))
if entry is not None and entry.hash == file_hash:
hits[file] = entry
else:
miss_hashes[file] = file_hash
return hits, miss_hashes
def _run_collect_batches(paths: list[Path]) -> list[tuple[str, str, int]]:
"""Run pytest --collect-only across ``paths`` using a process pool."""
workers = min(len(paths), os.cpu_count() or 1) or 1
batches = [paths[i::workers] for i in range(workers)]
if workers == 1:
results = [_collect_batch(batches[0])]
else:
with ProcessPoolExecutor(max_workers=workers) as executor:
results = list(executor.map(_collect_batch, batches))
return [_collect_batch(batches[0])]
with ProcessPoolExecutor(max_workers=workers) as executor:
return list(executor.map(_collect_batch, batches))
folder = TestFolder(path)
for stdout, stderr, returncode in results:
def _parse_collect_output(stdout: str) -> dict[Path, int]:
"""Parse ``pytest --collect-only -qq`` output into ``{path: count}``."""
counts: dict[Path, int] = {}
for line in stdout.splitlines():
if not line.strip():
continue
file_path, _, total_tests = line.partition(": ")
if not file_path or not total_tests:
raise ValueError(f"Unexpected line: {line}")
counts[Path(file_path)] = int(total_tests)
return counts
def _run_pytest_collect(paths: list[Path]) -> dict[Path, int]:
"""Run pytest --collect-only across ``paths`` and parse the output."""
counts: dict[Path, int] = {}
for stdout, stderr, returncode in _run_collect_batches(paths):
if returncode != 0:
print("Failed to collect tests:")
print(stderr)
print(stdout)
sys.exit(1)
for line in stdout.splitlines():
if not line.strip():
continue
file_path, _, total_tests = line.partition(": ")
if not file_path or not total_tests:
print(f"Unexpected line: {line}")
sys.exit(1)
try:
counts.update(_parse_collect_output(stdout))
except ValueError as err:
print(err)
sys.exit(1)
return counts
file = TestFile(int(total_tests), Path(file_path))
folder.add_test_file(file)
def _build_folder(root: Path, counts: dict[Path, int]) -> TestFolder:
"""Build a ``TestFolder`` from a flat ``{path: count}`` mapping.
Files reported with zero tests are skipped so they don't enter
bucketing (helper modules named ``test_*.py`` with no test functions
look like test files to the walker but pytest returns nothing for
them).
"""
folder = TestFolder(root)
for file_path, count in counts.items():
if count:
folder.add_test_file(TestFile(count, file_path))
return folder
def _exit_if_empty(paths: list[Path], root: Path) -> None:
"""Exit with a clear message when no eligible test paths were found."""
if not paths:
print(f"No eligible test paths found under {root}")
sys.exit(1)
def _collect_tests_uncached(path: Path) -> TestFolder:
"""Collect tests by handing pytest the top-level directories.
Skips the tree walk and per-file hashing; used when no cache file is
requested so the script behaves like the pre-cache implementation.
"""
batch_paths = _enumerate_batch_paths(path)
_exit_if_empty(batch_paths, path)
return _build_folder(path, _run_pytest_collect(batch_paths))
def _collect_tests_cached(path: Path, cache_path: Path) -> TestFolder:
"""Collect tests using an on-disk cache for incremental updates."""
all_test_files, fixtures = _walk_test_tree(path)
_exit_if_empty(all_test_files, path)
# Include ancestor conftests so a subtree run (eg tests/components)
# still invalidates when tests/conftest.py changes.
all_fixtures = _find_ancestor_conftests(path) + fixtures
invalidation_hash = _compute_invalidation_hash(path, all_fixtures)
cache = _Cache.load(cache_path, invalidation_hash)
hits, miss_hashes = _resolve_from_cache(all_test_files, cache, path)
print(
f"Cache: {len(hits)} hits / {len(miss_hashes)} misses"
f" / {len(all_test_files)} total"
)
new_counts: dict[Path, int] = {}
if miss_hashes:
# On a full cold-cache run, hand pytest the top-level directories
# instead of 5000+ individual file paths: pytest walks dirs much
# faster than it resolves each file argument. Once any cache hits
# exist, use file-level collection so we only re-collect the diff.
collect_paths = _enumerate_batch_paths(path) if not hits else list(miss_hashes)
new_counts = _run_pytest_collect(collect_paths)
# Walk the full set of test files once and decide each file's entry:
# hits keep their stored entry (and verified hash), misses build a
# fresh entry from the resolve-time hash plus the freshly collected
# count. Files in misses that pytest returned no count for are
# stored as 0 so they stop re-collecting on the next run.
entries: dict[str, _CacheEntry] = {}
counts: dict[Path, int] = {}
for file in all_test_files:
if (entry := hits.get(file)) is None:
entry = _CacheEntry(hash=miss_hashes[file], count=new_counts.get(file, 0))
entries[str(file.relative_to(path))] = entry
counts[file] = entry.count
_Cache(invalidation_hash=invalidation_hash, entries=entries).save(cache_path)
return _build_folder(path, counts)
def collect_tests(path: Path, cache_path: Path | None = None) -> TestFolder:
"""Collect all tests, using an on-disk cache when ``cache_path`` is set."""
if cache_path is None:
return _collect_tests_uncached(path)
if path.is_file():
# The cache keys on conftest_hash, but a single file root has no
# ancestor conftests to walk and the hash would always be empty,
# which would let stale counts survive conftest edits. Skip the
# cache for the file-root case rather than silently mis-caching.
print(f"--cache ignored: {path} is a single file")
return _collect_tests_uncached(path)
return _collect_tests_cached(path, cache_path)
def main() -> None:
"""Execute script."""
parser = argparse.ArgumentParser(description="Split tests into n buckets.")
@@ -276,11 +551,17 @@ def main() -> None:
help="Path to the test files to split into buckets",
type=Path,
)
parser.add_argument(
"--cache",
help="Path to a JSON file used to cache per-file test counts",
type=Path,
default=None,
)
arguments = parser.parse_args()
print("Collecting tests...")
tests = collect_tests(arguments.path)
tests = collect_tests(arguments.path, arguments.cache)
tests_per_bucket = ceil(tests.total_tests / arguments.bucket_count)
bucket_holder = BucketHolder(tests_per_bucket, arguments.bucket_count)
@@ -290,7 +571,7 @@ def main() -> None:
print(f"Total tests: {tests.total_tests}")
print(f"Estimated tests per bucket: {tests_per_bucket}")
bucket_holder.create_ouput_file()
bucket_holder.create_output_file()
if __name__ == "__main__":
@@ -798,7 +798,7 @@
'object_id_base': 'Uptime',
'options': dict({
}),
'original_device_class': <SensorDeviceClass.TIMESTAMP: 'timestamp'>,
'original_device_class': <SensorDeviceClass.UPTIME: 'uptime'>,
'original_icon': None,
'original_name': 'Uptime',
'platform': 'homewizard',
@@ -813,7 +813,7 @@
# name: test_sensors[HWE-BAT-entity_ids10][sensor.device_uptime:state]
StateSnapshot({
'attributes': ReadOnlyDict({
'device_class': 'timestamp',
'device_class': 'uptime',
'friendly_name': 'Device Uptime',
}),
'context': <ANY>,
@@ -116,3 +116,30 @@ async def test_get_triggers(
]
assert triggers == unordered(expected_triggers)
async def test_get_triggers_for_removed_device(
hass: HomeAssistant,
mock_bridge_v2: Mock,
v2_resources_test_data: JsonArrayType,
device_registry: dr.DeviceRegistry,
) -> None:
"""Test triggers for a device removed from the bridge.
Regression test for https://github.com/home-assistant/core/issues/152937
"""
await mock_bridge_v2.api.load_test_data(v2_resources_test_data)
await setup_platform(
hass, mock_bridge_v2, [Platform.BINARY_SENSOR, Platform.SENSOR]
)
# Create a device entry with a Hue ID that doesn't exist on the bridge
orphaned_device = device_registry.async_get_or_create(
config_entry_id=mock_bridge_v2.config_entry.entry_id,
identifiers={(hue.DOMAIN, "non-existent-hue-device-id")},
)
triggers = await async_get_device_automations(
hass, DeviceAutomationType.TRIGGER, orphaned_device.id
)
assert triggers == []
@@ -116,3 +116,54 @@ async def test_host_updated(hass: HomeAssistant) -> None:
assert result["reason"] == "already_configured"
assert entry.data[CONF_HOST] == MOCK_SSDP_LOCATION
async def test_ssdp_udn_as_list(hass: HomeAssistant) -> None:
"""Test SSDP discovery when UDN is a list instead of a string.
Regression test for https://github.com/home-assistant/core/issues/171837
"""
list_udn_discovery = SsdpServiceInfo(
ssdp_usn="usn",
ssdp_st="st",
ssdp_location=MOCK_SSDP_LOCATION,
upnp={
ATTR_UPNP_FRIENDLY_NAME: MOCK_FRIENDLY_NAME,
ATTR_UPNP_UDN: [MOCK_UDN, "uuid:other"],
},
)
result = await hass.config_entries.flow.async_init(
DOMAIN,
context={CONF_SOURCE: SOURCE_SSDP},
data=list_udn_discovery,
)
assert result["type"] is FlowResultType.FORM
assert result["step_id"] == "confirm"
assert result["description_placeholders"] == {CONF_NAME: MOCK_FRIENDLY_NAME}
result2 = await hass.config_entries.flow.async_configure(result["flow_id"], {})
assert result2["type"] is FlowResultType.CREATE_ENTRY
assert result2["title"] == MOCK_FRIENDLY_NAME
assert result2["data"] == {CONF_HOST: MOCK_SSDP_LOCATION}
async def test_ssdp_udn_as_empty_list(hass: HomeAssistant) -> None:
"""Test SSDP discovery when UDN is an empty list."""
empty_udn_discovery = SsdpServiceInfo(
ssdp_usn="usn",
ssdp_st="st",
ssdp_location=MOCK_SSDP_LOCATION,
upnp={
ATTR_UPNP_FRIENDLY_NAME: MOCK_FRIENDLY_NAME,
ATTR_UPNP_UDN: [],
},
)
result = await hass.config_entries.flow.async_init(
DOMAIN,
context={CONF_SOURCE: SOURCE_SSDP},
data=empty_udn_discovery,
)
assert result["type"] is FlowResultType.ABORT
assert result["reason"] == "incomplete_discovery"
+452
View File
@@ -0,0 +1,452 @@
"""Tests for the split_tests cache logic."""
from collections.abc import Callable
import json
from pathlib import Path
from unittest.mock import patch
import pytest
from script import split_tests
@pytest.fixture
def tree(tmp_path: Path) -> Path:
"""Build a small test tree on disk.
Returns the root path containing one root conftest, two integrations,
and a ``common.py`` helper that participates in cache invalidation
but is not a pytest collection target.
"""
(tmp_path / "conftest.py").write_text("# tests/conftest.py\n")
(tmp_path / "common.py").write_text("# helper module\n")
alpha_dir = tmp_path / "components" / "alpha"
alpha_dir.mkdir(parents=True)
(alpha_dir / "conftest.py").write_text("# alpha conftest\n")
(alpha_dir / "test_one.py").write_text("def test_a():\n pass\n")
(alpha_dir / "test_two.py").write_text("def test_b():\n pass\n")
beta_dir = tmp_path / "components" / "beta"
beta_dir.mkdir()
(beta_dir / "test_x.py").write_text("def test_x():\n pass\n")
return tmp_path
def test_iter_eligible_children_filters_helpers(tree: Path) -> None:
"""Helper files like conftest.py and common.py are not collection targets."""
children = split_tests._iter_eligible_children(tree)
names = {p.name for p in children}
assert "common.py" not in names
assert "conftest.py" not in names
# components/ is a dir, gets included.
assert "components" in names
def test_enumerate_batch_paths_fans_out_components(tree: Path) -> None:
"""tests/components fans out one level deeper into per-integration paths."""
paths = split_tests._enumerate_batch_paths(tree)
rel = {p.relative_to(tree).as_posix() for p in paths}
assert rel == {"components/beta", "components/alpha"}
def test_enumerate_batch_paths_for_single_file(tmp_path: Path) -> None:
"""A test file passed directly is returned as-is."""
file = tmp_path / "test_solo.py"
file.write_text("def test_x(): pass\n")
assert split_tests._enumerate_batch_paths(file) == [file]
def _invalidation_hash_for(tree: Path) -> str:
"""Compute the invalidation hash for ``tree`` (helper for the tests below)."""
_, fixtures = split_tests._walk_test_tree(tree)
return split_tests._compute_invalidation_hash(tree, fixtures)
def _prime_cache(
cache_path: Path,
tree: Path,
hits: dict[Path, int] | None = None,
extra_entries: dict[str, split_tests._CacheEntry] | None = None,
) -> None:
"""Save a cache for ``tree`` keyed on real file hashes.
``hits`` maps an on-disk test file to its cached count; the helper
computes the file's real hash so the cache will resolve as a hit on
next run. ``extra_entries`` lets a test inject entries whose path
does not exist on disk (e.g. ghost files).
"""
entries: dict[str, split_tests._CacheEntry] = {
str(file.relative_to(tree)): split_tests._CacheEntry(
hash=split_tests._hash_file(file), count=count
)
for file, count in (hits or {}).items()
}
if extra_entries:
entries.update(extra_entries)
split_tests._Cache(
invalidation_hash=_invalidation_hash_for(tree),
entries=entries,
).save(cache_path)
def _echo_one_test_each(
skip: set[Path] | None = None,
) -> Callable[[list[Path]], list[tuple[str, str, int]]]:
"""Build a fake ``_run_collect_batches`` that returns 1 test per path.
Any path in ``skip`` is silently omitted from the output (simulating
pytest finding no tests under it).
"""
skip = skip or set()
def fake(paths: list[Path]) -> list[tuple[str, str, int]]:
emitted = [p for p in paths if p not in skip]
return [("\n".join(f"{p}: 1" for p in emitted) + "\n", "", 0)]
return fake
def test_compute_invalidation_hash_changes_when_conftest_changes(tree: Path) -> None:
"""Editing any conftest changes the global cache key."""
before = _invalidation_hash_for(tree)
(tree / "components" / "alpha" / "conftest.py").write_text("# changed\n")
after = _invalidation_hash_for(tree)
assert before != after
def test_compute_invalidation_hash_changes_when_helper_changes(tree: Path) -> None:
"""Editing a non-conftest helper (eg common.py imported for parametrize) busts the cache.
Test files often import VALUES from common.py for
@pytest.mark.parametrize; a change there shifts collected counts
even though no test file or conftest was touched, so it has to
participate in the invalidation hash.
"""
before = _invalidation_hash_for(tree)
(tree / "common.py").write_text("# helper changed\n")
after = _invalidation_hash_for(tree)
assert before != after
def test_compute_invalidation_hash_stable_for_test_changes(tree: Path) -> None:
"""Test-file edits do not invalidate the global cache key."""
before = _invalidation_hash_for(tree)
(tree / "components" / "alpha" / "test_one.py").write_text(
"def test_a():\n pass\n\ndef test_c():\n pass\n"
)
after = _invalidation_hash_for(tree)
assert before == after
def test_find_ancestor_conftests_walks_up_until_gap(tmp_path: Path) -> None:
"""Ancestor conftests are collected up to the first dir without one."""
nested = tmp_path / "a" / "b" / "c"
nested.mkdir(parents=True)
# No conftest in tmp_path → walk stops there.
(tmp_path / "a" / "conftest.py").write_text("# a\n")
(tmp_path / "a" / "b" / "conftest.py").write_text("# b\n")
ancestors = split_tests._find_ancestor_conftests(nested)
assert [p.relative_to(tmp_path).as_posix() for p in ancestors] == [
"a/b/conftest.py",
"a/conftest.py",
]
def test_compute_invalidation_hash_changes_on_ancestor_change(tmp_path: Path) -> None:
"""An ancestor conftest edit must invalidate a subtree run's cache."""
(tmp_path / "conftest.py").write_text("# parent\n")
subtree = tmp_path / "components"
subtree.mkdir()
(subtree / "test_x.py").write_text("def test_x(): pass\n")
def _hash() -> str:
_, descendant = split_tests._walk_test_tree(subtree)
ancestors = split_tests._find_ancestor_conftests(subtree)
return split_tests._compute_invalidation_hash(subtree, ancestors + descendant)
before = _hash()
(tmp_path / "conftest.py").write_text("# parent changed\n")
assert _hash() != before
def test_walk_test_tree_separates_tests_from_fixtures(tree: Path) -> None:
"""The walker returns test_*.py files and every other .py as fixtures."""
test_files, fixtures = split_tests._walk_test_tree(tree)
test_names = {p.name for p in test_files}
fixture_paths = {p.relative_to(tree).as_posix() for p in fixtures}
assert test_names == {"test_one.py", "test_two.py", "test_x.py"}
assert fixture_paths == {
"conftest.py",
"common.py",
"components/alpha/conftest.py",
}
def test_walk_test_tree_skips_hidden_and_dunder_dirs(tmp_path: Path) -> None:
"""Hidden/dunder directories are pruned from the walk."""
(tmp_path / "__pycache__").mkdir()
(tmp_path / "__pycache__" / "test_ghost.py").write_text("def test_g(): pass\n")
(tmp_path / ".hidden").mkdir()
(tmp_path / ".hidden" / "test_invisible.py").write_text("def test_h(): pass\n")
(tmp_path / "test_real.py").write_text("def test_r(): pass\n")
test_files, _ = split_tests._walk_test_tree(tmp_path)
assert {p.name for p in test_files} == {"test_real.py"}
def test_collect_tests_skips_cache_for_single_file_root(tmp_path: Path) -> None:
"""A single-file root cannot validate conftest drift, so caching is disabled.
_walk_test_tree returns no conftests for a file root, which would make
the invalidation_hash a constant — letting a stale entry survive a real
conftest change. Better to bypass the cache than mis-cache silently.
"""
cache_path = tmp_path / "cache.json"
file = tmp_path / "test_solo.py"
file.write_text("def test_x(): pass\n")
with (
patch.object(split_tests, "_collect_tests_uncached") as uncached,
patch.object(split_tests, "_collect_tests_cached") as cached,
):
split_tests.collect_tests(file, cache_path)
uncached.assert_called_once_with(file)
cached.assert_not_called()
assert not cache_path.exists()
def test_cache_roundtrip(tmp_path: Path) -> None:
"""A cache survives save → load when the conftest hash matches."""
cache_path = tmp_path / "cache.json"
cache = split_tests._Cache(
invalidation_hash="abc",
entries={"tests/alpha/test_a.py": split_tests._CacheEntry(hash="h1", count=5)},
)
cache.save(cache_path)
loaded = split_tests._Cache.load(cache_path, "abc")
assert loaded.entries == cache.entries
assert loaded.invalidation_hash == "abc"
def test_cache_load_missing_returns_empty(tmp_path: Path) -> None:
"""A missing cache file degrades gracefully to an empty cache."""
cache = split_tests._Cache.load(tmp_path / "missing.json", "abc")
assert cache.entries == {}
assert cache.invalidation_hash == "abc"
def test_cache_load_invalid_json_returns_empty(tmp_path: Path) -> None:
"""Corrupt JSON is treated as a cache miss instead of crashing."""
path = tmp_path / "broken.json"
path.write_text("{not json")
cache = split_tests._Cache.load(path, "abc")
assert cache.entries == {}
def test_cache_load_wrong_version_returns_empty(tmp_path: Path) -> None:
"""An older cache schema is discarded rather than misread."""
path = tmp_path / "old.json"
path.write_text(json.dumps({"version": 0, "invalidation_hash": "abc", "files": {}}))
cache = split_tests._Cache.load(path, "abc")
assert cache.entries == {}
def test_cache_load_conftest_drift_returns_empty(tmp_path: Path) -> None:
"""A conftest change invalidates the entire cached set."""
path = tmp_path / "cache.json"
path.write_text(
json.dumps(
{
"version": split_tests._CACHE_VERSION,
"invalidation_hash": "old",
"files": {"test_a.py": {"hash": "h1", "count": 3}},
}
)
)
cache = split_tests._Cache.load(path, "new")
assert cache.entries == {}
def test_cache_load_drops_malformed_entries(tmp_path: Path) -> None:
"""Malformed per-file entries are skipped, valid ones are kept."""
path = tmp_path / "cache.json"
path.write_text(
json.dumps(
{
"version": split_tests._CACHE_VERSION,
"invalidation_hash": "abc",
"files": {
"good.py": {"hash": "h1", "count": 3},
"bad_count.py": {"hash": "h2", "count": "three"},
"missing_hash.py": {"count": 4},
"not_dict.py": 5,
},
}
)
)
cache = split_tests._Cache.load(path, "abc")
assert set(cache.entries) == {"good.py"}
def test_resolve_from_cache_hits_and_misses(tree: Path) -> None:
"""Files with matching hashes are hits; edited or new files are misses."""
alpha_one = tree / "components" / "alpha" / "test_one.py"
alpha_two = tree / "components" / "alpha" / "test_two.py"
beta_x = tree / "components" / "beta" / "test_x.py"
alpha_one_hash = split_tests._hash_file(alpha_one)
cache = split_tests._Cache(
invalidation_hash="dummy",
entries={
str(alpha_one.relative_to(tree)): split_tests._CacheEntry(
hash=alpha_one_hash, count=1
),
str(alpha_two.relative_to(tree)): split_tests._CacheEntry(
hash="stale", count=99
),
},
)
hits, miss_hashes = split_tests._resolve_from_cache(
[alpha_one, alpha_two, beta_x], cache, tree
)
assert hits == {alpha_one: split_tests._CacheEntry(hash=alpha_one_hash, count=1)}
assert miss_hashes == {
alpha_two: split_tests._hash_file(alpha_two),
beta_x: split_tests._hash_file(beta_x),
}
def test_collect_tests_hashes_each_file_once(tree: Path) -> None:
"""Hits reuse the stored hash, misses reuse the resolve-time hash.
Guards against regressing the double-read on cache-miss rebuilds:
each test file should pass through _hash_file at most once per run.
"""
cache_path = tree / "cache.json"
alpha_one = tree / "components" / "alpha" / "test_one.py"
# Prime with one hit so we exercise the file-level (not directory-level) miss path.
_prime_cache(cache_path, tree, hits={alpha_one: 1})
real_hash = split_tests._hash_file
counts: dict[Path, int] = {}
def counting_hash(path: Path) -> str:
counts[path] = counts.get(path, 0) + 1
return real_hash(path)
with (
patch.object(split_tests, "_hash_file", side_effect=counting_hash),
patch.object(
split_tests, "_run_collect_batches", side_effect=_echo_one_test_each()
),
):
split_tests.collect_tests(tree, cache_path)
assert all(n == 1 for n in counts.values()), counts
def test_collect_tests_warm_cache_skips_pytest(tree: Path) -> None:
"""A warm cache with no diffs should skip the pytest subprocess entirely."""
cache_path = tree / "cache.json"
alpha_one = tree / "components" / "alpha" / "test_one.py"
alpha_two = tree / "components" / "alpha" / "test_two.py"
beta_x = tree / "components" / "beta" / "test_x.py"
_prime_cache(cache_path, tree, hits={alpha_one: 1, alpha_two: 2, beta_x: 3})
with patch.object(split_tests, "_run_collect_batches") as run_batches:
folder = split_tests.collect_tests(tree, cache_path)
run_batches.assert_not_called()
assert folder.total_tests == 6
def test_collect_tests_cold_cache_collects_only_missing(tree: Path) -> None:
"""A partial cache should only re-collect the files that changed."""
cache_path = tree / "cache.json"
alpha_one = tree / "components" / "alpha" / "test_one.py"
alpha_two = tree / "components" / "alpha" / "test_two.py"
beta_x = tree / "components" / "beta" / "test_x.py"
_prime_cache(cache_path, tree, hits={alpha_one: 1})
with patch.object(
split_tests, "_run_collect_batches", side_effect=_echo_one_test_each()
) as run_batches:
folder = split_tests.collect_tests(tree, cache_path)
assert run_batches.call_count == 1
requested = set(run_batches.call_args.args[0])
assert requested == {alpha_two, beta_x}
assert folder.total_tests == 3
# Cache should now contain entries for every test file.
saved = json.loads(cache_path.read_text())
assert set(saved["files"]) == {
str(alpha_one.relative_to(tree)),
str(alpha_two.relative_to(tree)),
str(beta_x.relative_to(tree)),
}
def test_collect_tests_caches_files_with_no_collected_tests(tree: Path) -> None:
"""Files pytest returns nothing for are cached as 0 so we stop re-collecting them.
Helper modules named test_*.py with no actual test functions look like
test files to the walker but pytest reports no tests for them. We
want the cache to remember that and skip them on subsequent runs.
"""
cache_path = tree / "cache.json"
alpha_one = tree / "components" / "alpha" / "test_one.py"
alpha_two = tree / "components" / "alpha" / "test_two.py"
beta_x = tree / "components" / "beta" / "test_x.py"
# Prime the cache with one hit so collect_tests takes the file-level
# diff path; the cold-cache path hands pytest top-level directories
# rather than individual file paths.
_prime_cache(cache_path, tree, hits={alpha_one: 1})
with patch.object(
split_tests,
"_run_collect_batches",
side_effect=_echo_one_test_each(skip={alpha_two}),
):
split_tests.collect_tests(tree, cache_path)
saved = json.loads(cache_path.read_text())
assert saved["files"][str(alpha_two.relative_to(tree))]["count"] == 0
assert saved["files"][str(alpha_one.relative_to(tree))]["count"] == 1
assert saved["files"][str(beta_x.relative_to(tree))]["count"] == 1
# Re-running with the same content should now be a full cache hit
# even though alpha_two has no tests.
with patch.object(split_tests, "_run_collect_batches") as run_batches:
folder = split_tests.collect_tests(tree, cache_path)
run_batches.assert_not_called()
# alpha_two contributes 0, only alpha_one + beta_x count.
assert folder.total_tests == 2
def test_collect_tests_drops_deleted_files_from_cache(tree: Path) -> None:
"""Files that disappear from disk are dropped from the saved cache."""
cache_path = tree / "cache.json"
alpha_one = tree / "components" / "alpha" / "test_one.py"
ghost_rel = "components/alpha/test_ghost.py"
_prime_cache(
cache_path,
tree,
hits={alpha_one: 1},
extra_entries={ghost_rel: split_tests._CacheEntry(hash="dead", count=42)},
)
with patch.object(
split_tests, "_run_collect_batches", side_effect=_echo_one_test_each()
):
split_tests.collect_tests(tree, cache_path)
saved = json.loads(cache_path.read_text())
assert ghost_rel not in saved["files"]