mirror of
https://github.com/home-assistant/core.git
synced 2025-06-25 01:21:51 +02:00
Tweak non-English issue detection (#146636)
This commit is contained in:
23
.github/workflows/detect-non-english-issues.yml
vendored
23
.github/workflows/detect-non-english-issues.yml
vendored
@ -64,16 +64,19 @@ jobs:
|
||||
You are a language detection system. Your task is to determine if the provided text is written in English or another language.
|
||||
|
||||
Rules:
|
||||
1. Analyze the text and determine the primary language
|
||||
1. Analyze the text and determine the primary language of the USER'S DESCRIPTION only
|
||||
2. IGNORE markdown headers (lines starting with #, ##, ###, etc.) as these are from issue templates, not user input
|
||||
3. IGNORE all code blocks (text between ``` or ` markers) as they may contain system-generated error messages in other languages
|
||||
4. Consider technical terms, code snippets, and URLs as neutral (they don't indicate non-English)
|
||||
5. Focus on the actual sentences and descriptions written by the user
|
||||
6. Return ONLY a JSON object with two fields:
|
||||
- "is_english": boolean (true if the text is primarily in English, false otherwise)
|
||||
4. IGNORE error messages, logs, and system output even if not in code blocks - these often appear in the user's system language
|
||||
5. Consider technical terms, code snippets, URLs, and file paths as neutral (they don't indicate non-English)
|
||||
6. Focus ONLY on the actual sentences and descriptions written by the user explaining their issue
|
||||
7. If the user's explanation/description is in English but includes non-English error messages or logs, consider it ENGLISH
|
||||
8. Return ONLY a JSON object with two fields:
|
||||
- "is_english": boolean (true if the user's description is primarily in English, false otherwise)
|
||||
- "detected_language": string (the name of the detected language, e.g., "English", "Spanish", "Chinese", etc.)
|
||||
7. Be lenient - if the text is mostly English with minor non-English elements, consider it English
|
||||
8. Common programming terms, error messages, and technical jargon should not be considered as non-English
|
||||
9. Be lenient - if the user's explanation is in English with non-English system output, it's still English
|
||||
10. Common programming terms, error messages, and technical jargon should not be considered as non-English
|
||||
11. If you cannot reliably determine the language, set detected_language to "undefined"
|
||||
|
||||
Example response:
|
||||
{"is_english": false, "detected_language": "Spanish"}
|
||||
@ -122,6 +125,12 @@ jobs:
|
||||
return;
|
||||
}
|
||||
|
||||
// If language is undefined or not detected, skip processing
|
||||
if (!languageResult.detected_language || languageResult.detected_language === 'undefined') {
|
||||
console.log('Language could not be determined, skipping processing');
|
||||
return;
|
||||
}
|
||||
|
||||
console.log(`Issue detected as non-English: ${languageResult.detected_language}`);
|
||||
|
||||
// Post comment explaining the language requirement
|
||||
|
Reference in New Issue
Block a user