AI Chat Moderation

AI-Powered Chat Moderation Plugin, Prevent your players to send harmful content.

2,93k

AI Chat Moderation

<center>

AI Chat Moderation logo banner

</center>

Keep your Minecraft server chat clean and safe with advanced AI technology!

AI Chat Moderation is a powerful plugin that uses artificial intelligence to detect and filter harmful messages in your Minecraft server chat. With customizable detection categories and response actions, you can create a welcoming environment for all players.

The plugin uses an OPENAI or Mistral AI model, no API key required

Features

- Advanced AI Detection: Identifies 12 categories of harmful content including harassment, hate speech, threats, violence, and inappropriate sexual content FOR all languages. - Customizable Confidence Levels: Fine-tune detection sensitivity for each category - Flexible Response System: Configure unique responses for different violation types - Real-time Moderation: Option to check messages before or after they're sent - Comprehensive Logging: Track flagged messages in your server console - Easy Configuration: Simple YAML file with detailed options

Installation

Just drag and drop the plugin jar into your plugins folder and it is ready ! No need to create an OPENAI key. To not to leave any doubt, message may be cached in my programs for performance purpose only.

How It Works

The plugin analyzes each chat message using AI technology to determine if it contains harmful content. When potentially harmful content is detected, the plugin can:

- Hide the message from other players - Send warning messages to the offending player - Execute custom commands based on the violation type - Kick or take other administrative actions for severe violations

Give your moderators the tools they need to maintain a positive community without constantly monitoring chat. AI Chat Moderation works silently in the background, allowing you to focus on building and playing while keeping your server safe for everyone.

Commands

- /acm reload : To reload the plugin - /acm debug : To see the categories confidence of each message sent. It allows to adjust your categories confidence easily.

Permissions

- acm.reload : To be able to use /acm reload - acm.debug : To be able to use /acm debug - acm.bypass : To bypass the checks (By default OP bypass)

AI Chat Moderation - Smart protection for your Minecraft community

The configuration file: ```yaml

Enable or disable the plugin

enabled: true

---- NO AI PART ----

To blacklist some words without using AI

blacklisted-words: - fuck you - retarded

Whether or not the message will be hidden when it contains a blacklisted word

blacklisted-words-hide-message: true

The commands to run if the message contains a blacklisted word

blacklisted-words-commands: - "SEND_MESSAGE &6%player% &cUse a correct language."

---- AI PART ----

Choose your provider OPENAI or MISTRAL or DETOXIFY

+ Mistral is better for arabic and Russian languages

+ OpenAI responses are faster

+ Detoxify Very fast, good for IT, FR, RU, PT, ES, TR but not very detection when the insult is splitted / not well written / acronym

providers: [DETOXIFY, OPENAI]

True: the plugin will log the flagged messages in the console

False: the plugin will not log the flagged messages in the console

logs-flagged-messages: true logged-message-format: "Flagged message: Player:[%player%] - Message:[%message%] - Category:[%category%] - Confidence[%confidence%]"

True: the plugin will check the message before it is sent (the message will be a bit delayed, approximately 0.3s)

False: the plugin will check the message after it has been sent (the message will not be delayed) but if the message is flagged, the message will not be deleted

check-after-message-has-been-sent: false

The different harmful categories that the plugin can detect

categories:

OPENAI

illicit/violent:

True: the plugin will detect the messages in this category

False: the plugin will not detect the messages in this category

detection: true

The actions that the plugin will do when a message is detected in this category

actions: warn:

Adjust the confidence level to detect the messages in this category

When it is higher the plugin will detect only the messages with height intensity of harmfulness

When it is lower the plugin will detect the messages with low intensity of harmfulness

From 0.0 to 1.0, 0.85 is a good value

confidence: 0.80

Whether the plugin will hide the message from the player

hideMessage: true

The commands that the plugin will execute when a message is detected in this category

Placeholders: %player%: the player who sent the message

%player_uuid%: the UUID of the player who sent the message

commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use violent language." other: confidence: 0.95 hideMessage: true commands: - "kick %player%"

OPENAI

self-harm/instructions: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use self-harm instructions."

OPENAI

harassment: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to harass other players."

OPENAI

violence/graphic: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use violence."

OPENAI

illicit: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use illicit language."

OPENAI

self-harm/intent: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use self-harm intent."

OPENAI

hate/threatening: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use threatening language."

OPENAI

sexual/minors: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use sexual language."

OPENAI

harassment/threatening: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use threatening language."

OPENAI

hate: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use hate language."

OPENAI

self-harm: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use self-harm language."

OPENAI and MISTRAL

sexual: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use sexual language."

OPENAI

violence: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use violent language."

MISTRAL

Content that expresses prejudice, hostility, or advocates discrimination against individuals or groups based on protected characteristics such as race, ethnicity, religion, gender, sexual orientation, or disability. This includes slurs, dehumanizing language, calls for exclusion or harm targeted at specific groups, and persistent harassment or bullying of individuals based on these characteristics.

hate_and_discrimination: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use hate language."

MISTRAL

Content that describes, glorifies, incites, or threatens physical violence against individuals or groups. This includes graphic depictions of injury or death, explicit threats of harm, and instructions for carrying out violent acts. This category covers both targeted threats and general promotion or glorification of violence.

violence_and_threats: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use violent language."

MISTRAL

Content that promotes or provides instructions for illegal activities or extremely hazardous behaviors that pose a significant risk of physical harm, death, or legal consequences. This includes guidance on creating weapons or explosives, encouragement of extreme risk-taking behaviors, and promotion of non-violent crimes such as fraud, theft, or drug trafficking.

dangerous_and_criminal_content: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use violent language."

MISTRAL

Content that promotes, instructs, plans, or encourages deliberate self-injury, suicide, eating disorders, or other self-destructive behaviors. This includes detailed methods, glorification, statements of intent, dangerous challenges, and related slang terms

selfharm: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use violent language."

MISTRAL

Content that contains or tries to elicit detailed or tailored medical advice

health: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use tailored medical language."

MISTRAL

Content that contains or tries to elicit detailed or tailored financial advice.

financial: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use financial language."

MISTRAL

Content that contains or tries to elicit detailed or tailored legal advice.

law: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use elicit detailed language."

MISTRAL

Content that requests, shares, or attempts to elicit personal identifying information such as full names, addresses, phone numbers, social security numbers, or financial account details.

pii: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to leak personal information."

DETOXIFY

toxicity: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use toxic language."

DETOXIFY

sexual_explicit: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use sexual explicit language."

DETOXIFY

severe_toxicity: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use severe toxic language."

DETOXIFY

obscene: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use obscene language."

DETOXIFY

threat: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use threat language."

DETOXIFY

identity_attack: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use threat language."

DETOXIFY

insult: detection: true actions: warn: confidence: 0.80 hideMessage: true commands: - "SEND_MESSAGE &6%player% &cYou are not allowed to use insult language."

print-api-errors: false # For debugging purposes, I advice you to set it to false in production ```

ADS