37
loading...
This website collects cookies to deliver better user experience
This document shows you how to use Amazon Rekognition and Amazon AppFlow to build a fully serverless content moderation pipeline for messages posted in a Slack channel.
The content moderation strategy identifies images that violate sample chosen guidelines:
Amazon Rekognition content moderation is a deep learning-based service that can detect inappropriate, or offensive images & videos, making it easier to find and remove such content at scale.
It provides a detailed taxonomy of moderation categories.
Such as Explicit Nudity, Suggestive, Violence, and Visually Disturbing.
You can now detect six new categories: Drugs, Tobacco, Alcohol, Gambling, Rude Gestures, and Hate Symbols.
Amazon AppFlow is a fully managed integration service that enables you to securely transfer data between Software-as-a-Service (SaaS) applications like Salesforce, Marketo, Slack, and ServiceNow, and AWS services like S3, Redshift, in just a few clicks.
This solution leverages Amazon AppFlow to capture the content posted in Slack channels for analysis using Amazon Rekognition.
It's supposed that reader of this document has an account/workspace on AWS and Slack, also S3 bucket accessibility.
Also, client credetials for this services will be used.
This solution doesn't require any prior machine learning (ML) expertise, or development of your own custom ML models.
A Choose Slack connection dropdown list appears. From this list, choose Create new connection:
Enter your Slack workspace address (for example, testingslackdevgroup.slack.com), and Client ID and Client Secret generated when created the Slack App.
Give your connection a name on the Connect to Slack popup window.
Choose Continue.
On the next screen, provide a name for your new function called process-new-messages, and create a new IAM role called process-new-messages-lambda-role using the available “Amazon S3 object read-only permissions” template. This role will need to be customized in a later step.
After the function has been created, choose the Permissions tab.
Choose the role name to open a second window where you can view the two policies applied to this role.
Expand each policy to view the permissions details. The policy named AWSLambdaBasicExecutionRole-* grants the necessary permissions for the function to log
information in CloudWatch. The policy named AWSLambdaS3ExecutionRole-* provides S3 permissions and needs to be modified. To modify the policy, choose Edit Policy and switch to the JSON view to customize this policy. The final permissions statement should appear as follows:
"Statement": [{
"Action": [
"s3:GetObject*",
"s3:GetBucket*",
"s3:List*"
],
"Resource": [
"arn:aws:s3:::slack-moderation-output",
"arn:aws:s3:::slack-moderation-output/*"
],
"Effect": "Allow"
}]
{
"Version": "2012-10-17",
"Statement": [{
"Action": [
"sqs:SendMessage",
"sqs:GetQueueAttributes",
"sqs:GetQueueUrl"
],
"Resource": "arn:aws:sqs:us-east-1:111111111111:new-image-findings",
"Effect": "Allow"
}]
}
import boto3
from urllib.parse import unquote_plus
import json
s3_client = boto3.client('s3')
s3 = boto3.resource('s3')
sqs = boto3.client('sqs')
def sendToSqS(attributes, queueurl):
sqs.send_message(
QueueUrl=queueurl,
MessageBody='Image to Check',
MessageAttributes={ "url": { "StringValue": attributes["image_url"], "DataType": 'String'
}, "slack_msg_id": { "StringValue": attributes["client_msg_id"], "DataType": 'String' } } )
def lambda_handler(event, context):
image_processing_queueurl = "https://queue.amazonaws.com/111111111111/new-image-findings”
for record in event['Records']:
bucket = record['s3']['bucket']['name']
key = unquote_plus(record['s3']['object']['key'])
file_lines = s3.Object(bucket, key).get()\['Body'].read().decode('utf-8').splitlines()
attachment_list = []
for line in file_lines:
if line: # Check for blank lines
jsonline = json.loads(line)
if "attachments" in jsonline.keys(): # Check for lines with attachements
for attachment in jsonline["attachments"]:
if "image_url" in attachment.keys():
if "client_msg_id" in jsonline.keys():
thisdict = {
"image_url": attachment["image_url"],
"client_msg_id": jsonline["client_msg_id"]
}
attachment_list.append(thisdict.copy())
else:
thisdict = {
"image_url": attachment["image_url"],
"client_msg_id": "None Found"
}
attachment_list.append(thisdict.copy())
for item in attachment_list:
sendToSqS(item, image_processing_queueurl)
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"sqs:ReceiveMessage",
"sqs:ChangeMessageVisibility",
"sqs:GetQueueUrl",
"sqs:DeleteMessage",
"sqs:GetQueueAttributes"
],
"Resource": "arn:aws:sqs:us-east-1:111111111111:new-image-findings",
"Effect": "Allow"
},
{
"Action": [
"sqs:SendMessage",
"sqs:GetQueueAttributes",
"sqs:GetQueueUrl"
],
"Resource": "arn:aws:sqs:us-east-1:111111111111:new-violation-findings",
"Effect": "Allow"
}]
}
import urllib.request
import boto3
sqs = boto3.client('sqs')
rekognition = boto3.client('rekognition')
def analyze_themes(file, min_confidence=80):
with open(file, 'rb') as document:
imageBytes = bytearray(document.read())
response = rekognition.detect_moderation_labels(Image={'Bytes': imageBytes}, MinConfidence=min_confidence)
found_high_confidence_labels = []
for label in response['ModerationLabels']:
found_high_confidence_labels.append(str(label['Name']))
return found_high_confidence_labels
def analyze_text(file):
with open(file, 'rb') as document:
imageBytes = bytearray(document.read())
response = rekognition.detect_text(Image={'Bytes': imageBytes})
textDetections = response['TextDetections']
found_text = ""
for text in textDetections:
found_text += text['DetectedText']
return found_text
def sendToSqS(words, attributes, queueurl):
sqs.sendMessage(
QueueUrl=queueurl,
MessageBody='Image with "' + words + '" found',
MessageAttributes={
"url": {
"StringValue": attributes["image_url"],
"DataType": 'String'
},
"slack_msg_id": {
"StringValue": attributes["slack_msg_id"],
"DataType": 'String'
}
}
)
def lambda_handler(event, context):
violations = "https://queue.amazonaws.com/111111111111/new-violation-findings"
disallowed_words = ["medical", "private"]
disallowed_themes = ["Tobacco", "Alcohol"] # Case Sensitive
file_name = "/tmp/image.jpg"
for record in event['Records']:
print(record)
receiptHandle = record["receiptHandle"]
image_url = record["messageAttributes"]["url"]["stringValue"]
slack_msg_id = record["messageAttributes"]["slack_msg_id"]["stringValue"]
eventSourceARN = record["eventSourceARN"]
arn_elements = eventSourceARN.split(':')
img_queue_url = sqs.get_queue_url(
QueueName=arn_elements[5],
QueueOwnerAWSAccountId=arn_elements[4]
)
sqs.delete_message(
QueueUrl=img_queue_url["QueueUrl"],
ReceiptHandle=receiptHandle
)
urllib.request.urlretrieve(image_url, file_name)
detected_text = analyze_text(file_name)
print("Detected Text: " + detected_text)
found_words = []
for disallowed_word in disallowed_words:
if disallowed_word.lower() in detected_text.lower():
found_words.append(disallowed_word)
print("WORD VIOLATION: " + disallowed_word.lower() + " found in " + detected_text.lower())
violating_words = ", ".join(found_words)
if not violating_words == "":
attributes_json = {}
attributes_json["slack_msg_id"] = slack_msg_id
attributes_json["image_url"] = image_url
sendToSqS(violating_words, attributes_json, violations)
detected_themes = analyze_themes(file_name)
print("Detected Themes: " + ", ".join(detected_themes))
found_themes = []
for disallowed_theme in disallowed_themes:
if disallowed_theme in detected_themes:
found_themes.append(disallowed_theme)
print("THEME VIOLATION: " + disallowed_theme + " found in image")
violating_themes = ", ".join(found_themes)
if not violating_themes == "":
attributes_json = {}
attributes_json["slack_msg_id"] = slack_msg_id
attributes_json["image_url"] = image_url
sendToSqS(violating_themes, attributes_json, violations)
After you have pasted the code, update the violations variable in the function handler with the correct ARN for the new-violation-findings SQS queue which contains your actual account number.
Choose Deploy.
{
}
"Version": "2012-10-17",
"Statement": [
{
"Sid": "QueueOwnerOnlyAccess",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam:: 111111111111:root"
},
"Action": [
"sqs:DeleteMessage",
"sqs:ReceiveMessage",
"sqs:SendMessage",
"sqs:GetQueueAttributes",
"sqs:RemovePermission",
"sqs:AddPermission",
"sqs:SetQueueAttributes"
],
"Resource": "arn:aws:sqs:us-east-1: 111111111111:new-violation-findings"
},
{
"Sid": "HttpsOnly",
"Effect": "Deny",
"Principal": "*",
"Action": "SQS:*",
"Resource": "arn:aws:sqs:us-east-1: 111111111111:new-violation-findings",
"Condition": {
"Bool": {
"aws:SecureTransport": "false"
}}}]
}
Post sample images that will be used to trigger violations for the current configured policies:
After creating those posts, wait 2-3 minutes and then navigate to the SQS Console. View the queues and choose the new-violation-findings queue.
Choose the Send and receive messages button.
At the bottom of the screen, choose the Poll for messages button.
After a few seconds you should see two messages pop up. You can choose each message to interrogate the contents.
Choose the Message ID. The body of the message contains information about what violation was triggered. The Attributes show the image URL and “slack_msg_id” for the offending item.
37