Microsoft

How can we improve Content Moderator?

Need better way to distinguish offensive content

Consider the following two tweets. The first is benign. The second is offensive. How can I know that progamatically? They have the same ReviewRecommended, the same Category3 score, low Cat1 and Cat2 scores, and neither have any Terms flagged. This is useless.

{
"OriginalText": "Wanna win tix to our @RoughTradeNYC show NEXT MON 5/14? @Thrillcall's got you covered. Enter to win here >… https://t.co/s0KT4JDCB1",
"NormalizedText": "Wanna win tix to our @ RoughTradeNYC show NEXT MON 5/ 14? @ Thrillcall' s got you covered. Enter to win here & gt; … https://t.co/s0KT4JDCB1",
"Misrepresentation": null,
"Classification":
{
"ReviewRecommended": true,
"Category1":
{
"Score": 0.001084162387996912
},
"Category2":
{
"Score": 0.13056860864162445
},
"Category3":
{
"Score": 0.98799997568130493
}
},
"Language": "eng",
"Terms": null,
"Status":
{
"Code": 3000,
"Description": "OK",
"Exception": null
},
"TrackingId": "b1f79713-f84f-47b5-b4d8-0a0d0aec3d73"
}

compare to:

{
"OriginalText": "@omgimwigs @ me bro fckin @ me 😤🔫🙅🏻",
"NormalizedText": "@ omgimwigs @ me bro fckin @ me 😤🔫🙅🏻",
"Misrepresentation": null,
"Classification":
{
"ReviewRecommended": true,
"Category1":
{
"Score": 0.28458982706069946
},
"Category2":
{
"Score": 0.19746741652488708
},
"Category3":
{
"Score": 0.98799997568130493
}
},
"Language": "eng",
"Terms": null,
"Status":
{
"Code": 3000,
"Description": "OK",
"Exception": null
},
"TrackingId": "30bf4c39-47ae-4047-8ebb-863486bbabab"
}

1 vote
Sign in
Check!
(thinking…)
Reset
or sign in with
  • facebook
  • google
    Password icon
    Signed in as (Sign out)

    We’ll send you updates on this idea

    Jeff Lit shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

    2 comments

    Sign in
    Check!
    (thinking…)
    Reset
    or sign in with
    • facebook
    • google
      Password icon
      Signed in as (Sign out)
      Submitting...

      Feedback and Knowledge Base