Tisane Block for Natural Language Processing in 27 Languages
Natural language processing (NLP) is one of the most impactful branches of artificial intelligence already in action today. In a nutshell, it analyzes massive sets of structured or unstructured data and derives insights from it – gauging sentiment, parsing text, extracting topics, tagging hate speech, identifying criminal activity. It’s all in the text, and NLP is the key to understanding it at a rate and scale far beyond the capabilities of human operators.
That’s why we’re excited to add the Tisane Block for Natural Language Processing from Tisane Labs to our ever-growing list of AI/machine learning services that can be directly integrated into PubNub with Functions and the Blocks Catalog. The Block brings NLP capabilities to your data streams, allowing you to analyze and process the data flowing through with no additional services or infrastructure required. Blocks are 100% serverless, and all the business logic lives directly inside the PubNub network.
The Tisane Block provides:
- Granular sentiment analysis (aspect-based sentiment analysis), rather than a single figure, the result is a vector of values for specific targets.
- Detection of abuse: hate speech / personal attacks / sexual advances / profanities
- Topic modeling: IAB and IPTC standards are supported, as well as native topic Tisane IDs.
- Part of speech tagging with additional tags:
- Supported standards: glossing abbreviations, Penn Treebank tags, Universal Dependencies, or native Tisane features
- Entity extraction
- Sense disambiguation
And the NLP service itself supports 27 languages from European to Asian to Middle Eastern.
Sample Use Cases
The following are a couple examples of how you can use the Tisane Block for real-time natural language processing in your app or product.
NLP allows you to moderate content or data streams on-the-fly. In real time chat environments for example, human moderators cannot be in the middle of the posting and the publishing – they can only react after abusive content was published. Depending on how frequent and severe the incidents are, and how the community reacts to them, you can mix and match the following options:
- Alert the poster, suggesting to rethink what they are about to publish (“was that really necessary?”).
- Keep track of the frequency of abuses, with an automatic multiple strikes system, after which the account is temporarily suspended until a review of a human moderator.
- Quietly alert the moderators.
- Provide an option for the other participants not to see content with particular types of abuse or even blacklist the abusive user, unless explicitly whitelisted. This is likely to be the most annoying option for the trolls, as they can no longer complain about the authorities misusing their power; in this scenario, it’s the other users that voluntarily took action against them.
- Censor the message completely or put it in custody until reviewed by a human moderator.
Certain particulars about the date, time, location, prices or contact details may be mentioned in a chat, and ‘smart’ chat apps may save these and interact with other applications, like calendars, scheduling software, or a notebook when shopping around and comparing offers. You can use Tisane to extract entities and cleanly store them alongside the original document.
Statistics and Content Aggregation
Tisane’s topic extraction can be used for contextual ad display. Topics, entities, concepts, and sentiment analysis 2.0 snippets can be aggregated into statistics that can provide the community management with powerful data points to see what their community is excited about and what holds it back.