Natural Language Processing (NLP) is a field of study focused on developing techniques and algorithms for processing and analyzing human language. It is an interdisciplinary field that draws on knowledge from computer science, linguistics, and artificial intelligence, among other areas. NLP is concerned with tasks such as text classification, sentiment analysis, language translation, speech recognition, and text summarization, among others. In DEEP we are using this technology for supporting the humanitarian sector with classification, and categorization for analysis tasks and report generation for crises. It is embedded in the platform to quickly categorize, extract, summarize, and analyze texts to generate reports in a more efficient manner.DEEP is the first tool in the humanitarian sector that has practical applications of Artificial intelligence. Making it the pioneer in bringing NLP to the humanitarian and development sectors.
A multi-label classification model tailored for common humanitarian analysis frameworks and trained on DEEP data with direct access in DEEP and a pluggable interface for other applications.
Optimizing a model that extracts geographic information from the text.
Automatizing the summarization step for creating reports using annotated data.
This model selects a subset of passages that contain relevant information from the given document; these entries do not necessarily follow the common units of text such as sentence and paragraph and can appear in various lengths.
HumBert (Humanitarian Bert) is a XLM-Roberta model trained on humanitarian texts – approximately 50 million textual examples (roughly 2 billion tokens) from public humanitarian reports, law cases and news articles.
Using it to identify the topics present in a large corpus of text data and for analyzing large volumes of text data and identifying meaningful patterns and insights.