Soundouss Messoudi, Sylvain Rousseau, Sebastien Destercke
Show Abstract ►
Deep networks, like some other learning models, can associate high trust to unreliable
predictions. Making these models robust and reliable is therefore essential, especially for
critical decisions. This paper shows that the density-based conformal prediction approach brings
a convincing solution to this challenge. Conformal prediction consists in predicting a set of
classes covering the real class with a user-defined frequency. In the case of atypical examples,
the conformal prediction predicts the empty set. Experiments show the good behavior of the
conformal approach, especially when handling noisy and outlier examples.
Bouaaddi Ayoub, Artibi Yasser, Hajji Hicham, Mharzi Alaoui Hicham
Show Abstract ►
Deep learning architectures have proven to be effective in geospatial analysis, especially in
image segmentation. However, for conventional building extraction only traditional machine
learning methods are used. In this paper, we explore different and newer models, including the
U-Net architecture, Attention U-Net, and TransUNet for building detection by using them and
assessing their performances in extracting buildings footprints.
Amzoug Zohra, Essqalli Chaimae, Hajji Hicham, Bouarfa Soufiane
Show Abstract ►
Computer vision and deep learning are frequently used in automated detection and inspection. In
the aeronautical field and aircraft inspection, it allows a huge time saving, and improves the
accuracy of dents detection which reduces the risk of aircraft accidents. The novelty brought by
this work is the use of 3 variants of deep learning architectures ranked the most powerful in
segmentation, namely U-NET, Attention U-NET which reflects the effect of attention and
TransU-Net characterized by the presence of transformers. Despite the reduced dataset we had,
the results are encouraging and demonstrate the power of these architectures in aircraft
inspection and dents detection, especially TransU-Net which gave us a precision of 83.18%,
87.27% as IOU and 50.27% in Recall. According to the reached performances, we can admit that
transformers proved their strength on our case of study.
Aissam Outchakoucht and Hamza Es-samaali
Show Abstract ►
Darija Open Dataset (DODa) is an open-source project for the Moroccan dialect. With more than
13,000 entries DODa is arguably the largest open-source collaborative project for Darija <=>
English translation built for Natural Language Processing purposes. In fact, besides semantic
categorization, DODa also adopts a syntactic one, presents words under different spellings,
offers verb-to-noun and masculine-to-feminine correspondences, contains the conjugation of
hundreds of verbs in different tenses, as well as more that 2500 translated sentences. This data
paper presents a description of DODa. This collaborative project is hosted on GitHub and aims to
be a standard resource for researchers, students, and anyone who is interested in Moroccan
Dialect.
Bousselham El Haddaoui, Raddouane Chiheb, Rdouan Faizi, and Abdellatif El Afia
Show Abstract ►
Deep learning techniques have proven their effectiveness for Sentiment Analysis (SA) related
tasks. Recurrent neural networks (RNN), especially Long Short-Term Memory (LSTM) and
Bidirectional LSTM, have become a reference for building accurate predictive models. However,
the models complexity and the number of hyperparameters to configure raises several questions
related to their stability. In this paper, we present various LSTM models and their key
parameters, and we perform experiments to test the stability of these models in the context of
Sentiment Analysis.
Ouboti Djaneye-Boundjou
Show Abstract ►
x86 opcodes extracted from disassembly files, which are provided for each malware program in the
imbalanced, labeled subset of the BIG 2015 dataset, are used to classify the said malware
programs. More specifically, Non-Negative Matrix Factorization (NMF) is utilized to model
documents of opcodes representing the malware programs as weighted mixtures of the generated NMF
topics. A k Nearest Neighbors model, a Random Forest model, an XGBoost model, and an ensemble of
the aforementioned models are each used to classify the malware programs based on NMF topic
weight features. The proposed approach is promising as, on an adequately sampled and held-out
test dataset, it yields minimums of 98.49% classification accuracy and 97.38% macro F1 score.
Ismail KICH, El Bachir AMEUR, Youssef TAOUIL
Show Abstract ►
In this paper, a steganographic model hiding a color image into another
color image of the same size is presented. The use of a deep auto-encoder network architecture
and a loss function focusing on the quality of the stego image are investigated. Experiments on
different image databases demonstrate the ability of the proposed architecture to hide one color
image within another regardless of their sources and sizes.
Randa Zarnoufi, Walid Bachri, Hamid Jaafar, Mounia Abik
Show Abstract ►
Moroccan Arabic (MA) dialect is a low resource language. To perform any NLP task, we have to
develop the necessary resources from scratch. This paper presents our work on the first MA
dataset for violent contents detection from user generated text. The dataset will serve to build
predictive models of violent contents widely present in social media and thus help to ensure
online safety.
Mohamed Zouidine, Mohammed Khalil
Show Abstract ►
This paper presents a new deep reinforcement learning based method for Arabic Sentiment Analysis
using a policy gradient algorithm. To show the effectiveness of deep reinforcement learning
techniques, an RNN-based model was trained with a combination of binary cross-entropy and policy
gradient losses. Experiments on Large-Scale Arabic Book Reviews (LABR) dataset show that our
method help to improve the performance of the trained model for Arabic Sentiment Analysis.
Salma El Anigri, Abdelhak Mahmoudi, Saad Slimani, Salaheddine Hounka, Taha Rehah, Youssef Bouyakhf,
Mustapha Akiki, El Houssine Bouyakhf
Show Abstract ►
Medical reports record both information concerning the various personal data of the patient as
well as his medical consultations, clinical, biological, and radiological examinations. Thus,
before processing and/or publicly sharing these medical reports for scientific research
purposes, sensitive data must be deleted for legal and ethical considerations. In this article,
we present our ongoing rule-based de-identification process of French medical reports of fetal
echography.
Wissal MARHIT
Show Abstract ►
In this paper, a computational predictive model for innovative marketing is presented. The use
of Text Mining, NLP and Machine Learning is used to target consumers in a sustainable
personalized way giving birth to “LCF One to One Marketing Strategy” innovation in Loyalty
Marketing Strategies.
Khalid TNAJI, Karim BOUZOUBAA, Lhoussain Aouragh
Show Abstract ►
Part Of Speech (POS) tagging is the ability to computationally determine which POS of a word is
activated by its use in a particular context. POS tagger is a useful preprocessing tool in many
natural language processing (NLP) applications. In this paper, we expose a new Arabic POS Tagger
based on the combination of two main modules, one using the 1st order Markov model and a
decision tree model. The tag set used for this POS is an elementary tag set composed of 4 tags
{noun, verb, particle, punctuation} that are sufficient for some NLP applications but with a
much greater accuracy and rapidity.
Mariame Ouamer, Karim Bouzoubaa, and Rachida Tajmout
Show Abstract ►
This paper attempts to describe our own Arabic broken plural list and handle the problem of
broken plural by developing a system to extract the plural and its singular form using several
machine learning classifiers. Obtained results show that the Random Forest classifier
outperforms the other statistical classifiers with an accuracy of approximately 98%.
Amina Alaoui Soulimani
Show Abstract ►
Digitisating Morocco’s health infrastructure has entailed the adoption of foreign algorithms
across hospitals and microbiology laboratories. While cancer diagnosis through artificially
intelligent machines is at the heart of contemporary discourses on technology advancement ‘for’
health, the ethics of its deployment and the consent of data usage of cancer patients is not
echoed enough, especially as public hospitals are posited within an imaginary that can be devoid
of hope for non-middle class populations due to dilapidated infrastructures.