MoroccoAI Conference Keynotes
Empower AI transformation in Morocco and connect today’s and tomorrow’s AI leaders
Empower AI transformation in Morocco and connect today’s and tomorrow’s AI leaders
At the end of the 1600s, Leibniz started thinking about building a machine that would answer legal questions. However, the first systems applying Artificial Intelligence (AI) to the law didn’t appear before the 1970s. One famous example of these systems is TAXMAN. The first law chatbot DoNotPay was launched in 2015. TAXMAN built a formal model of US tax law. DoNotPay is an online chatbot that was first created to help appeal for parking tickets. Since then, various approaches have been proposed to automate some legal tasks. In this talk, we will overview how recent advances in AI and Natural Language Processing (NLP) allow the analysis of large numbers of legal documents:
Much work has been done recently to make neural networks more interpretable, and one approach is to arrange for the network to use only a subset of the available features. In linear models, Lasso (or L1-regularized) regression assigns zero weights to the most irrelevant or redundant features, and is widely used in data science. However the Lasso only applies to linear models. In this webinar, Mr. Lemhadri will introduce LassoNet, a neural network framework with global feature selection. The proposed approach enforces a hierarchy: specifically a feature can participate in a hidden unit only if its linear representative is active. Unlike other approaches to feature selection for neural nets, the method uses a modified objective function with constraints, and so integrates feature selection with the parameter learning directly. As a result, it delivers an entire regularization path of solutions with a range of feature sparsity. On systematic experiments, LassoNet significantly outperforms state-of-the-art methods for feature selection and regression. The LassoNet method uses projected proximal gradient descent, and generalizes directly to deep networks.
Accurate image segmentation is crucial for medical imaging applications, which typically rely on high-quality manual annotations which is tedious and time-consuming for clinical experts. In this talk, Dr. Gridach will present his recent work on Densely Oriented Pooling Network (DOPNet) for capturing variation in feature size and preserving spatial interconnection in medical imaging segmentation. Dr. Gridach will start with a brief computer vision review before leading to his proposed work.
The presence of pollutants in the air has a direct impact on our health and causes detrimental changes to our environment. Air quality monitoring is therefore of paramount importance. The high cost of the acquisition and maintenance of accurate air quality stations implies that only a small number of these stations can be deployed in a country. This presentation is about a low-cost approach to monitor air quality in urban areas. By combining Artificial Intelligence (AI) and Internet of Things (IoT), we can improve the spatial resolution of the air monitoring process, and successfully predict air quality based on readily available data.
Attention mechanisms have improved the performance of NLP tasks while allowing models to remain explainable. Self-attention is currently widely used; however, interpretability is difficult due to the numerous attention distributions. Recent work has shown that model representations can benefit from label-specific information while facilitating interpretation of predictions. We introduce the Label Attention Layer, a new form of self-attention where attention heads represent labels. We test our novel approach by running constituency and dependency parsing experiments on the Penn Treebank (PTB) and the Chinese Treebank (CTB) datasets. The new proposed model achieves state-of-the-art results for both tasks. Moreover, our model requires fewer self-attention layers compared to existing models. Finally, we share our findings that Label Attention heads can learn relations between syntactic categories and show pathways to analyze errors.
Recently, by relying only on the self-attention blocks, the transformer mechanism has taken many AI fields by storm. For example in NLP, several transformer-based architectures were proposed like BERT, GPT 2, GPT3 outperforming classical NLP approaches such RNN and LSTM… Also, in Biology, AlphaFold 2 was proposed as a transformer-based model that better predicts the structures of proteins from their genetic sequences. And more recently many researchers have tried to apply the same transformer recipe to tackle computer visions tasks such as classification, semantic segmentation, object detection…
The aim of this webinar is to give an overview of the Vision Transformer ViT and how it has changed the computer vision landscape by replacing the most famous convolution operator with only self-attention. The webinar will discuss how the proposed architecture succeeded to outperform CNN-based architectures like ResNet by only stacking transformer layers and by considering input images as patches tokens. Considerations such as scalability, complexity, interpretability and other ViT variants will be discussed as well.
Rich-resource languages have plenty of frameworks to consider when developing for language technology purposes. For low-resourced languages, either no frameworks exist such as the Amazigh language or very few components are integrated in known and large frameworks. We present a comparative study of frameworks in order to clarify which ones can handle Arabic suitably and report on best practices to be applied for low-resource languages.
In this project, a Reinforcement Learning (RL) agent is trained to obtain a safe policy thus rendering a risk averse agent that prioritize avoiding worst case scenarios.
The model leverages distributional RL (e.g. Deep Quantile Regression) and optimizes the Conditional Value at Risk (CVaR) thus providing the user with an adjustable level of risk aversion.
Many applications can benefit from this approach, especially in fields where worst case scenarios are inadmissible such as security or medicine.
Plastic debris are one of the most widespread debris contributing to marine pollution, it threatens food safety and quality, human health, coastal tourism and contributes to climate changes. Remote sensing has shown great effectiveness in locating this type of debris. By leveraging AI/ML and hydrodynamic ocean models, we will demonstrate how to detect, quantify and track plastic marine debris in the marine environment.
The problem of human motion (face and body) prediction and generation is at the core of many applications in computer vision and robotics, such as human-robot interaction, autonomous driving and computer graphics. In this talk I will present some of our recent achievements addressing these specific aspects: 1) generating videos of the facial expressions given a neutral face image, 2) dynamic 3D expression generation from an expression label, 3) Human motion prediction and generation of 3D skeleton. We model the temporal evolution of the 3D human motion and face expression as trajectory, what allows us to map human motions to single points on a sphere manifold. We propose a manifold-aware Wasserstein generative adversarial model that captures the temporal and spatial dependencies of facial expression and human motion through different losses. Our solutions score best on diverse benchmarks.
How do we compare between hypotheses that are entirely consistent with observations? The marginal likelihood (aka Bayesian evidence), which represents the probability of generating our observations from a prior, provides a distinctive approach to this foundational question, automatically encoding Occam’s razor. Although it has been observed that the marginal likelihood can overfit and is sensitive to prior assumptions, its limitations for hyperparameter learning and discrete model comparison have not been thoroughly investigated. We revisit the appealing properties of the marginal likelihood for learning constraints and hypothesis testing. Then, we highlight the conceptual and practical issues in using the marginal likelihood as a proxy for generalization. Namely, we show how marginal likelihood can be negatively correlated with generalization, with implications for neural architecture search, and can lead to both underfitting and overfitting in hyperparameter learning. We provide a partial remedy through a conditional marginal likelihood, which we show is more aligned with generalization, and practically valuable for large-scale hyperparameter learning, such as in deep kernel learning.