Walkability of Vulnerable Populations

This project develops person-centered models of walkability for vulnerable urban populations. The research combines a large-scale survey (about 1,000 participants) with spatial mapping of neighborhood characteristics and individual-level attributes, and aims to apply AI methods to predict the level of walkability that a specific person is likely to experience in a given neighborhood. Results will be cross-referenced with existing activity datasets to validate and refine predictions.

Goal

The project aims to build personalized walkability models that estimate, for each individual, the likelihood they will walk versus use other transport modes under specific environmental conditions by mapping spatial neighborhood characteristics alongside individual-level features and applying artificial-intelligence methods to predict a person’s walkability level; it will use a comprehensive survey of roughly 1,000 participants and cross-reference the survey results with existing activity datasets to evaluate and refine the predictive models and to identify the environmental and personal drivers of walkability for vulnerable populations.

Researchers

What we did

In progress.

Results

Forthcoming.

PHLAME – PHenotype prediction Live Assessment from Metagenomic sequencing data

The human gut microbiome contains DNA that can help predict host traits such as disease status, but tools are usually tested on different data, making fair comparison difficult. PHLAME is a live benchmarking platform that runs standardized phenotype-prediction tasks on the same gut-metagenome datasets, computes uniform scores, and displays a public leaderboard so researchers can compare methods directly. The platform includes real-world tasks (colorectal cancer, IBD, delivery mode, general health, schizophrenia) and promotes responsibility and practical tool development. PHLAME is publicly available for ongoing benchmarking at PHLAME.

Goal

Provide a unified, continuously updated framework for evaluating methods that predict host phenotypes from gut metagenomes, so researchers can identify approaches that generalize across realistic, real-world scenarios.

Researchers

What we did

We built a web application that hosts prediction tasks, accepts submissions, computes standardized metrics, and shows a live leaderboard. We curated diverse training and test sets from public cohorts (tasks include CRC, IBD, Delivery Mode — western & non-western tests, General Health Status, Schizophrenia) and designed tasks to reflect common real-world challenges: varying dataset sizes, imbalanced classes, longitudinal sampling, batch effects, and cross-cohort differences. We provided baseline workflows and reference implementations so new methods can be compared to standard pipelines.

Results

Systematic evaluations using PHLAME show that microbiome-based phenotype prediction remains challenging. In many cases, classic machine-learning methods (e.g., Random Forest) perform competitively with more complex approaches while being simpler to run and reproduce. PHLAME highlights current limitations and creates a stable environment for developing and testing improved, more practical methods.

Anomaly detection using set representations and density estimations

Anomaly detection using set representations and density estimations

Anomaly detection aims to automatically identify samples that exhibit unexpected behavior. We tackle the challenging task of detecting anomalies consisting of an unusual combination of normal elements (`logical anomalies`). For example, consider the case where normal images contain two screws and two nuts but anomalous images may contain one screw and three nuts. We propose to detect logical anomalies using set representations. We score anomalies using density estimation on the set of representations of local elements. Our simple-to-implement approach outperforms the state-of-the-art in image-level logical anomaly detection and sequence-level time series anomaly detection.(nuts or screws) occur in natural images, previous anomaly detection methods relying on anomalous patches would not succeed. Instead, a more holistic understanding of the image is required. You can check out the preprint at: https://arxiv.org/pdf/2302.12245.pdf

Goal

Set Features for Fine-grained Anomaly Detection

Researchers

What we did

Fine-grained anomaly detection has recently been dominated by segmentationbased approaches. These approaches first classify each element of the sample (e.g., image patch) as normal or anomalous and then classify the entire sample as anomalous if it contains anomalous elements. However, such approaches do not extend to scenarios where the anomalies are expressed by an unusual combination of normal elements. We overcome this limitation by proposing set features that model each sample by the distribution of its elements. We compute the anomaly score of each sample using a simple density estimation method. Our simple-to-implement approach1 outperforms the state-of-the-art in image level logical anomaly detection (+3.4%) and sequence-level time series anomaly detection (+2.4%).

Results

Preprint: https://arxiv.org/pdf/2302.12245.pdf

What do AI Models Know? A Case Study on Visual Question Answering

A major challenge in recent AI literature is understanding why state-of-the-art deep learning models show great success on a range of datasets while they severely degrade in performance when presented with examples which slightly vary from their training distribution. In this proposal, we will examine this question in the context of visual question-answering, a challenging task which requires models to jointly reason over images and text. We will start our exploration with the GQA dataset, which, along with images and text, also includes a rich semantic scene graph, representing the spatial relations between objects in the image, and thus lends itself to probing through high-quality automatic manipulation. In prelminary work we have augmented GQA with examples that vary slightly from the original questions, and shown that here too high-performing models perform much worse on the augmented questions compared to the original ones. Our proposal will analyze our results, exploring the reasons for the drop in performance, and what makes our new questions more challenging. We also plan to to generalize the reasons we find to other datasets of visual question answering, and more broadly to other AI datasets. Given those insights, we will augment the training set with instances that capture model “blind spots”, in an attempt to improve the model’s generalization ability. Our results will improve our understanding of what state-of-the-art AI models know, what they are still missing, and how can we improve them based on this new understanding.

Goal

To achieve a better understanding of the limitations of current state-of-the-art models and datasets, as well as ways to improve them.

Researchers

What we did

In progress

Results

Forthcoming