Mapping Chronic Pain Hotspots

This project aims to map and characterize “hot spots” of chronic pain using self-reported body-map data. The dataset includes about 200 community participants who meet chronic-pain inclusion criteria; each participant marks painful segments out of 74 body segments and assigns an intensity [0–10] to marked regions (0 used also for unmarked regions).

Goal

1) To characterize "hot spots" of pain within patterns of pain distribution and intensity (self-reported on body maps). 2) To identify clusters of people with chronic pain who share similar patterns of pain "hot spots" and/or similar distributions of pain intensity across the body. 3) To examine whether these clusters are meaningful with respect to demographic, clinical, and psychological measures, controlling for the known effect that a greater number of painful body regions is associated with worse pain outcomes.

Researchers

What we did

In progress.

Results

Forthcoming.

Geographic Imbalance in the distribution of Byzantine Coins

This project analyses large scale coin data to reconstruct how Byzantine mints supplied different regions across time, while explicitly accounting for biases in archaeological discovery and recording. Using the FLAME database and focused pilot datasets, we combine maps by time periods, grid-based visualizations, and statistical models to highlight meaningful changes in coin flows (for example, shifts in which mint supplies a region).

Goal

We want to (1) characterize how different mints distributed coins across regions and over time, (2) identify spatial patterns (which mints supplied which places) and temporal changes in those patterns, and (3) understand and document biases in the data so the analysis focus on questions that can be answered with the available records.

Researchers

What we did

In progress.

Results

Forthcoming.

LAMPP – Live Assessment of Metagenomics-based tools for host Phenotype Prediction

The human gut microbiome contains DNA that can help predict host traits such as disease status, but tools are usually tested on different data, making fair comparison difficult. LAMPP is a standardized and comprehensive benchmark designed to evaluate methods for predicting host phenotypes from gut metagenomic data. It offers a diverse suite of binary classification tasks, each comprising a labeled training set and a test set with hidden labels. LAMPP provides an open and fair platform for comparing predictive methods, with the goal of advancing the use of metagenomic data for disease diagnosis and monitoring. We encourage the development of innovative methods that not only advance state-of-the-art performance but also prioritize ease of use, ensuring that cutting-edge tools remain accessible to the broader research community. At the same time, we invite users to explore emerging tools that may better meet their needs. LAMPP is publicly available for ongoing benchmarking at LAMPP.

Goal

Provide a unified, continuously updated framework for evaluating methods that predict host phenotypes from gut metagenomes, so researchers can identify approaches that generalize across realistic, real-world scenarios.

Researchers

What we did

We built a web application that hosts prediction tasks, accepts submissions, computes standardized metrics, and shows a live leaderboard. We curated diverse training and test sets from public cohorts (tasks include CRC, IBD, Delivery Mode — western & non-western tests, General Health Status, Schizophrenia) and designed tasks to reflect common real-world challenges: varying dataset sizes, imbalanced classes, longitudinal sampling, batch effects, and cross-cohort differences. We provided baseline workflows and reference implementations so new methods can be compared to standard pipelines.

Results

Systematic evaluations using LAMPP show that microbiome-based phenotype prediction remains challenging. In many cases, classic machine-learning methods (e.g., Random Forest) perform competitively with more complex approaches while being simpler to run and reproduce. LAMPP highlights current limitations and creates a stable environment for developing and testing improved, more practical methods.

Anomaly detection using set representations and density estimations

Anomaly detection using set representations and density estimations

Anomaly detection aims to automatically identify samples that exhibit unexpected behavior. We tackle the challenging task of detecting anomalies consisting of an unusual combination of normal elements (`logical anomalies`). For example, consider the case where normal images contain two screws and two nuts but anomalous images may contain one screw and three nuts. We propose to detect logical anomalies using set representations. We score anomalies using density estimation on the set of representations of local elements. Our simple-to-implement approach outperforms the state-of-the-art in image-level logical anomaly detection and sequence-level time series anomaly detection.(nuts or screws) occur in natural images, previous anomaly detection methods relying on anomalous patches would not succeed. Instead, a more holistic understanding of the image is required. You can check out the preprint at: https://arxiv.org/pdf/2302.12245.pdf

Goal

Set Features for Fine-grained Anomaly Detection

Researchers

What we did

Fine-grained anomaly detection has recently been dominated by segmentationbased approaches. These approaches first classify each element of the sample (e.g., image patch) as normal or anomalous and then classify the entire sample as anomalous if it contains anomalous elements. However, such approaches do not extend to scenarios where the anomalies are expressed by an unusual combination of normal elements. We overcome this limitation by proposing set features that model each sample by the distribution of its elements. We compute the anomaly score of each sample using a simple density estimation method. Our simple-to-implement approach1 outperforms the state-of-the-art in image level logical anomaly detection (+3.4%) and sequence-level time series anomaly detection (+2.4%).

Results

Preprint: https://arxiv.org/pdf/2302.12245.pdf

Infer, filter and enhance topological signals in single-cell data using spectral template matching

Infer, filter and enhance topological signals in single-cell data using spectral template matching

Single-cell RNA sequencing is a powerful technology that allows researchers to analyze gene expression in individual cells, providing insights into cellular processes and functions. However, analyzing this data can be challenging, as cells can simultaneously encode multiple, potentially cross-interfering, biological signals. A new computational method, scPrisma, was developed to address this challenge. scPrisma has the ability to uncover cellular spatiotemporal context and has the potential to drive further insights into cellular processes and functions, ultimately advancing our understanding of biology. You can check out the published article at: https://www.nature.com/articles/s41587-023-01663-5

Goal

Infer, filter and enhance topological signals in single-cell data using spectral template matching

Researchers

What we did

We apply scPrisma to the analysis of the cell cycle in HeLa cells, circadian rhythm and spatial zonation in liver lobules, diurnal cycle in Chlamydomonas and circadian rhythm in the suprachiasmatic nucleus in the brain. scPrisma can be used to distinguish mixed cellular populations by specific characteristics such as cell type and uncover regulatory networks and cell–cell interactions specific to predefined biological signals, such as the circadian rhythm. We show scPrisma’s flexibility in incorporating prior knowledge, inference of topologically informative genes and generalization to additional diverse templates and systems. scPrisma can be used as a stand-alone workflow for signal analysis and as a prior step for downstream single-cell analysis.

Results

https://www.nature.com/articles/s41587-023-01663-5