

This version of the dataset contains 7 fewer plates 406 compared to 413). The updated pipeline also produces more measurements per cell (n=1783) compared to the previous version.

Based on this set, two expert CellProfiler users produced an improved segmentation pipeline that was used to reprocess all the 406 plates. To evaluate the quality of segmentations, 30 wells were randomly sampled across seven plate maps from the bioactive compound collection, and one site was randomly sampled per well, producing a test set of 210 five-channel images. The current version of the dataset was generated using updated CellProfiler pipelines to improve the quality of cell and nucleus segmentations. Integration with genetically-perturbed datasets could enable identification of small-molecule mimetics of particular disease- or gene-related phenotypes that could be useful as probes or potential starting points for development of future therapeutics. The data set can be mined for many purposes, including small-molecule library enrichment and chemical mechanism-of-action studies, such as target identification. Lastly, chemical annotations are supplied for the compound treatmentsīecause computational algorithms and methods for handling single-cell morphological measurements are not yet routine, the dataset serves as a useful resource for the wider scientific community applying morphological (image-based) profiling. Quality-control metrics are provided as metadata, indicating fields of view that are out-of-focus or containing highly fluorescent material or debris. It also includes data files containing morphological features derived from each cell in each image, both at the single-cell level and population-averaged (i.e., per-well) level the image analysis workflows that generated the morphological features are also provided. This microscopy data set includes 919,265 five-channel fields of view representing 30,616 tested compounds, available at The Cell Image Library repository. Highly multiplexed measurements of cellular morphology can be extracted from each image and subsequently mined for a number of applications. Large-scale image sets acquired by automated microscopy of perturbed samples enable a detailed comparison of cell states induced by each perturbation, such as a small molecule from a diverse library.
