Astra Zeneca Lung Tissue Data

Description: Lung tissue images containing channels nuclei stain, auto fluorescent and two fluorescent stains of different drugs. Large tissue images of around 23’000px X 35´000px. 2318 labelled ROI images. 40 annotated examples for segmentation.

Provider: Johan Karlsson at Astra Zeneca

Responsible person: Håkan Wieslander

Publications: Wieslander et al. Deep learning and conformal prediction for hierarchical analysis of large-scale whole slide tissue images. Under Review

Vironova Video Stream datasets

Description: Image frames captured while moving the microscope over a sample. For each field of view, corresponding high resolution frames are acquired. Various areas imaged where each area contains ~ 100 motion degraded frames of size 1024 x 1024 with corresponding 40-70 high resolution frames of size 2048 x 2048

Provider: Ida-Maria Sintorn at Vironova

Responsible person: Håkan Wieslander

Publications: Wieslander et al. TEM Image Restoration From Fast Image Stream. (Accepted for poster presentation at Swedish Symposium for Deep Learning 2020) 

Astra Zeneca lipid-nanoparticle (LNP) drug delivery dataset

Provider: Alan Sabirsh at Astra Zeneca

Description: For nine wells and two fields of view images (each 2554 x 2154 pixels with a pixel resolution of 0.1625 𝜇m/pixel) were acquired every ten minutes for twelve hours. Images included four channels: brightfield; a cell counterstain (shown in purple above); LNP (yellow); and GFP (green). Three LNP doses were added (in triplicate) to the wells: 0, 31.6 and 316 ng mRNA/well (25ul). If the drug is successfully uptaken by the cell then GFP expression occurs.

Responsible person: Phil Harrison

Publications: Deep learning models for lipid-nanoparticle-based drug delivery.

Provider: Spjuth research group

Description: The image data provided here is for U2OS cells treated with compounds belonging to ten MoA classes (MoAs that we believed would be reasonably separable and that had a sufficient number of compounds (n) associated with them in our assay). The 10 MoAs were: ATPase inhibitors (ATPase-i, n = 18); Aurora kinase inhibitors (AuroraK-i, n = 20);  HDAC inhibitors (HDAC-i, n = 33); HSP inhibitors (HSP-i, n = 24); JAK inhibitors (JAK-i, n = 21); PARP inhibitors (PARP-i, n = 21); protein synthesis inhibitors (Prot.Synth.-i, n = 23); retinoid receptor agonists (Ret.Rec.Ag, n = 19); topoisomerase inhibitors (Topo.-i, n = 32); and tubulin polymerization inhibitors (Tub.Pol.-i, n = 20). The compounds were administered at a dose of 10 micromolar and exposed for 48 h, in 384 well plates. Each compound-level experiment was replicated 6 times. The compounds were distributed across 18 microplates. Images (16-bit, 2160×2160 pixels) were captured with a 20X objective at five sites/fields-of-view in each well, with five fluorescence channels for the Cell Painting fluorescence (FL) data and six evenly spaced z-planes for the brightfield (BF) data.

Dataset size: 76,032 images, 590 GB

Responsible person: Phil Harrison

Link to data on FigShare with additional descriptions:


– Harrison PJ, Gupta A, Rietdijk J, Wieslander H, Carreras-Puigvert J, Georgiev P, Wählby C, Spjuth O, Sintorn IM.
Evaluating the utility of brightfield image data for mechanism of action prediction
PLOS Computational Biology19, 7, e1011323. (2023). DOI: 10.1371/journal.pcbi.1011323

Tian G, Harrison PJ, Sreenivasan AP, Carreras-Puigvert J, Spjuth O
Combining molecular and cell painting image data for mechanism of action prediction
Artificial Intelligence in Life Science3, 100060 (2023). DOI: 10.1016/j.ailsci.2023.100060