Northstar: leveraging cell atlases to identify healthy and neoplastic cells in transcriptomes from human tumors

Cell atlases are revolutionizing our understanding of tissue and disease heterogeneity, yet most single-cell transcriptomic analyses on tumors are not leveraging atlases effectively. We developed northstar, a computational approach to classify cells in tumor datasets guided by but not restricted by previously annotated cell atlases. To benchmark northstar, we transferred annotations from a human brain atlas to a published dataset on glioblastoma and could recapitulate the tumor composition accurately and within seconds. We then collected 1,622 cells from 11 pancreatic tumors and could robustly identify healthy pancreatic and immune cells and neoplastic cell states. Three cell populations were shared across patients while five were private to a single sample. northstar’s cell type classification offered rapid insight into the origins of neuroendocrine and exocrine tumors and fibromatosis. northstar is a useful tool to classify single-cell transcriptomes into known and novel cell types in the age of cell atlases.


biorxiv preprint
Figure 3 Application of northstar to a new pancreatic cancer dataset. (A) t-SNE of 11 pancreatic tumors together with averages from two atlases by Baron et al (2016) and Zanini et al. (2018) [18,19]. Stars: atlas averages. (B) Number of cells from each tumor and fractions of cells belonging to each cell type. (C) Top differentially expressed genes for each novel cluster (top) and some known markers (bottom) for pancreatic cancers [32]. Black indicates no expression, red to white increasing expression levels. PTPRC (CD45) and IGKC are added to highlight the probable B cell phenotype of cluster 19. First bar: novel cluster (colored as in A and B). Second bar: patient of origin, with legend below.

Figure S4. Key marker genes for different pancreatic and blood cell types projected onto the t-SNE of Fig. 3A, colored by their expression level (low: blue, intermediate: green, high: yellow). As in Fig. 3A, squares indicate atlas cell types, circles new cells from pancreatic tumors.

Primary files

Lab analysis

Biomarkers, protocols, clustering or other supplementary files supplied by the lab

Secondary analysis

Expression Matrix (lab-generated) | Expression matrix (UCSC) | QC Metrics

CESCG Center Standard Analysis

FastQC | Picard | RSEM | STAR | bigWig

Tertiary analysis

Cell Browser

Sample Psychic




JCVI BioMarkers

