Abstract |
Single-cell gene expression data with positional information is critical to dissect mechanisms and architectures of multicellular organisms, but the potential is limited by the scalability of current data analysis strategies. Here, we present scGCO, a method based on fast optimization of hidden Markov Random Fields with graph cuts to identify spatially variable genes. Comparing to existing methods, scGCO delivers a superior performance with lower false positive rate and improved specificity, while demonstrates a more robust performance in the presence of noises. Critically, scGCO scales near linearly with inputs and demonstrates orders of magnitude better running time and memory requirement than existing methods, and could represent a valuable solution when spatial transcriptomics data grows into millions of data points and beyond. Single-cell gene expression data with positional information is critical to dissect mechanisms and architectures of multicellular organisms, but the potential is limited by the scalability of current data analysis strategies. Here the authors develop a highly scalable method, scGCO, to identify genes whose expression values form spatial patterns from spatial transcriptomics data. |