大数据挑战
连接组学数据规模将进一步扩大,给数据存取、管理、传输等带来挑战。
目前的连接组学领域内已经存在太字节(Terabyte)12和拍字节(Petabyte)3规模的项目。如此大规模的数据通量对连接组学图像处理提出了严峻的考验4。针对大规模数据的挑战,在数据管理方面,DVID(Distributed Versioned Image-Oriented Dataservice)5以及Boss(Brain Observatory Storage Service)6都是较为常用的方案,能够支持大规模数据的云端存储以及高效调用;在图像重构方面,ASAP(Assembly Stitching and Alignment Pipeline)7和SEAMLeSS(Siamese Encoding and Alignment by Multiscale Learning with Self Supervision)8 都能够 在分布式集群上通过并行计算实现超大规模数据的图像重构;在数据重构方面,已有webKnossos9、CATMAID10等支持基于网络的标注工具实现大规模数据的多人协同标注。然而这些方案的实施和维护均需要额外的专业技术人员来完成。
-
Nicholas L Turner, Thomas Macrina, J Alexander Bae, Runzhe Yang, Alyssa M Wilson, Casey Schneider-Mizell, Kisuk Lee, Ran Lu, Jingpeng Wu, Agnes L Bodor, and others. Reconstruction of neocortex: organelles, compartments, cells, circuits, and activity. Cell, 185(6):1082–1100, 2022. ↩
-
Zhihao Zheng, J Scott Lauritzen, Eric Perlman, Camenzind G Robinson, Matthew Nichols, Daniel Milkie, Omar Torrens, John Price, Corey B Fisher, Nadiya Sharifi, and others. A complete electron microscopy volume of the brain of adult drosophila melanogaster. Cell, 174(3):730–743, 2018. ↩
-
J Alexander Bae, Mahaly Baptiste, Agnes L Bodor, Derrick Brittain, JoAnn Buchanan, Daniel J Bumbarger, Manuel A Castro, Brendan Celii, Erick Cobos, Forrest Collman, and others. Functional connectomics spanning multiple areas of mouse visual cortex. bioRxiv, 2021. ↩
-
Jeff W Lichtman, Hanspeter Pfister, and Nir Shavit. The big data challenges of connectomics. Nature neuroscience, 17(11):1448–1454, 2014. ↩
-
William T Katz and Stephen M Plaza. Dvid: distributed versioned image-oriented dataservice. Frontiers in neural circuits, pages 5, 2019. ↩
-
Robert Hider Jr, Dean Kleissas, Timothy Gion, Daniel Xenes, Jordan Matelsky, Derek Pryor, Luis Rodriguez, Erik C Johnson, William Gray-Roncal, and Brock Wester. The brain observatory storage service and database (bossdb): a cloud-native approach for petascale neuroscience discovery. Frontiers in Neuroinformatics, 2022. ↩
-
Gayathri Mahalingam, Russel Torres, Daniel Kapner, Eric T Trautman, Tim Fliss, Sharmishtaa Seshamani, Eric Perlman, Rob Young, Samuel Kinn, J Buchanan, and others. A scalable and modular automated pipeline for stitching of large electron microscopy datasets. bioRxiv, 2021. ↩
-
Sergiy Popovych, Thomas Macrina, Nico Kemnitz, Manuel Castro, Barak Nehoran, Zhen Jia, J Alexander Bae, Eric Mitchell, Shang Mu, Eric T Trautman, and others. Petascale pipeline for precise alignment of images from serial section electron microscopy. bioRxiv, 2022. ↩
-
Kevin M Boergens, Manuel Berning, Tom Bocklisch, Dominic Bräunlein, Florian Drawitsch, Johannes Frohnhofen, Tom Herold, Philipp Otto, Norman Rzepka, Thomas Werkmeister, and others. Webknossos: efficient online 3d data annotation for connectomics. nature methods, 14(7):691–694, 2017. ↩
-
Stephan Saalfeld, Albert Cardona, Volker Hartenstein, and Pavel Tomančák. Catmaid: collaborative annotation toolkit for massive amounts of image data. Bioinformatics, 25(15):1984–1986, 2009. ↩