In late 2023, the first drug with potential to slow the progression of Alzheimer's disease was approved by the U.S. Federal Drug Administration. Alzheimer's is one of many debilitating neurological disorders that together affect one-eighth of the world's population, and while the new drug is a step in the right direction, there is still a long journey ahead to fully understanding it, and other such diseases.
"Reconstructing the intricacies of how the human brain functions on a cellular level is one of the biggest challenges in neuroscience," says Lars Gjesteby, a technical staff member and algorithm developer from the MIT Lincoln Laboratory's Human Health and Performance Systems Group. "High-resolution, networked brain atlases can help improve our understanding of disorders by pinpointing differences between healthy and diseased brains. However, progress has been hindered by insufficient tools to visualize and process very large brain imaging datasets."
A networked brain atlas is in essence a detailed map of the brain that can help link structural information with neural function. To build such atlases, brain imaging data need to be processed and annotated. For example, each axon, or thin fiber connecting neurons, needs to be traced, measured, and labeled with information. Current methods of processing brain imaging data, such as desktop-based software or manual-oriented tools, are not yet designed to handle human brain-scale datasets. As such, researchers often spend a lot of time slogging through an ocean of raw data.
Gjesteby is leading a project to build the Neuron Tracing and Active Learning Environment (NeuroTrALE), a software pipeline that brings machine learning, supercomputing, as well as ease of use and access to this brain mapping challenge. NeuroTrALE automates much of the data processing and displays the output in an interactive interface that allows researchers to edit and manipulate the data to mark, filter, and search for specific patterns.
Untangling a ball of yarn
One of NeuroTrALE's defining features is the machine-learning technique it employs, called active learning. NeuroTrALE's algorithms are trained to automatically label incoming data based on existing brain imaging data, but unfamiliar data can present potential for errors. Active learning allows users to manually correct errors, teaching the algorithm to improve the next time it encounters similar data. This mix of automation and manual labeling ensures accurate data processing with a much smaller burden on the user.
"Imagine taking an X-ray of a ball of yarn. You'd see all these crisscrossed, overlapping lines," says Michael Snyder, from the laboratory's Homeland Decision Support Systems Group. "When two lines cross, does it mean one of the pieces of yarn is making a 90-degree bend, or is one going straight up and the other is going straight over? With NeuroTrALE's active learning, users can trace these strands of yarn one or two times and train the algorithm to follow them correctly moving forward. Without NeuroTrALE, the user would have to trace the ball of yarn, or in this case the axons of the human brain, every single time." Snyder is a software developer on the NeuroTrALE team along with staff member David Chavez.
Because NeuroTrALE takes the bulk of the labeling burden off of the user, it allows researchers to process more data more quickly. Further, the axon tracing algorithms harness parallel computing to distribute computations across multiple GPUs at once, leading to even faster, scalable processing. Using NeuroTrALE, the team demonstrated a 90 percent decrease in computing time needed to process 32 gigabytes of data over conventional AI methods.
The team also showed that a substantial increase in the volume of data does not translate to an equivalent increase in processing time. For example, in a recent study they demonstrated that a 10,000 percent increase in dataset size resulted in only a 9 percent and a 22 percent increase in total data processing time, using two different types of central processing units.
"With the estimated 86 billion neurons making 100 trillion connections in the human brain, manually labeling all the axons in a single brain would take lifetimes," adds Benjamin Roop, one of the project's algorithm developers. "This tool has the potential to automate the creation of connectomes for not just one individual, but many. That opens the door for studying brain disease at the population level."
The open-source road to discovery
The NeuroTrALE project was formed as an internally funded collaboration between Lincoln Laboratory and Professor Kwanghun Chung's laboratory on MIT campus. The Lincoln Lab team needed to build a way for the Chung Lab researchers to analyze and extract useful information from their large amount of brain imaging data flowing into the MIT SuperCloud — a supercomputer run by Lincoln Laboratory to support MIT research. Lincoln Lab's expertise in high-performance computing, image processing, and artificial intelligence made it exceptionally suited to tackling this challenge.
In 2020, the team uploaded NeuroTrALE to the SuperCloud and by 2022 the Chung Lab was producing results. In one study, published in Science, they used NeuroTrALE to quantify prefrontal cortex cell density in relation to Alzheimer's disease, where brains affected with the disease had a lower cell density in certain regions than those without. The same team also located where in the brain harmful neurofibers tend to get tangled in Alzheimer's-affected brain tissue.
Work on NeuroTrALE has continued with Lincoln Laboratory funding and funding from the National Institutes of Health (NIH) to build up NeuroTrALE's capabilities. Currently, its user interface tools are being integrated with Google's Neuroglancer program — an open-source, web-based viewer application for neuroscience data. NeuroTrALE adds the ability for users to visualize and edit their annotated data dynamically, and for multiple users to work with the same data at the same time. Users can also create and edit a number of shapes such as polygons, points, and lines to facilitate annotation tasks, as well as customize color display for each annotation to distinguish neurons in dense regions.
"NeuroTrALE provides a platform-agnostic, end-to-end solution that can be easily and rapidly deployed on standalone, virtual, cloud, and high performance computing environments via containers." says Adam Michaleas, a high performance computing engineer from the laboratory's Artificial Intelligence Technology Group. "Furthermore, it significantly improves the end user experience by providing capabilities for real-time collaboration within the neuroscience community via data visualization and simultaneous content review."
To align with NIH's mission of sharing research products, the team's goal is to make NeuroTrALE a fully open-source tool for anyone to use. And this type of tool, says Gjesteby, is what's needed to reach the end goal of mapping the entirety of the human brain for research, and eventually drug development. "It's a grassroots effort by the community where data and algorithms are meant to be shared and accessed by all."
The codebases for the axon tracing, data management, and interactive user interface of NeuroTrALE are publicly available via open-source licenses. Please contact Lars Gjesteby for more information on using NeuroTrALE.