![]() Isolating conversation or conversations while individuals are walking and talking is a classic signal-processing blind-separation problem. The problem is often likened to extracting the individual voices at a noisy cocktail party, with a set of microphones recording the chatter. These processes or hidden features remain latent-they’re not directly observable and are confusingly mixed with each other, with unimportant features, and with noise. But in big-data analytics, it is difficult to directly link these observables to the underlying processes that generate the data. These vast data sets are formed exclusively by observable quantities, by definition-think of eyes, noses, and ears. Making sense of this ever-increasing racket is vital to national security, economic stability, individual health, and practically every branch of science. Our world is awash in a seemingly bottomless ocean of data pouring in from sources ranging from satellites and MRI scans to massive computer simulations and seismic-sensor networks, from electronic surveillance to smart phones, from genome sequencing of SARS-Cov-2 to COVID-19 test results, from social networks to vast number of texts. If you think of the data cube as being made up of many small, stacked cubes, each one represents information about some or all of the features, or dimensions of the data. As more of these features are added, the cube becomes more complex-with more dimensions-than the simple 3D cube. So, in a business example, information about customers might be on the X axis, information about annual sales on the Y axis, and information about manufacturing on the Z axis. Each axis defining the cube represents a different dimension of the data. SmartTensors works with the notion of a tensor, or multidimensional data cube. In very large datasets-measured in billions of millions of bytes-these features are frequently unknown and invisible to direct observation, obscured by a torrent of less-useful information and noise presented in the data. In other database examples, latent features may represent climate processes, watershed mechanisms, hidden geothermal resources, carbon sequestration processes, chemical reactions, protein structures, pharmaceutical molecules, cancerous mutations in human genomes, and so on. For instance, maybe eyebrows and chins are unnecessary for facial recognition. It also can determine how many of those features-the optimal number-are required to do the job accurately and reliably. SmartTensors can be pointed at a database of faces and, without human guidance, isolate those features. For instance, in a database of human faces, key features are noses, eyes, eyebrows, ears, mouths, and chins. The extracted features are explainable and understandable.įeatures are discrete, intelligible chunks of data. The nonnegativity of the latent features and determining their optimal number reduce a vast data set to a scale that’s manageable for computers to process and subject-matter-experts to analyze. “Finding the optimal number of features is a way to reduce the dimensions in the data while being sure you’re not leaving out significant parts that lead to understanding the underlying processes shaping the whole the dataset,” said Velimir (“Monty”) Vesselinov, an expert in machine learning, data analytics and model diagnostics at Los Alamos and also a principal investigator. SmartTensors also can identify the optimal number of features needed to make sense of enormous, multidimensional datasets. “What our AI software does best is extracting the latent variables, or features, from the data that describe the whole data and mechanisms buried in it without any preconceived hypothesis.” “SmartTensors analyzes terabytes of diverse data to find the hidden patterns and features that make the data understandable and reveal its underlying processes or causes,” said Boian Alexandrov, a scientist at Los Alamos National Laboratory, AI expert, and principal investigator on the project. SmartTensors sifts through millions of millions of bytes of diverse data to find the hidden features that matter, with significant implications from health care to national security, climate modeling to text mining, and many other fields. LOS ALAMOS, N.M., March 11, 2021-Making sense of vast streams of big data is getting easier, thanks to an artificial-intelligence tool developed at Los Alamos National Laboratory.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |