Developing tools to help scientists automate their scientific data management and analysis workflows is the aim of a new $1.7 million, three-year grant from the National Science Foundation to 51³Ô¹ÏºÚÁÏ Davis, 51³Ô¹ÏºÚÁÏ Santa Barbara and 51³Ô¹ÏºÚÁÏ San Diego. The project will develop Kepler/CORE, a Comprehensive, Open, Reliable, and Extensible Scientific Workflow Infrastructure.
From bioinformatics and ecology to nuclear physics and astronomy, scientists can employ Kepler scientific workflows for desktop data analysis, remote execution monitoring, or to move around huge volumes of data reliably, leaving scientists more time to think and do science.
Kepler has grown out of a grassroots collaboration between research projects funded by the National Science Foundation and the U.S. Department of Energy, based on a 51³Ô¹ÏºÚÁÏ Berkeley project and system called PTOLEMY II, said Bertram Ludaescher, an associate professor in the 51³Ô¹ÏºÚÁÏ Davis Department of Computer Science and at the Genome Center, and principal investigator on the grant.
"In the last few years, Kepler has been used and extended in various ways, but different projects tend to pull the system in different directions," Ludaescher said. The new project will develop a software core that facilitates independent extensions to support wider adoption of the system, he said.
Ludaescher describes Kepler as a tool for developing and managing scientific workflows, which in turn take raw or derived data and further process it and put it into forms that are easier to analyze and work with.
Computers have been used to automate routine data-processing tasks since the earliest days of computing. Short programs or scripts written for a specific task are widely used in science. But these scripts may need to be rewritten for new or slightly different tasks, or to communicate with each other.
Instead, a workflow system like Kepler uses a simple, intuitive graphical interface of draggable modules and connections that allows users to quickly assemble a workflow that meets their needs, without having to understand all of the underlying computer code.
The goal is to help optimize scientists' brain cycles, not just cpu cycles, by automating scientific workflows whenever possible, Ludaescher said.
The other investigators on the grant are Shawn Bowers and Timothy McPhillips, 51³Ô¹ÏºÚÁÏ Davis Genome Center; Ilkay Altintas, San Diego Supercomputer Center; and Matthew Jones, 51³Ô¹ÏºÚÁÏ Santa Barbara.
Media Resources
Andy Fell, Research news (emphasis: biological and physical sciences, and engineering), 530-752-4533, ahfell@ucdavis.edu
Bertram Ludaescher, Computer Science, (530) 754-8576, ludaesch@ucdavis.edu