Zaid Harchaoui – UW News /news Tue, 01 Sep 2020 21:35:30 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.4 UW launches Institute for Foundations of Data Science /news/2020/09/01/uw-launches-institute-for-foundations-of-data-science/ Tue, 01 Sep 2020 18:02:33 +0000 /news/?p=70090
The UW will host the Institute for Foundations of Data Science to develop the theoretical foundations of a fast-growing field: data science. Maryam Fazel, shown here in a 2015 photo, will lead the institute. Photo: Patrick Bennett/天美影视传媒

The 天美影视传媒 will lead a team of institutions in establishing an interdisciplinary research institute that brings together mathematicians, statisticians, computer scientists and engineers to develop the theoretical foundations of a fast-growing field: data science.

The Institute for Foundations of Data Science (IFDS) is a collaboration between the UW and the Universities of Wisconsin-Madison, California Santa Cruz, and Chicago, with a mission to develop a principled approach to the analysis of ever-larger, more complex and potentially biased data sets that play an increasingly important role in industry, government and academia.

Support for the IFDS comes from a $12.5 million grant from the and its Transdisciplinary Research in Principles of Data Science, or , program. Today, the . TRIPODS is tied to the NSF鈥檚program, which aims to accelerate discovery and innovation in data science algorithms, data cyberinfrastructure and education and workforce development.

鈥淲ith NSF鈥檚 $25 million investment, these interdisciplinary teams will be able to tackle some of the most important theoretical and technical questions in data science,鈥 said NSF Division Director for the Division of Mathematical Sciences Juan Meza.

IFDS research will lead to algorithmic decision-making processes that tackle incomplete or ambiguous datasets and are better able to respond and act in changing environments. The team will also study some of the ethical implications of data-driven algorithms.

The UW team, clockwise from top left: Maryam Fazel, Zaid Harchaoui, Kevin Jamieson, Yin Tat Lee, Abel Rodriguez and Dmitriy Drusvyatskiy. Photo: 天美影视传媒

鈥淎s data science is increasingly incorporated in all facets of our lives, its success is uncovering pressing challenges that call for new theories,鈥 said a UW electrical and computer engineering professor and the lead principal investigator for the IFDS. 鈥淲e need the expertise of all core disciplines to understand the mysteries and to address the pitfalls of data science and artificial intelligence algorithms.鈥

“The success of the UW team in establishing the IFDS stems from having fantastic faculty from four departments, representing both arts and sciences and engineering, working collaboratively on the most important foundational questions of data science,鈥 said Nancy Allbritton, dean of the College of Engineering. Dan Pollack, dean for the Natural Sciences, added, 鈥淲e are confident that this multi-institutional, multi-disciplinary, effort聽will聽shape the future of the field.”

The UW team of investigators has been laying the groundwork for IFDS during the past three years. UW鈥檚 was established in 2017 with a $1.5 million award from the NSF.聽Since then, the team has collaborated across disciplinary boundaries to address reliability and scalability of data science algorithms, and has also forged new partnerships.

鈥淭he strategic partnership between Washington and Wisconsin was crucial to the success of IFDS in the Phase II competition, and we are excited to build on this relationship over the next five years, 鈥 said Stephen Wright, a professor of computer science who headed the TRIPODS Phase I effort at the University of Wisconsin.

In 2018, the UW team received three additional awards from the NSF鈥檚 new program, through which members of the team partnered with other researchers to address data science challenges in fields such as robotics and epidemiology.

鈥淚FDS is an exciting culmination of these Phase I efforts,鈥 said Fazel, who is also the Moorthy Family Professor in the electrical and computer engineering department. 鈥淚t opens the door to further collaborations across our partner institutions and with practitioners in academia and industry, and helps place the UW and Seattle prominently in the national data science research effort.鈥

IFDS research addresses new fundamental problems that echo classical results in mathematical optimization, robust statistics, statistical inference and decision theory.

鈥淭he team adopts a neoclassical viewpoint in order to define notions of optimality, robustness and calibration, that is relevant for modern day data science. These new notions will shape the research in order to develop new theories, methods and algorithms to be used by scientists and engineers,鈥 said co-principal investigator , an associate professor of statistics.

The five-year funding plan for the IFDS Phase II includes support for new research projects, workshops, a partnership across the four research sites and students and postdoctoral scholars co-advised by faculty from different fields. Plans for education and outreach will draw on previous experience of IFDS members and leverage institutional resources at all four sites.

鈥淎 central goal of IFDS is to develop algorithms with best-in-class performance for data scientific tasks. Recent breakthroughs in this area (in part by UW investigators) have benefitted from combining techniques across computer science, mathematics and statistics. An interdisciplinary approach to data science will be a key ingredient of the future work at IFDS.鈥 said co-principal investigator , an associate professor of mathematics.

IFDS will cultivate existing ties with the , as well as work with the newly-announced NSF AI Institute, in which UW also participates.

In addition to Fazel,聽Harchaoui and Drusvyatskiy, the UW IFDS team includes and , assistant professors in the Paul G. Allen School of Computer Science & Engineering. The original UW team was recently joined by , professor and chair of the statistics department, who comes to the UW from University of California, Santa Cruz and serves as the diversity liaison for the Institute.

For more information, contact Fazel at mfazel@uw.edu.

]]>
Faculty from Allen School, Evans School tapped for NSF institutes on artificial intelligence /news/2020/08/26/faculty-from-allen-school-evans-school-tapped-for-nsf-institutes-on-artificial-intelligence/ Wed, 26 Aug 2020 18:58:17 +0000 /news/?p=70026 天美影视传媒 faculty are part of two new National Science Foundation institutes devoted to artificial intelligence research.

, a professor in the Evans School of Public Policy and Governance, will be part of the AI Institute for Research on Trustworthy AI in Weather, Climate, and Coastal Oceanography, led by the University of Oklahoma. , , and , faculty in the Paul G. Allen School of Computer Science & Engineering, and , associate professor of statistics, will be part of the AI Institute for Foundations of Machine Learning, led by the University of Texas at Austin.

The NSF on Wednesday announced five institutes in all, based at research universities around the country and part of a collaboration among the U.S. departments of Agriculture, Homeland Security and Transportation. The institutes aim to accelerate research, expand America’s workforce and transform society in the coming decades. Each institute receives $20 million in NSF funding over five years.

The National Science Foundation has announced new AI institutes at universities around the country. 天美影视传媒 faculty are affiliated with institutes based at the University of Texas and the University of Oklahoma. Photo: National Science Foundation

The NSF AI Institute for Research on Trustworthy AI in Weather, Climate, and Coastal Oceanography assembles researchers in machine learning, atmospheric and ocean science and risk communication to develop user-driven, trustworthy AI that addresses pressing concerns in weather, climate and coastal hazards prediction.

“In collaboration with our colleagues in this new institute, the risk communication research team will examine how AI information influences trust and use of AI over time by decision makers in ecological and water resource management, weather forecasting and emergency management,” Bostrom said. “It鈥檚 an exciting opportunity to advance fundamental research on mental models and perceptions of AI in environmental science contexts that have critical consequences for all of us.”

In addition to the UW and the University of Oklahoma, other participating institutions are Colorado State University, the University of New York at Albany, North Carolina State University, Texas A&M University-Corpus Christi, Del Mar College; the National Center for Atmospheric Research; and private industry partners including Google, IBM, NVIDIA and Disaster Tech.

Amy McGovern, a professor of computer science and meteorology at the University of Oklahoma and lead researcher for this NSF institute, said the long-term goal is to apply AI to a broad array of environmental challenges.

“This institute is a convergent center that will create trustworthy AI for environmental science, revolutionize prediction and understanding of high-impact weather and ocean hazards, and benefit society by protecting lives and property,” McGovern said. “Leading experts from AI, atmospheric and ocean science, risk communication, and education, will work synergistically to develop and test trustworthy AI methods that will transform our understanding and prediction of the environment.”

The NSF Institute for Foundations of Machine Learning will focus on major theoretical challenges in AI, including next-generation algorithms for deep learning, neural architecture optimization, and efficient robust statistics.

At their core, tools from machine learning still rely on models and algorithms that are often ill-equipped to process dynamic, complex datasets. For example, algorithms designed to help machines recognize, categorize and label images can鈥檛 keep up with the massive amount of video data people upload to the internet every day.

“This institute tackles the foundational challenges that need to be solved to keep AI on its current trajectory and maximize its impact on science and technology,” said Oh, an associate professor in the Allen School. “We plan to develop a toolkit of advanced algorithms for deep learning, create new methods for coping with the dynamic and noisy nature of training datasets, learn how to exploit structure in real-world data, and target more complex and real-world objectives. These four goals will help solve research challenges in multiple areas, including medical imaging and robot navigation.”

Wichita State University and Microsoft Research are also participating in this institute.

NSF’s history of investment in AI research and workforce development “paved the way for many of the breakthrough commercial technologies permeating and driving society today,” said NSF Director Sethuraman Panchanathan. “NSF invests more than $500 million in AI research annually. We are supporting five NSF AI Institutes this year, with more to follow, creating hubs for academia, industry, and government to collaborate on profound discoveries and develop new capabilities to advance American competitiveness for decades to come.”

The other NSF institutes announced Tuesday are the AI Institute for Student-AI Teaming, led by the University of Colorado Boulder; the AI Institute for Molecular Discovery, Synthetic Strategy and Manufacturing, led by the University of Illinois Urbana-Champaign; and the AI Institute for Artificial Intelligence and Fundamental Interactions, led by the Massachusetts Institute of Technology.

For more information on the NSF AI institutes, visit www.nsf.gov.

 

Adapted from press releases from the National Science Foundation, the University of Oklahoma and the University of Texas at Austin.

]]>
Three UW teams receive TRIPODS+X grants for research in data science /news/2018/09/12/tripodsx-grants-data-science/ Wed, 12 Sep 2018 21:54:07 +0000 /news/?p=58852 The National Science Foundation on Sept. 11 that it is awarding grants totaling $8.5 million to 19 collaborative projects at 23 universities for the study of complex and entrenched problems in data science. Three of these projects will be based at the 天美影视传媒 and led by researchers in the College of Engineering and the College of Arts & Sciences.

The grants build on 2017 awards in the Transdisciplinary Research in Principles of Data 鈥 or 鈥 program. These new grants make up the TRIPODS+X program, which expands these big-data projects into broader areas of science, engineering and mathematics. The lead faculty on these new projects are among the core founding faculty of the , the UW’s TRIPODS institute.

“The multidisciplinary approach for addressing the increasing volume and complexity of data enabled through the TRIPODS+X projects will have a profound impact on the field of data science and its use,” said Jim Kurose, NSF assistant director for Computer and Information Science and Engineering. “This impact will be sure to grow as data continues to drive scientific discovery and innovation.”

The TRIPODS program’s convergent and interdisciplinary approach emerged from the 2016 NSF TRIPODS workshop. Since then, the program has evolved into a community of institutes that share expertise and work together to advance the three NSF priorities central to TRIPODS: research, visioning and education. Research-track projects aim to develop new algorithms and fundamental approaches to data-driven challenges. Visioning projects focus on fostering collaboration across disciplines and help spawn well-integrated research teams that yield truly new perspectives. Education projects are pilot efforts that aim to drive workforce development in multiple disciplines and at multiple education levels. Each TRIPODS institute will have three years to use its award to expand efforts in one of these program tracks.

The first UW-led project, a research-track endeavor, is called “Safe Imitation Learning for Robotics” and is led by assistant professor of statistics and fellow . This project will focus on imitation learning in robotics, a form of learning in which a system learns through demonstration. Researchers will design trust-building learning algorithms and lay the groundwork for safe imitation-learning approaches for beneficial human-machine interaction. Additional UW researchers on this project include associate professor of electrical engineering ; , an associate professor in both the Department of Statistics and the Paul G. Allen School of Computer Science & Engineering; and , who is also a professor in the Allen School.

Fazel will lead the second TRIPODS+X project at the UW: “Foundational Training in Neuroscience and Geoscience via Hack Weeks.” This project will enhance the successful “hack week” model as a tool for data science education and collaboration. blend elements of traditional lecture-style pedagogy and participant-driven projects. Two hack week formats, one for neuroscience and one for the geosciences, have already been organized and held by researchers at the eScience Institute. For this project, hack week leaders will work to incorporate training on core methods in statistics and optimization in order to promote a deeper understanding of methodologies along with hands-on experience with data-driven problems in the geosciences and in neuroscience. Additional UW researchers on this project are at the eScience Institute; at the Applied Physics Laboratory; , an assistant professor of applied mathematics; and Harchaoui.

The third UW-led TRIPODS+X project, “Scaling Up Descriptive Epidemiology and Metabolic Network Models via Faster Sampling,” is led by , an assistant professor in the Allen School. This research track will focus on developing and disseminating practical analysis tools for public health and biological studies that involve large datasets and rely on accurate “sampling” 鈥 a principle of randomly drawing a subset of cases in a larger dataset, in order to identify trends quickly and speed up analysis. To develop these tools, this project will evaluate current big-data projects in health metrics and systems biology. Additional researchers on this project are , an associate professor in the UW’s and , a professor of computing at Georgia Tech.

“TRIPODS+X is exciting not only for its near-term impact addressing some of society’s most important scientific challenges, but because of its potential for developing tools for future applications,” said Anne Kinney, NSF assistant director for Mathematical and Physical Sciences.

###

For more information, contact Joshua Chamot with the NSF at 703-292-4489 or jchamot@nsf.gov.

Adapted from by the National Science Foundation.

]]>
What makes Bach sound like Bach? New dataset teaches algorithms classical music /news/2016/11/30/what-makes-bach-sound-like-bach-new-dataset-teaches-algorithms-classical-music/ Wed, 30 Nov 2016 16:30:10 +0000 /news/?p=50776
MusicNet is a new publicly available dataset from UW researchers that labels each note of 330 classical compositions in ways that can teach machine learning algorithms about the basic structure of music. Photo: , flickr

The composer Johann Sebastian Bach left behind an upon his death, either as an unfinished work or perhaps as a puzzle for future composers to solve.

A classical music dataset released Wednesday by 天美影视传媒 researchers 鈥 which enables machine learning algorithms to learn the features of classical music from scratch 鈥 raises the likelihood that a computer could expertly finish the job.

is the first publicly available large-scale classical music dataset with curated fine-level annotations. It鈥檚 designed to allow machine learning researchers and algorithms to tackle a wide range of open challenges 鈥 from note prediction to automated music transcription to offering listening recommendations based on the structure of a song a person likes, instead of relying on generic tags or what other customers have purchased.

鈥淎t a high level, we鈥檙e interested in what makes music appealing to the ears, how we can better understand composition, or the essence of what makes Bach sound like Bach. It can also help enable practical applications that remain challenging, like automatic transcription of a live performance into a written score,鈥 said , a UW associate professor of computer science and engineering and of statistics.

鈥淲e hope MusicNet can spur creativity and practical advances in the fields of machine learning and music composition in many ways,鈥 he said.

Described in a published Nov. 30 in the arXiv pre-print repository, MusicNet is a collection of 330 freely licensed classical music recordings with of each individual note, what instrument plays the note and its position in the composition鈥檚 metrical structure.聽 It includes more than 1 million individual labels from 34 hours of chamber music performances that can train computer algorithms to deconstruct, understand, predict and reassemble components of classical music.

鈥淭he music research community has been working for decades on hand-crafting sophisticated audio features for music analysis. We built MusicNet to give researchers a large labelled dataset to automatically learn more expressive audio features, which show potential to radically change the state-of-the-art for a wide range of music analysis tasks,鈥 said , a UW assistant professor of statistics.

It鈥檚 similar in design to , a public dataset that revolutionized the field of computer vision by labeling basic objects 鈥 from penguins to parked cars to people 鈥 in millions of photographs. This vast repository of visual data that computer algorithms can learn from has enabled huge strides in everything from image searching to self-driving cars to algorithms that recognize your face in a photo album.

鈥淎n enormous amount of the excitement around artificial intelligence in the last five years has been driven by supervised learning with really big datasets, but it hasn鈥檛 been obvious how to label music,鈥 said lead author , a UW computer science and engineering doctoral student.

鈥淵ou need to be able to say from 3 seconds and 50 milliseconds to 78 milliseconds, this instrument is playing an A. But that鈥檚 impractical or impossible for even an expert musician to track with that degree of accuracy.鈥

The UW research team overcame that challenge by applying a technique called 鈥 which aligns similar content happening at different speeds 鈥 to classical music performances. This allowed them to synch a real performance, such as Beethoven鈥檚 鈥楽erioso鈥 string quartet, to a synthesized version of the same piece that already contained the desired musical notations and scoring in digital form.

Time warping and mapping that digital scoring back onto the original performance yields the precise timing and details of individual notes that make it easier for machine learning algorithms to learn from musical data.

In their arXiv paper, the UW research team tested the ability of some common end-to-end deep learning algorithms used in speech recognition and other applications to predict missing notes from compositions. They are so machine learning researchers and music hobbyists can adapt or develop their own algorithms to advance music transcription, composition, research or recommendations.

鈥淣o one鈥檚 really been able to extract the properties of music in this way, which opens so many opportunities for creative play,鈥 said Kakade.

For instance, one could imagine asking your computer to make up a performance that鈥檚 similar to songs you鈥檝e listened to, or to hum a melody and tell it to make a fugue on command.

鈥淚鈥檓 really interested in the artistic opportunities. Any composer who crafts their art with the assistance of a computer 鈥 which includes many modern musicians 鈥 could use these tools,鈥 said Thickstun. 鈥淚f the machine has a higher understanding of what they鈥檙e trying to do, that just gives the artist more power.鈥

This research was funded by the Washington Research Foundation and the Canadian Institute for Advanced Research (CIFAR), where Harchaoui is an associate fellow.

For more information, contact the research team at musicnet@cs.washington.edu.

]]>