683985
The Utility of Data Dictionaries in Reconciling and Analyzing Salmon Management Action Data
Katie Barnas, David Hamm
Northwest Fisheries Science Center, Seattle, WA, USA.
The listing of Pacific Salmon has necessitated the implementation of many types of management actions, for instance habitat restoration and hatchery reform, to redress known threats or limiting factors to salmonid populations. This effort requires the compilation of disparate historic and current datasets over spatial distances never imagined at the time of data collection. We have created a protocol for dealing with diverse datasets of management actions and resolving them into a single common framework using data dictionaries.
As one example of this, we present a census of restoration projects impacting salmonid habitat in the states of Washington, Oregon, Idaho, and Montana. This was accomplished by:(1) Surveying the universe of available project data by acquiring data from known sources (2) Parsing the data based on its relevance to our goals (3) Defining the information and resolving projects into type and subtype categories by creating a common language using a data dictionary.
This process has enabled us to reconcile diverse data sets with varying formats held by private, local, state and federal entities into a single, queryable format, the Pacific Northwest Salmon Habitat Project Tracking Database. This spatially referenced database contains project-level data on 29,000 restoration actions initiated at 47,000 locations. Concurrently, at least two other projects have assembled restoration data using alternate data dictionaries. These comparable efforts collected overlapping restoration data but differ in their goals. We compared project type and project cost across the three data dictionaries to assess how data dictionary definitions change the relationship between data categories. We further demonstrate how our bottom-up methodology for creating dictionaries can be applied to salmon management actions beyond those focused on habitat.
652424
Approaches to Data Integration and Fusion for Integrated Ecosystem Assessments
Bosch, Julie; Cross, Scott; Roby, Eric; Parsons, Arthur R.
NOAA, National Coastal Data Development Center, Beaverton, OR, USA.
According to a recent definition (Levin et al., 2008), an Integrated Ecosystem Assessment (IEA) is “…a synthesis and quantitative analysis of information on relevant physical, chemical, ecological, and human processes in relation to specified management objectives.” Data discovery is foundational to the IEA process. Though some IEAs may involve the collection of new data, most will rely on making the best use of data and information that already exists in numerous, distributed databases housed at government agencies (local, state, and federal), academic institutions, and non-governmental organizations. Simple keyword searches on existing metadata resources will likely not yield the best results; however, applying a semantic understanding of the data to the searches will optimize data discovery and the assessment of its usefulness across multiple data management systems. In addition to being a first step in assembling data for an IEA, a thorough data discovery phase also supports data gap analysis. Evaluating data usefulness and providing access follow discovery, and will be a particular challenge for IEAs, owing both to the distributed nature of the data as well as the heterogeneity of data types and access methods. Quality metadata is critical to evaluating the value, usefulness, and accessibility of the data for the purpose of conducting an IEA. Efforts within NOAA on technologies and techniques for maximizing data discovery will be demonstrated. Data integration and fusion form the core of an IEA. Data integration describes the process whereby multiple data streams are brought together into the analysis environment in useful formats, while data fusion involves the additional step of bringing disparate data together into common applications for analysis, visualization, model construction and operation, or product generation. Techniques for data fusion that may be especially useful within components of the IEA process will be discussed.
683980
Dynamic Mapping and Data Access to Salmonid Data
Jeff Cowen
NOAA Northwest Fisheries Science Center, Seattle, WA, USA.
The Scientific Data Management (SDM) team at the Northwest Fisheries Science Centerprovides data management infrastructure and develops web based applications to provide access to dataand information in support of salmon recovery efforts in the Northwest. Our team focuses on Oracle database products and ESRI GIS Server technologies to create interactive websites that allow researchersto query, visualize and access data and information. Recently the SDM team has published websites built on Oracle’s Application Express platform and incorporates ESRI’s JavaScript Mapping API framework and Google’s Charting API. One of the applications that will be presented provides dynamic mapping and graphing of Salmonid Population Summary data. Another application dynamically maps PITTAG release sitesbased on where tags have been recovered from bird colonies on the Columbia River. These applications have both a robust query interface and a dynamic mapping front end to access the information. The SDM team has also created a query and mapping interface built on Java and JavaServer Page technologies to provide access to Status and Trend Evaluation and Monitoring (STEM) data in support of the Integrated Status and Effectiveness Monitoring Program (ISEMP). This presentation will focus on the data delivery technologies used to develop these applications.
Bio: I have a Master’s degree in geography from the University of North Carolina at Charlotte. I worked at ESRI from 1993-1995 and was a GIS analyst and application developer at the NOAA Coastal Services Center in Charleston, SC from 1996-2001. Since 2001, I have been a part of the Scientific Data Management program at the NOAA Northwest Fisheries Science Center in Seattle, WA. My work emphasis is on data quality and management, and data visualization and analysis through web mapping tools.
696919
Data Conservancy: A Library-Based Data Cyberinfrastructure Paradigm
Mellisa H. Cragin
Center for Informatics Research in Science and Scholarship, University of Illinois at Urbana-Champaign, Champaign, IL, USA.
The Data Conservancy (DC) is one of two current awards through the US National Science
Foundation’s DataNet program which will generate “a set of exemplar national and global data research infrastructure organizations (DataNet Partners).” DataNet projects are intended to combine “library andarchival sciences, cyberinfrastructure, computer and information sciences, and domain science expertise to provide reliable digital preservation, access, integration, and analysis capabilities … over a decades-long timeline.” (NSF, 2007). Long-term access and preservation of data are needed to broaden participation in science, encourage reuse of valuable data assets, and provide data resources for longitudinal studies and synthesis across studies and disciplines.
The Data Conservancy is developing a comprehensive data curation strategy, building on previous accomplishments at Johns Hopkins University providing data services for the Sloan Digital Sky Survey and the National Virtual Observatory. To further advance and extend data support to the life sciences, earth sciences, and social sciences, the DC team is focusing on the equally demanding sociological and technological dimensions of managing, integrating, sharing, discovering, analyzing, and preserving multidisciplinary digital data, leveraging the initial architectural design, data models and metadata profiles, and organizational models developed for astronomy.
The DC initiative has a strong information science research component that includes development of a crossdisciplinary data model for observational data, data mining techniques for extracting and mapping diverse data to the model, as well as research on data collection description requirements, metadata granularity and relationships, and comparative analysis of data practices across the sciences to be supported by the DC.
Bio: Melissa Cragin is the lead data curation researcher at the Center for Informatics Research in Science and Scholarship at the University of Illinois at Urbana-Champagin, and an investigator on the Data Conservancy project, funded by the NSF DataNet initiative. She is currently directing research a project developing data curation profiles for a broad range of scientific disciplines and coordinating the Data Curation Education Program (DCEP) at the Graduate School of Library and Information Science (GSLIS). Cragin’s research area is scholarly and scientific communication, and she specializes in scientific data practices and their relationship to the emerging field of data curation. Melissa has a Ph.D. from the University of Illinois at Urbana-Champaign, where her dissertation work investigated the functions of shared data collections in neuroscience.
652663
The Semantic Web, Linked Data and How It Can Help
DeVries, Peter J.; Young, Daniel K.
Entomology, University of Wisconsin - Madison, Madison, WI, USA.
Today's researchers are confronted with a wealth of relevant data, but it is often difficult to integrate these large and disparate data sources in a useful way. New tools and techniques that are being developed for the Semantic Web can help deal with this information overload. The Semantic Web is a part of the larger World Wide Web. Semantic Web sites contain additional information that helps clarify the meaning of the information on a web page or web accessible resource. This additional semantic information is also being applied to Internet data repositories, allowing them to share their data in a machine understandable way. The ability to clearly attach meaning to text and numbers has fostered the creation of a world-wide web of data. Data repositories that are linked to other data repositories in a standard way are part of the Linked Data Cloud. The Linked Data Cloud can be thought of as a large data set that spans repositories across the globe. A paper in a publication repository may contain links to data in gene and specimen repositories maintained elsewhere. This paper is also linked to other papers and the data that they contain. The Linked Data Cloud provides a rich set of information that can be queried and analyzed. If your research involves species collected at a particular location, you may be able to link those collection records to soil, weather, and other relevant data. The Linked Data Cloud will allow researchers to incorporate more related information into their analysis than they would have been able to collect and curate themselves. The talk will provide an introduction to Semantic Web concepts and methods using data from a Lake Michigan coastal dune habitat. Topics will include how to markup your data using semantic identifiers, where to look for related data sets on the Linked Data Cloud, and how to create your own knowledge base of semantic facts that you can analyze and query.
683977
Informatic Tools to Assess Habitat Restoration in the Pacific Northwest
David Hamm, Katie Barnas
Hamm Consulting, Seattle, WA, USA.
A prominent method used to help recover endangered salmon has been and is currently, the restoration of habitat on which these fish depend. However, evaluating the effectiveness of restoration efforts has been extremely challenging and the causal connections between restoration actions and fish population response remain poorly understood. A mechanistic understanding of these linkages first requires some fundamental pieces of information such as data on project implementation and the condition of salmonid habitat. This requires collecting data from diverse sources, standardizing features of the environment across datasets and developing an explicit classification scheme that accurately reflects ecological relationships. With the creation of new technical tools, large data sets can uncover associations and improve our understanding of how habitat restoration is implemented.
Here we present a set of novel informatics tools. We have developed a Habitat Limiting Factors Data Dictionary that enables us to compile and standardize habitat assessments which identify those features of the habitat in need of restoration. We compare data crosswalks that connect restoration actions to independent assessment of salmonid habitat needs and performed a sensitivity analysis that informs critical steps in making this analysis. We also utilized a novel metric to assess and visualize the results.
652379
From observatories to collaboratories: the SATURN cyber-infrastructure
Hansen, David M.2; Seaton, Charles M.1; Jaramillo, Alex V.1; Turner, Paul J.1; Schilling, Jeffrey1; Maier, David4; Freire, Juliana3; Silva, Claudio3; Baptista, Antonio M.1
1 Center for Coastal Margin Observation & Prediction, Oregon Health & Science University, Beaverton, OR, USA.
2 Computer & Information Science, George Fox University, Newberg, OR, USA.
3 School of Computing, University of Utah, Salt Lake City, UT, USA.
4 Department of Computer Science, Portland State University, Portland, OR, USA.
We define a coastal margin collaboratory as (a) a networked integration of sensors, platforms, models, data, analyses and collaboration, and social processes, (b) which enables diverse stakeholders to interact without geographic, disciplinary or institutional barriers and (c) towards the understanding, operation and sustainability of coastal margins. This definition is serving as guidance for the development of SATURN, a collaboratory for the Columbia River coastal margin. As many contemporary ocean observatories, SATURN includes an observation network, a modeling system, and a cyber-infrastructure. Here, we will focus on the SATURN cyber-infrastructure, for which we provide both a description of current functionality and a gap analysis relative to the functionality implicit in the guiding definition. We will demonstrate useful proficiency in the ability to move, store and retrieve data from fixed stations and numerical models, and emerging proficiency in the same functionality for vessels and mobile platforms. We will also show progress in using various web, visualization and provenance tools as vehicles for communication, analyses and retained memory–with increasing buy-in from multiple scientific and non-scientific stakeholders. However, we will also identify important gaps between cyberenabled capabilities and effective SATURN-enabled collaboration, which might suggest the need for formal research on social processes, collaboration techniques and communication strategies. This work is supported by the NSF cooperative agreement OCE-0424602. SATURN observations and simulations are integral to the Northwest Association of Networked Ocean Observing Systems (NANOOS).
696975
Information Management in LTER: Moving beyond site-based approaches to accommodate network-scale interdisciplinary research
Don Henshaw
U.S. Forest Service Pacific Northwest Research Station, Corvallis, OR, USA.
The H.J. Andrews Experimental Forest is a member site of the National Science Foundation Long-Term Ecological Research (LTER) network that includes 26 sites representing diverse ecosystems. The Andrews Forest LTER has conducted intensive forest ecosystem research since the 1950’s resulting in many diverse, long-term ecological databases and a strong commitment to information management. The Andrews LTER has developed an information management system which supports the collection, quality control, archival and long-term accessibility of collected data and associated metadata. The reuse of long-term study data and the synthesis of cross-site data in large-scale collaborative efforts have necessitated sophisticated information systems to allow easy discovery and web access of these data collections and complete, structured metadata for these resources. Data access policies are established including agreements for data release and data use to assure that data resources are available and to assure data providers of the ethical use of their data. Involvement of the research community in the management of this resource has been critical to success.
Current LTER Network planning activities emphasize new network science approaches that demand collaboration and integration across broader spatial and temporal scales and require improvements in the flow of data and the synthesis of information. A key challenge to the LTER Network has been the transition large-scale, interdisciplinary environmental issues. The LTER Network has been the first and largest adopter of metadata standards in the ecological community, and has set standards for site information management systems that have been peer-reviewed and vetted by the ecological community. Several network-level products have emerged as part of a Network Information System (NIS) to facilitate integrative, cross-site research.
Bio: Don Henshaw is an Information Technology Specialist with the U.S. Forest Service Pacific Northwest Research Station. Don is Information Manager for the H.J. Andrews Experimental Forest LTER site and director of the Forest Science Data Bank at Oregon State University in partnership with the PNW Station, which is a data repository for Andrews Forest LTER data and specific campaign data for Oregon State University and USFS Research (examples are Mt.St.Helens, Research Natural Areas (RNA), Cascade Head Exp. Forest, DEMO). Don has chaired the Network Information System Advisory Committee (NISAC) for LTER and was Team Leader for development of ClimDB/HydroDB data harvester and warehouse.
683991
Developing Information Systems for Salmon Science, Management and Marketing
Peter Lawson
Hatfield Marine Science Center, Newport, OR, USA.
Pacific Northwest commercial ocean troll fisheries for Chinook salmon were closed in 2008 and 2009 due to low abundance of Sacramento River fall runs. Fisheries in 2006 and 2007 were severely restricted to protect spawning escapements of Klamath River fall Chinook. In 2005, anticipating the Klamath River fishery restrictions, a collaboration of fishermen, scientists, and seafood marketers initiated Project CROOS (Collaborative Research on Oregon Ocean Salmon) to explore the potential of genetic stock identification (GSI) to provide fisheries managers with better data to manage harvest. The hope was that fine-scale aggregations of weak stocks in the ocean could be identified and avoided. Fishermen bar-coded each fish caught, recorded the location using GPS, collected fin clips (for GSI) and scales (for aging), along with fish length and depth caught. Data were used to map changing distributions of Chinook, by stock, throughout the fishery. A broader range of potential applications quickly became evident. The bar code enables samples to be linked to individual fish and permits tracking of each fish through the processor chain to market. Data are assembled in a central data base where they can be associated with supporting data sets including oceanographic data, satellite observations, and coded-wire tag data. Potential users of these data are scientists, fishermen, fishery managers, processors, marketers, and the general public. A web site, www.pacificfishtrax.org, is being developed as a portal to these data. This web site is designed to provide access tailored to the needs of specific user groups, and to be extended to accommodate new species, data types, and users. The ultimate goal is to develop a coast-wide data network with flexible tools to serve the full spectrum of needs and services supporting a variety of West Coast fisheries.
Bio: Dr. Peter Lawson is currently a research fishery biologist at the Northwest Fisheries Science Center of the National Marine Fisheries Service (NMFS). He received an M.S. in 1984 and Ph.D. in stream ecology from Idaho State University in 1986. He then took a position as biometrician and modeler for the ocean salmon harvest team of the Oregon Department of Fish and Wildlife. In 1997, after ten years with ODFW, Pete joined NMFS at the Northwest Fisheries Science Center. He has served on technical advisory committees to the Pacific Fishery Management Council and the Pacific Salmon Commission since 1987. Pete's models have been used to predict salmon runs, estimate harvest impacts, elucidate non-landed mortality in selective fisheries, and explore coho salmon population dynamics with a fine-grained, habitat-based life-cycle model. The habitat model is currently being expanded to include a dynamic landscape and spatially explicit stream network. Recent publications have treated climate effects on coho salmon survival in both freshwater and marine environments, with the goal of building a model that integrates across freshwater and marine phases of the life cycle. Pete is a key player in Project CROOS (Collaborative Research on Oregon Ocean Salmon) and the West Coast GSI Collaboration. He is currently helping to design the data system that will move information from fishing boats at sea into a shoreside data base that can then be used to support research, fisheries management, marketing, and education.
683994
The Washington State Salmonid Stock Inventory: aggregation, standardization, and dissemination of sensitive, potentially contentious data
Dayv Lowry, Gil Lensegrav, and Brodie Cox
The Salmon and Steelhead Stock Inventory (SASSI) was created by the Washington Departments of Fisheries and Game, and the Western Washington Treaty Tribes in 1992. Its purposes were to catalogue stock assessment methods being used for natural populations of salmon and steelhead throughout the state, consolidate annual stock abundance estimates into a compendium, and evaluate stock status based on historic data trends. Intermittent updates and additions of other salmonid species led the inventory to be renamed the Salmonid Stock Inventory (SaSI) in 2002. In 2004 the SaSI database was linked to the Washington Department of Fish and Wildlife’s (WDFW’s) SalmonScape, a web-based GIS utility, and a web portal was developed to allow field biologists to submit stock abundance estimates remotely. Though these combined data submittal and sharing elements laid a solid groundwork for broad data distribution, the system’s piecemeal development led to data integrity and quality issues that compromised the ability of users to understand current stock status. The need for these data to be freely available to the public and scientific users in their most complete, well-annotated, and accurate form has risen in recent years with the Endangered Species Act (ESA) listing of several salmonid Evolutionarily Significant Units (ESUs) in the Pacific Northwest.
Recently the SaSI database underwent a series of major changes from a static MS Access database to a centralized SQL Server database. Other major fixes included a complete audit of all database entries, normalization of redundant data, a redesign of the JAVA programmed\web enabled data entry tool (“SaSI Web Funnel”), the integration of historic stock report text blocks into the database, and a redesign of SQL queries. SalmonScape now presents a detailed, dynamic, annotated report output for each stock. The Salmonscape report can now be updated on the fly by the data provider using the SaSI Web Funnel when substantial content changes occur. Additionally, this dynamic content is now accessible to other data portal efforts still in development. This content is accessible through WDFW SalmonScape website: http://wdfw.wa.gov/mapping/salmonscape
Bios: Dayv Lowry is a Fishery Biologist and the SaSI Data Coordinator for the Washington Department of Fish and Wildlife (WDFW) Fish Science Division. He moved into this position after serving one year as the Region 6 Puget Sound Dungeness Crab Biologist for the WDFW. Prior to working for the WDFW he earned his doctorate at the University of South Florida where he studied the ontogeny of feeding biomechanics in sharks, and before that he earned his bachelor’s degree from Hawai’i Pacific University studying coral reef fish ecology and shark movement patterns. • Gil Lensegrav is an Information Technology Specialist with the WDFW Fish Science Division where he develops and manages research and production data systems for use by regional fish managers and data analysts. His projects include the anadromous fish Spawning Ground Survey (SGS) system, the Salmonid Stock Assessment Inventory (SaSI), and the salmon and steelhead age data system. He began with the agency in 2000, and he has also worked with the agency’s Habitat Division on GIS mapping and application development for fish barrier assessments. Gil has a Bachelor of Science degree in Environmental Sciences from The Evergreen State College. • Brodie Cox is the manager of the WDFW’s Biological Data Systems Unit. This unit serves the Fish Science Division and houses a diverse number of datasets adaptable to corporate, regional, laboratory, and project purposes. Beyond the SaSI, notable projects within the BDS Unit include, but are not limited to: web-based statewide fish hatchery management system: FishBooks, coded wire tag application and recovery accounting, web-based catch reporting and accounting in Washington, Washington State In-stream Atlas project, GIS based Ocean GSI and the statewide high level indicator (HLI) web reporting data portal: ‘H2WS’.
652334
Concepts for Bringing Land and Marine Data Together
Mark MacKenzie
CARIS, Fredericton, NB, Canada.
This paper will focus on addressing issues encountered when attempting to merge land and marine geospatial datasets together, which is essential for effective Coastal Zone Management. There has been a great deal of focus on the management of topographic, and cadastral data within National Spatial Data Infrastructures, but unfortunately the marine data components have received less attention. Combining land and marine geospatial datasets is challenging because of different data standards, disparity between data scales, symbology and coordinate reference system / datum differences. Bringing this data together can occur through the web from disparate sources or alternatively through harmonizing the data in centralized spatial databases. Both scenarios will be discussed. A benefit of the central database approach that will be emphasized is the ability to store a fully de-conflicted set of geospatial features that represents the littoral zone and therefore allows subsequent analysis and decision-making. Successfully addressing the issues associated with merging land and marine data results in more efficient implementation of initiatives such as coastal flood visualization, disaster management and response, and Integrated Coastal Zone Management (ICZM). It also allows expensive marine survey data to be collected, processed and managed once and used many times.
684927
The HUBzero Platform for Scientific Collaboration
Michael McLennan
Purdue University, West Lafayette, IN, USA.
HUBzero is a cyberinfrastructure for scientific simulation and modeling activities. It was created at Purdue University by the NSF-sponsored Network for Computational Nanotechnology to support their web site at nanoHUB.org. Over the years, it has been adapted and expanded to support a dozen more sites in a variety of disciplines, including healthcare, pharmaceuticals, microelectronics, and education.
Some people compare HUBzero to the highly successful Open Courseware Initiative from MIT. But a hub is more than just a repository for course materials. It is a place where researchers and educators can meet and accomplish real work. For example, nanoHUB.org offers more than 140 simulation tools that users can access instantly via an ordinary Web browser, and not only launch jobs, but also visualize and analyze the results — without having to download, compile, or install any code. Simulation jobs can be dispatched on national Grid resources, including the NSF TeraGrid and the Open Science Grid. The HUBzero middleware hides much of the complexity of Grid computing, handling authentication, authorization, file transfer, and visualization — thereby making the tools more accessible not only to computational scientists, but to experimentalists and educators as well. In 2008, more than 89,000 unique users browsed nanoHUB.org from 172 countries worldwide. Of these, 6,700 users accessed simulation tools and launched some 380,000 simulation jobs. Other hubs, such as pharmaHUB.org and thermalHUB.org, have been online for just 18 months, but already have more than 2,500 users.
HUBzero combines its seamless access to simulation power with social networking features, allowing users to share their models, help one another, and collaborate online, thereby accelerating the process of scientific discovery.
Bio: Dr. McLennan received his Ph.D. in 1990 from Purdue University for the study of quantum mechanical electron transport in mesoscopic devices, supported as an SRC Graduate Fellow. He went on to develop CAD software at companies including Bell Labs and Cadence Design Systems. He is well known in the open source community for developing [incr Tcl], an object-oriented extension of the popular Tcl scripting language. He is a coauthor of two books: "Effective Tcl/Tk Programming" and "Tcl/Tk Tools."
In 2004, Dr. McLennan returned to Purdue as a Senior Research Scientist in the Rosen Center for Advanced Computing, where he acts as Director of the HUBzero Platform for Scientific Collaboration. This platform has been used to create several scientific Web sites or "hubs" that currently support tens of thousands of users worldwide. As part of the HUBzero development, Dr. McLennan created the Rappture toolkit, which has been used to build graphical user interfaces for hundreds of tools deployed on the various hubs.
652719
Use of a Non-Supervised Neural Network and Hierarchical Clustering To Study Spatial Patterns of Fish Assemblages in Southeast Alaska
Miller, Katharine1; Brenda, Norcross1; Lorenz, Mitch2
1 School of Fisheries and Ocean Sciences, University of Alaska Fairbanks, Fairbanks, AK, USA.
2 Alaska Fisheries Science Center, Juneau, AK, USA.
Cluster analysis is commonly used to reduce the dimensionality of multivariate data to create groups of similar samples. One of the most commonly used clustering methods in ecology is hierarchical clustering which involves constructing a species resemblance matrix and then fusing the data to create a hierarchy of clusters of increasing similarity. For species biological data, the Bray-Curtis coefficient is commonly used to construct the resemblance matrix because, unlike other similarity measures, its value is not changed by joint absences of species. Community data are composed of a large number of rare species: species that occur in only a few samples or occur in low numbers. Resemblance measures, such as the Bray-Curtis coefficient, do not cope well with rare species because they are constrained to vary between 0 and 100, and the fewer the species in the samples the less latitude there is for variation. There is general agreement that rare species are important components of the ecosystem; however there is less agreement as to whether they can contribute meaningfully to identifying patterns of species distribution or environmental gradients. This research compared the geographic distribution of fish assemblages from 44 estuaries in Southeast Alaska using hierarchical clustering methods and a Kohonen Self-Organizing Map (SOM). The SOM is an unsupervised neural network that is used to classify multidimensional data into a two-dimensional map. The importance of rare species to biogeographic patterns was evaluated.
683994
Vision for Unified Alaskan State Commercial Fishing Data
Tracy Olson
Commercial Fisheries, Alaska Department of Fish and Game, Juneau, AK, USA.
Computer Information Services (CIS) of the Alaska Department of Fish and Game (ADF&G), Division of Commercial Fisheries is creating a vision to standardize statewide data collection and streamline processing. Commercial Fisheries Division is structured as four distinct regions where processing, applications, databases, reporting and development are largely carried out independently, with data stored in multiple places and formats. Applications are often region-specific with programmers using different programming languages and styles. This results in multiple support, deployment, and access issues. A general lack of data accessibility impacts internal staff as well as the public, creating cumbersome interactions whenever data are requested. This is frustrating for everyone involved.
To rectify these problems, CIS is embarking on a new direction to create a statewide data warehouse and related tools for the purpose of combining databases and other data sources. Applications will be built using a standardized framework and have a similar look and feel. By using the same programming languages and core approaches, programmers can efficiently address any regional application. Databases will connect to a centralized data warehouse and reports will be standardized and run from the data warehouse. This will allow programmers to build reports quickly and not have to redeploy an application every time a new report is needed. Users will ideally be able to easily create or modify reports as they need them.
This initiative will require data recovery, translation and an agreement throughout the state to centralize data from many divergent sources. While this is a huge undertaking, we believe this effort will enhance capabilities of ADF&G staff, protect irreplaceable data and provide for greater public access to important fisheries data.
652224
Information Solutions for Watersheds: III. A service-oriented solution for managing and analyzing data across the Mississippi Atchafalaya River Basin and Northern Gulf of Mexico
Parker, Amanda K.1; Bourne, Stephen2; Hampson, John C.3
1 PBS&J, Atlanta, GA, USA.
2 PBS&J, Smyrna, GA, USA.
3 PBS&J, Tampa, FL, USA.
The Mississippi/Atchafalaya River Basin (MARB) covers 1.245M mi2 of the continental U.S. Phosphorus and nitrogen from cities and agriculture fields in the MARB are transported through a hydraulically modified system to the Gulf of Mexico and cause the formation of a large area of hypoxia. At 7,988 mi2 in 2008, it ranks second in the world in size and comprises the most complex water quality problem in the U.S. In June 2008, The Mississippi River/Gulf of Mexico Watershed Nutrient Task Force signed the 2008 Action Plan and released FY operating plans to address the problem. To ensure successful implementation of the Action Plan, we suggest an information solution using web services that are now mature enough to federate volumes of freshwater, estuarine, and marine water quality data, GIS data, aerial imagery, gage data, hydrodynamic data, weather data, and gridded re-analysis data, etc., and to wrap mechanistic modeling and statistical analysis tools in a web-based visualization platform that encourages discussion, consensus building, and ownership. Our solution for community-based watershed planning and management comprises 1) a cyberinfrastructure using web services and catalogues to federate existing databases in multiple agencies and make data sharing open and secure, 2) a web-based GIS tool for exploring a) the data, b) available analysis tools, c) conclusions made, and d) the resulting management policies and implementation results, and 3) a GIS-based analysis workbench that connects directly to the cyberinfrastructure and extracts data according to analytical intent. The papers within this conference, “Information Solutions for Watersheds: I. An analysis workbench for simple, GISbased, estuary analysis,” and “Information Solutions for Watersheds: II. The Northslope Descision Support System for Arctic water resources planning and management,” illustrate different applications of the suggested information solution presented here.
684929
Behind the Scenes of the Sockeye IUCN Visualization
Kim Rees
Periscopic, Portland, OR, USA.
Interactive data visualizations are immersive tools that allow people to explore information in ways that produce unexpected insight and revelation. Often developed around a specific metaphor or point of view, these tools require manipulation of data in real-time and interfaces that can adjust to changing results.
Recently the State of the Salmon commissioned a visualization of their Sockeye salmon escapement data. The online tool, an interactive complement of the IUCN assessment, is intended to both educate the public and be a scientific research tool.
Providing multiple interfaces for exploring this data, from maps to scatterplots to abstract views, the visualization offers many interactive ways of looking at the raw data using different methodologies. Additionally, it allows for new interpretation of the data, and provides an excellent way to explore, compare, and graph information in ways that have previously been unavailable.
Two members of the development team will give a behind-the-scenes look at what it took to create the visualization, including identifying key audiences, exploring data visually, uncovering hidden story lines, creating themes, and creating user-focused interaction methods.
Bio: Kim Rees has 15 years of experience in the multimedia industry. She has a history of software development in alternative technologies through her efforts with physical computing, embedded systems, and media programming. Kim received her BA in Computer Science from NYU. Periscopic is an award-winning interactive design and development firm specializing in user-centric design with a strong focus on information visualization. The company's work has appeared in several publications, including the 2009 Communication Arts Interactive Annual and in the Information Design Sourcebook.
Bio: Dino Citraro isa 15-year veteran of the multimedia industry, Dino Citraro’s work has spanned interactive motion pictures, multi-player online games, immersive data visualizations, and interactive hardware installations. Periscopic is an award-winning interactive design and development firm specializing in user-centric design with a strong focus on information visualization. The company's work has appeared in several publications, including the 2009 Communication Arts Interactive Annual and in the Information Design Sourcebook.
683957
The Aquatic Resources Schema
Steve Rentmeester
Environmental Data Services, Portland, OR, USA.
The Aquatic Resources Schema (ARS) was developed to support data collectors in the Pacific Northwest manage, document, and analyze aquatic resources data. In developing the ARS, the primary objectives were to develop a schema that is robust against variations in data collection protocols, that supports procedures for ensuring data integrity at the time of data entry, and that support efficient analysis and submission of aquatic resources data. The schema provides a proof of concept on two main concepts — discipline specific data templates and protocol management. Discipline specific data templates aims to identify unique groupings of objects, attributes, and relationships that are relevant to aquatic ecology as a discipline. Protocol management assumes that data collection protocols are a variable in natural resource monitoring and therefore, should be managed as variables within the data management system. Storing metadata about the protocol directly in the database allows those values to be used for customizing data entry forms, for data validation, and to ensure data integrity. These two concepts will be discussed in greater detail.
652206
Ecosystem Informatics for Natural History Data — Developing an integrated framework for biological and geographic data
Reusser, Deborah A.2; Lee, Henry1
1 U S. Environmental Protection Agency, Newport, OR, USA.
2 U.S. Geological Survey, Newport, OR, USA.
Threats to the ecological integrity of marine and estuarine systems operate over many spatial scales, from nutrient enrichment at watershed/estuarine linkages to invasive species and climate change at regional/global scales. Decision support tools and information systems needed to identify and address issues such as these across multiple spatial scales are lacking. To address this need, we have designed an integrated framework for capturing life history characteristics, environmental preferences and geographic distribution data for marine and estuarine biota called the Pacific Coast Ecosystem Information System (PCEIS). The key aspects of PCEIS include: 1) consistent terminology; 2) translation of numerical habitat and physiological requirements into classes; and 3) classification schemas for natural history, environmental attributes, and geographic distributions. When possible, a hierarchical classification typology was developed to capture information at multiple levels of detail. For example, reproduction, feeding, life style, salinity, and geographic distributions all fit well into hierarchical schemas. In some cases where the structure of the data was not hierarchical (e.g., wave energy), a multidimensional typology was designed. Another component of the framework includes hierarchical spatial topology for connecting watershed characteristics to estuarine environments. PCEIS is a decision support tool containing queriable biological and environmental data that allows users to extract information on multiple species and/or their natural history attributes across a variety of spatial scales. It is currently implemented in a stand-alone, user-friendly ACCESS database for researchers and managers.
683997
A Web-based Data Serving and Visualization Tool for Oregon Coastal Coho Salmon and Aquatic Habitat Information
Jeff Rodgers (substitute: Julie Firman)
Oregon Department of Fish and Wildlife, Corvallis, OR, USA.
Throughout much of their range along the west coast of the contiguous United States, salmon populations are listed as threatened or endangered under the U.S. Endangered Species Act (ESA). One of the requirements of the ESA is to conduct periodic assessments of the status of theses populations. Another is to develop recovery plans designed to return listed populations to viability and to monitor progress towards achieving recovery plan goals. Across the state of Oregon, numerous recovery plans are in the process of being developed that have specific benchmarks or “measurable criteria” that will be used to track progress towards meeting recovery goals. Along the Oregon coast, the Oregon Department of Fish and Wildlife (ODFW) has implemented it’s version of a federal ESA recovery plan for coho salmon. To provide information on the progress towards achieving recovery goals for coastal Oregon coho, access to the data used in ESA viability and threats assessments, and develop a template that can be used as a model for information sharing for other ESA recovery plans being developed in Oregon, ODFW and the State of the Salmon Program have partnered to develop a web-based data serving and visualization tool for Oregon coastal coho salmon and aquatic habitat information. This presentation will provide an overview of this web-based tool.
Bio: Jeff Rodgers has a Bachelor of Science degree in Biology from the University of Oregon (1978) and a Master of Science Degree in Fisheries and Wildlife from Oregon State University (1985). Since 1978, Jeff has worked for the Oregon Department of Fish and Wildlife (ODFW) conducting research on the behavior, estuarine use, and freshwater habitat requirements of anadromous salmonids. He has also conducted research on the effects of habitat restoration on salmonid production, and the relative precision and bias of fish population estimate methods. From 1994-1996 Jeff was the GIS analyst fora joint Oregon State University and ODFW project devoted to developing alternate conservation strategies for salmon throughout the North Pacific Rim. Since 2003 Jeff has been ODFW's Conservation and Recovery Monitoring Coordinator. He lives in Corvallis, Oregon and works out of ODFW's Fish Research Laboratory.
697484
The realist approach to building ontologies for science
Alan Ruttenberg
Science Commons, Boston, MA, USA.
Building an ontology to be used for large scale data integration is an effort that requires extensive communication and agreement among stakeholder groups. But on what basis are groups to come to agreement? The realist approach establishes such criteria by aiming to ensure that terms from ontologies clearly represent entities in the world, and so establishes a link from scientific agreement about what exists to decision making about what terms should exist in ontologies — they should be in correspondence. In this talk I will elaborate some consequences of taking such an approach and demonstrate, from experience working in a number of collaborative ontology projects, how and why the approach is successful.
684930
Kepler: A scientific work flow support tool
Mark Schildhauer (substitute: Matt Jones)
National Center for Ecological Analysis and Synthesis, Santa Barbara, CA, USA.
Kepler, a free and open source scientific work flow application, is designed to help scientists, analysts, and computer programmers create, execute, and share models and analyses across a broad range of scientific and engineering disciplines. Kepler can operate on data stored in a variety of formats, locally and over the internet, and is an effective environment for integrating disparate software components, such as merging "R" scripts with compiled "C" code, or facilitating remote, distributed execution of models. Using Kepler's graphical user interface, users simply select and then connect pertinent analytical components and data sources to create a "scientific work flow," an executable representation of the steps required to generate results. The Kepler software helps users share and reuse data, work flows, and components developed by the scientific community to address common needs. This presentation will introduce key features and functions of Kepler and provide working examples of its use.
684947
The Scientific Observations Network and semantic tools for ecological data management
Michael Schildhauer
National Center for Ecological Analysis and Synthesis, Santa Barbara, CA, USA.
Advances in environmental science increasingly depend on information from multiple disciplines to tackle broader and more complex questions about the natural world. Such advances, however, are hindered by data heterogeneity, which impedes the ability of researchers to discover, interpret, and integrate relevant data that have been collected by others. A recent NSF-funded workshop on multidisciplinary data management concluded that interoperability can be significantly improved by better describing data at the level of observation and measurement, rather than the level of the data set. Drawing upon work from the genomics community, the Scientific Observations Network (SONet) effort brings together a community of experts from multiple fields to define and develop the necessary specifications and technologies to facilitate the interpretation and integration of observational data.
648197
The Chesapeake Bay Program’s Chesapeake Information Management System (CIMS)
Shenk, Gary; Burch, Brian
EPA / Chesapeake Bay Program Office, Annapolis, MD, USA.
The Chesapeake Bay Program’s Chesapeake Information Management System (CIMS) is an organized, distributed library of information and software tools designed to increase basin-wide public access to Chesapeake Bay information. CIMS partners are those states, federal agencies, academic institutions, and others who signed a Memorandum of Agreement to provide public access to its Chesapeake Bay watershed information. Information is available through CIMS in several gradations of synthesis from raw data and dynamically-generated maps and charts, through statically-presented reports. CIMS provides benefits to the community through ease of access to information and tools, reduction of cost by eliminating duplicative data and information handling, improvement in data quality, direct data source to data product workflow mapping, and the ability to evolve quickly to be responsive to users’ needs.
683998
Open Science and Data Sharing
Kaitlin Thaney
Science Commons, San Francisco, CA, USA.
With more and more content moving to a digital form, the traditional method of scientific publishing and knowledge sharing is changing. Web technology has brought tremendous efficiency gains to commerce, to entertainment, to culture — but the potential has not yet been seized for scientific research.
We now have the tools and understanding to bring together open research and data on a global scale, embedded with the freedoms necessary to be able to fully utilize it. The first step to that lies in making the scholarly content legally and technically available, both the literature and the research data. This takes an approach that not only addresses the copyright and rights ownership issues surrounding scholarly literature but also one that considers the legal, technical and social barriers to sharing data. The trend towards applying licenses, click-wrap agreements and other sorts of restrictions on scientific data is also increasing, limiting the downstream use of this information. The costs are high, the terms are not always clear, nor are the protections always legally sound. The result is a high barrier to entry to do meaningful analysis, annotation, search, etc. on the mass of data available currently that's continuing to grow exponentially, and integrating with the literature available.
Science Commons, a project of Creative Commons, helps build some of this open infrastructure — crafting policy and legal tools to lower those barriers, and developing technology to make research data and materials easier to find and use. The goal of Science Commons is to speed the translation of data into discovery and to unlock the value of research so more people can benefit from the work scientists are doing.
This talk will explore the various tools and policy Science Commons provides to enable this, and look at the issues surrounding and infrastructure needed to make data sharing more efficient and scalable.
Bio: A Rochester, New York native, Kaitlin comes to Science Commons with a background deeply rooted in news and policy. Prior to Science Commons, she worked as the communications coordinator for MIT iCampus, a research alliance between the university and Microsoft, centered on education technology. She also spent time working as a journalism intern for Reporters Committee for Freedom of the Press in Arlington, VA. Prior to that, Kaitlin worked as a correspondent for The Boston Globe’s City/Region section. Kaitlin did her undergraduate work at Northeastern University, where she received two degrees — one in journalism and the other in political science. Her interests lie in open access publishing, data sharing and licensing issues, and the burgeoning open science movement. She is based in Boston, Massachusetts.
683981
Capitalizing on Data Visualization Technologies: An introduction to Tableau software
Josh Vitello
Tableau Software, Inc., Seattle, WA, USA.
Scientists, managers, database administrators, and others working at the intersection of primary research and natural resources management are faced with a barrage of challenges, not the least of which are various people clamoring for more data, more reports, more information — creating mountains of requests. The science of data visualization has evolved to meet these challenges. More rapid, visual, summarized and appropriate information can be available on demand. In this presentation we will find out how organizations are using data visualization software to get rapid results without breaking their budgets. You’ll learn: How rapid-fire visual analysis helps drive results; How to focus more on analysis and less on formatting data; How to cut the backlog of analytical requests by building an environment where users get what they want, when they want it; How to deliver a more transparent, actionable view of the research and ensure faster, smarter responses to changing conditions.
683981
National Environmental Information Exchange Network: Sharing data for better watershed management
Mitch West, Executive Coordinator
National Environmental Information Exchange Network, Portland, OR, USA.
Effective watershed management efforts rest on accurate measurement of ambient conditions and results. Access to necessary information in usable and consistent format has been a challenge. Traditional monitoring project design has led to a report and to a set of raw data, which is frequently inaccessible to other projects or analyses.
The National Environmental Information Exchange Network (the Network) is a partnership between USEPA, states, tribes, and territories designed to promote access to like data from multiple sources, and to promote the use of standard definitions and formats to promote comparability. When multiple Network partners offer like data collections, they do so in a commonly adopted form. Past Network projects that provide potential value to salmon restoration are outlined:
In 2003, the four states of EPA’s region 10 established the “Pacific Northwest Water Quality Exchange, defining access protocols and formats to access ambient water monitoring data. The USEPA used this project as a template for the current Water Quality Exchange (WQX). This project has already more than doubled the record count in a decades-old data store.
Chesapeake Bay Project analysts were frustrated by lack of access to information about restoration projects undertaken by various authorities. Scientists were unable to link observed changes in ambient conditions to the projects being undertaken in the name of restoration. The state members of the Chesapeake Bay Project published information on their projects in a common format linkable to other data sources.
This presentation will focus on these efforts, and show how salmon restoration efforts could reuse Exchange Network technology and processes to support new data access projects.
BIO: Mitch has been involved in the creation and growth of the Exchange Network from its inception in 1999. From 1994 to 2007, Mitch worked at the Oregon Department of Environmental Quality as a budget manager and later as an information systems manager. While working for the State of Oregon, Mitch participated in the Information Management Workgroup, a States/USEPA partnership focused on improving the quality, accessibility, and use of environmental data for decision-making. As a part of that effort, he helped draft the blueprint for the National Environmental Information Exchange Network. Prior to joining state government, Mitch completed a 22-year career with the United States Coast Guard, retiring as a Lieutenant Commander in 1994. While in the Coast Guard, he obtained his information systems background with a master’s degree in business from the University of Maryland.
We're working with the ODFW Research Lab in Corvallis to create a web and database system to:
• Make it easier for ODFW staff to track their progress towards meeting conservation goals for coastal coho.
• Provide public access to frequently requested data and information on salmon and aquatic habitat in coastal Oregon and the Lower Columbia region.
We're working with DFO staff at the Pacific Biological Station in Nanaimo to develop a summarized catch and escapement
data set by Conservation Unit (CU) to:
• Ensure DFO researchers have ready access to standard, core information needed to assess biological status of CUs.
• Establish the groundwork for eventual public access to escapement, catch rate, and CU status information in BC and Yukon
We're working with ADF&G's Copper River and Prince William Sound Commercial Fisheries staff in Cordova to create web and database systems to:
• Make it easier for ADF&G staff to enter, edit, retrieve, and analyze escapement, age, sex, size and harvest data.
• Provide public access to frequently requested data and information.