SULI
CCI
PST
FaST

Student Abstracts: Computer Science at ANL

A Survey of Web Services Interoperability. SCOTT PRICE (University of Illinois at Chicago Chicago, IL 60614) IVAN JUDSON (Argonne National Laboratory, Argonne, IL, 60439)

Web Services have become the lingua franca for building service-oriented systems. The development of tools for web service developers takes a significant amount of time, negatively affecting the deployment, and thus the adoption, of web services as a real technology. With the advent of the Web Service Interoperability Organization along with the resulting Basic Profile specifications - which outlines guidelines for interoperable services - toolkits can adhere to a set of standards that can enable interoperability. The selected toolkits are Microsoft's Visual Studio.NET (VS.NET), Apache Axis for Java (Axis), SOAP::Lite (S::L) for Perl, ZSI for Python and gSoap for C/C++. All of the toolkits require the use of auxiliary software in order for them to work. This includes Microsoft's Windows XP Professional operating system and Internet Information Services (IIS) web server, Sun Microsystem's Java 2 Platform, Enterprise Edition (J2EE), the Apache Tomcat web server, and ActiveState's ActivePerl and ActivePython. A set of simple web services was first designed by constructing a web service definition language file (WSDL). VS.NET was selected to be the provider of these services while the remaining toolkits were used to construct clients that could consume the services. VS.NET, Axis, ZSI and gSoap provided tools within the toolkit to read the WSDL and automatically generate client and server code stubs. This allows the developer to focus solely on writing their specific logic, as opposed to having to meddle with the details of web services. S::L did not provide such a tool; however, the toolkit proved to be easily customizable and achieved interoperability with little fuss. The next step was to create servers using the remaining toolkits and to ensure that all clients and servers were interoperable. The tools provided by VS.NET, Axis, ZSI and gSoap simplified this process. S::L was also easily customizable when acting as a server. While each of these toolkits has documentation, it is often incomplete or lacking in some respect. This survey was aimed at demystifying the process of developing interoperable services by providing ample documentation of our methods.

Configuration File Generation with Python and XML. KENNETH RAFFENETTI (University of Illinois at Urbana Champaign Urbana, IL 61801) CRAIG STACEY (Argonne National Laboratory, Argonne, IL, 60439)

There is a need by servers providing network services for a large number of machines for configuration files that can take a long time to create and update. Systems administrators would be best to devise a system for automatically generating and updating these files from information gathered about the network. When this information is created and/or updated, a single script can easily be run to put the appropriately updated files in place and assure the correct functionality of the network and its hosts. This process can be greatly aided with the use of the Python programming language combined with the data representation abilities of the Extensible Markup Language (XML) to generate files of such nature.

Density Functional Theory-Based Nanostructure Investigation. ADRIAN KOPACZ (Northwestern University Evanston, IL 60208) MIHAI ANITESCU (Argonne National Laboratory, Argonne, IL, 60439)

he development of software for the investigation of chemical and mechanical properties of nanostructures promises to elucidate phenomena not observed in bulk materials. The method formulates a two-step approach to compute the electronic density distribution in and around a nanostructure and then the displacement of its nuclei. The Electronic Problem employs interpolation and coupled cross-domain optimization techniques through a process called electronic reconstruction. The Ionic Problem, within a quasicontinuum framework, relocates the nuclei of the nanostructure given the electronic density in the domain. The goal of this work is to implement an object-oriented framework that will provide testing mechanisms of the evolving code. Future work will focus on further enhancements to substantially increase the dimension of the nanostructures that can be simulated by using approaches that include accurate density functional theory (DFT) computation.

Design and Development of a PDA Interface for Cloning and Purification Database Systems. RAKEYA SMITH (Governors State University University Park, IL 60466) DR. SOON-OK PARK (Argonne National Laboratory, Argonne, IL, 60439)

The objective of the research being conducted in the Midwest Center for Structural Genomics (MCSG) and Bioscience division is to develop, and optimize new, rapid integrated methods for highly cost-effective determination of protein structures through x-ray crystallography. Our near term goal is to improve a user application, which the biologist use, for the advancement of research data storage and mobility. We are faced with the task of modifying the cloning application so that it will be feasibly useable on all PDA's. We want to provide some of the desktop functionality of the application on PDA's. While developing the application there were precautions we took. We developed the application to be user friendly and made sure we had data validation, to reduce errors on data entry that is stored in the database. There are several advantages to this application: It increases the quality and efficiency on the data collection and analysis. It minimizes data entry errors. The data is collected electronically from various sources. Scans the devices and import it into the database by using a scanner on the PDA. It increases accessibility and mobility.

Design and Development of a PDA Interface for Cloning and Purification Database Systems. SOON PARK (Governors State University University Park, IL 60466) DR. ANDRZEJ JOACHIMIAK (Argonne National Laboratory, Argonne, IL, 60439)

One of the researches in the Midwest Center for Structural Genomics (MCSG) is to develop, and optimize new, rapid integrated methods for highly cost-effective determination of protein structures through x-ray crystallography. The cloning database system has been developed for automated gene cloning, gene alteration, and expression using a commercial liquid handling robot system. The purification database system has been developed for the protein purification by implementing a single process stream from cell growth to protein delivery (concentration, characterization). Web applications are available for scientists to access and store their research data to the cloning and purification database systems through a network browser in the MCSG. Mobile computing offers the possibility of dramatically expanding the versatility of computers, by bringing them off the desktop and into PDAs like the Pocket PC, Palm, and Blackberry. This project is to design a PDA application for the cloning and purification database systems. Some of the benefits include, improving operational efficiency, reducing human errors, saving time, and increasing accessibility and mobility The PDA cloning and purification systems make it convenient for scientists to access the databases in the lab while preserving the sophistication of the web application.

Developing Information Technology Data Management Applications using PHP and MySQL. ERIC ANDERS (Iowa State University Ames, IA 50014) MIKE SKWAREK (Argonne National Laboratory, Argonne, IL, 60439)

This project involved developing several systems for managing information technology data required for everyday use. PHP and MySQL provide a framework that allows for rapid development and easy integration with HTML elements. Accessible and efficient systems can be developed in a short amount of time. The PHP and MySQL online manuals provided all of the necessary information to complete the project. I developed several different systems during the course of the project including an application that tracks wireless access point information gathered with Kismet and an application that manages cyber security incident information. While involved in these projects, I investigated new techniques in web design developed to present data as efficiently as possible. One such technique is known as AJAX (Asynchronous JavaScript And XML). AJAX allows data to be displayed to the user's web browser without reloading the page. As new information becomes available, it is instantly displayed to the user.

From Concept to Creation and Implementation: The Web-Based Hour and Dose Tracking and Monitoring System for the Alpha Gamma Hot-Cell Facility. TYLER DILEO (Pennsylvania State University University Park, PA 16802) PAUL DOMAGALA (Argonne National Laboratory, Argonne, IL, 60439)

Due to the dangerous nature of working with and handling radioactive material there are mandatory regulations set forth by the governing powers that be. There is a strong need for the utmost attention to detail while adhering to such strict policies not only for the facility to remains in operation, but for the ongoing safety of those working both directly and indirectly in and around the hot cells. Though a monitoring system is currently in place, its shortcomings require that an overall more effective system be implemented to ensure that safety regulations and protocol are met on a daily basis. Through knowing merely the purpose and needs of the original spreadsheet and the use of a combination of Hypertext Markup Language (HTML), PHP: Hypertext Preprocessor (PHP) and a Structured Query Language (MySQL) through the aid of Navicat a new web-based system was created from scratch. Those using the original spreadsheet prior to the creation of the web-based database are now able to use the new system in a limited manner. Until this method of data recording has met the satisfaction of all involved it will continue to be used along side the spreadsheet. This database is the first step of many in order to create web-driven records within the Alpha-Gamma Hot Cell Facility (AGHFC). Among those are web-based workslips and real-time radiation dose recordings of the facility.

Keeping System Administrators Informed: Building a Reporting Mechanism for the BCFG2 Configuration Management System. JOSEPH HAGEDORN (University of Illinois at Urbana-Champaign Urbana, IL 61801) CRAIG STACEY (Argonne National Laboratory, Argonne, IL, 60439)

Bcfg2 is a client-server configuration management tool used to maintain networks of computers. To overcome some of the challenges in maintaining a diverse group of computer configurations, Bcfg2 implements a comprehensive reporting system. This system provides a feedback loop for system administrators to get vital information for effectively managing a network of computers. The uniqueness of a report allows it to hi-light otherwise unobtainable information about possible problems. Reports also can present a variety of other data in an easy to read format to help administrators gain information about the configuration and state of their managed computers, further easing management. Viable information to be included in a report might consist of overall system statistics, discrepancies between specified and actual configuration, invalid configuration, and auditing information. Some requirements in the implementation of the reporting system include; extensibility of reports, flexibility of report delivery mechanisms, the ability to deliver information administrators need. The reporting system implemented in Bcfg2 meets many of these requirements yet can improve with time due to its flexible nature.

Modeling Dynamics in the Kah Framework. CHARLES TREATMAN (Oberlin College Oberlin, OH 44074) ED FRANK (Argonne National Laboratory, Argonne, IL, 60439)

Kah is a framework for large-scale systems biology applications. However, Kah does not provide facilities for simulating reaction dynamics. An extensible object model is developed for representing reaction dynamics in the Kah framework, and a number of dynamics types are implemented. Converters are developed to connect the new object model to external generators for simulation of reaction systems. The dynamics model is used to reproduce results from a paper on modeling the MAP Kinase pathway in order to verify the validity of the object model. The ModelEditor, an existing Kah application, is modified to allow interaction with the new dynamics simulation capabilities of the Kah framework.

Network Enabled Optimization Severs (NEOS) introduces a novel method of optimizing problems. The NEOS Server provides solutions to find best values under linear and/or non-linear constraints which have derivatives and sparsity patterns computed with diff. JOSE MARTINEZ (Governors State University University Park, IL 60466) J. MORE (Argonne National Laboratory, Argonne, IL, 60439)

The Diet problem is one of the first optimization to be studied back in the 1930's and 1940's. It was first apt by the Army's desire to meet the nutritional requirements of the field GI while minimizing the cost. George Stigler was a researcher who made a hypothesis of $39.93 for the optimal solution to the linear program using the heuristic method. In 1947, Jack Laderman of Mathematical Table Project of the National Bureau of Standard undertook Stiglers problem. The diet case study is one of the most visited site for linear programming. In the past and recently taken surveys many students, teacher, professors, and businesses have used this particular site to teach linear programming. The goal of the diet problem is to find the cheapest combination of foods that will satisfy with daily nutritional requirements of a person. This particular case study is down due to a change of server at Argonne and loss some of the source code. In order to eradicate the problem I had to learn Linux an operating system which hosts the diet case study. The diet problem was written in Perl and C++ which are not working now. The strategy to fix diet case problem are rewrite the nph-Input script file along with other files to Python. In converting the code, I study the code written in Perl and learn Python. In particularly I am writing a Python module to read users web input such as the foods they like to eat and another Python module to flush a buffered output such as the solution form the NEOS to print to the web.

Scalable Systems Software. ZACH LOWRY (MTSU Murfreesboro, TN 37130) NARAYAN DESAI (Argonne National Laboratory, Argonne, IL, 60439)

System software on high performance systems consists of everything below the user applications and above the kernel. When designing and writing system software, a number of different problems are encountered. Scalability is one such problem, and a key concern because system size continues to grow quickly. We discuss the Cobalt system software suite, and the techniques used to over come these scalability concerns.

Supporting Information technology to take work load from System administrators in order maximize the volume of projects accomplished by the Systems department of Argonne National Laboratory. JOE KESTEL (Governors State University University Park, IL 60466) CRAIG STACEY (Argonne National Laboratory, Argonne, IL, 60439)

The goal of my visit at Argonne National Laboratories Math and Computer Science Division was to work in real world IT department the hands on doing of assignments that are taking place in IT departments around the world in the current in exchange for gaining new skills learned during the appointment as with a starter course in programming high level languages like Python and PHP, providing helpdesk support and alleviating administrators of the overhead of some assignments. I also had taken the privilege of building and basic installation of a server and its operating system. I gained knowledge of cluster computer hardware and how a cluster system operates by reassembling getting to re-assemble one in a more effective deployment. Experience with such a machine is obtainable in most IT systems department

The Cost of Using MPICH, MPICH2, and the Ganglia GMOND. BRIAN BAKER (Olivet Nazarene University Bourbonnais, IL 60914) SUSAN COGHLAN (Argonne National Laboratory, Argonne, IL, 60439)

Scientists use clusters and supercomputers to compute complex problems many times more precise and in a fraction of the time needed if the problem were to be run on a single workstation. Scientists want to ensure that each node of the cluster is focusing all of its attention or computing power towards their codes. Other processes on the compute nodes, disk I/O, and network I/O all cause interference with the scientist's code. Pete Beckman, along with help from Kazutomo Yoshii, has designed and implemented a program called Selfish Detour that will calculate the amount of interference, or "noise" (also latency), that these external processes are causing. This program, however, will only calculate the latencies for the system as a whole, and cannot pinpoint the amount of latency caused by a specific application. Comparisons of latencies caused by using MPICH and MPICH2 will be discussed in this paper along with a conclusion of whether MPICH2 is able to lower latency levels as compared with MPICH. Latencies caused by running daemons such as the Ganglia daemon will also be discussed and a conclusion of whether the daemon will be fit to be used in a production environment will be made.

The Diet Problem Case Study Using Network Enabled Optimization Systems (NEOS). MALCOLM KNOX (Governors State University University Park, IL 60466) XUEQING TANG (Governors State University University Park, IL 60466) JORGE MORE (Argonne National Laboratory, Argonne, IL, 60439)

The Diet Problem Case Study is a real world application that utilizes the Network Enabled Optimization System ( NEOS ). The Diet Problem is the most popular case study problem in industry and universities to learn about Linear Programming. When the menu is accessed, and the user makes his/her selections for meals of a day, an error, "Error in write_data program: return code 32256" occurs, and the user does not get to see the results of the least cost menu for his/her selections. If the user wants to change the constraints on the menu, there is an error that displays "Can't open file 'total_cost.out'. We studied C++, Perl, Hyper-Text Markup Language ( HTML ), A thematical Programming Modeling Language ( AMPL ), and Linear Programming to find out what's missing in the current system. We are writing all the codes in Python to make the case study running again. Python is an object-oriented computer programming language. It's readable, maintainable, portable, and it's powerful. It can be interchanged with other languages, if necessary. Python is easy to use, and easy to learn. Since Python has several advantages over other programming languages, we are converting the code for the Diet Problem over to Python. This way, valuable time and resources are saved. Debugging time is reduced. With all of these advantages, Python provides a better atmosphere, and an easier working environment. I'm working on the write_data module. There are two input files for the module: input.dat, and the food selection from the user. The input.dat file lists the nutrient's name, measuring unit, minimum and maximum requirements, and also the foods nutrient's information. The write_data module will generate an AMPL data file called diet.dat, based on the user's selection and input.dat file.

The Versatility of Web Applications. CHRIS VULETICH (Northern Illinois University DeKalb, IL 60115) CRAIG STACEY (Argonne National Laboratory, Argonne, IL, 60439)

The use of web applications has become increasingly popular in businesses and scientific settings since the mid 1990's. As programming languages such as Perl, PHP and ASP become easier and more comprehensive, more and more people are using these languages to their advantage to create web applications that calculate large amounts of data and store the data in a database.

Two Hybrid Approaches to Increase Activity Analysis Accuracy. SCOTT EASTERDAY (La Sierra University Riverside, CA 82515) BARBARA KREASECK (La Sierra University Riverside, CA 92515) LUIS RAMOS (La Sierra University Riverside, CA 92505) PAUL HOVLAND (Argonne National Laboratory, Argonne, IL, 60439)

Automatic Differentiation is the process of translating one program that computes a function f and generating a different program that computes the derivative of that function, f '. Activity analysis is important for AD. Our results show that a dynamic activity analysis, checking at run-time, incurs an average overhead of 27% when all independent variable are active. When as few as half of the independent variables are active, dynamic activity analysis enables an average speedup of 28%. We investigated two techniques aimed at reducing the overhead of dynamic activity analysis. One uses a profile-guided activity analysis and the other uses a less-conservative static activity analysis. The results from these techniques are mixed: some benchmarks showed improvement, others performed poorly.

Web Content Development and management using DREAMWEAVER MX 2004 for the scientist team who work on the Linear Accelerator Coherent Light Source (LCLS) project at Argonne National Laboratory (ANL). ALY DIAKHATE (Borough of Manhattan Community College New York, NY 10007) CHRISTOPHER KLAUS (Argonne National Laboratory, Argonne, IL, 60439)

Scientists at Argonne National Laboratory (ANL) and their colleagues at Stanford Linear Accelerator Center (SLAC) work together on joint projects. While SLAC has its own web site, the Argonne team does not. The purpose of the project is to build a web site similar to the SLAC website, but with a design that is similar to the ANL official web site. To simplify the task, we use a software program known as DREAMWEAVER MX 2004 for its capability to create, maintain and update Hypertext Makeup Language (HTML) pages. We focus on some major points: Set and state our goals (the needs to create this site), Time management (How DREAMWEAVER facilitated us to save time when designing the site), the Presentation, Content management and Navigation of the site.

WSND: an architecture for wireless sensor networks. ISAAC WASILESKI (University of Chicago Chicago, IL 60637) PETE BECKMAN (Argonne National Laboratory, Argonne, IL, 60439)

The rise in popularity of wireless sensor networks composed of small "motes" capable of gathering data, performing a small amount of processing, and communication over ad-hoc wireless networks, has led to the necessity for an integrated architecture, not only on the mote level, but on the level of data processing and user interaction. There already exist programs for controlling the motes, for visualizing data, and for debugging the network, however, what is needed is a modular architecture that allows built-in and custom components to co-operate in managing the network. The WSND system provides a low-overhead system to do just that, and incorporates a number of components to demonstrate the capabilities of the system. In its current incarnation, the system can gather data from a self-organizing ad-hoc network of motes, feed this data into event listeners or data handlers, detect and respond to events, collect data into a relational database, and visualize raw messages, graphical plots of sensor data, or routing graphs in near real-time. The key is that it's trivial to extend the system in any way - for instance, by writing new event detectors, which can be developed in high-level scripting languages. In this manner, the system can be adapted easily to any desired task.

ZeptoOS and The IBM Blue Gene/L Supercomputer. CAMERON COOPER (Ohio State Columbus, OH 43210) PETE BECKMAN (Argonne National Laboratory, Argonne, IL, 60439)

The IBM Blue Gene/L supercomputer currently provides more computational power than any other supercomputer architecture in the world. However, without proper software support this system cannot be fully utilized. Many crucial software systems in the IBM Blue Gene/L system are closed-source which prevents modification and fine tuning to maximize system performance. In our research, we have analyzed different systems in the Blue Gene/L and looked for ways that we improve them. In some cases this has required us to replace them with our own, open-source, versions. In particular, we have done this with the I/O Node Kernel and Ramdisk, as well as the CIOD system call forwarder.

Zoid [Zepto OS IO Daemon]. IVAN BESCHASTNIKH (University of Chicago Chicago, IL 60637) PETER BECKMAN (Argonne National Laboratory, Argonne, IL, 60439)

The design of IBM's Blue Gene family of super computers scales to thousands of nodes by offloading all system calls made by a compute node to a designated IO node. To facilitate and make the porting of software possible, the system calling interface is transparent to software written for POSIX compliant systems. The native mechanism responsible for servicing forwarded system calls is the Compute IO Daemon (CIOD). This paper introduces and describes the design of ZeptoOS IO Daemon (Zoid), an alternative to CIOD. Zoid is an attempt to construct a scalable system call forwarding mechanism that is designed to emulate CIOD on the BG/L, and integrate with ZeptoOS, a platform independent research operating system meant to scale to systems with millions of CPUs. Zoid has a flexible design which allows user defined behaviour per system call, varying consistency semantics and a choice of cache policies. Another critical feature of Zoid is that it is an open platform for research. Open Source lets scientists explore advanced functionality and new algorithms for BG/L that would otherwise not be possible. Most importantly, Zoid and the ZeptoOS suite is the only available open source choice for the BG/L architecture.