Which of the following provides a uniform interface to the database for users and applications?

The Essential Resource Management

Nội dung chính Show

The Essential Resource Management
5.1 Managing Resources
Distributed Systems
6.11.5.1 Representational State Transfer (REST)
Queued Transaction Processing
Broker-Based Architecture
Modeling RESTful Web of Things Services
3.2.1 RESTful Design
RESTful IoT Authentication Protocols
2 REST Foundations
Cloud Computing Infrastructure for Data Intensive Applications
8.3 SlipStream: Cloud Application Management Platform
Testing of Intelligent Vehicles Using Virtual Environments and Staged Scenarios
2.2.1 Simulation
Literature Review
2.1.3 Data Management in Other Distributed Systems
GPGPU: General-Purpose Computing on the GPU
2.4 Open Computing Language (OpenCL)
Human Resource Information Systems
IV An Overview of the Human Resource Information System
What provides the interface between the application program and the DBMS?
What is database and database management system?
What are programs that help a database to be maintained by creating editing and deleting data records and files?
Is defined to be a portion of a row used to uniquely identify a row in a table?

Thomas Sterling, ... Maciej Brodowicz, in High Performance Computing, 2018

5.1 Managing Resources

Supercomputer installation frequently represents a significant financial investment by the hosting institution. However, the expenses do not stop after the hardware acquisition and deployment is complete. The hosting data center needs to employ dedicated system administrators, pay for support contracts and/or a maintenance crew, and cover the cost of electricity used to power and cool the machine. Together these are referred to as “cost of ownership”. The electricity cost is frequently overwhelming for large installations. A commonly quoted average is over US$1 million for each megawatt of power consumed per year in the United States; in many other countries this figure is much higher. It is not surprising that institutions pay close attention to how supercomputing resources are used and how to maximize their utilization.

Addressing these concerns, resource management software plays a critical role in how supercomputing system resources are allocated to user applications. It not only helps to accommodate different workload sizes and durations, but also provides uniform interfaces across different machine types and their configurations, simplifying access to them and easing (at least some) portability concerns. Resource management middleware provides mechanisms by which computing systems may be made available to various categories of users (including those external to the hosting institution, for example via collaborative environments such as the National Science Foundation XSEDE [1]) with accurate accounting and charging for the resource use. Resource management tools are an inherent part of the high performance computing (HPC) software stack. They perform three principal functions: resource allocation, workload scheduling, and support for distributed workload execution and monitoring. Resource allocation takes care of assigning physical hardware, which may span from a fraction of the machine to the entire system, to specific user tasks based on their requirements. Resource managers typically recognize the following resource types.

•

Compute nodes. Increasing the number of nodes assigned to a parallel application is the simplest way to scale the size of the dataset (such as the number of grid points in a simulation domain) on which the work is to be performed, or reduce the execution time for a fixed workload size. Node count is therefore one of the most important parameters requested when scheduling an application launched on a parallel machine. Even single physical computers may include various node types; for example differing in memory capacity, central processing unit (CPU) types and clock frequency, local storage characteristics, available interconnects, etc. Properly configured resource managers permit selection of the right kind of node for the job, precluding assigning resources that will likely go unused.

•

Processing cores (processing units, processing elements). Most modern supercomputer nodes feature one or more multicore processor sockets, providing local parallelism to applications that support it through multithreading or by accommodating several concurrent single-threaded processes. For that reason, resource managers provide the option of specifying shared or exclusive allocation of nodes to workloads. Shared nodes are useful in situations where already assigned workloads would leave some of the cores unoccupied. By coscheduling different processes on the remaining cores, better utilization may be achieved. However, this comes at a cost: all programs executing on the shared node will also share access to other physical components, such as memory, network interfaces, and input/output (I/O) buses. Users who perform careful benchmarking of their applications are frequently better off allocating the nodes in exclusive mode to minimize the intrusions and resulting degradation due to contention caused by unrelated programs. Exclusive allocation can also be used for programs that rely on the affinity of the executing code to specific cores to achieve good performance. For example, programs that rely on lowest communication latency may want to place the message sending and receiving threads on cores close to the PCI express bus connected to the related network card. This may not be possible when multiple applications enforce their own, possibly conflicting, affinity settings at the same time.

•

Interconnect. While many systems are built with only one network type, some installations explicitly include multiple networks or have been expanded or modernized to take advantage of different interconnect technologies, such as GigE and InfiniBand architecture in combination. Selection of the right configuration depends on the application characteristics and needs. For example, is the program execution more sensitive to communication latency, or does it need as much communication bandwidth as possible? Can it take advantage of channel bonding using different network interfaces? Often the answer may be imposed by the available version of the communication library with which the application has been linked. For example, it is common to see message-passing interface (MPI) installations with separate libraries supporting InfiniBand and Ethernet if both such network types are available. Selecting a wrong network type will likely result in less efficient execution.

•

Permanent storage and I/O options. Many clusters rely on shared file systems that are exported to every node in the system. This is convenient, since storing a program compiled on the head node in such a file system will make it available to the compute nodes as well. Computations may also easily share a common dataset, with modifications visible to the relevant applications already during their runtime. However, not all installations provide efficient high-bandwidth file systems that are scalable to all machine resources and can accommodate concurrent access by multiple users. For programs performing a substantial amount of file I/O, localized storage such as local disks of individual nodes or burst buffers (fast solid-state device pools servicing I/O requests for predefined node groups) may be a better solution. Such local storage pools are typically mounted under a predefined directory path. The drawback is that the datasets generated this way will have to be explicitly moved to the front-end storage after job completion to permit general access (analysis, visualization, etc.). Since there is no single solution available, users should consult local machine guides to determine the best option for their application and how it can be conveyed to the resource management software.

•

Accelerators. Heterogeneous architectures that employ accelerators (graphics processing units (GPUs), many integrated cores (MICs), field programmable gate array modules, etc.) in addition to main CPUs are a common way to increase the aggregate computational performance while minimizing power consumption. However, this complicates resource management, since the same machine may consist of some nodes that are populated with accelerators of one type, some nodes that are populated with accelerators of a different type, and some nodes that do not contain any accelerating hardware at all. Modern resource managers permit users to specify parameters of their jobs so that the appropriate node types are selected for the application. At the same time, codes that do not need accelerators may be confined to regular nodes as much as possible for best resource utilization over multiple jobs.

Resource managers allocate the available computing resources to jobs specified by users. A job is a self-contained work unit with associated input data that during its execution produces some output result data. The output may be as little as a line of text displayed on the console, or a multiterabyte dataset stored in multiple files, or a stream of information transmitted over the local or wide area network to another machine. Jobs may be executed interactively, involving user presence at the console to provide additional input at runtime as required, or use batch processing where all necessary parameters and inputs for job execution are specified before it is launched. Batch processing provides much greater flexibility to the resource manager, since it can decide to launch the job when it is optimal from the standpoint of HPC system utilization and is not hindered by the availability of a human operator, for example at night. For this reason, interactive jobs on many machines may be permitted to use only a limited set of resources.

Jobs may be monolithic or subdivided into a number of smaller steps or tasks. Typically each task is associated with the launch of a specific application program. In general, individual steps do not have to be identical in terms of used resources or duration of execution. Jobs may also mix parallel application invocations with instantiations of single-threaded processes, dramatically changing the required resource footprint. An example is a job that first preprocesses input data, copying them to storage local to its execution nodes, then launches the application that gives high-bandwidth access to the data, and finally copies the output files to shared storage using shell commands.

Pending computing jobs are stored in job queues. The job queue defines the order in which jobs are selected by the resource manager for execution. As the computer science definition of the word suggests, in most cases it is “first in, first out” or “FIFO”, although good job schedulers will relax this scheme to boost machine utilization, improve response time, or otherwise optimize some aspect of the system as indicated by the operator (user or system administrator). Most systems typically use multiple job queues, each with a specific purpose and set of scheduling constraints. Thus one may find an interactive queue solely for interactive jobs. Similarly, a debug queue may be employed that permits jobs to run in a restricted parallel environment that is big enough to expose problems when running on multiple nodes using the same configuration as the production queue, yet small enough that the pool of nodes for production jobs may remain substantially larger. Frequently there are multiple production queues available, each with a different maximum execution time imposed on jobs or total job size (short versus long, large versus small, etc.). With hundreds to thousands of jobs with different properties pending in all queues of a typical large system, it is easy to see why scheduling algorithms are critical to achieving high job throughput. Common parameters that affect job scheduling include the following.

•

Availability of execution and auxiliary resources is the primary factor that determines when a job can be launched.

•

Priority permits more privileged jobs to execute sooner or even preempt currently running jobs of lower priority.

•

Resources allocated to the user determines the long-term resource pool a specific user may consume while his or her account on the machine remains active.

•

Maximum number of jobs that a user is permitted to execute simultaneously.

•

Requested execution time estimated by the user for the job.

•

Elapsed execution time may cause forced job termination or impact staging of pending jobs for upcoming execution.

•

Job dependencies determine the launching order of multiple related jobs, especially in producer–consumer scenarios.

•

Event occurrence, when the job start is postponed until a specific predefined event occurs.

•

Operator availability impacts the launch of interactive applications.

•

Software license availability if a job is requesting the launch of proprietary code.

Resource managers are equipped with optimized mechanisms that enable efficient launching of thousands or more processes across a comparable number of nodes. Naïve approaches, such as repeated invocation of a remote shell, will not yield acceptable results at scale due to high contention when transferring multiple programs' executables to the target nodes. Job launchers employ hierarchical mechanisms to alleviate the bandwidth requirements and exploit network topology to minimize the amount of data transferred and overall launch time. Resource managers must be able to terminate any job that exceeds its execution time or other resource limits, irrespective of its current processing status. Again, distributed termination should be efficient to release the allocated nodes to the pool of available nodes as quickly as possible. Finally, resource managers are responsible for monitoring application execution and keeping track of related resource usage. The actual resource utilization data is recorded to enable accounting and accurate charging of users for their cumulative system resource usage.

A number of resource management suites have been created that differ in their features, capabilities, and adoption level. The software commonly used today includes the following.

•

Simple Linux Utility for Resource Management (SLURM) [2] is a widely used free open-source package.

•

Portable Batch System (PBS) [3] was originally available as proprietary code as well as several open implementations with compatible application programming interface and commands.

•

OpenLava [4] is an open-source scheduler based on the Platform Load Sharing Facility and originally developed at the University of Toronto.

•

Moab Cluster Suite [5], based on the open-source Maui Cluster Scheduler, is a highly scalable proprietary resource manager developed by Adaptive Computing Inc.

•

LoadLeveler [6], currently known as the Tivoli Workload Scheduler LoadLeveler, is a proprietary IBM product originally targeting systems running the AIX operating system (OS) but later ported to POWER and x86-based Linux platforms.

•

Univa Grid Engine [7] uses technology originally developed by Sun Microsystems and Oracle that supports multiple platforms and OSs.

•

HTCondor [8], formerly known just as Condor, is an open-source framework for coarse-grain high-throughput computing.

•

OAR [9] provides database-centered resource and task management for HPC clusters and some classes of distributed systems.

•

Hadoop Yet Another Resource Negotiator (YARN) [10] is a broadly deployed scheduler specifically tailored to MapReduce applications, discussed in detail in Chapter 19.

Unfortunately there is no common standard specifying the command format, language, and configuration of resource management. Every system mentioned above uses its own interface and supports different sets of capabilities, although the basic functionality is essentially similar. Thus two widely used examples of resource managers are described here in detail, SLURM and PBS. Both have particularly broad adoption in the HPC community. These sections are presented in tutorial form to build the reader's skill-set.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780124201583000058

Distributed Systems

Richard John Anthony, in Systems Programming, 2016

6.11.5.1 Representational State Transfer (REST)

REST is a set of guidelines that are designed to ensure high quality in applications such as Web services, in terms of simplicity, performance, and scalability. REST-compliant (or RESTful) Web services must have a client-server architecture and use a stateless communication protocol such as http.

The design of RESTful applications should respect the following four design principles:

Resource identification through URI: Resources that are accessible via a RESTful Web service should be identified by URIs. URIs represent a global Web-compliant-related address space.

Uniform interface: A fixed set of four operations—PUT, GET, POST, and DELETE—are provided to create, read, update, and delete resources, respectively. This restriction ensures clean, uncluttered, and universally understood interfaces.

Self-descriptive messages: Resources need to be represented in various formats, depending on the way they are to be manipulated and how their content is to be accessed. This requires that the representation of a resource in a message is decoupled from the actual resource itself and that the request and response messages identify the resource itself and either which operation is to be performed or the resulting value of the resource after the operation, respectively.

Stateful interactions through hyperlinks: The Web service itself, and thus each server-side interaction with a resource, should be stateless. This requires that request messages must be self-contained (i.e., the request message must contain sufficient information to contextualize the request so that it can be satisfied by the service without the need for any additional server-side-stored state concerning the client or its specific request).

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128007297000066

Queued Transaction Processing

Philip A. Bernstein, Eric Newcomer, in Principles of Transaction Processing (Second Edition), 2009

Broker-Based Architecture

In a broker-based architecture a message server provides a bridge between the heterogeneous applications (see Figure 4.10). Instead of communicating directly with the applications, a client communicates with the broker, which forwards the message to the desired application. The client can be one of the applications being integrated or an external program such as an end-user device.

Figure 4.10. Broker-based Application Integration. The Message Broker mediates message transfer from clients to TP applications.

The broker provides three functions, which correspond to the three differences to be reconciled. First, it supports all the communication protocols required to communicate with the applications. A client sends a message to the broker using any of the supported protocols. The broker can forward that message to the desired application using the protocol supported by that application.

Second, the broker supports the union of all the functions offered by the applications being integrated. Usually, the broker offers a uniform interface to these functions, such as a canonical message format defined by the broker. Thus, a client can call these functions using that uniform interface, independent of the message protocol, programming language, or other technologies used by the application that implements the function. Internally the broker stores a mapping that tells it how to translate each function into the form required by the application that implements the function. This mapping often is implemented as a set of protocol adaptors, one for each of the application environments being integrated. Some brokers can also support clients that use their own protocols and formats and don’t enforce the use of a single uniform interface

Third, it offers tools for translating between different parameter and message formats. The translation may be based on a calculation (such as translating between date formats), a table (such as translating between country codes), or a lookup from an external source (such as an exchange rate server to translate a money field between currencies). Some applications import or export structured documents (e.g., in XML), rather than individual parameters. In this case document translation is used, such as an XSLT program that translates one XML document into another XML document having a different format.

Some brokers also offer routing functions. A message may be routed based on the contents of a request or by requirements that are set by the client or the server. Other broker functions include logging, auditing, performance monitors, and other system management functions.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9781558606234000044

Modeling RESTful Web of Things Services

Christian Prehofer, Ilias Gerostathopoulos, in Managing the Web of Things, 2017

3.2.1 RESTful Design

The REST architectural style is aligned with the concepts used in the HTTP protocol; the work by Roy Fielding has shaped the concepts of RESTful design [15]. Following [37], the main ingredients of RESTful design are as follows:

•

Identification of resources via Uniform Resource Identifiers (URI). These are hierarchically structured and each resource must have at least one URI.

•

Uniform interfaces to read and manipulate the resources. These are the four basic HTTP operations GET, POST, PUT and DELETE. Other operations, e.g. HEAD and OPTIONS, deal with metadata.

•

Self-descriptive messages. Representation of the resources can be accessed in different formats, e.g. HTML, JSON or XML. The messages, both requests and reply, contain the complete context and are self-descriptive in this sense.

•

Stateless interactions, i.e. the server does not maintain session state on the interactions with the clients. This means that all information to fulfill a request is included in the HTTP request, i.e. the resource name and message.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128097649000044

RESTful IoT Authentication Protocols

H.V. Nguyen, L. Lo Iacono, in Mobile Security and Privacy, 2017

2 REST Foundations

Roy Fielding introduced the architectural style REST (Fielding, 2000) in his doctoral dissertation. The basic idea behind this concept is to provide a guideline proposing architectural constraints for designing highly scalable distributed software systems. These constraints are illustrated in Fig. 1.

Fig. 1. REST constraints and principles (Gorski et al., 2014a).

The communication in REST is based on the client-server and request-response model. Therefore it is always the client who initiates the communication by issuing a request addressing a resource from a server. In the context of REST, a resource is an abstract definition of information intended for human interpretation or machine processing. Thus, a resource can have multiple representations. Moreover, a resource must be addressable by a unique resource identifier. Hence, each request must include a resource identifier. In conjunction with the requested action, both data elements define the intention and destination of a request. The resource identifier syntax and the request actions must be standardized and predefined by the uniform interface so that all components in a REST architecture can understand the purpose of a request. Fielding does not specify any concrete actions for REST-based systems; the definition of a fixed set of actions is rather a matter of the implementation of the uniform interface. REST-based systems mostly use actions to create, read, update and delete a resource. Depending on the action, a request can comprise a resource representation such as that for creating or updating. In addition to the resource identifier syntax and request action, the uniform interface also defines a fixed set of further metadata elements describing, for example, the size and the media type of a resource representation. Since REST messages are constrained to be stateless and cacheable, metadata can also define state information such as authentication or session data and caching information. As requests in REST contain all required data elements including the action, the resource identifier, state and cache information, and further metadata, their semantics are self-descriptive for each server. This means that every server can understand the intention of a request without maintaining any particular state and without knowing the client in advance, since all requests are self-descriptive and all data elements are standardized.

The stateless and self-descriptive nature of REST messages makes them well suited for intermediate processing. Thus in many cases, the communication flow in REST-based systems is layered by multiple intermediate systems to ensure efficiency and scalability. For instance, intermediaries are utilized to cache messages, saving a server from replicated processing with the aim to reduce communication latency. A load balancer is another prevalent intermediate component to distribute workloads across multiple servers in order to provide scalability. Further intermediaries can be, for example, security gateways performing authentication, as well as access control or cross-protocol proxies encapsulating legacy or other related service systems.

Once a request receives a server, the endpoint returns a response including a response meaning informing about the result of a request. As with requests, a response can contain further metadata, such as authentication or caching information, and a resource representation accompanied by resource representation metadata. Moreover, the metadata and resource representation of a REST response may contain hypermedia elements defining application control information; that is, description of actions to be applied to resource identifiers, which are embedded in the metadata and resource representation.

The metadata and resource representation of the returned response triggers a state change inside the client. Based on hypermedia information within the response, a client can choose the next desired request, or state change, to repeat the described cycle. This kind of application control concept is called hypermedia as the engine of application state, one of the key interface constraints of REST.

All of these aforementioned constraints and principles describe a RESTful architecture that promotes scalability, generality of interfaces, and independent deployment of components, as well as reduces latency, enforces security, and encapsulates legacy and related systems.

Hypertext Transfer Protocol (HTTP) (Fielding and Reschke, 2014) is one protocol that is in conformance with the REST constraints and principles, as it is based on the client-server and request-response model. Moreover, it specifies a set of request actions (i.e., HTTP methods) and a set of further metadata such as header fields or status codes. Resources in HTTP can be addressed by a standardized resource identifier syntax, namely the URI (Berners-Lee et al., 2005) syntax. Also, HTTP messages can include a resource representation such as JSON (Crockford, 2006), HTML (Hickson et al., 2014), or XML (Bray et al., 2008). The metadata and resource representation may contain description on hypermedia relationships (i.e., links or resource identifiers) to describe the next possible state changes or requests for the client. Additionally, HTTP messages are stateless and cacheable, so they can be processed in intermediate systems, such as proxies, cache servers, or load balancers, without saving any contextual information. HTTP was been originally invented as the technical foundation for the web, the world's largest distributed system.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128046296000109

Cloud Computing Infrastructure for Data Intensive Applications

Yuri Demchenko, ... Charles Loomis, in Big Data Analytics for Sensor-Network Collected Intelligence, 2017

8.3 SlipStream: Cloud Application Management Platform

SlipStream is an open source cloud application management platform1 that provides software developers and service operators with the necessary functionality to manage the complete lifecycle of their cloud applications. Through its plugin architecture, SlipStream supports most major cloud service providers (CSPs) and the primary open source cloud distributions. By exposing a uniform interface that hides differences between cloud providers, SlipStream facilitates application portability across the supported cloud infrastructures.

To take advantage of cloud portability, developers define “recipes” that transform preexisting “base” VMs into the components that they need for their application. By reusing these base VMs, developers can ensure uniform behavior of their application components across clouds without having to deal with the time-consuming and error-prone transformation of VM images. Developers bundle the defined components into complete cloud applications using SlipStream facilities for passing information between components and for coordinating the configuration of services.

Once a cloud application has been defined, the operator can deploy the application in “one click,” providing values for any defined parameters and choosing the cloud infrastructure to use. With SlipStream, operators may choose to deploy the components of an application in multiple clouds, for example, to provide geographic redundancy or to minimize latencies for clients. In order to respond to changes in load, operators may adjust the resources allocated to a running application by scaling the application horizontally (changing the number of VMs) or vertically (changing the resources of a VM).

SlipStream combines its deployment engine with an “App Store” for sharing application definitions with other users and a “Service Catalog” for finding appropriate cloud service offers, providing a complete engineering PaaS supporting DevOps processes. All of the features are available through its web interface or RESTful API.

8.3.1 Functionality used for applications deployment

The bioinformatics use cases described above are implemented using SlipStream's facilities and tools to define applications and its deployment engine through the RESTful API [44].

The definition of an application component actually consists of a series of recipes that are executed at various stages in the lifecycle of the application. The main recipes, in order, are as follows:

•

Preinstall: Used principally to configure and initialize the OS package management.

•

Install packages: A list of packages to be installed on the machine. SlipStream supports the package managers for the RedHat and Debian families of OS.

•

Postinstall: Can be used for any software installation that cannot be handled through the package manager.

•

Deployment: Used for service configuration and initialization. This script can take advantage of SlipStream's “parameter database” to pass information between components and to synchronize the configuration of the components.

•

Reporting: Collects files (typically log files) that should be collected at the end of the deployment and made available through SlipStream.

There are also a number of recipes that can be defined to support horizontal and vertical scaling that are not used in the defined here use cases. The applications are defined using SlipStream's web interface; the bioinformatics portal then triggers the deployment of these applications using the SlipStream RESTful API.

8.3.2 Example recipes

The application for the bacterial genomics analysis consisted of a compute cluster based on Sun Grid Engine with a Network File System (NFS) exported from the master node of the cluster to all of the slave nodes. The master node definition was combined into a single “deployment” script that performed the following actions:

Initialize the yum package manager.

Install bind utilities.

Allow SSH access to the master from the slaves.

Collect IP addresses for batch system.

Configure the batch system admin user.

Export NFSs to slaves.

Configure the batch system.

Indicate that the cluster is ready for use.

The deployment script extensively uses the parameter database that SlipStream maintains for each application to correctly the configure the master and slaves within the cluster. A common pattern is the following:

ss-display "Exporting SGE_ROOT_DIR…"

echo -ne "$SGE_ROOT_DIR\t" > $EXPORTS_FILE

for ((i=1; i<=`ss-get

Bacterial_Genomics_Slave:multiplicity`; i++ ));

node_host=`ss-get

Bacterial_Genomics_Slave.$i:hostname`

echo -ne $node_host >> $EXPORTS_FILE

echo -ne "(rw,sync,no_root_squash) ">> $EXPORTS_FILE

done

echo "\n" >> $EXPORTS_FILE # last for a newline

exportfs -av

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128093931000027

Testing of Intelligent Vehicles Using Virtual Environments and Staged Scenarios

Arda Kurt, ... Ümit Özgüner, in Advances in Intelligent Vehicles, 2014

2.2.1 Simulation

The initial stages of the overall testing procedure take place solely in the virtual world, as high-fidelity computer simulations are very useful in detecting and correcting initial design and implementation errors rapidly.

Although there are many simulators available for ITS simulations depending on the specific issues studied, the examples in the remainder of this chapter will focus on the Gazebo [7] and Stage [8] simulators used at CITR. These two simulation environments use the Player [8] interface to communicate with and control the simulated robots. The Player interface is also available for actual robots through a uniform interface so that control and sensing software can be written without knowledge of the specific hardware of the robot or how virtual the robot may be.

The Stage simulator provides a lightweight environment that focuses on the interaction of a vast number of autonomous agents. These vehicles navigate on the stacked two-dimensional planes that create Stage’s 2.5-dimensional simulation region. This setup constitutes the first process in our proposed procedure.

One particular vehicular autonomy study [9], which will be visited throughout the narrative in the following sections, is based on a collaborative convoying scenario, where a three-vehicle convoy consisting of two autonomous vehicles led by a manual vehicle continuously circles the figure-8 route shown in the diagram in Figure 2.3. They obey the traffic light, accelerating and decelerating simultaneously to maintain set inter-vehicle distances and to allow the convoy to clear the intersection as smoothly and rapidly as possible. They also obey the stop sign and intersection precedence rules, determining the presence of a manually driven fourth vehicle using information obtained from its DSRC transmissions.

Figure 2.3. Schematic Sketch of the Collaborative Experiment Layout.

This scenario, seen in Figure 2.4 as the initial Stage simulation, shows four vehicles interacting with one another at an intersection.

Figure 2.4. Stage Simulation of Four-Robot Experiment Interacting Through an Intersection.

In contrast, Gazebo is a fully three-dimensional simulator with higher fidelity physics corresponding to roll and tilt of simulated vehicles and fully 3D terrain profiles, neither of which are available in Stage. The same Player interface is provided for the robots simulated in the Gazebo.

An example simulation for a fully autonomous vehicle coming to a stop at an intersection and detecting the other vehicles with an LIDAR can be seen in Figure 2.5. This particular simulation example was based on the DARPA Urban Challenge 2007 [10] autonomous vehicle competition.

Figure 2.5. Simulation of an Intersection Scenario with Simulated LIDAR Rays Visible.

© 2008 IEEE. Reprinted with permission from “Simulation and testing environments for the DARPA Urban Challenge”, IEEE International Conference on Vehicular Electronics and Safety, 2008. ICVES 2008, 22–24 September 2008, pp. 222–226.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780123971999000021

Literature Review

Dong Yuan, ... Jinjun Chen, in Computation and Storage in the Cloud, 2013

2.1.3 Data Management in Other Distributed Systems

Many technologies are utilised for computation and data-intensive scientific applications in distributed environments and have their own specialties. They could be important references for our work. A brief overview is shown below [78]:

Distributed database (DDB) [68]. A DDB is a logically organised collection of data stored at different sites on a computer network. Each site has a degree of autonomy, which is capable of executing a local application, and also participates in the execution of a global application. A DDB can be formed either by taking an existing single site database and splitting it over different sites (top-down approach) or by federating existing database management systems so that they can be accessed through a uniform interface (bottom-up approach). However, DDBs are mainly designed for storing the structured data, which is not suitable for managing large generated data sets (e.g. raw data saved in files) in scientific applications.

Content delivery network (CDN) [38]. A CDN consists of a ‘collection of (non-origin) servers that attempt to offload work from origin servers by delivering content on their behalf’. That is, within a CDN, client requests are satisfied by other servers distributed around the Internet (also called edge servers) that cache the content originally stored at the source (origin) server. The primary aims of a CDN are, therefore, load balancing to reduce effects of sudden surges in requests, bandwidth conservation for objects such as media clips and reducing the round-trip time to serve the content to the client. However, CDNs have not gained wide acceptance for data distribution because of the restricted model that they follow.

P2P Network [66]. The primary aims of a P2P network are to ensure scalability and reliability by removing the centralised authority and also to ensure redundancy, to share resources and to ensure anonymity. Such networks have mainly focused on creating efficient strategies to locate particular files within a group of peers, to provide reliable transfers of such files in the face of high volatility and to manage high load caused by the demand for highly popular files. Currently, major P2P content-sharing networks do not provide an integrated computation and data distribution environment.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780124077676000020

GPGPU: General-Purpose Computing on the GPU

Ying Tan, in Gpu-Based Parallel Implementation of Swarm Intelligence Algorithms, 2016

2.4 Open Computing Language (OpenCL)

OpenCL is an open, royal-free specification aimed on heterogenous programming platforms. OpenCL is portable and supported by diverse hardware, GPUs (both NVIDIA and AMD), CPUs, accelerators (eg, Intel Phi), FPGA, and so forth [199]. As smart phones, tablets, and wearable devices are increasingly popular, there are good chances that OpenCL on these novel, embedded compute devices [67, 114].

Both based on the proof-of-conception platform of Brook [21], OpenCL and CUDA share very similar platform model, execution model, memory model and programming model. Most concepts in both platforms can be connected [90].

2.4.1 Device Hierarchy

OpenCL defines a hierarchical heterogenous parallel model.

As illustrated in Fig. 2.10, the system defined by OpencCL includes a host and multiple devices. Each device includes several compute unit (CU). CU can be the compute unit in AMD’s GPUs or SM for NVIDIA’s GPUs. Also, CU can be a CPU’s core or other computing device like DSP and FPGA. CU can be made up of several processing element (PE).

Fig. 2.10. OpenCL Device Model

OpenCL define the hierarchy for memory. Global and constant memory can be assessed globally, local memory and private memory can only be accessed locally.

Table 2.1 lists the mapping of OpenCL’s memory types to CUDA’s memory types.

Table 2.1. Mapping Between OpenCL and CUDA Memory Types

OpenCL Memory Type Global	Memory Type Global
Local	Shared
Private	Local
Constant	Constant

2.4.2 Data Parallel Mode

OpenCL defines the execution mode for the program. Fig. 2.11 demonstrates the data model in OpenCL. In OpenCL, the execution units are organized hierarchically: work items make up of a working group, and work groups constitute a an index space — NDRange. The mapping of concepts between OpenCL and CUDA is listed in Table 2.2.

Fig. 2.11. OpenCL Abstract Parallel Model

Table 2.2. The Mapping Between OpenCL and CUDA’s Parallel Model

OpenCL	CUDA
Kernel	Kernl
Host	Host
Work item	Thread
Work group	Block

Like CUDA, work item and work group are also identified by indexes.

2.4.2.1 Device Management and Kernel Launch

The kernel in OpenCL is identical to CUDA by large.

When kernel launched, each work item kernel executes the instructions simultaneously. As OpenCL is hardware independent, the launch of kernel is a bit complex than CUDA is. (In CUDA work is carried by CUDA driver and runtime.)

OpenCL requires a context to manage the devices with each device a context (see Fig. 2.12). The context provides a uniform interface to the host. When the host launch a kernel, the kernel is put into a command queue for executing. If a kernel finish the execution, another kernel in the queue will be fetched and executed. More detailed description can be found in [63, 123].

Fig. 2.12. OpenCL Manage Kernel Execution Through Command Queue and Context

2.4.3 Libraries

Many libraries are available for OpenCL.

2.4.3.1 Bolt

Bolt is an STL compatible library of high-level constructs for creating accelerated data parallel applications. Code written using STL or other STL compatible libraries (eg, TBB) can be converted to Bolt in minutes. With Bolt, kernel code to be accelerated is written in-line in the C++ source file. No OpenCL API calls are required since all initialization and communication with the OpenCL or C++ AMP device is handled by the library. Bolt includes common compute-optimized routines such as sort, scan, transform, and reduce operations.

2.4.3.2 Math Lib

clMath is the open-source project for OpenCL-based BLAS and FFT libraries clMath provides the complete set of BLAS level 1, 2, and 3 routines and FFT with 1D, 2D, and 3D support for both real and complex values.

Based on clMath, MAGMA provides high-level matrix operation, like LUQR and Cholesky decomposition, eigenvalue calculation.

2.4.4 Profiling and Analysis Tools

There are many excellent tools for debugging and optimizing OpenCL programs. CodeXL, distributed with AMD APP SDK, contains a set of software tools to help developers make full use of GPU. CodeXL includes a powerful debugger, which is capable of monitoring both CPU and GPU code and analyzing kernels dynamically. CodeXL can be integrated into Visual Studio, or it can be used alone.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B9780128093627500029

Human Resource Information Systems

Michael Bedell, in Encyclopedia of Information Systems, 2003

IV An Overview of the Human Resource Information System

The information systems that have historically been used by HR functions were designed as administrative compensation and benefits systems. These systems were designed to track employee benefit choices and related costs. Some of these systems also handled the data entry for payroll since a portion of the benefit cost would be deducted from the employee's paycheck. These systems were mainframe based and came in nonrelational database form, flat files, spreadsheets, or proprietary databases. Access was usually through simple “dumb” terminals or terminal emulators. Major limitations of these systems were that (1) they had limited reporting capabilities; (2) they only tracked employee compensation and benefits history; (3) the database could not easily be modified to track additional information; (4) the database was limited to the information within and could not be integrated with other data sources; (5) they often did not have a uniform interface, thus requiring specialized knowledge just to access the data; and (6) information retrieval could only be performed by employees with database expertise.

The biggest liability of these systems was their inability to keep pace with the increasing need for information within the HR function. As HR evolved from a purely administrative role and took on a competitive and strategic role, additional HR information requirements, such as competency management, developed. The inability to modify the database compounded the information requirement issue and lead organizations to develop supplemental databases to store the information that was needed to be competitive. The problem with using multiple databases was, at a minimum, twofold. First, there was the obvious information consistency issue. Second, electronic methods of integrating data were nonexistent. Data integration was completed by querying each database separately and then combining the data by hand.

The modern HRIS differs from older systems in that these systems were designed to meet the needs of the entire HR function and not just the compensation and benefits department. New technology in database systems enabled information generated by the HR function about each employee, applicant, or position to be tracked and, more importantly, integrated with other data for decision-making purposes.

Take, for example, the HR generalist trying to determine which employee to transfer into a marketing job. The generalist will use the performance appraisal table, the training and development table, and the employee interest data table to develop a short list of employees that are eligible for the position. Rather than performing a query on each table and then manually combining the data to determine which employee to transfer, the generalist can use key employee data (e.g., employee number) to link tables and query all of the relevant information at once.

When these systems are integrated with Internet/ intranet technology, HR data becomes available to HR staff anywhere in the world. As HR information needs grow, the modern HRIS is easily modified and more flexible than previous generations of HRISs.

Read full chapter

URL: https://www.sciencedirect.com/science/article/pii/B0122272404000861

What provides the interface between the application program and the DBMS?

The database management system acts as the interface between the application programs and the data.

What is database and database management system?

A database typically requires a comprehensive database software program known as a database management system (DBMS). A DBMS serves as an interface between the database and its end users or programs, allowing users to retrieve, update, and manage how the information is organized and optimized.

What are programs that help a database to be maintained by creating editing and deleting data records and files?

A database management system (DBMS) is system software for creating and managing databases. A DBMS makes it possible for end users to create, protect, read, update and delete data in a database.

Is defined to be a portion of a row used to uniquely identify a row in a table?

A key is an attribute or set of attributes which helps us in uniquely identifying the rows of a table. It also helps in establishing relationship among tables.

define database security: