| SRB Glossary
SRB can seem complex. In order to help those new to the industry or Nirvana's products, we've provided a comprehensive glossary of terms.
For an overview of SRB and to put these terms into context, please peruse the Nirvana whitepapers. Please contact us for further explanation of concepts or terminology.
A | B | C | D | E | F | G | H | I | J | K | L | M | N |
O | P | Q | R | S | T | U | V |W | X | Y | Z |
A
Active Directory - Active Directory (AD) is a Microsoft technology to unify the management of an organization's users, departments, and IT resources. As of SRB 2007, the SRB Windows Gateway is integrated with AD so that AD users can be passed-through into SRB upon authentication.
ACL - An Access Control List (ACL) is an object that tells an SRB Server which access rights each SRB User, Group, or Domain has to a particular system object, such as a Collection or an individual Data Object. ACLs also exist for Resources, Users, Groups, Domains, Locations, and Metadata attributes.
AD - see Active Directory.
Agent - see SRB Agent.
API - The SRB Application Program Interface (API - also referred to as Application Programming Interface or Advanced Programming Interface) is the specific method prescribed by the SRB System by which a programmer writing an SRB Client can make requests of the SRB System. The SRB API can be contrasted with SRB's graphical user interface, command line interfaces (Scommands, Acommands, Mcommands), or Gateways as an interface to the SRB System.
Audit Trail - A record containing detailed information on all SRB transactions involving Data Objects, Collections, Users, and Resources. Information recorded includes timestamp, user, action, result, and comments.
Authentication - The process of identifying an individual usually based on a user name and password. In security systems, authentication is distinct from authorization, which is the process of giving individuals access to system objects based on their identity. Authentication merely ensures that individuals are who they claim to be, but says nothing about the access rights of the individual. See GSI and Kerberos.
top
BLOB - A BLOB (binary large object) is an unstructured file, typically an image or sound file, that is stored in a relational database table potentially along with columns of other data types. A BLOB on Centera is also used for storing unstructured data. The BLOBs in Centera are not part of a relational database table and are referenced by their Content Address (see CAS).
top
C
CAS - A Content Addressed Storage system is usually used to store fixed-content data. CAS systems (like Centera) make it transparent to the application where data is physically stored and do not natively provide applications with a file system view of the data. Data is referenced by a unique content address (typically a 27 or 53 character string) which returns the object to the application. Centera features additional services like self-healing and self-managing capabilities so that data is always protected from corruption.
Collection - An SRB Collection is (much like a folder or directory in a file system) an object that contains other Collections or Data Objects. A Collection is used to organize Data Objects into a logical hierarchy that is easily accessible and understood by both user and administrator. For example, a Collection named "Project X" can be used to store all the Data Objects that are related to project X, independent of where that data is physically stored or how it is physically structured. This logical Collection view is maintained transparently through SRB so that users do not need to keep track of the physical locations of their data. Collections in SRB possess the same attributes as Data Objects.
Container - In order to reduce latencies in accessing and transferring data, SRB uses patented Container technology. Due to the relatively high overhead of creating or opening files in archival storage systems, it is not practical to store large amounts of small files typically found in information management systems like SRB. Containers overcome this limitation by packing multiple files into one larger file. The archive then only has to deal with the large file. SRB features caching, staging, and streaming of Containers. SRB has the ability to associate Containers with Collections and hence speed up data migration into Containers and improve their ease of use.
Client API - see API
Cluster Resource - Cluster Resources enable the efficient management of clustered file systems and tape media that can be accessed through several tape drives (e.g. in a tape library). To any SRB Client a Cluster Resource shows up as a Physical Resource. Behind the scenes, the Cluster Resource might in fact have many Physical Resources attached. When data is read or written on the Cluster Resource, the cluster will iterate through its attached Physical Resources until it finds a working Resource or the right tape in a drive that is not busy. Cluster Resources provide SRB with an elegant automatic fail-over from one server to the next if those servers are "seeing" the same data (see also Resource).
top
D
DAI - Database Access Interface, provides access to tabular data stored in databases using SRB to ingest into, and retrieve from, a database table. The DAI includes mechanisms for tailoring the input and output streams. This is performed by associating a Template with the input which can then be used to interpret the data in the input file and convert them into SQL statements for inserting the data into the database. On the way out, the Template can be used to construct forms and marked-up documents using the tabular data produced by SQL statements. These conversion operations occur on-the-fly inside the SRB and are conveniently transparent to the user. A template language called T-language is used for building customizable Templates.
Data Object - Every piece of data managed or accessed through SRB is represented as a Data Object within SRB. Examples of Data Objects are: images, metadata files, databases, spreadsheets, office documents, database queries, URLs, or others. Data Objects can physically reside anywhere within the SRB Federation including file systems, tape drives, tape libraries, relational databases, or archives. A Data Object is not necessarily the same as a file although it can be. Furthermore, Data Objects do not necessarily have a one-to-one relationship to the underlying data. A Data Object can in fact point to several (replicated) pieces of data (one-to-many relationship). Therefore, a Data Object can be viewed as a logical entity whereas a file refers to a physical entity. There are a number of attributes automatically associated with each Data Object in SRB: name, data type, size, physical path, creation timestamp, modification timestamp, last access timestamp, custom attributes through Metadata Schemes, etc.
Data Replication - see Replication
Database Shadow Object - A Database Shadow Object (DSO) is a Data Object that is not actually ingested into an SRB Vault. It is attached to a database resource and can forward SQL queries to this database. Upon displaying the DSO the output from the database can be formatted according to a Template. Like a Data Object, a DSO has a path except that in the case of a DSO the path contains an SQL query.
Directory Shadow Object - A Directory Shadow Object is a Data Object that is not actually ingested into an SRB Vault. It is only registered to point to a directory on an SRB Agent. It can be opened up and browsed like a directory in the Windows Explorer. Directory Shadow Objects can be used for two purposes: (1) bulk registration and later ingestion of files into the SRB system; (2) temporary access to files or directories below the Directory Shadow Object's starting directory.
Directory - A directory is a named group of related files in a file system (i.e., NTFS, or EXT-2) that are separated by the naming convention from other groups of files. Directories are usually organized in a hierarchical structure.
Domain - Every SRB User must be a member of one and only one SRB Domain. Domains can be used to group individuals together that reside on one physical site or office location. It is also useful to group users in an SRB Domain that do not necessarily share the same physical site but rather some common characteristics like Customer or Supplier. That way it is very easy to differentiate such SRB Users when looking at audit trails. SRB Domains are independent from Windows or NFS domains.
top
F
Federation - An SRB Federation is a usually distributed group of servers (i.e., SRB Agents) whose data is connected into a single Global Namespace. The management, organization, access, discovery of, and collaboration on data in an SRB Federation is greatly simplified. Each SRB Federation must contain at least one MCAT Server.
File Shadow Object - A File Shadow Object is a Data Object that is not actually ingested into an SRB Vault. It is only registered with (or pointed to by) an MCAT without a change in its local file system location. The File Shadow Object must be registered on a shadow resource. The deletion of shadow objects from SRB does not delete the physical file in the file system.
top
G
Gateway - see SRB Gateway
Global Namespace - The entire logical space comprising all Collections, Sub-Collections, Virtual Collections, Data Objects, and Links is called a Global Namespace, also referred to as a Logical Namespace. By definition, a Global Namespace can span multiple heterogeneous and distributed storage systems and data centers. In its simplest form, the Global Namespace is a hierarchy of Collections. By adding Virtual Collections, the structure of the Global Namespace becomes more flexible (i.e., enabling the building of taxonomies). Collection Links are creating cross-references between previously unrelated Collections - similar to hyperlinks on an Internet web page.
Grid Security Infrastructure (GSI) - The Globus Toolkit uses the GSI to enable secure authentication and communication over an open network. GSI provides a number of useful services for Grids, including mutual authentication and single sign-on. The primary motivations behind the GSI are:
- The need for secure communication (authenticated and perhaps confidential) between elements of a computational Grid.
- The need to support security across organizational boundaries, thus prohibiting a centrally-managed security system.
- The need to support "single sign-on" for users of the Grid, including delegation of credentials for computations that involve multiple Resources and/or sites.
GSI is based on public key encryption, X.509 certificates, and the Secure Sockets Layer (SSL) communication protocol. Extensions to these standards have been added for single sign-on and delegation. The Globus Toolkit implementation of the GSI adheres to the Generic Security Service API (GSS-API), which is a standard API for security systems promoted by the Internet Engineering Task Force (IETF).
Group - Groups in SRB are used to simplify the management of SRB Users and the access control mechanism. An SRB Group is a logical accumulation of SRB Users and can contain any number of Users. The Users in an SRB Group can span multiple SRB Domains. SRB Users can be part of multiple SRB Groups simultaneously. SRB Groups are used primarily for easing the granting of access permissions to Data Objects, Resources, Locations, Metadata, or other SRB Objects.
top
H
HSM - see Hierarchical Storage Management
Hierarchical Storage Management - Hierarchical Storage Management (HSM) is policy-based management of file backup and archiving in a way that uses storage devices economically and without the user needing to be aware of when files are being retrieved from backup storage media. SRB implements this concept using HSM Daemons (prior to SRB 2006) or in newer releases ILM Daemons.
HSM Daemon - (Deprecated and replaced by ILM Daemon) The HSM Daemon (see Hierarchical Storage Management) is a management daemon that routinely queries the MCAT in user-defined intervals. Policies can be set to migrate data from distributed locations to resources with specified criteria. In addition to migration, other actions can be performed on the data including replication, deletion, backup, or simply reporting.
top
I
Information Lifecycle Management (ILM) - ILM is a comprehensive approach to managing the flow of an information system's data and associated metadata from creation and initial storage to the time when it becomes obsolete and is deleted. Unlike earlier approaches to data storage management, ILM involves all aspects of dealing with data, starting with user practices, rather than just automating storage procedures, as for example, hierarchical storage management (HSM) does. Also in contrast to older systems, ILM enables more complex criteria for storage management than data age and frequency of access. SRB implements these concepts using its ILM Daemon.
Inheritance - The process of passing on one's properties to a child. Starting with SRB 2007, inheritance is implemented for Access Control Lists (ACLs) to simplify the management of authorization. Further, inheritance of ACLs greatly improves performance as SRB does not have to create and maintain ACLs for every Data Object and Collection.
ILM Daemon - The ILM Daemon (see Information Lifecycle Management) performs a number of actions on Data Objects and Collections using customizable policies. Data Objects or Collections can be backed-up, deleted, migrated, replicated, synchronized, or reported on based upon policies given through the Java Admin or the Acommands. Policies can be scheduled to run with very flexible schedules and recurrence patterns.
Ingestion - The process of physically bringing data into an SRB Federation; in contrast to Registration, Ingestion involves the physical transfer of data, whereas Registration only creates a pointer to an existing data structure. Files that are ingested not only get registered within MCAT but also get physically transferred into an SRB Vault.
top
K
Kerberos - Kerberos is a network authentication protocol. It is designed to provide strong authentication for client/server applications by using secret-key cryptography. A free implementation of this protocol is available from Massachusetts Institute of Technology. The primary motivations behind Kerberos are:
- The Internet is an insecure place where hackers can "sniff" passwords off the network without appropriate encryption.
- Firewalls only secure a network from the outside whereas most attacks are performed from inside the firewall.
- Kerberos is a solution to network security problems. It provides the tools of authentication and strong cryptography over the network to help secure information systems across the entire enterprise.
top
L
Logical Namespace - see Global Namespace.
Logical Resource - Logical Resources are used to group one or more Physical Resources together, making it transparent to the SRB User where data is physically stored. Logical Resources can be useful for many reasons: (a) data can be transparently stored to several underlying Resources simultaneously (Replication) without having to involve the SRB Users, (b) Resources can be switched out or added behind the scenes without affecting the SRB Users, (c) one can achieve a load balancing effect between several Resources transparent to the SRB Users, and (d) as a mechanism for SRB Containers to automatically archive and stage the data in the correct archival or cache Resource (see also Resource).
Location - SRB stores data in storage devices that are called Resources. Several of these Resources can be grouped together and reside under a single SRB Location. An SRB Location is uniquely identified by its host name and port number and always resides on a single machine - an SRB Agent. There can be multiple Locations per SRB Agent.
top
M
Master Scheme - The equivalent of a Metadata Scheme before SRB 2007, a Master Scheme's attributes are what users in SRB get to see, query, and associate with Data Objects or Collections. Master Schemes can contain several View Schemes or Tables Schemes as long as the attributes from the View or Table Schemes are only used once in the Master Scheme. Of those View or Table attributes, only the linking attribute/column can be changed within the Master Scheme.
MCAT - The Metadata Catalog (MCAT) is the heart of an SRB Federation. It consists of two components - the MCAT Database and the MCAT Server. All the metadata that SRB Servers need to access is stored in the MCAT. The MCAT stores and provides access to data about SRB Users, Data Objects, Collections, Resources, Locations, and other objects. Furthermore, the MCAT contains ACLs, metadata, and token mappings.
MCAT Database - The actual physical storage location for all the metadata stored in MCAT. The MCAT Database is supported on several relational database systems such as Oracle, Postgres, Microsoft SQL Server, IBM DB2, or Sybase ASE. The MCAT Database is directly connected to the MCAT Server.
MCAT Server - The MCAT Server communicates with the MCAT Database server in an SRB Federation. For performance reasons the MCAT Server can be installed on a separate host machine as the MCAT Database server. The SRB Agents make calls to the MCAT Server to authorize and authenticate client machines, query for SRB Objects (Data Objects, Collections, Resources, Containers, Tickets, etc.), store audit information, and manage Data Objects.
Metadata - Data about data. Metadata in SRB is maintained for Data Objects, Collections, and Resources. An example for a Metadata attribute on a Data Object is an 'author' or a 'department'. SRB allows only Administrators to define the Metadata attributes that are needed for the organization. Metadata is grouped into Metadata Schemes. Every Metadata attribute can have access restrictions for certain SRB Users, Groups, or Domains.
Metadata Daemon - A Metadata Daemon automatically parses files of various types for Metadata, extracts such Metadata, and associates it with SRB Data Objects through Metadata Schemes. The Metadata Daemon accomplishes this with the help of Templates written for the various file types.
Metadata Scheme - A logical grouping of Metadata attributes that eases the administration of attributes and the data entry for end users. Those attributes can then be used to organize and discover data and information. There can be any number of Metadata Schemes in the MCAT and every Data Object or Collection can be associated with one ore more Metadata Scheme. Starting with SRB 2007, there were a number of changes related to Metadata Schemes: 1) there are now three types of Metadata Schemes: Master Schemes, View Schemes, and Table Schemes; 2) there can be multiple rows or pages of attributes per Data Object or Collection; 3) attributes can be nullable or not nullable; 4) attributes can be NULL or have a value; 5) attributes can have default values; and 6) attributes can have a constraint to be unique.
top
P
Persistent Archive - Persistent Archives by definition are designed to persist over a very long period of time. The archivist of such archives needs to be able to prove that the documents and records inside the archive are authentic. Furthermore, one has to be able to prove that they have not been modified.
In order to create a Persistent Archive there have to be many features in place that are managed by a central authority. Those features include auditing, global persistent identifier, vault management, continuous and uninterrupted data migration to the latest storage technology, as well as centralized ACLs. SRB contains all of these features and is therefore able to create and manage Persistent Archives.
Physical Resource - Physical Resources represent abstractions of places where Data Objects are physically stored. Such "places" include file systems on UNIX, Linux, or Windows; relational databases like Oracle, MS SQL Server, or Postgres; tape drives and libraries; Content Addressable Storage (CAS) systems; web or FTP servers etc. (see also Resource).
PKI - Public Key Infrastructure. The set of hardware, software, people, policies and procedures needed to create, manage, store, distribute, and revoke Public Key Certificates based on public-key cryptography.
Proxy Operation - Proxy Operations take place when an SRB Server performs operations
on behalf of a remote SRB Server that was instructed by an SRB Client to perform such operations.
top
R
Replication - Replication is the process of making a replica, or copy, of something. Replication in SRB does not distinguish between the original and the copy. Therefore it is possible to delete the original and continue working with the copy (also called Migration). Replication in SRB serves a number of purposes: Disaster protection and recovery, migration to new storage technologies, and load balancing.
Registration -
In contrast to Ingestion, files that are registered do not have to be physically transferred to another
Resource. The registration of a file is equivalent to the creation of a pointer to the local file in MCAT. A Database Shadow Object is an example of a registered object. During registration the SQL Query is stored in MCAT. Furthermore, registered objects can contain Metadata and are actually treated like any other Data Object in SRB. If a Data Object was registered on a shadow resource, the physical file in the local file system can not be deleted through SRB.
Resource - Every piece of data must be, ultimately, on a physical storage system. In SRB, the mapping between those storage systems and the Data Objects is done using Resources. There are three types of Resources in SRB: Physical Resources, Logical Resources, and Cluster Resources.
top
S
Security - Refers to techniques for ensuring that data stored in a computer cannot be read or compromised by unauthorized users. Most security measures involve data encryption and passwords. Data encryption is the translation of data into a form that is unintelligible without a deciphering mechanism. A password is a secret word or phrase that gives a user access to a particular program or system.
SRB - Storage Resource Broker - In large organizations, a central challenge is making data - in complex environments - easily accessible. SRB provides data abstraction, ending path name dependency, and radically simplifying data access. In an SRB Federation, data is accessible, secure, persistent, and easily managed.
SRB 2007 offers improved access management, secure authentication, transfer encryption, logging and audit trails, and is scalable for massive archives and data grid projects where data persistence and secure data sharing capabilities are requirements.
SRB Agent - SRB Agents are servers that are part of an SRB Federation. They interface between the Metadata Catalog (MCAT) and the Physical Storage Resources that they are attached to. All the physical data managed by SRB either resides on or is directly or remotely connected to an SRB Agent. The interfacing between the physical storage system and the rest of the SRB Federation is accomplished using drivers that can be added to an SRB Agent on-the-fly.
SRB Gateway -A Gateway is a network point that acts as an entrance to another network. An SRB Gateway is a network point that acts as an entrance to the SRB Global Namespace. The role of SRB Gateways is to translate existing commonly used protocols (such as SMB, CIFS, NFS, FTP, GridFTP, HTTP, HTTPS, or WebDAV) into the SRB protocol and vice versa. SRB Gateways can make the communication with SRB transparent to existing applications and give those applications the impression that they are communicating with a native server (i.e., Windows server, NFS server, FTP server, web server, etc.). SRB Gateways are therefore extremely easy to use because they do not require any behavior modification on the part of end users or applications. Applications do not need to be re-compiled or modified in any way to work with SRB Gateways. SRB currently has the following Gateways: Windows Gateway, UNIX/Linux Gateway, Grid Gateway, and Internet Gateway.
Storage Resource - see Resource
Sync Daemon - The Sync Daemon keeps local directories and data repositories synchronized with SRB Collections. The Sync Daemon is used for two purposes: Bringing existing directory structures or data repositories into the SRB, and synchronizing a local directory structure or data repository with SRB on an ongoing basis (i.e., if third party applications create, modify, or delete files or directories in a local directory).
top
T
Table Scheme - A Table Scheme is another type of Metadata Scheme. With the creation of a Table Scheme, SRB automatically creates a new table in the MCAT Database. This table will have columns and column types as specified during the Table Scheme creation. Table Scheme tables are not directly associated with SRB Data Objects or Collections. They can, however, be linked to a Master Scheme through a linking column. The Master Scheme, in turn, has the association with Data Objects and Collections. A similar linkage can be accomplished for external views through the usage of View Schemes.
Template - A Template (or T Language Template) is a text file that contains <TL...> tags which describe how to format data being read from or written to a relational database. A Template allows complete customization of database output and input. This enables administrators to create their own customized reports from database queries. Furthermore, one can parse a local file according to the Template's rules and load any discovered metadata into the database.
Ticket - SRB employs an additional authorization mechanism where data-sharing Tickets can be sent out to internal or external SRB Users. The Ticket then grants controlled access to Data Objects or entire Collections. Additional restrictions such as time limits and limits on the number of accesses can be built into every Ticket.
Token - Tokens in SRB provide an extensible mechanism for various SRB Objects. Examples for such objects are data types, errors, icon types, and user types.
top
U
User - The MCAT contains its own authentication and
authorization services and therefore manages its own directory
of SRB Users. SRB Users can be created, managed and associated
with the ACLs stored in MCAT. SRB Users are uniquely identified by their user name and SRB Domain. SRB Users can be grouped into SRB Groups.
top
V
Vault - A Vault (or SRB Vault) is a space on the associated Resource where SRB ingests its Data Objects. The Vault is a protected space that can only be accessed by the SRB System, the local system user running the SRB Servers, and the local root/administrator user. This way modification or deletion of Data Objects can be controlled and should only happen through SRB. The structure of a Vault can initially be determined by the administrators when they create a Physical Resource.
Vdisk - A native Windows virtual drive interface into the SRB Global Namespace.
This enables fast access into SRB from any Windows desktop making SRB look and work
like a local Windows disk. A Vdisk can also be shared over the network from a Windows
server so that remote clients are able to connect to SRB without the need to install
any local software.
View Scheme - A type of Metadata Scheme that allows attributes/columns from views that are outside of the SRB MCAT Schema to be associated with Data Objects or Collections. This is accomplished through a linking attribute/column that joins the external view with a Master Scheme table. A similar linkeage can be accomplished for "non Data Object or Collection related" tables through the usage of Table Schemes.
Virtual Collection - In new versions of SRB, users have the ability to create Virtual Collections. Their contents are - as their name says - "virtually defined" through a query (also called policy). When the users view the contents of a Virtual Collection, the query is executed and the results of the query will make-up the contents of the Virtual Collection. This automatically precludes a user or application from ingesting data into a Virtual Collection. The only way to get new data into a Virtual Collection is by adding data - with attributes that match the Virtual Collection's query - to a "real" Collection.
top |