FileNotFoundException is file does not exist. @throws UnresolvedLinkException if the path contains a symlink.]]> FileNotFoundException is file does not exist. @throws UnresolvedLinkException if the path contains a symlink.]]> @throws IOException]]> A distributed implementation of {@link org.apache.hadoop.fs.FileSystem}. This is loosely modelled after Google's GFS.

The most important difference is that unlike GFS, Hadoop DFS files have strictly one writer at any one time. Bytes are always appended to the end of the writer's stream. There is no notion of "record appends" or "mutations" that are then checked or reordered. Writers simply emit a byte stream. That byte stream is guaranteed to be stored in the order written.

]]> Return {@link LocatedBlocks} which contains file length, blocks and their locations. DataNode locations for each block are sorted by the distance to the client's address.

The client will then have to contact one of the indicated DataNodes to obtain the actual data. @param src file name @param offset range start offset @param length range length @return file length and array of blocks with their locations @throws IOException @throws UnresolvedLinkException if the path contains a symlink. @throws FileNotFoundException if the path does not exist.]]> This will create an empty file specified by the source path. The path should reflect a full path originated at the root. The name-node does not have a notion of "current" directory for a client.

Once created, the file is visible and available for read to other clients. Although, other clients cannot {@link #delete(String, boolean)}, re-create or {@link #rename(String, String)} it until the file is completed or explicitly as a result of lease expiration.

Blocks have a maximum size. Clients that intend to create multi-block files must also use {@link #addBlock(String, String, Block, DatanodeInfo[])}. @param src path of the file being created. @param masked masked permission. @param clientName name of the current client. @param flag indicates whether the file should be overwritten if it already exists or create if it does not exist or append. @param createParent create missing parent directory if true @param replication block replication factor. @param blockSize maximum block size. @throws AccessControlException if permission to create file is denied by the system. As usually on the client side the exception will be wrapped into {@link org.apache.hadoop.ipc.RemoteException}. @throws QuotaExceededException if the file creation violates any quota restriction @throws IOException if other errors occur. @throws UnresolvedLinkException if the path contains a symlink. @throws AlreadyBeingCreatedException if the path does not exist. @throws NSQuotaExceededException if the namespace quota is exceeded.]]> The NameNode sets replication to the new value and returns. The actual block replication is not expected to be performed during this method call. The blocks will be populated or removed in the background as the result of the routine block maintenance procedures. @param src file name @param replication new replication @throws IOException @return true if successful; false if file does not exist or is a directory @throws UnresolvedLinkException if the path contains a symlink.]]>

Fails if src is a file and dst is a directory.

Fails if src is a directory and dst is a file.

Fails if the parent of dst does not exist or is a file.

Without OVERWRITE option, rename fails if the dst already exists. With OVERWRITE option, rename overwrites the dst, if it is a file or an empty directory. Rename fails if dst is a non-empty directory.

This implementation of rename is atomic.

@param src existing file or directory name. @param dst new name. @param options Rename options @throws IOException if rename failed @throws UnresolvedLinkException if the path contains a symlink.]]> Any blocks belonging to the deleted files will be garbage-collected. @param src existing name. @return true only if the existing file or directory was actually removed from the file system. @throws UnresolvedLinkException if the path contains a symlink. @deprecated use {@link #delete(String, boolean)} istead.]]> same as delete but provides a way to avoid accidentally deleting non empty directories programmatically. @param src existing name @param recursive if true deletes a non empty directory recursively, else throws an exception. @return true only if the existing file or directory was actually removed from the file system. @throws UnresolvedLinkException if the path contains a symlink.]]> So, the NameNode will revoke the locks and live file-creates for clients that it thinks have died. A client tells the NameNode that it is still alive by periodically calling renewLease(). If a certain amount of time passes since the last call to renewLease(), the NameNode assumes the client has died. @throws UnresolvedLinkException if the path contains a symlink.]]>

[0] contains the total storage capacity of the system, in bytes.

[1] contains the total used space of the system, in bytes.

[2] contains the available storage of the system, in bytes.

[3] contains number of under replicated blocks in the system.

[4] contains number of blocks with a corrupt replica.

[5] contains number of blocks without any good replicas left.

Use public constants like {@link #GET_STATS_CAPACITY_IDX} in place of actual numbers to index into the array.]]> Safe mode is a name node state when it

does not accept changes to name space (read-only), and
does not replicate or delete blocks.

Safe mode is entered automatically at name node startup. Safe mode can also be entered manually using {@link #setSafeMode(FSConstants.SafeModeAction) setSafeMode(SafeModeAction.SAFEMODE_GET)}.

At startup the name node accepts data node reports collecting information about block locations. In order to leave safe mode it needs to collect a configurable percentage called threshold of blocks, which satisfy the minimal replication condition. The minimal replication condition is that each block must have at least dfs.namenode.replication.min replicas. When the threshold is reached the name node extends safe mode for a configurable amount of time to let the remaining data nodes to check in before it will start replicating missing blocks. Then the name node leaves safe mode.

If safe mode is turned on manually using {@link #setSafeMode(FSConstants.SafeModeAction) setSafeMode(SafeModeAction.SAFEMODE_ENTER)} then the name node stays in safe mode until it is manually turned off using {@link #setSafeMode(FSConstants.SafeModeAction) setSafeMode(SafeModeAction.SAFEMODE_LEAVE)}. Current state of the name node can be verified using {@link #setSafeMode(FSConstants.SafeModeAction) setSafeMode(SafeModeAction.SAFEMODE_GET)}

Configuration parameters:

dfs.safemode.threshold.pct is the threshold parameter.
dfs.safemode.extension is the safe mode extension parameter.
dfs.namenode.replication.min is the minimal replication parameter.

Special cases:

The name node does not enter safe mode at startup if the threshold is set to 0 or if the name space is empty.
If the threshold is set to 1 then all blocks need to have at least minimal replication.
If the threshold value is greater than 1 then the name node will not be able to turn off safe mode automatically.
Safe mode can always be turned off manually. @param action

0 leave safe mode;
1 enter safe mode;
2 get safe mode state.

@return

0 if the safe mode is OFF or
1 if the safe mode is ON.

@throws IOException]]> Saves current namespace into storage directories and reset edits log. Requires superuser privilege and safe mode. @throws AccessControlException if the superuser privilege is violated. @throws IOException if image creation failed.]]> sets flag to enable restore of failed storage replicas @throws AccessControlException if the superuser privilege is violated.]]>
The quota can have three types of values : (1) 0 or more will set the quota to that value, (2) {@link FSConstants#QUOTA_DONT_SET} implies the quota will not be changed, and (3) {@link FSConstants#QUOTA_RESET} implies the quota will be reset. Any other value is a runtime error. @throws UnresolvedLinkException if the path contains a symlink. @throws FileNotFoundException if the path is a file or does not exist @throws QuotaExceededException if the directory size is greater than the given quota]]> @throws IOException]]>
The message for the exception specifies the directory where the quota was violated and actual quotas. Specific message is generated in the corresponding Exception class: DSQuotaExceededException or NSQuotaExceededException]]> The balancer is a tool that balances disk space usage on an HDFS cluster when some datanodes become full or when new empty nodes join the cluster. The tool is deployed as an application program that can be run by the cluster administrator on a live HDFS cluster while applications adding and deleting files.

SYNOPSIS

 To start:
      bin/start-balancer.sh [-threshold ]
      Example: bin/ start-balancer.sh 
                     start the balancer with a default threshold of 10%
               bin/ start-balancer.sh -threshold 5
                     start the balancer with a threshold of 5%
 To stop:
      bin/ stop-balancer.sh

DESCRIPTION

The threshold parameter is a fraction in the range of (0%, 100%) with a default value of 10%. The threshold sets a target for whether the cluster is balanced. A cluster is balanced if for each datanode, the utilization of the node (ratio of used space at the node to total capacity of the node) differs from the utilization of the (ratio of used space in the cluster to total capacity of the cluster) by no more than the threshold value. The smaller the threshold, the more balanced a cluster will become. It takes more time to run the balancer for small threshold values. Also for a very small threshold the cluster may not be able to reach the balanced state when applications write and delete files concurrently.

The tool moves blocks from highly utilized datanodes to poorly utilized datanodes iteratively. In each iteration a datanode moves or receives no more than the lesser of 10G bytes or the threshold fraction of its capacity. Each iteration runs no more than 20 minutes. At the end of each iteration, the balancer obtains updated datanodes information from the namenode.

A system property that limits the balancer's use of bandwidth is defined in the default configuration file:

 
   dfs.balance.bandwidthPerSec
   1048576
   Specifies the maximum bandwidth that each datanode 
 can utilize for the balancing purpose in term of the number of bytes 
 per second.

This property determines the maximum speed at which a block will be moved from one datanode to another. The default value is 1MB/s. The higher the bandwidth, the faster a cluster can reach the balanced state, but with greater competition with application processes. If an administrator changes the value of this property in the configuration file, the change is observed when HDFS is next restarted.

MONITERING BALANCER PROGRESS

After the balancer is started, an output file name where the balancer progress will be recorded is printed on the screen. The administrator can monitor the running of the balancer by reading the output file. The output shows the balancer's status iteration by iteration. In each iteration it prints the starting time, the iteration number, the total number of bytes that have been moved in the previous iterations, the total number of bytes that are left to move in order for the cluster to be balanced, and the number of bytes that are being moved in this iteration. Normally "Bytes Already Moved" is increasing while "Bytes Left To Move" is decreasing.

Running multiple instances of the balancer in an HDFS cluster is prohibited by the tool.

The balancer automatically exits when any of the following five conditions is satisfied:

The cluster is balanced;
No block can be moved;
No block has been moved for five consecutive iterations;
An IOException occurs while communicating with the namenode;
Another balancer is running.

Upon exit, a balancer returns an exit code and prints one of the following messages to the output file in corresponding to the above exit reasons:

The cluster is balanced. Exiting
No block can be moved. Exiting...
No block has been moved for 3 iterations. Exiting...
Received an IO exception: failure reason. Exiting...
Another balancer is running. Exiting...

The administrator can interrupt the execution of the balancer at any time by running the command "stop-balancer.sh" on the machine where the balancer is running.]]> The block has at least one {@link ReplicaState#FINALIZED} replica, and is not going to be modified.]]> It has been recently allocated for write or append.]]> When a file lease expires its last block may not be {@link #COMPLETE} and needs to go through a recovery procedure, which synchronizes the existing replicas contents.]]> The client reported that all bytes are written to data-nodes with the given generation stamp and block length, but no {@link ReplicaState#FINALIZED} replicas has yet been reported by data-nodes themselves.]]> true if exclusive locks are supported or false otherwise. @throws IOException @see StorageDirectory#lock()]]> Local storage information is stored in a separate file VERSION. It contains type of the node, the storage layout version, the namespace id, and the fs state creation time.

Local storage can reside in multiple directories. Each directory should contain the same VERSION file as the others. During startup Hadoop servers (name-node and data-nodes) read their local storage information from them.

The servers hold a lock for each storage directory while they run so that other nodes were not able to startup sharing the same storage. The locks are released when the servers stop (normally or abnormally).]]> Removes contents of the current directory and creates an empty directory. This does not fully format storage directory. It cannot write the version file since it should be written last after all other storage type dependent files are written. Derived storage is responsible for setting specific storage values and writing the version file to disk. @throws IOException]]>

node type

layout version

namespaceID

fs state creation time

other fields specific for this node type

The version file is always written last during storage directory updates. The existence of the version file indicates that all other files have been successfully written in the storage directory, the storage is valid and does not need to be recovered. @return the version file path]]> Locking is not supported by all file systems. E.g., NFS does not consistently support exclusive locks.

If locking is supported we guarantee exculsive access to the storage directory. Otherwise, no guarantee is given. @throws IOException if locking fails]]> Assigned to the file system at formatting and never changes after that. Shared by all file system components.]]> Modified during upgrades.]]> stream of bytes (of BLOCK_SIZE or less) This info is stored on a local disk. The DataNode reports the table's contents to the NameNode upon startup and every so often afterwards. DataNodes spend their lives in an endless loop of asking the NameNode for something to do. A NameNode cannot connect to a DataNode directly; a NameNode simply returns values from functions invoked by a DataNode. DataNodes maintain an open server socket so that client code or other DataNodes can read/write data. The host/port for this server is reported to the NameNode, which then sends that information to clients or other DataNodes that might be interested.]]> @see Storage]]>

If the block file is missing, delete the block from volumeMap

If the block file exists and the block is missing in volumeMap, add the block to volumeMap

If generation stamp does not match, then update the block with right generation stamp

If the block length in memory does not match the actual block file length then mark the block as corrupt and update the block length in memory

If the file in {@link ReplicaInfo} does not match the file on the disk, update {@link ReplicaInfo} with the correct file

@param blockId Block that differs @param diskFile Block file on the disk @param diskMetaFile Metadata file from on the disk @param vol Volume of the block file]]> -" Many of the activity metrics are sampled and averaged on an interval which can be specified in the metrics config file.

For the metrics that are sampled and averaged, one must specify a metrics context that does periodic update calls. Most metrics contexts do. The default Null metrics context however does NOT. So if you aren't using any other metrics context then you can turn on the viewing and averaging of sampled metrics by specifying the following two lines in the hadoop-meterics.properties file:

        dfs.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread
        dfs.period=10

Note that the metrics are collected regardless of the context used. The context with the update thread is used to average the data periodically Impl details: We use a dynamic mbean that gets the list of the metrics from the metrics registry passed as an argument to the constructor]]> This class has a number of metrics variables that are publicly accessible; these variables (objects) have methods to update their values; for example:

{@link #blocksRead}.inc()]]> Data Node runtime statistic info is report in another MBean @see org.apache.hadoop.hdfs.server.datanode.metrics.DataNodeStatisticsMBean]]> Backup node can play two roles.

{@link NamenodeRole#CHECKPOINT} node periodically creates checkpoints, that is downloads image and edits from the active node, merges them, and uploads the new image back to the active.
{@link NamenodeRole#BACKUP} node keeps its namespace in sync with the active node, and periodically creates checkpoints by simply saving the namespace image to local disk(s).

]]> blocklist) and (block-->machinelist) tables.]]> TreeSet]]> :/data[/] HTTP/1.1 }]]> =0.16)]]> The purpose of registration is to identify whether the new datanode serves a new data storage, and will report new data block copies, which the namenode was not aware of; or the datanode is a replacement node for the data storage that was previously served by a different or the same (in terms of host:port) datanode. The data storages are distinguished by their storageIDs. When a new data storage is reported the namenode issues a new unique storageID.

Finally, the namenode returns its namespaceID as the registrationID for the datanodes. namespaceID is a persistent attribute of the name space. The registrationID is checked every time the datanode is communicating with the namenode. Datanodes with inappropriate registrationID are rejected. If the namenode stops, and then restarts it can restore its namespaceID and will continue serving the datanodes that has previously registered with the namenode without restarting the whole cluster. @see org.apache.hadoop.hdfs.server.datanode.DataNode#register()]]> blocklist) and (block-->machinelist) tables.]]> no further work needed here. 2. Removed from hosts --> mark AdminState as decommissioned. 3. Added to exclude --> start decommission. 4. Removed from exclude --> stop decommission.]]> @throws IOException]]> key=value pairs to be written for the following properties: ugi=<ugi in RPC> ip=<remote IP> cmd=<command> src=<src path> dst=<dst path (optional)> perm=<permissions (optional)>]]> blocklist (kept on disk, logged) 2) Set of all valid blocks (inverted #1) 3) block --> machinelist (kept in memory, rebuilt dynamically from reports) 4) machine --> blocklist (inverted #2) 5) LRU cache of updated-heartbeat machines]]> :/listPaths[/][[&option]*] HTTP/1.1 } Where option (default) in: recursive ("no") filter (".*") exclude ("\..*\.crc") Response: A flat list of files/directories in the following format: {@code

}]]> The name-node can be started with one of the following startup options:

{@link StartupOption#REGULAR REGULAR} - normal name node startup
{@link StartupOption#FORMAT FORMAT} - format name node
{@link StartupOption#BACKUP BACKUP} - start backup node
{@link StartupOption#CHECKPOINT CHECKPOINT} - start checkpoint node
{@link StartupOption#UPGRADE UPGRADE} - start the cluster upgrade and create a snapshot of the current file system state
{@link StartupOption#ROLLBACK ROLLBACK} - roll the cluster back to the previous state
{@link StartupOption#FINALIZE FINALIZE} - finalize previous upgrade
{@link StartupOption#IMPORT IMPORT} - import checkpoint

The option is passed via configuration field: dfs.namenode.startup The conf will be modified to reflect the actual ports on which the NameNode is up and running if the user passes the port as zero in the conf. @param conf confirguration @throws IOException]]> blocksequence (namespace) 2) block->machinelist ("inodes") The first table is stored on disk and is very precious. The second table is rebuilt every time the NameNode comes up. 'NameNode' refers to both this class as well as the 'NameNode server'. The 'FSNamesystem' class actually performs most of the filesystem management. The majority of the 'NameNode' class itself is concerned with exposing the IPC interface and the http server to the outside world, plus some configuration management. NameNode implements the ClientProtocol interface, which allows clients to ask for DFS services. ClientProtocol is not designed for direct use by authors of DFS client code. End-users should instead use the org.apache.nutch.hadoop.fs.FileSystem class. NameNode also implements the DatanodeProtocol interface, used by DataNode programs that actually store DFS data blocks. These methods are invoked repeatedly and automatically by all the DataNodes in a DFS deployment. NameNode also implements the NamenodeProtocol interface, used by secondary namenodes or rebalancing processes to get partial namenode's state, for example partial blocksMap etc.]]> The tool scans all files and directories, starting from an indicated root path. The following abnormal conditions are detected and handled:

files with blocks that are completely missing from all datanodes.
In this case the tool can perform one of the following actions:
- none ({@link #FIXING_NONE})
- move corrupted files to /lost+found directory on DFS ({@link #FIXING_MOVE}). Remaining data blocks are saved as a block chains, representing longest consecutive series of valid blocks.
- delete corrupted files ({@link #FIXING_DELETE})
detect files with under-replicated or over-replicated blocks

Additionally, the tool collects a detailed overall DFS statistics, and optionally can print detailed statistics on block locations and replication factors of each file.]]> Name Node runtime activity statistic info is report in another MBean @see org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeActivityMBean]]> This class has a number of metrics variables that are publicly accessible; these variables (objects) have methods to update their values; for example:

{@link #filesTotal}.set()]]> For the metrics that are sampled and averaged, one must specify a metrics context that does periodic update calls. Most metrics contexts do. The default Null metrics context however does NOT. So if you aren't using any other metrics context then you can turn on the viewing and averaging of sampled metrics by specifying the following two lines in the hadoop-meterics.properties file:

        dfs.class=org.apache.hadoop.metrics.spi.NullContextWithUpdateThread
        dfs.period=10

{@link #syncs}.inc()]]> Returned to the backup node by the name-node as a reply to the {@link NamenodeProtocol#startCheckpoint(NamenodeRegistration)} request.
Contains:

{@link CheckpointSignature} identifying the particular checkpoint
indicator whether the backup image should be discarded before starting the checkpoint
indicator whether the image should be transfered back to the name-node upon completion of the checkpoint.

]]> datanode whose total size equals size. @see org.apache.hadoop.hdfs.server.balancer.Balancer @param datanode a data node @param size requested size @return a list of blocks & their locations @throws RemoteException if size is less than or equal to 0 or datanode does not exist]]> The tool scans all files and directories, starting from an indicated root path. The following abnormal conditions are detected and handled:

files with blocks that are completely missing from all datanodes.
In this case the tool can perform one of the following actions:
- none ({@link org.apache.hadoop.hdfs.server.namenode.NamenodeFsck#FIXING_NONE})
- move corrupted files to /lost+found directory on DFS ({@link org.apache.hadoop.hdfs.server.namenode.NamenodeFsck#FIXING_MOVE}). Remaining data blocks are saved as a block chains, representing longest consecutive series of valid blocks.
- delete corrupted files ({@link org.apache.hadoop.hdfs.server.namenode.NamenodeFsck#FIXING_DELETE})
detect files with under-replicated or over-replicated blocks

Additionally, the tool collects a detailed overall DFS statistics, and optionally can print detailed statistics on block locations and replication factors of each file. The tool also provides and option to filter open files during the scan.]]>