This is a collection of miscellaneous info regarding EC2. Special thanks to the past TAs who compiled this information.
When Things Go Wrong
Deleting Old Output Directories
Since the local disks on these instances are very large, you can feel free to invent a new name for outputs on HDFS to avoid the error of output directories already existing. If, however, you want to delete an old output from HDFS, you can do it with
$ hc large dfs -rmr hdfs:///NAME
Stopping running Hadoop jobs
Often it is useful to kill a job. Stopping the Java program that launches the job is not enough; Hadoop on the cluster will continue running the job without it. The command to tell Hadoop to kill a job is:
$ hc large job -kill JOBID
where JOBID is a string like "job_201101121010_0002" that you can get from the output of your console log or the web interface.
Proxy problems
"12/34/56 12:34:56 INFO ipc.Client: Retrying connect to server: ec2-123-45-67-89.amazonaws....."
or
"Exception in thread "main" java.io.IOException: Call to ec2-123-45-67-89.compute-1.amazonaws.com/123.45.67.89:8020 failed on local exception: java.net.SocketException: Connection refused
If you get this error from 'hc' try running
$ hadoop-ec2 proxy large
again. If you continue getting this error from hc after doing that, check that your cluster is still running using
$ hadoop-ec2 list large
and by making sure the web interface is accessible.
Deleted configuration files
If you've accidentally deleted one of the configuration files created by bash ec2-init.sh, you can recreate it by rerunning bash ec2-init.sh.
Last resort
It's okay to stop and restart a cluster if things are broken. But it wastes time and money.
Miscellaneous EC2-related Commands
We don't think you'll need any of these...
Terminating/Listing Instances Manually
You can get a raw list of all virtual machines you have running using
$ ec2-my-instances
This will include the public DNS name (starts with "ec2-" and ends with "amazonaws.com") and the private name (starts with "ip-...") of each virtual machine you are running, as well as whether it is running or shutting down or recently terminated, its type, the SSH key associated with it (probably USERNAME-default) and the instance ID, which looks like "i-abcdef12". You can use this instance ID to manually shut down an individual machine:
$ ec2-terminate-instances i-abcdef12
Note that this command will not ask for confirmation. ec2-terminate-instances comes from the EC2 command line tools. ec2-my-instances is an alias for the command line tools' ec2-describe-instances command with options to only show your instances rather than the whole class's.
Logging into your EC2 virtual machines
$ hadoop-ec2 login large # or using a machine name listed by ec2-my-instances or hadoop-ec2 list $ ssh-nocheck -i ~/USERNAME-default.pem root@ec2-....amazonaws.com
The cluster you start is composed of ordinary Linux virtual machines. The file ~/USERNAME-default.pem is the private part of an SSH keypair for the machines you have started.
Viewing/changing your AWS credentials
You can view your AWS access key + secret access key using:
$ ec2-util --list
If you somehow lose control of your AWS credentials, you can get new AWS access keys using:
$ ec2-util --rotate-secret $ new-ec2-certificate
This is likely to break any running instances you have.