Wednesday, March 21, 2018

AWS EMR:: Access S3 bucket from Hue Editor

Context:

I created S3 bucket as shown below. (Even though  the page shows, S3 does not require region selection, the S3 bucket page showing as US West (Oregon))




Followed by that, I created EMR cluster(Release 5.12.0) with following software configuration in US West (Oregon) region.






And also added inbound rules to access Master & Slave nodes from public internet, followed enabled traffic to ports 8888(Hue/Hive) & 8787(RStudio).





Issue:

Once EMR clsuter is ready and in waiting state, I opened Hue Editor and tried to browse s3 bucket.

I received below error.

"Failed to access path: "s3a://<s3-bucket-name>" Check that you have access to read this bucket and that the region is correct: Bad Request"




Solution:

I followed below mentioned steps to fix this issue.

  1. Login to EC2 instance(master node) as ec2-user using your key pair



  2. Go to path /etc/hue/conf.empty/ or /usr/lib/hue/desktop/conf

  3. Ref Command: cd /etc/hue/conf.empty/ or cd /usr/lib/hue/desktop/conf

  4. Take Backup of hue.ini file

  5. Ref Command: sudo cp hue.ini hue.ini_<ddmmyyyy>

  6. Edit hue.ini file

  7. Ref Command: sudo vi hue.ini

    Uncomment below lines (for me these lines were showing in between line numbers 1300 & 1325) and update the details based on your region and security credentials. 
    For access key id and secret access key, refer link.

    Before:



    After:



    Note: Providing access key id & secret access key in hue.ini file is not recommended in case of  Production Environments. Use below properties instead of access_key_id & secret_access_key.

    access_key_id_script=/path/to/access_key_script
    secret_access_key_script= /path/to/secret_key_script

  8. Save the changes.

  9. Press Escape button and type :wq to close vi editor and save the changes to hue.ini file. 

  10. Restart Hive and Hue services.

  11. Type below command to see the status of all the services running on Master node.

    initctl list | grep hive
    initctl list | grep hue

    Run below commands to stop the services.

    sudo stop hue
    sudo stop hive-server2


    Run below commands to start the services.

    sudo start hive-server2
    sudo start hue


  12. Now again try to access the Hue Editor and click on S3 browser. It will display the files in your S3 bucket.





NoteIf you terminate the cluster, you need to repeat same steps again to enable access.