• =?UTF-8?Q?Big_Data_Developer_=E2=80=93_Data_Lake_=2F=2F_Cambridge=2C_MA

    From shankar3sbc@gmail.com@21:1/5 to All on Thu Apr 25 08:14:27 2019
    Big Data Developer – Data Lake
    Cambridge, MA
    6 Months with Possible extension

    Description:
    • Implementation and Administration of On-prem Data lake environment
    • Monitoring and managing the Hadoop services on 3 clusters
    • Installing the New hosts (Head nodes, compute nodes and worker nodes to the existing cluster) and decommission of the hosts from the cluster
    • Maintenance and Monitoring of the jobs of Production, UAT and Development environments
    • Code changes and updated code deployments in the UAT and Production environments
    • Deploying code changes on Rshiny server and Rstudio server as per the user request
    • Implementation and Monitoring of oozie scheduled jobs
    • Implementation of patching activities and applying the fixes to the data lake environment provided by the Hortonworks
    • Working on the job failures mostly Hive and Spark jobs across the data lake environment
    • Onboarding the new users to the Hadoop data lake environment
    • Requirements gathering for creating the databases in Hive and providing policy based access management from the Ranger for the new Proof of Concepts (POCs) like Veeva Insights
    • Supporting the developers for executing the adhoc jobs in Hive environments for the existing POCs like enrollment_forecaster etc
    • HDFS home directories and Hive schema, table and column level enforcing access bases policies management from Ranger
    • Implementation of Security and management of Active Directory based Kerberos authentication across data lake clusters
    • Implementation of SSL for the Ambari and other HDP services in Hortonworks environment across the data lake clusters
    • Management of Encryption and Decryption of the users data using Ranger-KMS across the clusters of data lake environment
    • Installation and upgradation of Jupyterhub and python packages to support the developers for implementing the code in on-prem environments
    • working with HPC team for hardware issues and allocation of physical resources for the data lake environment
    • Hail- Spark implementation and analysis of UKBIOBANK datasets of genotypes and Phenotypes
    • Installation of latest version of spark and hail and optimization of Resources for launching datasets with huge size of data
    • Work with Hortonworks team for the planned upgradation of HDP version from 2.6 to 3.0
    • Support and maintenance of MongoDB servers in data lake
    • Source code Repository maintenance in Bitbucket
    In addition to the above tasks, the resource will also perform the following AWS activities
    • Support of Cloudbreak server in AWS for the Hortonworks CB Deployment • Support of software upgrades for Cloudbreak,HDP packages installation in AWS Cluster
    • Support Data scientists for any technical issues during the execution of Spark-Hail jobs in Cloudbreak AWS cluster
    Setup of latest versions of Spark and Hail in AWS spark cluster


    --
    Thanks and regards
    Shankar Allamsetti
    Phone :281-823-9222 Ext: 517| Fax : 281-823-9225 |
    Email: shankar.allamsetti@3sbc.com || G-Talk: shankar3sbc
    ****Best way to reach me through email****

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)