Forum: >>> Magnum BBS <<<

=?UTF-8?Q?Big_Data_Developer_=E2=80=93_Data_Lake_=2F=2F_Cambridge=2C_MA

From shankar3sbc@gmail.com@21:1/5 to All on Thu Apr 25 08:14:27 2019

Big Data Developer – Data Lake
Cambridge, MA
6 Months with Possible extension

Description:
• Implementation and Administration of On-prem Data lake environment
• Monitoring and managing the Hadoop services on 3 clusters
• Installing the New hosts (Head nodes, compute nodes and worker nodes to the existing cluster) and decommission of the hosts from the cluster
• Maintenance and Monitoring of the jobs of Production, UAT and Development environments
• Code changes and updated code deployments in the UAT and Production environments
• Deploying code changes on Rshiny server and Rstudio server as per the user request
• Implementation and Monitoring of oozie scheduled jobs
• Implementation of patching activities and applying the fixes to the data lake environment provided by the Hortonworks
• Working on the job failures mostly Hive and Spark jobs across the data lake environment
• Onboarding the new users to the Hadoop data lake environment
• Requirements gathering for creating the databases in Hive and providing policy based access management from the Ranger for the new Proof of Concepts (POCs) like Veeva Insights
• Supporting the developers for executing the adhoc jobs in Hive environments for the existing POCs like enrollment_forecaster etc
• HDFS home directories and Hive schema, table and column level enforcing access bases policies management from Ranger
• Implementation of Security and management of Active Directory based Kerberos authentication across data lake clusters
• Implementation of SSL for the Ambari and other HDP services in Hortonworks environment across the data lake clusters
• Management of Encryption and Decryption of the users data using Ranger-KMS across the clusters of data lake environment
• Installation and upgradation of Jupyterhub and python packages to support the developers for implementing the code in on-prem environments
• working with HPC team for hardware issues and allocation of physical resources for the data lake environment
• Hail- Spark implementation and analysis of UKBIOBANK datasets of genotypes and Phenotypes
• Installation of latest version of spark and hail and optimization of Resources for launching datasets with huge size of data
• Work with Hortonworks team for the planned upgradation of HDP version from 2.6 to 3.0
• Support and maintenance of MongoDB servers in data lake
• Source code Repository maintenance in Bitbucket
In addition to the above tasks, the resource will also perform the following AWS activities
• Support of Cloudbreak server in AWS for the Hortonworks CB Deployment • Support of software upgrades for Cloudbreak,HDP packages installation in AWS Cluster
• Support Data scientists for any technical issues during the execution of Spark-Hail jobs in Cloudbreak AWS cluster
Setup of latest versions of Spark and Hail in AWS spark cluster

--
Thanks and regards
Shankar Allamsetti
Phone :281-823-9222 Ext: 517| Fax : 281-823-9225 |
Email: shankar.allamsetti@3sbc.com || G-Talk: shankar3sbc
****Best way to reach me through email****

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Michal Wronka
  Wed Apr 24 14:13:57 2024
  from Wroclaw, Poland via SSH
- Michal Wronka
  Wed Apr 24 14:02:51 2024
  from Wroclaw, Poland via SSH
- Guest
  Wed Apr 24 01:40:10 2024
  from A via Telnet
- Bob Worm
  Thu Apr 25 11:52:12 2024
  from Wales, Uk via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	296
Nodes:	16 (3 / 13)
Uptime:	50:19:48
Calls:	6,649
Calls today:	1
Files:	12,200
Messages:	5,330,205

=?UTF-8?Q?Big_Data_Developer_=E2=80=93_Data_Lake_=2F=2F_Cambridge=2C_MA

Who's Online

Recent Visitors

System Info