Sr. Big Data Enginee Resume
Minneapolis, MN
SUMMARY:
- Over 11+ years of experience in Information Technology in various business domains
- Working experience with large scale Hadoop environments build and support including design, configuration, installation, performance tuning and monitoring
- Hands - on experience in Big Data and HADOOP Ecosystem components. (HDFS, Map Reduce, PIG, HIVE, HBASE, SQOOP, Flume, Oozie, Zookeeper, Ambari, Ranger,Hue, Kafka, Storm, MongoDB, R language)
- Experience in Adding, Configuring and Deploying New services in Hadoop Eco-System.
- Deep understanding of schedulers, workload management, availability, scalability and distributed data platforms.
- Expertise in Datalake process on Acquiring, preprocessing & ingesting different data formats from Clients.
- Experience building a commercially viable scalable solution using Big Insights Horton works and Cloudera distributions of Hadoop, HIVE and HBase, Hue, YARN, SPARK, sentry
- Proficient in Map-Reduce Java and streaming APIs, Ruby, Python
- Experience with Hadoop & Big Data architecture
- Experience working in AWS environments leveraging EC2, S3 etc
- Configured and monitored Hadoop clusters with Cloudera Enterprise distribution(CDH4)
- Configured and monitored Hadoop clusters with Hortonworks Enterprise distribution.
- Experience with AWS (Amazon Web Services),cloud, Docker container and MapR
- Experience with Spark, Spark Streaming and Spark SQL
- Deep understanding of schedulers, workload management, availability, scalability and distributed data platforms.
- Expert understanding of ETL principles and how to apply them within Hadoop
- Experience with Messaging and collection frameworks like Flume, Kafka
- Experience with Oozie Workflow Engine to automate and parallelize Map/Reduce, Hive and Pig jobs.
- Experience with IBM BigSQL (Massively Parallel Processing) IBM STREAM
- Experienced in developing application using the R, Big R programming language.
- Experience with R and statistical modeling with RStudio.
- Experience with Source control and Configuration Management tools and technologies (Git/GitHub)
- Experience building business applications using RDBMS such as Oracle, DB2, MS SQL Server, MySQL PostgreSQL and foreign data wrapper for postgresql
- Importing/Exporting data from MYSQL/Teradata/Netezza/DB2 /Oracle to HDFS (Hadoop) using sqoop
- Good understanding of performance tuning with both NoSQL and SQL technologies.
- Good understanding of file formats including JSON, Parquet, Avro, and others.
- Extensive experience building and designing large-scale distributed applications
- Experience with agile/scrum methodologies to iterate quickly on product changes, developing user stories and working through backlogs
- Experience with Talend RTX ETL tool, devalop jobs and scheduled jobs in Talend integration suite
- Experience with ETL development with Sync sort/NiFi/Talend/TAC
- Experience in building self-contained applications using Docker container.
- Experience with many Linux distributions - CentOS and Ubuntu.
- Web development experience using Flask-SQLAlchemy and other frameworks
EXPERIENCE:
Confidential, Minneapolis, MN
Sr. Big Data Enginee
Responsibilities:
- Developed Sqoop Framework to Source Historical Data from Oracle, DB2, Sql Server, Oracle
- Ingest Flat files received via ECG FTP tool and files received from Sqoop into UHG Data Lake Hive and HBase using Data Fabric functionalities.
- Data Integrated, Extracted and transformed from Heterogeneous sources and loaded to Hadoop using Nifi and hadoop foreign data wrapper for postgresql
- Validate Transactional Data Files coming from IBM CDC (Change Data Capture) tool
- Used Splunk Dashboard to record and monitor incoming file frequency from CDC tool
- Developed and automated jobs in Talend open studio to validate the Ingested Data.
- Performed address standardization per business requirement and captured Golden Records.
- Ingested XML files capture from RabbitMQ and Stored in HBase tables.
- Developed HBase table for Monitor Ingestion Logs and snapshot logs.
- Loaded the data to HBASE using bulk load and HBASE API.
- Performing analytics using Talend Spark components on Insurance Claims Data.
- Developed Data transformation module employing hive, Map reduce.
- Involved in new development as well as bug fixing, performance tuning in the existing application.
- Wrote Data transformation script using hive, Map reduce (Python)
- Scheduled backups jobs by implementing Talend Tac Scheduler
- Data Integrated, Extracted and transformed from mainframe and loaded to Hadoop using Talend
- Configured BI tools to access hadoop cluster from windows.
- Designed/Developed framework/api to leverage platform capabilities using map reduce /HDFS
- Experience with PostgreSQL and Oracle databases and related tools such as SQLAlchemy.
Environment: MapR, Hive, Hbase, Sqoop, Pig, Talend TAC, Map Reduce, Splunk, Eclipse, Maven,, Shell Script, JSP, Json, JDBC, XML, MYBATIS,UNIX, LINUX, 2,Oracle,TERADATA, SVN,JBOSS, MYECLIPSE, JIRA, Subversion, Jenkins, Codehub, Autosys, Talend/TAC.
Confidential
Sr. Big Data Engineer
Responsibilities:
- Developed Data transformation module employing hive, Map reduce.
- Geospatial modeling, scripting and geostatistical application development for cloud based solution utilizing Map Reduce Hadoop
- Setting Up ESRI Spatial Frame Work for Hadoop
- Adding configuring ESRI jar to execute geospatial quires in hive and beeline.
- Involved in new development as well as bug fixing, performance tuning in the existing application
- Importing/Exporting data from MYSQL /Oracle to Azure cluster and AWS cluster using sqoop.
- Worked on Hive Jason Serde to parse spatial / Enclosed JSON, Unenclosed JSON Data format.
- Analyzing data using Hive & Pig Latin Scripting
- Involved in new development as well as bug fixing, performance tuning in the existing application.
- Wrote Data transformation script using hive, Map reduce (Python)
- Scheduled backups jobs by implementing cron job
- Data Integrated, Extracted and transformed from mainframe and loaded to Hadoop using (HDF) Nifi
- Configured BI tools to access hadoop cluster from windows.
- Designed/Developed framework/api to leverage platform capabilities using map reduce /HDFS
Environment: AWS (Amazon Web Services), MICROSOFT AZURE, Hive, Sqoop, crontab, Map Reduce, Eclipse, Maven,, Shell Script, JSP, Json, JDBC, XML, MYBATIS,UNIX, LINUX, DB2,TERADATA, SVN,JBOSS, MYECLIPSE, JIRA, Subversion, Control M/Autosys. NIFI (HDF).
Confidential
Sr. Big Data Engineer
Responsibilities:
- Configured and monitored Hadoop clusters with Big Insights Enterprise distribution (3.0/4.0)
- Configured and monitored Hadoop clusters with Cloudera Enterprise distribution(CDH4/CDH5)
- Review Big Data Architect/Design/Release to maintain platform integrity
- Installing and Configuring SysnSort
- Developed Data transformation module employing hive, Map reduce
- Developed User Defined Function (UDF) for hive.
- Loaded the data to HBASE using bulk load and HBASE API.
- Developing APIs to remotely access the hadoop cluster
- Configuring BI tools to access hadoop cluster from windows desktops
- Architect and Implement Hadoop eco system component such as HDFS, Map Reduce, HBase, Zookeeper, Pig, Hadoop streaming, Sqoop, Oozie, hive, hive server 2.
- Designed applications for storing data to HDFS by using Kafka to get more performance.
- Involved in new development as well as bug fixing, performance tuning in the existing application
- Mentor other team members with the development of hive/udf/Map Reduce scripts
- Designed and developed the presentation and web layers based on Java, J2EE
- Developed Restful Web service using Jersey and JBOSS
- Demonstrated and strong working knowledge of Global Delivery Models comprising onsite and offshore staffing model.
- Data Integrated, Extract, transform from mainframe and loaded to Hadoop using Syncsort,
- Talend open studio.
- Assisted with data capacity planning and node forecasting.
- Importing/exporting heterogeneous data(Ebcdic,Ascii..) using sqoop (native/fast connector ), custom api (map reduce sftp push & pull) and flume
- Involved in new development as well as bug fixing, performance tuning in the existing application
- Proactively recognizing potential performance improvements and mitigating potential issues.
- Working Closely with IBM (Big Insights) team for future updates and recommendations.
Environment: Java/J2EE, Hadoop, Hbase, Hive, Map Reduce, Eclipse, Maven,, JavaScript, JSP, Python, HTML, JDBC, XML, MYBATIS,UNIX, LINUX, DB2,TERADATA, SVN,JBOSS, MYECLIPSE, JIRA, Subversion, Control M/Autosys. Sync sort. Talend . R Big R.
Confidential, IL
Hadoop Developer (Product Recommendation Engine)
Responsibilities:
- Developed Data transformation module employing hive, Map reduce
- Developed User Defined Function(UDF) for hive (Java)
- Loaded the data to HBASE using bulk load and HBASE API.
- Accessed the data from HBASE using Hbase Rest interface
- Mentored other team members with the development of hive/udf/Map Reduce scripts
- Designed and developed the presentation and web layers based on Java, J2EE
- Developed Restful Web service using Jersey and JBOSS
- Demonstrated and strong working knowledge of Global Delivery Models comprising onsite and offshore staffing model.
- Configure and monitor Hadoop clusters with Cloudera Enterprise distribution(CDH4)
Environment: Java/J2EE, HAdoop, HBAase, Hive, Map Reduce, Eclipse, Maven,, JavaScript, JSP, HTML, JDBC, XML, MYBATIS, MySQL, DB2,TERADATA, SVN,JBOSS, MYECLIPSE, JIRA, Subversion, Control M/Autosys
Confidential
Software Developer
Responsibilities:
- Designed and developed the application architecture, use cases, and flowcharts using Microsoft Visio
- Designed and developed the presentation and web layers (Transactional application) based on Java, J2EE (STRUTS, spring, Hibernate, Velocity), web services (Axis2) using Eclipse 3.0, Visual Basic, ASP using Visual Studio.
- Worked with technologies such as JavaScript, XML, AJAX, HTML, and CSS.
- Developed Web services Architecture for supporting common business functions with direct access to web services from PL/SQL using Oracle9i/ AXIS 2
- Developed internal Tracking system to track marketing mediums, including organic and paid search, banner ads, referral links and email newsletters
- Developed and deployed applications in UNIX and Windows environments using ANT tool and shell script.
Environment: J2EE, AXIS2, AJAX, JavaScript, Struts, TILES, ANT, JSP, HTML, JDBC, XML, Hibernate 3.0 spring, Oracle 9i, Web Sphere 5, JBOSS 3/4 XEsoft Software (Pvt.) LTD. Mar 2008 - Sep 2009
Software Developer
Confidential
Responsibilities:
- Coded and implemented the application (client-server, web-based desktop applications)
- Involved in routine maintenance and updated the documentation of functional and technical design.
- Experience with Responsive Web Design (RWD) techniques.
- Expert HTML, JavaScript, and CSS Skills.
- Design and implement new components and features to a rapidly growing Application
Environment: Java, JDBC, Tomcat, JSP, Servlets, Oracle7, PL/SQL, HTML, DHTML, JavaScript, ASP, UML and XML, Rational Robot, Oracle, Java Servlets, XML, XPATH, JBOSS.