Big Data Engineer (Hadoop Developer) Resume Chicago - Hire IT People

SUMMARY

Having 8 plus years of experience in IT experience in software design, development, implementation, and support of business applications for Telecom, health and Insurance industries
Worked extensively on installing and configuring Hadoop ecosystem components Hive, SQOOP, PIG, HBase, Zookeeper and Flume
Performed a key role in understanding the business requirements for migrating data to data warehouse.
Performed unit testing at various levels of the ETL and actively involved in team code reviews.
Data Extraction, aggregations, and consolidation of Adobe data within AWS Glue using PySpark.
Designed and implemented effective Analytics solutions and models withSnowflake.
Compared the performance of the Hadoop based system to the existing processes used for preparing the data for analysis
Worked on real time data integration using Kafka,Spark streaming and HBase.
Implemented ETL operations using Big Data platform
Good Knowledge in Spark and Scala, Sql queries and creating databases like stored procedures
Triggers for implementing business techniques.
Implementing a CI/CD Pipeline involving Bitbucket, Jenkins, Chef, Docker for complete automation from commit to deployment.
Hands of experience on build tools like Maven, Log4j, Junit and Ant
Working with the data extraction, transformation and load in Hive, Pig and HBase
Hands on Experience in designing and developing applications in Spark using Scala to compare the performance of Spark with Hive and SQL/Oracle.
Experience in understanding the security requirements for Hadoop and integrate with Kerberos authentication and authorization infrastructure.
Hands on experience on Streaming data ingestion and Processing
Experience in designing different time driven and data driven automated workflows using Oozie.
Expertise in writing the Real - time processing application Using spout and bolt in Storm.
Experience in configuring the Zookeeper to coordinate the servers in clusters and to maintain the data consistency
Used Python and Django to interface with the jQuery UI and manage the storage and deletion of content.
Acumen on Data Migration from Relational Database to Hadoop Platform using SQOOP.
Installed and configuredHadoopeco system components
Wrote Flume configuration files for importingStreaminglog Data intoHBasewith Flume.
Imported several transactional logs from web servers with Flume to ingest the Data into HDFS.
Installed and configured Pig, written Pig Latin scripts to convert the Data from Text file to Avro format.
Created Partitioned Hivetables and worked on them using HiveQL.
Experienced in migrating ETL transformations using Pig Latin Scripts, transformations, join operations.
Good understanding of MPP databases such as HP Vertica and Impala.
Hands on experience in configuring and working with Flume to load the data from multiple sources directly into HDFS
Experience in using design pattern, Java, JSP, Servlets, JavaScript, HTML, jQuery, Angular JS, Mobile jQuery, JBOSS 4.2.3, XML, Web Logic, SQL, PL/SQL, JUnit, and Apache-Tomcat, Linux.
Expertise in relational databases like Oracle, My SQL and SQL Server.
Experience in implementing projects both in Agile and Waterfall methodologies.
Having Good Experience on Cloud Technologies AWS, Azure and GCP
Strong Experience on Data Warehousing ETL concepts using Informatica Power Center, OLAP, OLTP and AutoSys.
MigratedHiveQLqueries intoSparkSQLto improve performance.
Performed Data integrity, validation and testing on the data migrated into the data warehouse.

TECHNICAL SKILLS

Bigdata/Hadoop
Oracle12c
SQOOP
PIG HIVE
SQL
PL/SQL
API
HBase
NoSQL
Python
Pyspark
ADF
SaaS
Erwin
Kafka
Spark
SSIS
Map/Reduce
ETL
SSRS
Tableau
Oozie
Teradata

PROFESSIONAL EXPERIENCE

Big Data Engineer (Hadoop Developer)

Confidential

Responsibilities:

Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Hive, Autosys, Spark, Impala with Cloudera distribution
Implemented Performance tuning to long running jobs and added Parallelism to jobs
Created views on top of existing data to transform with HiveQL
Reviewed Hadoop cluster configuration, application configuration and configure required parameters
Built processing flows with UC4 scheduler and work on data cleansing routines
Supports production applications for daily loads and automate them using AutoSys
Performed hive query optimization techniques for integration
Responsible to run Benchmarking and work on fine tuning the applications
Performed Monthly refresh activities and produce data for MBRs on monthly basis
Executed SQL statements to execute validation on the application data and update the data layouts
Experience with collaborating with other teams like Product, Technology, Risk, Compliance, Legal, and Operations
Worked with teams in different geographical region to deliver solutions
Managed and Deployed applications as Docker Containers to Aws Cloud Infrastructure
Deployed Elastic Load balancer for scalable and high available on AWS
Implemented and managed Continuous automation on AWS Cloud with cloud formation Terraform
Having Good Knowledge on cloud technologies as AWS, Azure and Google GCP
Extract Transform and Load Data from Sources Systems toAzure Data Storage.
Responsible to keep up to date on new/updated tools in the Big Data environment
Designing and implementing Hive queries and functions for evaluation, filtering, loading and storing of data
Implementing a risk assessment process and presenting the global risk program to the Board along with senior management
Automated Jobs using Austosys for all Environments with Regressive Testing
Worked on Solution requirements which includes Transformations with the data sources given Developed mechanisms for data ingestion and load data from the sources
Worked on complex and mission - critical data analysis for a wide range of applications using data in different formats, volumes and source systems at a company scale
Helping Production support for the Sqoop jobs Running on Production.
Having experience on creating databases, tables and views in HIVEQL and IMPALA
Working with data delivery team to setup new Hadoop users, Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig and MapReduce access for the new users on Horton works
Good knowledge in using Hibernate for mapping Java classes with database and using Hibernate Query Language (HQL).
Involving in UNIT testing and in Integration Testing and Deployment

Hadoop Developer

Confidential, Chicago

Responsibilities:

Involved in Migrating applications from Cloudera to Hortonworks. Experience in working with various Cloudera distributions (CDH4/CDH5) and have knowledge on Hortonworks
Expertise in designing and deployment of Hadoop cluster and different Big Data analytic tools including Pig, Hive, HBase, Oozie, Sqoop, Flume, Spark, Impala.
Responsible for data extraction and data ingestion from different data sources into Hadoop Data Lake by creating ETL pipelines using Sqoop and Hive
Professional Java developer with strong expertise in data engineering and big data technologies.
Hands on experience in programming using Java, Scala and SQL.
Analyzed the data using HiveQL to identify the different correlations and used core Java technologies to create Hive/Pig UDFs to use in the project.
Implemented complex MapReduce programs to perform joins on the Map side using Distributed Cache in Java.
Experience in performance tuning the Hadoop cluster by gathering and analyzing the existing infrastructure.
Continuous monitoring and managing the Hadoop cluster using Cloudera Manager and Ambari
Created multiple Map Reduce Jobs using Java API, Pig and Hive for data extraction
Wrote ETL jobs to read from web APIs using REST and HTTP calls and loaded into HDFS using java and Talend.
Operated on Java/J2EE systems with different databases, which include Oracle, MySQL and DB2.
Developed multiple MapReduce jobs in Java for data cleaning and preprocessing.
Worked on Talend ETL tool and used features like context variable and database components like input to oracle, output to oracle, tFile compare, tFile copy, to oracle close ETL components
Created ETL Mapping with Talend Integration Suite to pull data from Source, apply transformations, and load data into target database.
Worked with D-Series for Scheduling Sqoop jobs
Used HiveQL for data analysis like creating tables and import the structured data to specified tables for reporting.
Hands on experience in application development using Java, RDBMS, and Linux shell scripting.
Designed and Implemented Azure architectures, environments, and resources
Implemented migrations of applications and databases onto Azure
Contributed or author blogs, whitepapers, presentations on Azure technical and strategic topics
Having experience on creating databases, tables and views in HIVEQL and IMPALA
Working with data delivery team to setup new Hadoop users, Linux users, setting up Kerberos principals and testing HDFS, Hive, Pig and MapReduce access for the new users on Horton works
Good knowledge in using Hibernate for mapping Java classes with database and using Hibernate Query Language (HQL).
Extensive hands on experience in writing complex MapReduce jobs and Hive data modeling.
Involved in Agile methodologies, daily scrum meetings, spring planning.

Hadoop Developer

Confidential

Responsibilities:

Developed Schedulers that communicated with the Cloud based services (AWS) to retrieve the data.
Designed and implementedHIVE queries and functions for evaluation, filtering, loading and storing of data.
Creating Hive tables and working on them using HiveQL.
Developed data pipeline using Kafka and Storm to store data into HDFS.
Continuous monitoring and managing theHadoop cluster through Cloudera Manager.
Involved in review of functional and non-functional requirements.
Implemented Frameworks using Java and python to automate the ingestion flow.
Developed views and templates with Python and Django's view controller and templating language to create a user-friendly website interface.
Experience in implementing python alongside using various libraries such as graphs, MySQL db for database connectivity, python-twitter.
Developed a fully automated continuous integration system using Git, Gerrit, Jenkins, MySQL and custom tools developed in Python and Bash.
Responsible to manage data coming from different sources.
Loaded the CDRs from relational DB using Sqoop and other sources toHadoop cluster by using Flume.
Experience in processing large volume of data and skills in parallel execution of process using Talend functionality.
Involved in loading data from UNIX file system and FTP to HDFS.
DevelopedHivequeries to analyze the output data.
Developed workflow in Oozie to automate the tasks of loading the data into HDFS and pre-processing with Pig.
Designed Cluster co-ordination services through Zookeeper.Collected the logs data from web servers and integrated in to HDFS using Flume.
Used HIVE to do transformations, event joins and some pre-aggregations before storing the data onto HDFS.
Designed and implemented Spark jobs to support distributed data processing.
Supported the existing MapReduce Programs those are running on the cluster.
Wrote the shell scripts to monitor the health check ofHadoop daemon services and respond accordingly to any warning or failure conditions.
Involved in Migrating the Hive Data to Google BigQuery
Implementing the Automatic workflows with apache Airflow and Integrated the scripts with Jenkins
Used Jenkins to deploy code to Google Cloud with new namespaces, create Docker images and push them to container registry of Google Cloud.’
Expertise in documenting and deployment process and high-level preparation of Release notes, Checklists, Quality process docs, Analysis docs, configuration docs with versions.
Lead many formal and informal sessions to educate the issues of security and the importance of best practices in GCP.
Expertise in designing the Google Cloud architecture by following the financial regulations from security point of view.
Expertise in several GCP service focusing on Security, Kubernetes and Biq Query.
Expertise in automation of the infrastructure using Terraform for both AWS and GCP.
Created HIVE table to store the processed results in tabular format.
Involved in Building multitenant solutions using Python and internal tools, delivering complex cloud platforms
Worked with Spark, improving the performance and optimization of the existing applications in Hadoop using Spark Context, Spark-SQL, Data Frame, Pair RDD's, Spark YARN
Monitored Hadoop Jobs and Reviewed Logs of the failed jobs to debug the issues based on the errors
Fine tune Hadoop applications for high performance and throughput
Worked on POC on the real time streaming the data using Spark with Kafka.
Support development with application architecture in both real time and batch big data processing
Used Spark API overHadoopYARN as execution engine for data analytics using Hive
Worked on different file formats (ORCFILE, TEXTFILE) and different Compression Codecs (GZIP, SNAPPY, LZO).

Java Developer

Confidential

Responsibilities:

Involved in Agile - Sprint methodologies of SDLC for project management design, development,
Designed and implemented the training and reports modules of the application using Servlets, JSP andajax
Involved in Agile - Sprint methodologies of SDLC for project management design, development
Experience in using Spring Integration and RabbitMQ for creation of web services and communication.
Interact with Business Users and Develop Custom Reports based on the criteria defined.
Requirement gathering and information collection.
Analysis of gathered information so as to prepare a detail work plan and task breakdown structure
Developed custom JSP tags for the application
Involved in the phases of SDLC (Software Development Life Cycle) including Requirement collection
Design and analysis of Customer specification, Development and Customization of the application
Used Postman to trigger HTTP requests making the SOAP and REST based APIs work faster.
Created and consumed SOAP and REST services using CXF and used Mule ESB to route various calls to do validation of service input and to handle exceptions.
Used Quartz schedulers to run the jobs in a sequential with in the given time
Implemented the reports module applications using jasper reports for business intelligence
Good Experience in Exposure to Writing SQL/Transact-SQL (DDL, DML and DCL)
Developing, Creating New Database and Database Objects Such as Tables, Views, Indexes, Complex Stored
Deployed application on tomcat server for business application in client location
End-to-End System development and testing of Unit integration and System integration
Co-ordination activities with Onshore and Offshore team of 10+ members
Responsible for Effort estimation and timely production deliveries
Creation and Execution of half yearly and yearly load jobs which updates new rate and discounts etc. for the claim calculations in Database and Files
Extensively used Java multi-threading to implement batch Jobs with JDK 1.5 features
Configured the project on Web Logic 10.3 application servers
Implemented the online application using Core Java, JDBC, JSP, Servlets, spring, Hibernate, Web Services, SOAP, and WSD.

We provide IT Staff Augmentation Services!

Big Data Engineer (hadoop Developer) Resume

ChicagO

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship