Big Data Engineer Resume Chicago, IL - Hire IT People

SUMMARY

Big Data Engineer/Hadoop Developer with around 6+ years of extensive experience in business procedures, design strategies, data analytics solutions development and work flow implementations.
Hands on experience on Hadoop and Spark Big Data technologies with experience in Storage, Querying, Processing and analysis of data.
Experienced in using various Hadoop tools such as Map Reduce, Hive, Sqoop, Impala, Avro & HDFS .
Technologies extensively worked on during my career are Python, Java and various Databases like MySQL, Oracle, Postgre SQL and Microsoft SQL server.
Hands on experience working with various Hadoop cluster managers and tools like Cloudera Manager, Apache Ambari, Hue, etc.
Experienced in developing programs by using SQL, Python & shell scripts to schedule the processes running on a regular basis.
Proficient in working on the Git version control system for code sharing and updating.
Experienced in creating ad - hoc reports, summary reports using Advanced Excel, SQL and Tableau .
Experienced in collecting logs data from various sources and integration into HDFS using Flume .
Experienced in testing data in HDFS and Hive for each transaction of data.
Experienced in importing & exporting data using Sqoop from HDFS to Relational Database Systems & vice-versa .
Extensive knowledge in programming with Resilient Distributed Datasets (RDDs) .
Experienced in using Flume to transfer log data files to Hadoop Distributed File System (HDFS) .
Good experience in Shell programming .
Knowledge in managing Cloudera's Hadoop platform along with CDH clusters .
Proficient in writing complex SQL queries, working with Databases like Oracle, SQL Server, PostgreSQL and MySQL
Excellent technical and analytical skills with clear understanding of design goals of ER modeling for OLTP and dimension modeling for OLAP .
Experience working with operational data sources and migration of data from traditional databases to Hadoop System.

TECHNICAL SKILLS

Databases: MySQL, SQL Server, Oracle, PostgreSQL

Programming Skills: Python, Pandas, Numpy, NLTK, Scikit-learn, HTML, Java

Hadoop/Big Data: HDFS, MapReduce, Spark, Yarn, Kafka, PIG, HIVE, Sqoop, Storm, Flume, Oozie, Impala, HBase, Hue, Zookeeper.

Programming Languages: C, Java, PL/SQL, Pig Latin, Python, HiveQL, Scala and Kafka

Java/J2EE & Web Technologies: J2EE, EJB, JSF, Servlets, JSP, JSTL, CSS, HTML, XHTML, XML, Angular JS, AJAX, JavaScript, JQuery.

Development Tools: Eclipse, Net Beans, SVN, Git, Ant, Maven, SOAP UI, JMX, explorer, XML Spy, QC, QTP, Jira, SQL Developer, QTOAD.

Methodologies: Agile/Scrum, UML, Rational Unified Process and Waterfall.

NoSQL Technologies: Cassandra, MongoDB, HBase.

Frameworks: Struts, Hibernate, And Spring MVC.

Scripting Languages: Unix Shell Scripting, Perl.

Distributed platforms: Horton works, Cloudera, MapR

Databases: Oracle 11g/12C, MySQL, MS-SQL Server, Teradata, IBM DB2

Operating Systems: Windows XP/Vista/7/8,10, UNIX, Linux

Software Package: MS Office 2007/2010/2016.

Web/ Application Servers: Web Logic, WebSphere, Apache Tomcat, WebSphere Application Server

Visualization: Tableau, Qulickview, Microstratergy and MS Excel.

Version control: CVS, SVN, GIT, TFS.

Web Technologies: HTML, XML, CSS, JavaScript, and jQuery, AJAX, AngularJS, SOAP, REST and WSDL.

PROFESSIONAL EXPERIENCE

Confidential, Chicago, IL

Big Data Engineer

Responsibilities:

Developed analytical solutions, data strategies, tools and technologies for the marketing platform using the Big Data technologies.
Implemented solutions for ingesting data from various sources utilizing Big Data technologies such as Hadoop, Map Reduce Frameworks, Sqoop, Hive
Worked as a Hadoop consultant on technologies like Map Reduce, Pig, Hive, and Sqoop.
Worked with the PySpark API.
Worked using Apache Hadoop ecosystem components like HDFS, Hive, Sqoop, Pig, and Map Reduce.
Experience working with big data and real time/near real time analytics using the big data platforms like Hadoop and Spark using Python.
Used Sqoop to efficiently transfer data between databases and HDFS and used Flume to stream the log data from servers.
Worked in writing Hadoop Jobs for analyzing data like Text format files, sequence files, Parquet files using Hive and Pig.
Worked on analyzing Hadoop cluster and different Big Data components including Pig, Hive, Spark, Impala, and Sqoop.
Developed Spark code using Python and Spark-SQL for faster testing and data processing.
Monitored metrics, created backend reports and dashboard on Tableau.
Developed predictive analytics using PySparkAPIs.
Involved in working of big data analysis using Pig and Hive.
Created Hive External tables and loaded the data into tables and query data using HQL.
Imported millions of structured data from relational databases using Sqoop import to process using Spark and stored the data into HDFS in CSV format.
Used Spark SQL to process the huge amount of structured data.
Extracted the data from MySQL and AWS RedShiftinto HDFS using Sqoop.
Worked on tools like Flume, Sqoop, Hive and PySpark.
Expert in performing business analytical scripts using HiveQL.

Environment: Big Data, Spark, Yarn, Hive, Flume, Pig, Python, Hadoop, AWS, Databases, RedShift.

Confidential, Malvern, PA

Big Data Developer

Responsibilities:

Worked as Big Data Developer in the team dealing with Firm's proprietary platform issues, providing data analysis for the team as well as developing enhancements.
Involved in working with large sets of big data in dealing with various security logs.
All the data was loaded from relational databases to Hdfs using Sqoop and handled the data in the form of flat files from different vendors, text data, xml data, etc.
Developed Map Reduce jobs for data cleaning and manipulation.
Involved in migration of data from existing RDBMS (MySQL and SQL server) to Hadoop using Sqoop for processing and analyzing the data.
Good working knowledge of implementing solutions using AWS services like (EC2, S3, and Redshift).
Performed file system management and monitoring on Hadoop log files.
Wrote Pig and Hive jobs to extract files from MongoDB through Sqoop and placed in HDFS.
Used Flume to collect, aggregate, and store the web log data from different sources like web servers, mobile and network devices and pushed to HDFS.
Involved in developing data frames using Spark SQL as needed.
Wrote Hive join queries to fetch information from multiple tables and Map Reduce jobs to collect data from Hive.
Used Hive to analyze the partitioned & bucketed data and compute various metrics for reporting on the dashboard.
Developed the code for importing and exporting data into HDFS and Hive using Sqoop.
Involved in configuring and maintaining cluster, and managing & reviewing Hadoop log files.
Continuous monitoring and managing the Hadoop cluster through Cloudera Manager.
Analyzed large amounts of data to determine optimal way to aggregate and reported the findings.
Explored with the Spark framework, methods for improving the performance and optimization of the existing jobs in Hadoop using Spark Context, Spark-SQL, Data Frames, and YARN.

Environment: MySQL, SQL Server, Python, Hadoop, HDFS, Hive, Map Reduce, Cloudera, Pig, Sqoop, Impala, Flume, PySpark, Spark SQL.

Confidential

Java Developer

Responsibilities:

Developed the Web Interface using Struts, Java Script, HTML and CSS.
Extensively used the Struts controller component classes for developing the applications.
Involved in developing business tier using stateless session bean (acts as a Session Facade) and Message driven beans.
Used JDBC and Hibernate to connect to the database, using Oracle.
Data sources were configured in the app server and accessed from the DAO’s through Hibernate.
Design patterns of Business Delegates, Service Locator and DTO are used for designing the web module of the application.
Developed SQL stored procedures and prepared statements for updating and accessing data from database.
Involved in developing database specific data access objects (DAO) for Oracle.
Used CVS for source code control and JUNIT for unit testing.
Used Eclipse to develop entity and session beans.
The entire application is deployed in WebSphere Application Server.
Followed coding and documentation standards.

Junior SQL Developer

Confidential

Responsibilities:

Involved in the life-cycle of the project, i.e., requirements gathering, design, development, testing and maintenance of the database.
Created Database Objects like Tables, Stored Procedures, Views, Clustered and Non-Clustered indexes, Triggers, Rules, Defaults, User defined data types and functions.
Performed and fine-tunedstored procedures and SQL Queries and User Defined Functions using Execution Plan for better performance.
Created and scheduled SQL jobs to run SSIS packages daily, using MS SQL Server Integration Services (SSIS).
Performed query optimization and tuning, debugging and maintenance of stored procedures.
Database Creation, Assigning Database Security and Standard data modelling techniques.
Performed troubleshooting operations on the production servers.
Monitored, tuned and analysed database performance and allocated server resources to achieve optimum database performance.
Creating Staging Database and Import Tables in MS SQL Server.
Loading the data in the systems using Loaderscripts, Cursors, Stored Procedures.
Testing the data in Test Environment, client validation, issues resolution.
Developing reports on SSRS on SQL Server (2012).

Environment: Java, J2EE, JDK, Java Script, XML, Struts, JSP, Servlets, JDBC, EJB, Hibernate, Web services, JMS, JSF, JUnit, CVS, IBM Web Sphere, Eclipse, Oracle 9i, Linux.

We provide IT Staff Augmentation Services!

Big Data Engineer Resume

Chicago, IL

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship