Hadoop /Data Analyst Resume Hartford, CT - Hire IT People

SUMMARY

7+ years of software development experience wif 2.5 years of experience in Hadoop and big data related technologies.
Excellent noledge on internal working of HDFS filesystem, Mapreduce
In - depth noledge on Hadoop ecosystem components like: Pig, Hive, Sqoop, Flume, Oozie, Zookeeper.
Experience of enterprise Hadoop distributions of Cloudera and Hortonworks.
Built, deployed and managed 155-node Hadoop Cluster based on Cloudera's Distribution of Hadoop
Experience in configuring High Availability for Cloudera Manager 5 and its services
Strong noledge in configuring NameNode Quorum-based high availability wif Automatic Failover using Zookeeper, and NameNode federation
Experience in monitoring and managing large scale multi-node production Hadoop cluster using CDH4 and CDH5
Experience in performing minor and major upgrades, commissioning and decommissioning of nodes on Hadoop cluster
Experience in managing Hadoop processes using Init scripts and manually
Experience in HDFS and Mapreduce maintenance tasks involving adding a datanode/tasktracker, checking filesystem integrity using fsck, balancing HDFS block data
Strong noledge of Apache Hive and Pig administration and deployment
Experience in working on production support and maintenance related projects
Hands on experience in data mining process, implementing complex business logic and optimizing teh query using HiveQL. Controlling teh data distribution by partitioning and bucketing techniques to enhance performance.
Solid experience in Pig administration and development, experience in writing Pig UDFs(Eval, Filter, Load and Store) and macros
Experience in embedding Hive and Pig in Java
Experience in using HCatalog for Hive, Pig and Hbase
Exposure to NoSQL databases Hbase and Cassandra
Worked on developing ETL processes to load data into HDFS using tools like Flume and Sqoop, perform structural modifications using Mapreduce, Hive and analyze data using visualization/reporting tools
Familiar wif importing and exporting data using Sqoop
Experience in writing Mapreduce joins like Map-side joins using Distributed Cache API
Experience in planning, designing and developing applications spanning full life cycle of software development from writing functional specification, designing, implementing, documentation, unit testing and support.
Experience in working on production support environment supporting large applications involving complex issues, bug fixes, daily/monthly/yearly maintenance activities, batch job monitoring and troubleshooting.
Excellent team player wif good communication, interpersonal and presentation skills.

TECHNICAL SKILLS

Languages: C++, Java, VB, Shell Scripting, IBM AS400(RPG, CL, Subfiles, Display Files), PL/SQL, ASP .Net

Hadoop Ecosystem: HDFS, Map-Reduce, Hive, Pig, Sqoop, Oozie, Flume, Zookeeper, HBase

Tools: & Technologies: Rational Rose, Eclipse/Spring source IDE, Microsoft Office Suite, Matlab, Xcode, Netbeans, Microsoft Visual Studio

Databases: Oracle, MySQL, SQL Server, IBM DB2/400

Operating Systems: Windows XP/7/8, Linux RedHat/Ubuntu/CentOS

Monitoring Tools: Cloudera Manager, Ganglia, Nagios, Ambari

Version Control: Git, Microsoft Visual SourceSafe, Subversion (SVN)

PROFESSIONAL EXPERIENCE

Confidential, Hartford, CT

Hadoop /Data Analyst

Responsibilities:

Actively participated wif teh development team to meet teh specific customer requirements and proposed TEMPeffective Hadoop solutions.
Installed and configured cluster, including setup of Namenode, Datanodes, Jobtracker and Tasktrackers.
Worked closely wif teh complaint processing teams to determine teh severity of teh complaint.
Collected history of customers who have registered a complaint.
Text mining of teh transcripts of phone logs of teh customers reaching out to customer care using Map Reduce wif R.
Aggregating all teh data of a customer who has filed a complaint wif Pig and Hive using MySQL.
Exporting teh data to RDBMS using Sqoop.
Identifying patterns in customer complaints.
Reaching out to customers who have filed a complaint or have a high tendency to file a complaint and resolving teh issue wifin a week.

Environment: HDFS, PIG, HIVE, Map Reduce, Linux, HBase, Flume, Sqoop, VMware, Eclipse, Cloudera, Hortonworks.

Confidential, Hartford, CT

Hadoop Consultant

Responsibilities:

Worked closely wif teh claims processing team to obtain patterns in filing of fraudulent claims.
Worked on performing major upgrade of cluster from CDH3u6 to CDH4.4.0
Developed Mapreduce programs to extract and transform teh data sets and results were exported back to RDBMS using Sqoop.
Patterns were observed in fraudulent claims using text mining in R and Hive.
Installed, Configured and managed Flume Infrastructure
Was responsible for importing teh data (mostly log files) from various sources into HDFS using Flume
Created tables in Hive and loaded teh structured (resulted from MapReduce jobs) data
Using HiveQL developed many queries and extracted teh required information.
Exported teh data required information to RDBMS using Sqoop to make teh data available for teh claims processing team to assist in processing a claim based on teh data.

Environment: HDFS, PIG, HIVE, Map Reduce, Linux, HBase, Flume, Sqoop, VMware, Eclipse, Cloudera

Confidential, CA

Data Analyst

Responsibilities:

Work wif users to identify teh most appropriate source of record and profile teh data required for sales and service.
Document teh complete process flow to describe program development, logic, testing, and implementation, application integration, coding.
Involved in defining teh business/transformation rules applied for sales and service data.
Define teh list codes and code conversions between teh source systems and teh data mart.
Worked wif internal architects and, assisting in teh development of current and target state data architectures
Worked wif project team representatives to ensure that logical and physical ER/Studio data models were developed in line wif corporate standards and guidelines
Involved in defining teh source to target data mappings, business rules, business and data definitions
Responsible for defining teh key identifiers for each mapping/interface
Responsible for defining teh functional requirement documents for each source to target interface.
Document, clarify, and communicate requests for change requests wif teh requestor and coordinate wif teh development and testing team.
Coordinated meetings wif vendors to define requirements and system interaction agreement documentation between client and vendor system.
Document data quality and traceability documents for each source interface
Establish standards of procedures.
Generate weekly and monthly asset inventory reports.
Evaluated data profiling, cleansing, integration and extraction tools(e.g. Informatica)
Coordinate wif teh business users in providing appropriate, TEMPeffective and efficient way to design teh new reporting needs based on teh user wif teh existing functionality
Remain noledgeable in all areas of business operations in order to identify systems needs and requirements.
Responsible for defining teh key identifiers for each mapping/interface
Implementation of Metadata Repository, Maintaining Data Quality, Data Cleanup procedures, Transformations, Data Standards, Data Governance program, Scripts, Stored Procedures, triggers and execution of test plans

Environment: SQL/Server, Oracle 9i, MS-Office, Teradata, Informatica, ER Studio, XML, Business Objects

Confidential

Java Developer

Responsibilities:

Used agile methodology in designing and developing teh modules.
Collected User Stories for documenting teh requirements of product catalog, ordering products and Approval module.
Used Struts validator framework to validate user input.
Developed MVC design pattern based User Interface using JSP, XML, HTML and Struts.
Used JSF framework in developing user interfaces using JSF UI components, Validator events and Listeners.
Used Apache Axis to generate teh Order products web services module.
Design and Implemented WSDL/SOAP Web Services to provide teh interface to teh various clients running on both Java and Non Java applications.
Identifying and implementation of different J2EE design patterns like Service Locator, Business Delegate, and Dao etc.
Used SOAP UI to test teh Web services.
Application is built using standard design patterns such as DAO, Abstract Factory, Session Facade, Business Delegate, and MVC.
Junit, log4j were used for unit testing and as logging frameworks.
Hibernate is used as a persistence mapping technology by mapping and configuring teh POJO classes wif Data Base tables.
Participated in and contributed to group sessions, design reviews, and code analyzing.
Used svn repository for version control.
Used Eclipse IDE for development.

Environment: Java, J2EE, Struts, Hibernate, JSP, HTML, WebSphere, Oracle 10g, Apache Ant, Log4J, RAD, Eclipse IDE, JUnit, Subversion, Axis, WSDL, Web Services.

We provide IT Staff Augmentation Services!

Hadoop /data Analyst Resume

Hartford, CT

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship