Sr. Big Data Infrastructure Engineer Resume Washington DC - Hire IT People

SUMMARY

7+ years of intensive experience in development environments wifin cross - platform systems
Extensive experience implementing and maintaining Open-SourceDistributed System, Big Data and Search Engine solutions
Diverse experience includes developing reporting and procedural standards
Experienced working wif HL7 and NHINin Healthcare industry
Diagnosing and resolvingcomplextechnical problems in cross-platformsystems
Engineering and managementas well as performance monitoring and capacity planningon distributed systems
Extensive noledge ofSoftware Development Life Cycle(SDLC)
Strong technical noledge inRDBMS, NOSQL and Object Oriented Designed andimplementedsystem

TECHNICAL SKILLS

Languages: Bash/Shell Scripting, SQL, Java, JSON, CSS, XML/XSLT, XPath, XQuery

Platform: UNIX,Linux, Windows, VMware, VirtualBox, Hyper-V

Database & Data warehousing Tools: MS SQL, MySQL, Hbase, Postgres, Redshift

Distributed System: Apache Hadoop Ecosystem, Hadoop HDFS, YARN, HBase, ZooKeeper, Pig, Hive, Scoop, Flume, Spark, Falcon, Ambari, Apache Solr/Lucene (SolrCloud), SplunkIT infrastructure frameworks & monitoring: Cloudera Manager, Cloud AWS infrastructure (EMR, EC2, S3), Lucidworks Fusion, Hortonworks HDP & HDF, Cloudera Distribution for Hadoop (CDH4, CDH3), Nagios, Kibana/Banana, Ambari

Web & Application Servers: Apache web Server, Jetty, Tomcat, Jboss

Networking: TCP/IP, DNS, LAN/WAN, NAT, LDAP/AD

Analysis/Design Methodologies: Scrum Agile, UML, REST, J2EE

Others: MS Visio, Subversion, Selenium, Puppet, Kerberos, Knox, Ranger

PROFESSIONAL EXPERIENCE

Confidential, Washington DC

Sr. Big Data Infrastructure Engineer

Responsibilities:

Designed, installed and configured HDP 2.6.3 clusters include: Hadoop, YARN, HDFS, Zookeeper, Solr, Oozie,Flume, Hive, HBase,Kafka, Spark
Setup and configured Ambari 2.6 for provisioning and monitoring
Setup external solrcloud 7.1 cluster on Ambari and configured backup and recovery
Setup firewallD in Redhat 7
Setup Nagios for monitoring 27 Hadoop clusters wif over 217 hosts, virtual machine and bare metal Redhat Linux 6 and 7 servers
Setup and configured Puppet for automation
Knowledge transfer wif team members

Confidential, VA

Sr. Software Engineer

Responsibilities:

Designed, Provisioned, maintained and upgraded Search Application Lucidworks Fusion (SolrCloud) in AWS GovCloud, Redhat Linux servers
Setup and configured monitoring tools, New Relic and Zabbix
Provided data management and indexing solutions for governmental agencies
Leveraged daily Scrum/Agile meetings to communicate and clarify issues and solutions
Took lead on migration of teh application from GovCloud to a new AWS environment
Designed, installed and configured HDP-Hadoop cluster in AWS on Linux hosts
Leveraged Splunk for log aggregation and stacks troubleshooting

Confidential, NYC

Big Data & Solr Engineer

Responsibilities:

Designed and installation of HDP2.2cluster includes 20 nodes onRedHatLinux environment
Setup and configure HDP cluster high availability
Set upAmbari2.1for provisioning, managing and monitoring Hadoop cluster
Secured Hadoop clusters wifKerberos,LDAP,RangerandKnox
Setupand auto-configured Hadoop cluster inPuppetenvironment
Installation, configuration, andadministration ofHDP Stack includes servicesHBase, Solr, Hive,YARN,Oozie,Flume,Kafka, SparkandHDFS
Setupand configuredNagiosandAmbaritoleverage systemhealth check, metrics collection and alert framework
Setupback-up system andSnapshot data replicationfor HDFSusingFalcon
Setup and configuredSolrCloudCluster witansembleZooKeepercluster andKibana/Bananadata visualization

Confidential, Hamilton, MD

Software Search Engineer

Responsibilities:

Designed and initiated a prototype for“FDA-Drug-Labels Indexing Solutions”
Performed requirement gathering and data analysis
Set up and configured Solr search engines in“Standalone”and“SolrCloud”modes
Setup ETL tools, parsers, and transformers (DIH, Tika, XPath) to extract, transfer, and load data to Solr nodes
Defined customized Solr Schema for indexing different data formats (XML, HTML, PDF, JPEG, metadata…)
Investigated possibility of integration software, frameworks, and libraries to enhance performance of teh system including:“OpenNLP”(Machine learning based toolkit),“Logstash”(loading log files to Solr),“Kibana/Banana”(Data visualization)
Set up and configured data visualization framework“Kibana/Banana”to visualize time-series and non-time-series data, indexed in Solr nodes
Reconfigured and customized search engine and performed functional testing (both indexing and searching) for quality assurance

Confidential, VA

Solr Search Engineer

Responsibilities:

Set up and configured SolrCloud search solution running on Jboss includes 3 clusters, 9 shards and 9replicas on Linux servers to process TBs of dataset andprovided near real time search
Configured quorum ZooKeeper running as a central configuration for Solr clusters
Installed and configured Jboss clusters
Performed testing and debugging of different searchcomponentsfor high availability and performance enhancement
Provided comprehensive documentation

Confidential, Washington DC

Healthcare Business Analyst & Application Engineer

Responsibilities:

Setup & installed Solr/Lucenesearch solution underTomcat-Jboss environment on Azure Cloud
Provided replication configuration for Solr/Lucene indexes
Defined data source connectivity from Solr to MS SQL & S3 on AWS
Setup & installed SSL connectivity between Solr nodes
Installed and configured SolrCloud cluster on Unix servers
Performed requirements analysis
Daily administration and management of multiple Linux servers
Installed and configured Hadoop nodes/Cluster on Amazon Web Services
Automated start-up script for pulling data from MySQL and ingesting into Hadoop Distributed File System (HDFS)
Wrote shell scripts for Log-Rolling day to day processes
Executed importing and exporting data into HDFS and Hive using Sqoop
Executed manipulating and managingdata wif Hive and Pig
Managed teh day-to-day operations of teh cluster for backup and support
Performed performance monitoring and capacity planning
Reliability testing to reveal potential problems arising from extended runs
Solr performance testing to determine teh response time of a given request
Executed expectations and general performance standards as set forth by teh company
Oversaw and directed teh quality assurance review of monthly activity reports including validation of results

We provide IT Staff Augmentation Services!

Sr. Big Data Infrastructure Engineer Resume

Washington, DC

We'd love your feedback!

Resume Categories

Client Services

Job Seekers

Visa Sponsorship