Company: Deinersoft, Inc.
Howard is a software consultant and educator who specializes in Agile process and practices. He has a varied background spanning forty six years in the industry, with extensive experience in commercial software, aerospace, financial services, and healthcare services. He has played many of the roles in the development arena, such as developer, analyst, team lead, architect, project manager, and trusted advisor. He has applied the principles of Agile, Lean, and XP development in teams both large and small, in various environments. Howard has educated hundreds of teams and individuals, and is a long-standing member of the ACM and IEEE.
Developing for Big Data at the Intersection of Containerization and Infrastructure as Code
This presentation comes from a real world problem I faced. Our group knew that it needed to start supporting Apache Hive on Hadoop Clusters. Unhappily, they knew nothing about Hadoop nor Apache Hive. One possibility was for people to start using the AWS Console and start creating AWS EMR (the Amazon managed cluster platform for supporting Hadoop, Map/Reduce, Apache Hive, etc.). I was very much against that idea, since we would not be learning anything, and would be reduced to mere users of a vendor-centric solution that would increase our cost of switching should we wish to change Cloud providers. Instead, I showed the group to learn to run on Docker Desktop orchestrated instances (docker-compose), and gave them Terraform scripts to create clusters in the AWS Cloud, which could be easily customized to our needs using the IaC (Infrastructure as Code) concepts embodied by Terraform. This presentation will look at the issues involved and the code which does all of this. There will be full frontal code shown, but only in a respectful fashion. A GitHub repository accompanies the presentation.