Sr Splunk Infrastructure Engineer
Owings Mills, MD (2 days onsite, 3 days remote. Must be onsite day 1)
Compass Pointe has partnered with a global financial company in the Owings Mills, MD area that is looking for a Sr Splunk Infrastructure Engineer. Will be part of the Reliability and Integrations Engineering team within the Technology Services Engineering group. The Reliability Ops team supports observability and developer productivity platforms. The Sr. Splunk Infrastructure Engineer will be responsible for supporting Splunk Enterprise, including managing Windows and Linux servers, automating through Ansible, and coordinating the migration of on-premises hardware to AWS.
Support onboarding and maintenance of logs to Splunk from windows, Linux and cloud-based sources.
Support platform upgrades including coordinating testing of upgrades with users of the platform.
Automating manual platform management processes through Ansible or other scripting tools/languages.
Troubleshooting incidents impacting the Splunk platform.
Evaluate the use and integration of third-party add-ons.
Coordinating and collaboration with users of the platform.
Develop training and documentation materials.
3 years experience managing and configuring Splunk Enterprise and/or Splunk Cloud.
Experience with Splunk clustered deployment topology.
Experience with Linux and Windows agents for Splunk administration.
Experience in designing, developing, and deploying cloud-based solutions using AWS.
Experience in onboarding new data, configuration, creating new dashboards, extracting information through Splunk.
Experience with writing or modifying custom Splunk addons.
Demonstrated proficiency with scripting and automation (bash, python, other programming languages).
Familiarity with Splunk rest APIs.
Ability to troubleshoot and diagnose complex issues.
Able to demonstrate experience supporting technical users and conduct requirements analysis.
Can work independently with minimal guidance and oversight.
Experience with IT Service Management and familiarity with Incident & Problem management.
Highly skilled in identifying performance bottlenecks, identifying anomalous system behavior, and resolving root cause of service issues.
Demonstrated ability to effectively work across teams and functions to influence design, operations, and deployment of highly available software.
Knowledge of standard methodologies related to security, performance, and disaster recovery.
Splunk Certification (Admin or Architect).
Experience with Ansible tower automations.
Experience using Gitlab.
Experience with large platform migration efforts.