Job Description

Job Title: Principal Systems Engineer, Enterprise Technology Operations

Location - Hyderabad

Department:  Innovation Technology

Position Summary

The Cloud and Infrastructure Technology Team manages the design, installation, and maintenance of Berkadia’s technology infrastructure.  This includes all physical & virtual servers, cloud platforms (Azure and AWS), Active Directory, Docker, Kubernetes, messaging, storage, alerting, Service Mesh, monitoring and backups. This position will require a blend of on-premises and cloud knowledge. 

At Berkadia, cloud computing continues to allow us to modernize and consolidate IT infrastructure, automate workloads, and pursue next-generation innovation. To continue this transformation, we’re seeking an experienced systems and/or Amazon Web Services (AWS) engineer with proficiency in the development, implementation, optimization, and maintenance of our cloud-based solutions as well as the support and maintenance of our on-premises systems. The ideal candidate will have extensive experience implementing and using various monitoring and APM observability tools, both cloud and on-premises based.  This person will also be experienced in other cloud-based tech, with a firm grasp of emerging technologies, platforms, and applications, and ability to customize them to help our business become more secure and efficient as well as on-premises and enterprise solutions including server hardware, Windows Server, email, DNS, virtualization, and storage solutions. From day one, you’ll have an immediate impact on the day-to-day efficiency of our IT operations, and an ongoing impact on our overall growth.

Responsibilities

Essential Duties (Primary Responsibilities) include the following.

  • Act as an SME and technical lead for our monitoring and observability practice.
  • Design and implement  observability & monitoring solutions for systems and applications
  • Transform our systems-based monitoring strategy and platform to an application-based environment focused on observability, application performance and user experience.
  • Analyze performance and observability data and logs, determine the root cause of application outages and issues and implement needed solutions.
  • Work in collaboration with various teams (infrastructure, data, development) to understand their application, end user experience and ensure proper monitoring and observability is in place.
  • Define and ensure that SLOs (Service-level objectives), SLAs (Service-level agreements), OLAs (Internal metrics) and SLIs (Service-level indicators) are met
  • Provide L2 support for production applications and workloads
  • Managing cloud environments in accordance with company security guidelines
  • Troubleshoot incidents, identify root cause, fix and document problems, and implement preventive measures
  • Employ exceptional problem-solving skills, with the ability to see and solve issues before they affect business productivity
  • Participate in the annual disaster recovery planning process to ensure critical systems are backed up appropriately and can recover from failures with minimal impact to the business.
  • Must exercise effective judgment and follow established procedures in support of production, including on-call rotation for critical environments
  • Assist with other IT projects as needed or directed

Travel Expectations 

Minimal domestic and international travel required (10-20%)

Work Location

This role may be required to be onsite within a Berkadia office or designated location periodically at the request of the manager for things such as meetings, trainings or events.

OR

Due to the nature of the responsibilities, this role can primarily be done remotely, but is required to be onsite within a designated Berkadia office regularly (1-2 days/week).

 

Supervisory Responsibilities

This job has no supervisory responsibilities.

_________________________________________________________________________________________________

 

 

Qualifications

Qualifications to perform this job successfully, an individual must be able to perform each essential duty satisfactorily. The requirements listed below are representative of the experience required.

7-10+ years designing and implementing monitoring and APM observability tools, including but not limited to Datadog/ Solarwinds/ New Relic.

5+ years of experience developing and implementing cloud solutions on AWS and/or Azure platforms

Experience in several of the following areas: Patching and lifecycle management, CI/CD toolsets & pipelines, SysOps, databases, Kubernetes (AWS EKS), Docker, common networking protocols and services (DNS, HTTP(S), SSH, FTP, SMTP)

AWS certifications preferred

Proven ability to communicate and collaborate with multi-disciplinary teams of business analysts, developers, data engineers, data scientists, and subject matter experts

 Preferred Education

Bachelor’s degree or equivalent

Preferred Previous Experience

Seven to ten years prior experience in a similar position.

About Berkadia:

Berkadia, a joint venture of Berkshire Hathaway and Jefferies Financial Group, is an industry leading commercial real estate company providing comprehensive capital solutions and investment sales advisory and research services for multifamily and commercial properties. Berkadia is amongst the largest, highest rated and most respected primary, master and special servicers in the industry.

Berkadia is an equal opportunity employer and affords equal opportunity to all applicants and employees for all positions without regard to race, color, religion, gender, national origin, age, disability or any other status protected under the law.

Our people are our greatest strength and make Berkadia a great place to work, creating an environment of trust, mutual respect, innovation and collaboration. Our culture is driven by our core values: https://www.berkadia.com/about/vision-and-values.

To know more about Berkadia, please visit our website https://www.berkadia.com/aboutus/

Apply for this Job

* Required

resume chosen  
(File types: pdf, doc, docx, txt, rtf)


Our system has flagged this application as potentially being associated with bot traffic. Please turn off any VPNs, clear your browser cache and cookies, or try submitting your application in a different browser. If this issue persists, please reach out to our support team via our help center.
Please complete the reCAPTCHA above.