自动化运维系统具备需求描述

根据现在云计算和DevOps的现态,我觉得一个成熟的自动化运维平台应该包括以下的特性:

一、支持混合云的CMDB

现在越来越多的服务器都转到了云上,而主流的公有云、私有云平台都拥有比较完备的资源管理的API,这些API也就是构建一个自动化CMDB的基础。新一代的自动化运维平台应该是可以基于这些API来自动维护和管理相关的服务器、存储、网络、负载均衡的资源的。通过API对资源的操作都应该被作为操作日志记录下来,以备作为后续操作审计的基础数据。CMDB这个东西听上去是老生常谈,但这个确实是所有运维工具的基础设施。而基于开源工具做运维平台最大的麻烦,就是如何在各个工具之间把CMDB统一起来。CMDB不统一起来,就意味着一旦要增加一台服务器,可能要在各个运维工具里面都要同步一下。尤其是容器化以后更是必要,明确CMDB规范则是CMDB的第一步。

二、比较完备的监控和应用性能分析(APM)

能支持对平台的可用性、服务器的性能、各种服务(web服务、应用服务、数据库服务)的性能进行监控。做的好一些应该能进行更深入、或者关联性的性能分析。现在市面上一般都会将资源性能监控和应用性能监控(APM)混合着讲,这里面的产品确实也有很多都是重叠的,两方面都会涉及到。开源的性能监控系统主流有的Zabbix、Nagios,国产的开源监控平台有小米OpenFalcon,但这些基本都只是做基本的资源监控(服务器,磁盘、网络等)和简单的服务软件的性能监控(中间件,数据库等)。而市面上的APM系统更主打的功能是应用性能分析,比如能精确定位到某个应用的URL的访问速度快慢,方法栈的开销变化,某些SQL执行速度的快慢,这些对于开发人员和运维人员快速定位问题还是很有帮助的。APM这方面的商业工具,国外比较主流的有New Reclic、Dynatrace,国内的也就是透视宝、Oneapm、听云等,他们也提供了API进行集成。APM这方面的开源工具有pinpoint(一个韩国团队开源的),zipkin(twitter开源),cat(大众点评开源)。需要这种强调一点的是对于监控工具的监控是一个成熟team的重要标志之一。

三、有一个凑活UI的批量运维工具

在业务发展比较快的情况下,从几台服务器,到几十台服务器,再到几百台服务器,批量运维的需求很自然就产生了,老板也希望越少的人干越多的活。现在也有不少开源的批量运维工具,也都比较成熟了,比如puppet、chef、ansible(1.8以后变化成rpc了)、saltstack。puppet和chef都是ruby做的,实话实说,ruby的熟手中国市面上很少,比python不是难招一点。我个人比较推荐使用ansible或者saltstack,这两个系统都是python写的,代码质量和社区活跃度都挺不错的。ansible有官方的web ui——Tower,ansible在两年前还是很火的,主要是基于ssh进行socket的shell操作,现在在变化目前地想让其更快一点,实际还不如saltstack好用,所以我们也在重新做一套自己用起来更顺手的WEB UI。

四、日志集中分析工具

线上系统最常规的问题定位方式,就是日志分析了。随着服务器的增多,日志的分析定位也成为一个难点和痛点(想象一下,系统出故障之后,要去几十甚至数百个节点去上去查日志,是有多折腾)。日志分析这个领域现在是一个热点,现在的开源方案也比较多了,比如著名的ELKStack,还有Flume+Kafka+Storm的体系。上面这两个方案相对重一些,部署比较复杂,网上介绍的文章也不少,看官仔细辩解,坑不少。比较轻量级的开源日志集中采集方案有python做的Sentry,他是通过改造各种语言的日志采集框架来实现日志的集中采集,各种主流的开发语言的日志框架都支持得很完整了,比如java的log4j和logpack。Sentry:Sentry – Track exceptions with modern error logging for JavaScript, Python, Ruby, Java, and Node.js

五、持续集成和发布【CICD】工具

08ece6567a893083c59abcc9bfe0c61c

这方面其实比较难有统一的需求,很多公司集成发布的做法都差异挺大的。持续集成方面,一般用jekins的比较多,这方面网上介绍的文章也很多。而如何把打好的包发布至各台服务器,则可以通过批量运维工具或者脚本来完成了。版本发布的过程涉及到很多细节,包括了版本文件的上传、分发、版本管理、回滚等各种操作。对于一般不太复杂的项目,我比较推荐的做法是把打包好的文件上传到svn上,然后通过脚本在各台服务器上进行发布操作就行了,这样其实是利用了SVN来完成文件的上传、分发、版本管理、回滚等各种操作。实际使用中jenkins-cluster+git/svn+多语言+自写plugin,才是牛性闪闪的正解。

六、安全漏洞及扫描工具

这是最最最重要的一个分支,现在一个稍微有点知名度的系统,都会遭受各种各样的安全攻击的折磨。一般的公司不太可能请得起专职的安全工程师,所以运维工程师最好能自己借助一些安全扫描工具来发现自己系统的漏洞。安全工具方面我了解不多,不太熟这个领域的开源工具。之前乌云网推出过一个SaaS化的漏扫平台——唐朝巡航,有对外提供漏洞扫描的API,不过据说最近乌云网一直在升级,所以也就暂时无法调用了。个人觉得,如果上述功能都有了,最好还是需要考虑从Etheric协议层拾级而上的安全机制才是可控的,也是基础最牢靠的策略,尤其是ip6的时代更是这样,每一个协议节点都会成为故障节点。

七、可以弹性伸缩的实时环境

包括根据监控数据及压力变化进行服务和应用的自动伸缩循环(销毁-冷却-启动-并入-变配-变链-rb),目前越来越多的平台用了容器及数据仓库,处理的事务从简单交互的EJB正在向微服务,甚至是ServerLess方向,支撑的业务也变成了OLAP,对于一个可控的、成本优异的、节点化、容器化的环境是梦寐以求的需求。

八、报警和图表化

九、快迁快建………….

=========================================================================

DevOps (a clipped compound of “development” and “operations”) is a software engineering culture and practice that aims at unifying software development (Dev) and software operation (Ops). The main characteristic of the DevOps movement is to strongly advocate automation and monitoring at all steps of software construction, from integration, testing, releasing to deployment and infrastructure management. DevOps aims at shorter development cycles, increased deployment frequency, and more dependable releases, in close alignment with business objectives.
Venn diagram showing DevOps as the intersection of development (software engineering), operations and quality assurance (QA)
In 2009 Patrick Debois coined the term by naming a conference “devopsdays”

The term DevOps has been used in multiple contexts.

A definition proposed by Bass, Weber, and Zhu, is:

DevOps is a set of practices intended to reduce the time between committing a change to a system and the change being placed into normal production, while ensuring high quality.

In recent years, more tangential DevOps initiatives have also evolved, such as OpsDev,

Toolchain
Illustration showing stages in a DevOps toolchain
Illustration showing stages in a DevOps toolchain
See also: DevOps toolchain
As DevOps is intended to be a cross-functional mode of working, rather than a single DevOps tool there are sets (or “toolchains”) of multiple tools.

Code — code development and review, source code management tools, code merging
Build — continuous integration tools, build status
Test — continuous testing tools that provide feedback on business risks
Package — artifact repository, application pre-deployment staging
Release — change management, release approvals, release automation
Configure — infrastructure configuration and management, Infrastructure as Code tools
Monitor — applications performance monitoring, end–user experience
Note that there exist different interpretations of the DevOps toolchain (e.g. Plan, Create, Verify, Package, Release, Configure, and Monitor).

Some categories are more essential in a DevOps toolchain than others; especially continuous integration (e.g. Jenkins) and infrastructure as code (e.g. Puppet).

Relationship to other approaches
Agile
Main article: Agile software development
The need for DevOps arose from the increasing success of agile software development, as that led to organizations wanting to release their software faster and more frequently. As they sought to overcome the strain this put on their release management processes, they had to adopt patterns such as application release automation, continuous integration tools, and continuous delivery.

Continuous delivery
Main article: Continuous delivery
Continuous delivery and DevOps have common goals and are often used in conjunction, but there are subtle differences.

While continuous delivery is focused on automating the processes in software delivery, DevOps also focuses on the organization change to support great collaboration between the many functions involved.

DevOps and continuous delivery share a common background in agile methods and lean thinking: small and frequent changes with focused value to the end customer.

DataOps
Main article: DataOps
The application of continuous delivery and DevOps to data analytics has been termed DataOps. DataOps seeks to integrate data engineering, data integration, data quality, data security, and data privacy with operations.

Site reliability engineering
Main article: Site reliability engineering
In 2003, Google developed site reliability engineering, a new approach for releasing new features continuously into large-scale high-availability systems while maintaining high-quality end user experience.

Systems administration

This section needs expansion. You can help by adding to it. (June 2018)
DevOps is often viewed as an approach to applying systems administration work to cloud technology.

Goals
The goals of DevOps span the entire delivery pipeline. They include:

Improved deployment frequency;
Faster time to market;
Lower failure rate of new releases;
Shortened lead time between fixes;
Faster mean time to recovery (in the event of a new release crashing or otherwise disabling the current system).
Simple processes become increasingly programmable and dynamic, using a DevOps approach. DevOps aims to maximize the predictability, efficiency, security, and maintainability of operational processes. Very often, automation supports this objective.

DevOps integration targets product delivery, continuous testing, quality testing, feature development, and maintenance releases in order to improve reliability and security and provide faster development and deployment cycles. Many of the ideas (and people) involved in DevOps came from the enterprise systems management and agile software development movements.

Views on the benefits claimed for DevOps
Companies that practice DevOps have reported significant benefits, including: significantly shorter time to market, improved customer satisfaction, better product quality, more reliable releases, improved productivity and efficiency, and the increased ability to build the right product by fast experimentation.

However, a study released in January 2017 by F5 of almost 2,200 IT executives and industry professionals found that only one in five surveyed think DevOps had a strategic impact on their organization despite rise in usage. The same study found that only 17% identified DevOps as key, well below software as a service (42%), big data (41%) and public cloud infrastructure as a service (39%).

Cultural change
DevOps initiatives can create cultural change in companies

DevOps as a job title
While DevOps describes an approach to work rather than a distinct role (like system administrator), job advertisements are increasingly using terms like “DevOps Engineer”.

While DevOps reflects complex topics, the DevOps community uses analogies to communicate important concepts, much like “The Cathedral and the Bazaar” from the open source community.

Cattle not Pets: the paradigm of disposable server infrastructure.
10 deployments per day: the story of Flickr adopting DevOps.
Building a DevOps culture

DevOps T-shirt worn at a computer conference.
DevOps principles demand strong interdepartmental communication—team-building and other employee engagement activities are often used—to create an environment that fosters this communication and cultural change, within an organization.

Deployment
Companies with very frequent releases may require a DevOps awareness or orientation program. For example, the company that operates the image hosting website Flickr developed a DevOps approach, to support a business requirement of ten deployments per day;

Architecturally significant requirements
To practice DevOps effectively, software applications have to meet a set of architecturally significant requirements (ASRs), such as: deployability, modifiability, testability, and monitorability. These ASRs require a high priority and cannot be traded off lightly.

Although in principle it is possible to practice DevOps with any architectural style, the microservices architectural style is becoming the standard for building continuously deployed systems. and continuously.

Scope of adoption
Some articles in the DevOps literature assume, or recommend, significant participation in DevOps initiatives from outside an organization’s IT department, e.g.: “DevOps is just the agile principle, taken to the full enterprise.”

A survey published in January 2016 by the SaaS cloud-computing company RightScale, DevOps adoption increased from 66 percent in 2015 to 74 percent in 2016. And among larger enterprise organizations, DevOps adoption is even higher — 81 percent.

Adoption of DevOps is being driven by many factors — including:

Use of agile and other development processes and methods;
Demand for an increased rate of production releases — from application and business unit stakeholders;
Wide availability of virtualized and cloud infrastructure — from internal and external providers;
Increased usage of data center automation and configuration management tools;
Increased focus on test automation and continuous integration methods;
A critical mass of publicly–available best practices.

发表评论

电子邮件地址不会被公开。 必填项已用*标注