Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V0.3 #95

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

V0.3 #95

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions zh_0_3/GLOSSARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
<!-- toc -->

29 changes: 29 additions & 0 deletions zh_0_3/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
<!-- toc -->

[OpenFalcon](http://open-falcon.com)是一款企业级、高可用、可扩展的开源监控解决方案。

在大家的热心支持和帮助下,OpenFalcon 已经成为国内最流行的监控系统之一。

目前:
- 在 [github](https://github.com/open-falcon/falcon-plus) 上取得了数千个star,数百次fork,上百个pull-request;
- 社区用户6000+;
- 超过200家公司都在不同程度使用open-falcon,包括大陆、新加坡、台湾等地;
- 社区贡献了包括MySQL、redis、windows、交换机、LVS、Mongodb、Memcache、docker、mesos、URL监控等多种插件支持;

-----
**Acknowledgements**

- OpenFalcon was initially started by Xiaomi and we would also like to acknowledge contributions by engineers from [these companies](./contributing.html) and [these individual developers](./contributing.html).
- The OpenFalcon logo and website were contributed by Cepave Design Team.
- [Wei Lai](https://github.com/laiwei) is the founder of OpenFalcon software and community.
- The [english doc](http://book.open-falcon.com/en/index.html) is translated by [宋立岭](https://github.com/songliling).

-----
- QQ五群:42607978 (请加该群)//请大家优先在 github 上提交 [issue](https://github.com/open-falcon/falcon-plus/issues), 方便问题沉淀,github issue 会最高优先级解决。
- QQ四群:697503992 (已满员)
- QQ一群:373249123 (已满员)
- QQ二群:516088946 (已满员)
- QQ三群:469342415 (已满员)


<img src="image/OpenFalcon_wechat.jpg" width = "180" height = "180" alt="微信公众号" align=center />
87 changes: 87 additions & 0 deletions zh_0_3/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
# SUMMARY

### 第 Ⅰ 部分:社区
* [社区介绍](README.md)
* [贡献列表](contributing.md)
* [项目介绍](intro/README.md)

### 第 Ⅱ 部分:安装
* [单机安装](quick_install/README.md)
* [环境准备](quick_install/prepare.md)
* [启动后端](quick_install/backend.md)
* [安装前端](quick_install/frontend.md)
* [v0.1平滑升级到v0.2](quick_install/upgrade.md)
* [分布式安装](distributed_install/README.md)
* [环境准备](distributed_install/prepare.md)
* [Agent](distributed_install/agent.md)
* [Transfer](distributed_install/transfer.md)
* [Graph](distributed_install/graph.md)
* [API](distributed_install/api.md)
* [DashBoard](quick_install/frontend.md)
* [邮件/短信/微信发送接口](distributed_install/mail-sms.md)
* [Heartbeat Server](distributed_install/hbs.md)
* [Judge](distributed_install/judge.md)
* [Alarm](distributed_install/alarm.md)
* [Task](distributed_install/task.md)
* [Gateway](distributed_install/gateway.md)
* [Nodata](distributed_install/nodata.md)
* [Aggregator](distributed_install/aggregator.md)
* [Agent-updater](distributed_install/agent-updater.md)

### 第 Ⅲ 部分:手册
* [使用手册](usage/README.md)
* [快速入门](usage/getting-started.md)
* [Nodata配置](usage/nodata.md)
* [集群监控](usage/aggregator.md)
* [报警触发函数](usage/func.md)
* [自定义push数据](usage/data-push.md)
* [历史数据查询](usage/query.md)
* [进程端口监控](usage/proc-port-monitor.md)
* [MySQL监控](usage/mymon.md)
* [Redis监控](usage/redis.md)
* [MongoDB监控](usage/MongoDB.md)
* [Memcache监控](usage/memcache.md)
* [RabbitMQ监控](usage/rabbitmq.md)
* [Solr监控](usage/solr.md)
* [交换机监控](usage/switch.md)
* [ESXi监控](usage/esxi.md)
* [Windows主机监控](usage/win.md)
* [HAProxy监控](usage/haproxy.md)
* [docker容器监控](usage/docker.md)
* [Nginx监控](usage/ngx_metric.md)
* [JMX监控](usage/jmx.md)
* [硬件监控](usage/hwcheck.md)
* [LVS监控](usage/lvs.md)
* [Url监控](usage/urlooker.md)
* [mesos](usage/mesos.md)
* [vSphere监控](usage/vsphere.md)
* [Flume监控](usage/flume.md)
* [目录和进程资源监控](usage/du-proc.md)
* [故障自愈](usage/fault-recovery.md)
* [VSphere和ESXI监控](usage/vsphere-esxi.md)

### 第 Ⅳ 部分:理念
* [设计理念](philosophy/README.md)
* [数据模型](philosophy/data-model.md)
* [话说数据采集](philosophy/data-collect.md)
* [plugin机制](philosophy/plugin.md)
* [Tag和HostGroup](philosophy/tags-and-hostgroup.md)
* [二次开发](dev/README.md)
* [社区贡献](dev/community_resource.md)
* [修改绘图曲线精度](dev/change_graph_rra.md)
* [修改网卡流量单位](dev/change_net_unit.md)
* [支持 Grafana 视图展现](dev/support_grafana.md)
* [API](api/README.md)
* [实践经验](practice/README.md)
* [部署](practice/deploy.md)
* [自监控](practice/monitor.md)
* [Graph扩容二三事](practice/graph-scaling.md)

### 第 Ⅴ 部分:FAQ
* [FAQ](faq/README.md)
* [采集相关](faq/collect.md)
* [报警相关](faq/alarm.md)
* [绘图相关](faq/graph.md)
* [Linux常用监控指标](faq/linux-metrics.md)
* [QQ群问答整理](faq/qq.md)
* [Changelog](changelog/README.md)
2 changes: 2 additions & 0 deletions zh_0_3/api/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# open-falcon api
- [api v0.2](http://open-falcon.com/falcon-plus/)
1 change: 1 addition & 0 deletions zh_0_3/authors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
../zh/authors.md
117 changes: 117 additions & 0 deletions zh_0_3/changelog/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
<!-- toc -->

## [v0.2.0] 2017-06-17

> https://github.com/open-falcon/falcon-plus/releases/tag/v0.2.0

> http://www.jianshu.com/p/6fb2c2b4d030

> **全新的前端**

- Open-Falcon 所有前端组件进行了统一整合,包括dashboard、screen、portal、alarm-dashboard、UIC、fe、links等统一整合到了 [dashboard](https://github.com/open-falcon/dashboard) 组件;
- Dashboard 全站增加权限控制;
- Dashboard 增加删除指定 endpoint、counter 以及对应的 rrd 文件的功能;
- Dashboard 首页默认展示 endpoint 列表,并支持 endpoint 列表和 counter 列表翻页功能;
- Dashboard 增加删除一级 screen 的功能;
- 支持将报警的 callback 参数和内容在 Dashboard 页面上展示;
- 支持微信报警通道;
- Dashboard 支持展示过往的历史报警信息;

> **统一的后端**

- alarm支持报警历史信息入库存储和展示;
- 「报警合并」模块`links`的功能合并到统一前端 Dashboard 中,降低用户配置和维护成本;
- 「报警发送」模块`sender`的功能合并到 alarm 中,降低用户配置和维护成本;
- query的功能合并到了falcon-api组件中;
- 支持非周期性上报数据存储;
- agent支持通过自定义配置,只采集指定磁盘挂载点的磁盘监控数据;
- agent支持配置一个默认 tag,这样通过该 agent 上报的所有数据都会自动追加这个tag;
- judge新增报警判断函数`lookup(#num, limit)`,如果检测到过去num个周期内,有limit次符合条件就报警;

> **过去那些等待已久的bugfix**

- 修复grafana不支持metric含有大写字母的bug;
- 修复agent写多个transfer高可用不生效的bug;
- 修复agent发送数据给transfer的超时设置不合理的问题;

> **全新的 [RESTful API](http://open-falcon.com/falcon-plus)**:让 open-falcon 没有难自动化的操作

- 发布了全新设计的组件 falcon-api,falcon-plus 所有的功能都可以通过 RESTful API 来完成;
- 统一前端 Dashboard 绝大部分功能都是通过 [falcon-plus](https://github.com/open-falcon/falcon-plus) [api](http://open-falcon.com/falcon-plus) 来实现;


## [0.1.0] 2016-03-08

> https://github.com/open-falcon/of-release/releases/tag/v0.1.0

> http://www.jianshu.com/p/7751eb324a51

### highlights
- `文档` API梳理和文档完善 http://docs.openfalcon.apiary.io `OpenFalcon-team @hitripod`
- `优化` graph集群扩容时,历史数据自动迁移 `OpenFalcon-team @yubo laiwei niean`
- `优化` 数据上报的最小间隔,可以在配置文件中更改 `OpenFalcon-team @niean`
- `新功能` 监控数据支持写入opentsdb `美团 OpenFalcon-team @Charlesdong`
- `新功能` 适配支持grafana `快网 OpenFalcon-team @hitripod`
- `新功能` 新增集群监控 `OpenFalcon-team @ulricqin`
- `新功能` 新增nodata监控 `OpenFalcon-team @niean`
- `新功能` agent内置URL监控 `@onlymellb`
- `优化` agent支持对多个transfer的负载均衡 `@cmgs`
- `优化` 往HostGroup中添加机器的时候如果发现机器名不存在,就直接插入host表 `@wkshare`
- `优化` dashboard绘图线条采用深颜色的配色方案 `美团 OpenFalcon-team @skyeydemon`

### changelog
- [agent] 优化:agent支持配置多个transfer地址 @CMGS https://github.com/open-falcon/agent/pull/37
- [agent] 优化:agent支持URL探测 @onlymellb https://github.com/open-falcon/agent/pull/27
- [alarm] bugfix:修改beego api不兼容引起的编译报错 @ulricqin
- [common] 新功能:增加tsdb的支持 @Charlesdong https://github.com/open-falcon/common/pull/2
- [dashboard] 优化:checkbox均支持使用shift快捷多选 @skyeydemon https://github.com/open-falcon/dashboard/pull/14
- [dashboard] 优化:绘图线条采用深颜色的配色方案 @skydemon https://github.com/open-falcon/dashboard/pull/13
- [dashboard] 优化:日期选择框高亮当前时间,方便用户选择 @iambocai https://github.com/open-falcon/dashboard/pull/12
- [dashboard] bugfix: 修复charts页面刷新时偶尔不出图的问题 @niean
- [fe] bugfix:修改beego api不兼容引起的编译报错 @hitripod
- [gateway] 优化:gateway支持配置多个transfer列表间负载均衡 @niean
- [gateway] 优化:gateway引入perfcounter,用来统计gateway自身的稳定性指标 @niean
- [graph] 数据上报的最小间隔,可以在配置文件中更改 @niean
- [graph] graph集群扩容时,历史数据自动迁移 @yubo laiwei niean https://github.com/open-falcon/graph/pull/14
- [hbs] 新功能:配置agent支持URL监控[url.check.health] onlymellb https://github.com/open-falcon/hbs/pull/4
- [nodata] 新模块:当某些采集项超时未上报数据时,如果配置了nodata策略则会生成一条默认的数据 @niean
- [portal] API: 增加了获取expression和strategy详情的API @modeyang
- [portal] 优化:往HostGroup中添加机器的时候如果发现机器名存在,就直接插入host表 @wkshare https://github.com/open-falcon/portal/pull/4
- [portal] 新功能:支持集群聚合监控 @ulricqin
- [portal] 新功能:支持nodata @niean
- [query] 增加API来支持Grafana展示 @hitripod https://github.com/open-falcon/query/pull/5
- [transfer] 数据支持写入opentsdb @Charlesdong https://github.com/open-falcon/transfer/pull/5
- [transfer] 数据上报周期的最小限制可配置 @niean https://github.com/open-falcon/transfer/pull/7
- [transfer] migrating的功能从transfer中去除 @laiwei https://github.com/open-falcon/transfer/pull/8
- [aggregator] 新模块:集群聚合监控 @ulricqin

----
## [0.0.5] 2015-08-20
- [agent] new feature: 增加了udp、du相关采集项
- [agent] bugfix: 修复了配置了新的插件需要重启agent的bug
- [agent] bugfix: 修复了reload配置文件,hostname改动可能无法生效的问题 @oiooj
- [alarm] bugfix: 修复了告警email中的换行问题
- [alarm] bugfix: 修复了告警分级配置项为空时处理不当的问题
- [alarm] enhancement: 修改http的默认端口为9912(之前为6060)
- [transfer] new feature: 新增了数据双写的功能,即transfer可以将同一份数据发送到后端的多个graph或者judge,用于容灾
- [transfer] enhancement: 发往judge的数据,按照时间戳做对齐和规整(与发往graph保持一致)
- [transfer] bugfix: 修复transfer返回给客户端的结果中latency单位错误的问题
- [graph] new feature: 新增了last API接口,可以返回指定counter最新的点
- [query] new feature: 新增了last API接口,可以返回指定counter最新的点
- [hbs] bugfix: 配合agent, 修复了配置了新的插件需要重启agent的bug
- [plugin] new feature: 新增了插件项目,里面有一些常用的插件脚本
- [gateway] 新增组件,解决网络分区后的监控数据回传问题。代码及功能描述,请移步到[这里](https://github.com/open-falcon/gateway);Gitbook中没有该组件的描述。


## [0.0.4] 2015-06-09
- [alarm] bugfix:修复告警邮件中的换行问题
- [transfer] bugfix:当某个后端的graph宕机的时候,会引起transfer的发送能力下降
- [graph] bugfix: 当存储rrd数据文件的目录不存在或者没有读写权限的时候,程序作退出处理
- [fe] 新增了Golang版本的uic组件


## [0.0.3] 2015-06-02
- [dashboard] bugfix: search counters by tags in screen
- [judge] enhancement: clean stale data in memory


1 change: 1 addition & 0 deletions zh_0_3/contributing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
../zh/contributing.md
23 changes: 23 additions & 0 deletions zh_0_3/dev/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
<!-- toc -->

# 环境准备

请参考[环境准备](quick_install/prepare.md)
# 自定义修改归档策略
修改open-falcon/graph/rrdtool/rrdtool.go

![](https://raw.githubusercontent.com/open-falcon/doc/master/img/custom-rra-1.png)
![](https://raw.githubusercontent.com/open-falcon/doc/master/img/custom-rra-2.png)

重新编译graph组件,并替换原有的二进制

清理掉原来的所有rrd文件(默认在/home/work/data/6070/下面)

# 插件机制
1. 找一个git存放公司的所有插件
2. 通过调用agent的/plugin/update接口拉取插件repo到本地
3. 在portal中配置哪些机器可以执行哪些插件
4. 插件命名方式:$step_xx.yy,需要有可执行权限,分门别类存放到repo的各个目录
5. 把采集到的数据打印到stdout
6. 如果觉得git方式不方便,可以改造agent,定期从某个http地址下载打包好的plugin.tar.gz

87 changes: 87 additions & 0 deletions zh_0_3/dev/change_graph_rra.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
<!-- toc -->

## 修改绘图曲线精度

默认的,Open-Falcon只保存最近12小时的原始监控数据,12小时之后的数据被降低精度、采样存储。

如果默认的精度不能满足你的需求,可以按照如下步骤,修改绘图曲线的存储精度。

#### 第一步,修改graph组件的RRA,并重新编译graph组件
graph组件的RRA,定义在文件 graph/rrdtool/[rrdtool.go](https://github.com/open-falcon/graph/blob/master/rrdtool/rrdtool.go#L57)中,默认如下:

```golang
// RRA.Point.Size
const (
RRA1PointCnt = 720 // 1m一个点存12h
RRA5PointCnt = 576 // 5m一个点存2d
// ...
)

func create(filename string, item *cmodel.GraphItem) error {
now := time.Now()
start := now.Add(time.Duration(-24) * time.Hour)
step := uint(item.Step)

c := rrdlite.NewCreator(filename, start, step)
c.DS("metric", item.DsType, item.Heartbeat, item.Min, item.Max)

// 设置各种归档策略
// 1分钟一个点存 12小时
c.RRA("AVERAGE", 0.5, 1, RRA1PointCnt)

// 5m一个点存2d
c.RRA("AVERAGE", 0.5, 5, RRA5PointCnt)
c.RRA("MAX", 0.5, 5, RRA5PointCnt)
c.RRA("MIN", 0.5, 5, RRA5PointCnt)

// ...

return c.Create(true)
}

```

比如,你只想保存90天的原始数据,可以将代码修改为:

```golang
// RRA.Point.Size
const (
RRA1PointCnt = 129600 // 1m一个点存90d,取值 90*24*3600/60
)

func create(filename string, item *cmodel.GraphItem) error {
now := time.Now()
start := now.Add(time.Duration(-24) * time.Hour)
step := uint(item.Step)

c := rrdlite.NewCreator(filename, start, step)
c.DS("metric", item.DsType, item.Heartbeat, item.Min, item.Max)

// 设置各种归档策略
// 1分钟一个点存 90d
c.RRA("AVERAGE", 0.5, 1, RRA1PointCnt)

return c.Create(true)
}
```

#### 第二步,清除graph的历史数据
清除已上报的所有指标的历史数据,即删除所有的rrdfile。不删除历史数据,已上报指标的精度更改将不能生效。

#### 第三步,重新部署graph服务
编译修改后的graph源码,关停原有的graph老服务、发布修改后的graph。

只需要修改graph组件、不需要修改Open-Falcon的其他组件,新的精度就能生效。你可以通过Dashboard、Screen来查看新的精度的绘图曲线。



### 注意事项:

修改监控绘图曲线精度时,需要:

+ 修改graph源代码,更新RRA
+ 清除graph的历史数据。不删除历史数据,已上报指标的精度更改将不能生效
+ 除了graph之外,Open-Falcon的其他任何组件 不需要被修改
+ 修改RRA后,可能会出现"绘图曲线点数过多、浏览器被卡死"的问题。请合理规划RRA存储的点数,或者调整绘图曲线查询时的时间段选择。


Loading