Cacti and 1-minute polling

You should really, really get your Cacti RRA settings right before you begin using it. Cacti defaults to polling every 5 minutes, but a lot of enterprise users change this to 1 minute in order to provide higher resolution for troubleshooting problems. Unfortunately, there is a lot of incomplete information on how to do this.

The first link is mostly right, but forgets about adjusting RRA steps. Twopacket’s guide is very nearly correct, but warns against changing cron’s poller interval, which is exactly what you should do.

This is another post in my recent series on Cacti. In fact, I’ve written quite a lot about about Cacti.

The Four Changes

There are four things you must do to configure Cacti for 1-minute polling. You should do this before gathering any data.

  1. Create 1-minute RRA settings.
  2. Adjust “Step” and “Heartbeat” on all 1-minute Data Source templates.
  3. Change the poller frequency in cron.
  4. Change the poller frequency in Cacti settings.

I’m going to focus on the RRA settings first, because this is the one thing you must get right from the start. Once you start collecting data with bad RRA settings, it is extremely difficult to correct it. And by “extremely difficult”, I mean “just throw it away and start over”.

Cacti default 5-minute RRA settings

The default Cacti RRAs expect polling every 5 minutes, and use steps of 1, 6, 24, 288. Multiply by 5 minutes, and that gives RRAs with resolution of 5 minutes, 30 minutes, 2 hours, and 1 day. You can see this in Cacti, under “Console” -> “Management” heading -> “Data Sources” -> RRAs.

Cacti RRA default settings for 5 minute polling

Cacti RRA default settings for 5 minute polling

1-minute Polling: The Wrong Way

It’s common to see 1-minute polling implemented by adding a 1-minute RRA like so:

Bad Cacti RRA settings for 1 minute polling

Bad Cacti RRA settings for 1 minute polling

What’s the problem? When combined with a 60-second step size (see below), this defines a 1-minute RRA. But it doesn’t define the 5-minute, 30-minute, 2-hour, or 1-day RRAs. It defines TWO 1-minute RRAs, along with RRAs for 6 minutes, 24 minutes, and 288 minutes (about 4 hours). Remember, these are calculated as “(step size in data source) * (steps in RRA definition)”.

1-minute Polling: The Right Way

Here’s the correct way. Use step sizes of 1, 5, 30, 120, and 1440.

Good Cacti RRA settings for 1 minute polling

Good Cacti RRA settings for 1 minute polling

1 and 5-minute Polling

This will give you correct 1-minute polling data while keeping the consolidated RRAs at the expected frequencies. But what if you want some graphs to keep 1-minute resolution (network statistics, CPU load) and others at 5-minute resolution (filesystem space)? To do that you must define two sets of RRAs. Leave the default RRAs alone or only rename them – their IDs are hard-coded into certain places in Cacti. Create new ones for 1-minute polling like so.

Good RRA settings for 1- and 5-minute polling.

Good RRA settings for 1- and 5-minute polling.

These are intended to be used in groups. Select only the “@ 1min” RRAs for 1-minute data, and only the “@ 5min” RRAs for 5-minute data.

Adjust “Step” and “Heartbeat” on all Data Source templates

In Cacti, navigate to “Console” -> “Templates” heading -> “Data Templates”. Click on a Data Template you wish to use 1-minute polling.

Under “Associated RRAs”, select the new 1-minute RRAs you created. Use the Ctrl key and scroll the tiny selection window to select all “@ 1min” RRAs. Deselect the “@ 5min” RRAs.

Cacti associated RRA settings for Data Templates using 1-minute polling.

Cacti associated RRA settings for Data Templates using 1-minute polling.

Then adjust the Step setting to 60.

Cacti step settings for Data Templates using 1-minute polling.

Cacti step settings for Data Templates using 1-minute polling.

And the Heartbeat setting to 120. This must always be twice the Step setting. It would be nice if it were automatically calculated in the next release.

Cacti heartbeat settings for Data Templates using 1-minute polling.

Cacti heartbeat settings for Data Templates using 1-minute polling.

At the bottom of the page, press Save. Repeat the “Heartbeat” step for each Data Source Item tab.

The final settings should like this:

Cacti settings for Data Templates using 1-minute polling.

Cacti settings for Data Templates using 1-minute polling.

Repeat this step for every Data Template you wish to update. This is a tedious job. If you’re good with MySQL you can change these settings directly in the database, but I don’t recommend it.

Change the poller frequency in cron

Unfortunately it’s possible to configure cron several ways. Under Debian/Ubuntu, cacti installs a configuration file /etc/cron.d/cacti. From a terminal, edit it with:

sudo editor /etc/cron.d/cacti

Source installs recommending editing root’s crontab.

sudo crontab -e -u root

The entry will look something like this:

Change this to the following, then save and quit.

Change the poller frequency in Cacti settings

Back in Cacti, navigate to “Console” -> “Configuration” heading -> “Settings”. Then select the “Poller” tab. Change “Poller Interval” and “Cron Interval” to “Every Minute”.

Cacti poller settings for 1-minute polling.

Cacti poller settings for 1-minute polling.

Note: Some people recommend not changing the cron interval. See the first comment below for clarification.

Finally, rebuild the poller cache. Navigate to “Console” -> “Utilities” heading -> “System Utilities”. Then select “Rebuild Poller Cache”. Congratulations! Cacti is now polling every minute with correct RRA settings.

Tags:

  1. Pico’s avatar

    Everyone says to leave the cron poller at 5minutes and just change the poller interval to 1. However your guide states the opposite?

    Reply

    1. Tyler Wagner’s avatar

      Many people do indeed say that. This needs clarification:

      1. Always change the poller internal to 1 minute.

      2. If you change the crontab entry to 1 minute, change the Cacti cron setting to 1 minute.

      3. If you do not change the crontab entry (leave it at 5 minutes), then also leave the Cacti setting at 5 minutes.

      Reply

    2. Matt Richards’s avatar

      Nice post Tyler, I’ve just configured two different installs for 1-minute polling (Cacti-0.8.7e on Ubuntu Server 10.04 and Cacti-0.8.8b on Scientific Linux 6.4). Thanks for taking the time to bring this information together in one place. You response to Pico’s comment also helps to clear up any confusion about Cacti Poller settings.

      For those interested, more information can be found on the Cacti Forum about poller frequency and the values used for the “Rows” in the RRA configuration.

      Having a read of the RRD documentation helps too.

      Cheers.

      Reply

      1. Tyler Wagner’s avatar

        Congratulations on setting it all up! It’s not an easy task.

        I created a spreadsheet for this, in case you want it.

        Reply

        1. Matt Richards’s avatar

          Wonderful, thanks for sharing it.

          Reply

        2. Martin J’s avatar

          Thanks for the detail guide. You recommend changing polling parameters before using cacti. How the changes would affect graphs and overall work on the production cacti server? What would you advise?

          Reply

          1. Tyler Wagner’s avatar

            Hi Martin,

            If you make the above changes, all new data sources and graphs will work on 1-minute intervals, while all existing data sources and graphs will continue to use the old settings. They won’t, for instance, use 1-minute polling for new data. So if you want 1-minute polling, you have to bin the old 5-minute data.

            Technically, it’s possible to extract the old 5-minute data as XML using rrdtool and then manipulate it to create 1-minute data and import it again. In practice, that’s a lot of work and requires good knowledge of rrdtool.

            Reply

          2. Drew’s avatar

            Is it possible to have some hosts polled every 1 minute and some every 5 minutes? We only need the 1-minute granularity on some hosts, and don’t want to hammer some others (across a WAN link). Would just deselecting the 1-minute RRA in the data template take care of it?

            Reply

            1. Tyler Wagner’s avatar

              If you edit the data template, it will affect all new data sources/graphs created from that time. I have some graph types that use 5 min while others use 1 min, but if you wanted different hosts to do that, you’d have to make two complete sets of data source templates. And then either make two sets of graph templates (each referencing 1 or 5 min data sources), or spend a lot of time editing each graph as you create it.

              It’s probably not worth it, to be honest. Even my Raspberry Pi can handle being polled every minute, and it’s not an issue for the server either.

              Reply

            2. Tosage’s avatar

              Thanks for your article, it’s clear and well explained !

              David

              Reply

            3. Winfried Maus’s avatar

              Hi Tyler,

              how did you solve the problem with the “gaps” in the graphs – or didn’t you run into that issue? I have an Octo-Core Xeon server with 16 GB RAM and an SSD drive that uses one minute polling, and it shows a three minute “gap” in ALL graphs every 9.5 hours. I have a second non-SSD server with half that CPU power and only a quarter of that RAM running on five minute polling, and it does NOT show these gaps — it only showed them when it was still using one minute polling, too.

              The polling cycle finished in 22 seconds on the Octo Core machine (which also hosts a Smokeping server) and it needs around 45 seconds on the smaller server. So I don’t think the gaps come from unfinished polling cycles.

              What both Cactis have in common is their horrible slowness on the front end. The slower server now uses nginx instead of Apache and both machines have the php5-fpm package installed and running, but that didn’t produce any improvements. At all. Cacti’s web interface is a pain to use on both machines, and I really don’t know what screw I could still turn to make it faster.

              Do you have any suggestions for how to improve the responsiveness of the web interface?

              Thanks a lot in advance,
              Winni

              Reply

              1. Tyler Wagner’s avatar

                Hi Winni,

                It seems you have two problems:

                A. Gaps in your graphs of 3 minutes, every 9.5 hours.

                I have some questions:

                1. 9.5 hours is really odd. The 3-minute graph is really odd. Does this correspond to any load issues on the server, or any other background task?

                2. What is your poller interval in cron? 1 minute or 5? If set to 5, the PHP or spine process runs 5 times, once per minute, then terminates (and waits for the next cron run to start it again). If 1, it just runs when cron calls it. I always use a 1-minute interval.

                3. If you delete and re-create a data source using 1-minute RRAs, does the gap still exist?

                4. Are you graphing disk I/O of the Cacti server? That may help identify the problem. See templates here.

                Right now, I don’t believe your issue isn’t related to poller cycles, but is something else. Which takes me to issue 2:

                B. Unresponsive GUI.

                I’m concerned these are related. What is your base OS? Web server? How does the web server launch PHP? Are you using PHP or spine for polling? I’m using Ubuntu 10.04 and 12.04, Apache, mod-php5, and spine.

                Reply

                1. Winfried Maus’s avatar

                  Hi Tyler, I could finally identify the cause for the unresponsive web interface: The superlinks plugin! As soon as I disabled the plugin, the Cacti’s web interface became fast again.

                  To be sure that it’s the plugin itself that causes the slow downs and not that external web pages that it loads, I re-enabled the plugin and just disabled all external web pages. The result was a sluggish Cacti web interface. Only disabling the plugin itself restores the full speed of the server.

                  I don’t think that this will “magically” remove the gaps in the graphs, but at least it explains why Cacti responded so slow.

                  Reply

                  1. Tyler Wagner’s avatar

                    Hi Winni,

                    That’s really strange. I’m using Superlinks 1.4 on my home server with Cacti 0.8.8b on Ubuntu 12.04, and it has no impact at all on the interface.

                    Reply

                    1. Winfried Maus’s avatar

                      I haven’t look at the source code of the plugin, yet. The pages that superlinks is supposed to load contain Java applets and are hosted on a decimator appliance. Maybe superlinks fetches those pages every time Cacti’s user interface refreshes even when they are not going to be displayed?

                      Reply

                      1. Tyler Wagner’s avatar

                        In my tests, it only loads the URL (or page) when you click on it. I monitored Apache’s access log while testing. My only Superlink is a static HTML page on the same server (which itself simple redirects to / on the same server).

                        Checking that log while clicking around the Cacti interface shows that the only non-standard access is the Superlinks-generated tab image. Perhaps there’s some issue with the tab image generation?

                        Reply

                      2. Winfried Maus’s avatar

                        Hi Tyler,

                        Thanks for the reply! :)

                        I honestly have no idea why this happens ever 9 1/2 hours (give or take a couple of minutes). Also, both Cacti servers “disagree” on the exact moment when the three minute gap occurs. The second one usually shows the gap several minutes later. The only CRON job that in theory would explain the pause is the backup job – that usually launches between six and seven o’clock in the morning and the gaps don’t happen when it’s run. So it must be something else.

                        The OS: 64-Bit Ubuntu 12.04 LTS on both servers. Cacti1 uses Apache, Cacti2 currently has a test-run with nginx. Both originally were default Ubuntu LAMP stacks. When the performance got worse, I installed FPM/FastCGI on both servers, but that did not improve the performance at all. Both use Spine. One minute CRON intervals on the one-minute-polling Cacti1, five-minute CRON intervals on the five-minute polling Cacti2.

                        Disk I/O on Cacti1 normally is around 1.6MB/s (write) and 400k (read), if the graph is correct. When the backup runs, read goes up to around 47MB/s and ends with a short write peak of 22MB/s – if the graph is correct at all. The monthly averages peak at 8.3M (read) and 3.9M (write), so I don’t know which numbers I can really trust. But these peaks always happen during the back window in the morning. I don’t have any disk I/O numbers for the HDD-based Cacti2; it’s a test machine that only monitors a fracture of the devices anyway.

                        All 8 cores of Cacti1 show a CPU load between 28 and 32 per cent. “top” shows the usual suspects: rrd-tool and php have the most hunger for CPU resources (with up to 86% for sometimes multiple rrd-tool processes).

                        Since this is a production server, I’m not sure if I can easily delete data sources. I’ll try to pick a few that won’t hurt anybody and will let you know if that helped. It’s also possible that the RRA values are not in order; some are different to those that you listed in your articles. I should also double check that.

                        Thanks,
                        Winni

                        Reply

                      3. Winfried Maus’s avatar

                        BTW, I am also posting in this thread on Cacti.net; there you can also see some of the graphs I’m talking about.

                        http://forums.cacti.net/viewtopic.php?f=21&t=51796

                        Reply

                      4. Cam’s avatar

                        OMG Thanks SO MUCH! There is so much WRONG information out there. Follow the instructions, folks, this just works.
                        Cacti 88b yum install
                        Centos 6.5

                        Reply

                      5. Cam’s avatar

                        Posted wrong place…ugh.
                        OMG Thanks SO MUCH TYLER! There is so much WRONG information out there. Follow the instructions, folks, this just works.
                        Environment:
                        Cacti 88b yum install
                        Centos 6.5

                        Cam
                        (the guy with sweet 1 minute graphs that still has hair left)

                        Reply

                      6. sahar’s avatar

                        how can you make this changes automaticly (when you add another device you dont need to repat all steps abovefor all chart

                        Reply

                      7. Mehdy’s avatar

                        Thank you very much man!

                        Just enabled it on 0.8.8b and it works like a charm !!!

                        Thanks a lot ^^ !

                        Reply

                      8. Rico’s avatar

                        THis is just amazing, it works for me. Cacti 0.8.8a on Windows server 2012.
                        I did everything except changing the error-log.php setting.

                        Thanks
                        Rico

                        Reply

                      9. Ankan Bhowmik’s avatar

                        Hi Tyler,

                        Thanks for the detail posting. I am very new to Cacti. I have followed every steps to setup cacti with 1 minute polling but unfortunately graphs are still refreshing after 5 minutes whereas in /var/log/cacti/cacti.log poller is showing 1 minute interval. May be I did something wrong in setting that’s why graphs are not refreshing properly. Please help me to fix the issue. If you need more info let me know.

                        Reply

                        1. Tyler Wagner’s avatar

                          Ankan,

                          Cacti’s page refresh is not related to its polling cycle. It is set per-user in User Management. Set it to 60 seconds for your user like so:

                          http://forums.cacti.net/about35996.html

                          Reply

                        2. Cars’s avatar

                          nice article, i has try but i have eror like this

                          10/24/2014 03:42:09 PM – CMDPHP: Poller[0] WARNING: SNMP Walk Timeout for Host:127.0.0.1

                          what happen

                          so, my grpahing so long, never show

                          Reply

                          1. Tyler Wagner’s avatar

                            Your problem has nothing to do with 1-minute polling. Are you setting up Cacti for the first time? Start by verifying that you can do an SNMP walk at all.

                            Replace “public” with your SNMP community (password). If that fails or times out, resolve that issue first.

                            Reply

                          2. Chris K. Brown’s avatar

                            Tyler,

                            Extremely helpful. I was going the other way – I have a Script data source which only updates every 10 minutes. I left the poller interval in cacti and the cron entry alone at 5 minutes, and I noticed that the poller ran my script every other invocation, as expected! Thought I was home free – had no clue about the RRAs. So I set up 10 minute RRAs per your design.

                            What got me here were gaps in the graphs using the stock 5-minute RRAs. Gaps every 3 hours or so like clockwork. I know enough to make sure that the rrd updates were happening every 10 minutes and the new data was actually getting into my rrd files. Watching the poller output and querying the rrd directly told me this. So it was the graphing step that was leaving the gaps in the graph, since the 10-minute data is all there in the rrd. I am hoping the new rrd’s with the correct 10 minute step values fix it, I will let you know!

                            I have one question – you “left off” an hourly graph for the 5-minute data in you example above. I believe the correct way to create one is to have exactly the same settings as the 5 minute daily graph, but just change the display interval down to 14400. All you are doing here is “zooming in” the graph. Do you agree?

                            In any case your explanation was perfect. Thanks.

                            Reply

                            1. Tyler Wagner’s avatar

                              No, you don’t need the “hourly” graph with 5-minute polling. If high-resolution data is unavailable, Cacti uses the lower-resolution data. So zooming the daily graph gives you same effect. But collecting useless “hourly” data will take disk space for no reason.

                              Reply

Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">