| technology | system admin - Jon Welling
How to Use Cron & Crontabs to Schedule Linux Jobs
Building web apps or automated infrastructures likely means you'll want to create repeated, automated tasks. All operating systems have a mechanism to do this, but in this post we are going to be focusing on Linux.
Linux is often the OS of choice for the backbone of the internet. It powers our web servers, switches, and most of our cloud infrastructure. So, it's super handy to know how to schedule repeated tasks in Linux. Let's take a look using Cron to create and schedule tasks within Linux environments.
What is the difference between Cron, Crontabs, and a Cronjob?
Cron is the application in Linux that runs scheduled jobs. It works in the same way that Task Scheduler would in Windows but is in many ways easier to configure. Cron runs as a daemon, the terminology used in Linux to describe services. The name Cron itself comes from the Latin word for time.
Crontabs are the configuration files used by Cron to run services. Crontabs hold the configurations for which service to run and when it should run. Services are nothing more than an execution path to a script or application with possible additional commands.
Cronjobs are the individual entries in that Crontab file. The Cronjob is the act of running the service itself. So, if someone tells you to adjust the Cronjob for a service to only run once a month, they are saying to edit the specific Crontab entry for that individual, scheduled job.
How Do You Configure a Crontab?
As we mentioned above, the Crontab is the configuration file used by Cron to run services. There is one crontab per profile in the Linux environment. Each line in a Crontab file is a different entry for a specific job. Let's discuss the parts of a Crontab entry and how to create one.
What does a Crontab Entry Look Like?
Each line in a Crontab file will look like this:
* * * * * /path/to/command
Each one of those asterisks in that command designates a marker in time. There are always five entries (five asterisks). Those five markers in time denote in order:
- Day of the month
- Month of the year
- Day of the week
Each of those spots must have an entry. For example, if you wanted to create a service that ran every single minute, your Crontab entry will look like this:
* * * * * /path/to/command
On the other hand, if you wanted a service that ran once per hour, your Crontab entry would look like this:
1 * * * * /path/to/command
That seems a bit backward, doesn't it? Why are we adding an entry in the minute mark instead of the hour mark if we want a service to run once per hour?
The answer is surprisingly simple. The asterisk denotes a wild card, so anytime Cron encounters an asterisk in the Crontab entry, Cron runs that Cronjob whenever the timing would match. In this case every single time. By adding an entry at the minute mark only, we are telling Cron that we want that Cronjob to only run on the first minute of every hour. However, since there are wildcards in the rest of the spots, any time that first minute of the hour is encountered, any other time is fine. This would be the same reason we have all asterisks in the other entries listed above.
For one more example, how would we create a CronJob that only runs on the 30-minute mark every hour of the day but only on the 13th of every month? In that case, your Crontab entry will look like this:
30 * 13 * * /path/to/command
How Do You Create a Cronjob That Runs Every 30 seconds?
I'm sure you've spotted an issue here. Cron has a limitation to how often Cronjobs can be configured in the Crontab configuration file. What happens if you need to run a service every 30 seconds? In that case, your Crontab entry will look like this:
* * * * * (sleep 30; /path/to/command)
This brings up an interesting scenario here. Knowing how Crontab entries are entered using the time markers from the section above (there are asterisks in each time marker), we know that Cron will run this Cronjob every single minute of every hour on every day of the month. Cron jobs always start at the very beginning of the minute/hour/day/week, etc…
In this case, though, we are piping extra commands into the command area in the Crontab entry. The command part of the Crontab entry works like a mini shell script. You can execute multiple command-line commands by wrapping the command area in parentheses and separating each command by a semi-colon. In this case, we are telling the command line to wait 30 seconds and then execute our command.
So, if you think through that, the Crontab entry above in actuality still only executes commands once a minute and not every 30 seconds as we wanted. That's because there is only one Crontab entry but it's telling Cron to wait 30 seconds before executing the command. To fix this, we need two Crontab entries in our configuration file. Our Crontab entry to execute a command every 30 seconds will look like this:
* * * * * /path/to/command
* * * * * (sleep 30; /path/to/command)
In this way, Cron will execute our command once at the very beginning of every minute while also waiting for 30 seconds and then executing it again.
As we mentioned above, each Linux profile has its own Crontab configuration file. Cron will run the Cronjobs listed in that Crontab file with the permissions of that user profile for that Crontab configuration file. So, if you need a Cronjob to run with root privileges in Linux, you need to edit the Crontab file with elevated privileges by using the SUDO or SU command (depending on the version of Linux you are using) first.
How do I Add a New Cronjob to Crontab files?
Now that we know what Cron, Crontab, and Cronjobs are, let's put it all together and add a new Cronjob to Cron.
For our example, we are going to run a shell script once every 30 seconds, and we are going to run a command that deletes all items from a temporary folder in our website directory for our web server once an hour at the 30-minute mark each hour. We are going to make a few assumptions for this example, though:
- The shell script will be called myScript.sh.
- That shell script will need root privileges to run.
- That shell script will be located in the root of the home folder for your Linux profile.
- We'll assume your Linux user profile name is Jon — and thus the home folder for that user account will be named Jon as well.
- We'll also assume we are using the standard 'html' directory for the Apache web server for our website.
- We'll also be using Nano to edit the Crontab configuration file as well. Nano can be swapped for your Linux command line text editor of choice.
First, let's enter the command line argument to edit the Crontab file. This is the 'crontab -e' command. Don't forget, we need these Cronjobs to run with root privileges, so we need to escalate our command-line environment first:
Sudo nano crontab -e
We used Nano in that example above, but you can swap it for Vi or any other text editor. Likewise, you can omit the text editor and the Crontab application will ask you which editor you want to use by default to edit Crontab files before opening the file to edit.
Once you open the Crontab file, you'll see a screen similar to this:
Now we need to add our entries to Crontab. Don't forget that each entry is a new line in the Crontab file. We'll add the following three lines:
* * * * * /home/Jon/ myScript.sh
* * * * * (sleep 30; /home/Jon/ myScript.sh)
30 * * * * rm -R /var/www/html/temp/*
After you enter all three commands, double-check your work and then save and close the Crontab file. Once the Crontab file is saved, Cron will automatically monitor those changes.
Let's walk through those three lines quickly.
The first two lines are straightforward. Because there is a wild card in each time marker in the Crontab entry, Cron will run each of those lines at the beginning of each possible minute. Because the second line has that Sleep command we discussed above, Cron will pause for 30 seconds before executing that shell script. The shell script command itself also includes the path to the shell script. Cron has no concept of circumstances, so you need to include the full path to the command or shell script you want it to run unless that command or shell script has an environment variable in Bash linking to it globally.
The last line is straightforward as well, but let's talk about a couple of the "gotchas." Note that we entered 30 in the minute mark of that Crontab line. That tells Cron that this command should only run at the beginning of the 30-minute mark of each hour.
More importantly, note how we structured the command for this line. It's using the standard Linux remove (rm) command, so what's so special about it?
First, we added that recursive flag (-R). Though we aren't expecting there to be sub-directories to be created under that temp folder, it's safer to add that flag just in case. That eliminates any possible errors later.
Likewise, we added the /* at the end of that command, too. If we omitted that (so the command only said rm -R /var/www/html/temp) then the RM command would delete the temp folder along with everything inside of it. We don't want that. The temp folder needs to stay there. By adding the /* to the end of that command, we are telling the RM application to only delete files and folders inside of the temp folder and not the temp folder itself.
And that's it! Now you can confidently update and add jobs to Cron.