I was recently asked what a day in the life of a sysadmin looks like. The truth is far less exotic than most folks would expect. A “typical” day usually means spending 6+ hours staring at a computer screen trying to figure out why OSX Open Directory won’t authenticate to an Active Directory server. Or a 14-hour marathon of installing patches on individual workstations, because they’re network patches that fail when installed remotely. So rather than give an actual example of an average day, I decided to create an unusually busy and productive day, which includes tasks that happen regularly; they just don’t usually all happen on the same day. Keep in mind, some of the items below do take place every day, particularly the maintenance stuff.
There’s also the unfortunate truth that a day in the life of a system administrator isn’t necessarily the best and most efficient way to run a day. As I go through the “normal” day, I’ll make notes along the way. Even the most seasoned professional can improve their efficiency and workflow, so I’ll do my best to advise… myself!
Wake up. Still not adjusted to Daylight Savings Time ending. Waking up two minutes before the alarm goes off isn’t uncommon, but an hour and two minutes? Ugh.
Try to fall back asleep, but concern over potential overnight catastrophes keeps me awake.
Capitulate and grab cellphone from nightstand. Don’t have glasses on, so hold phone two inches from face, and get blinded by the screen brightness. Check emails for any automated failure notices. Lack of any email, including SPAM, means email isn’t loading. Force quit mail app and restart it: 47 new messages, but a quick scan of subjects shows no major failure notifications. Set phone down and try to go back to sleep.
Can’t sleep. Grab phone and check Twitter, Facebook, G+, and browse news sites. Load all websites from work to make sure network is working and servers are up. Check that network backups succeeded last night (at work).
NOTE: While I don’t know if I’ll ever be able to stop myself from manually checking servers and such, a far more efficient method is to automate checks from both internal and external sources, then have reports emailed to you. Be sure to read that daily report, and if there are errors (or it doesn’t arrive!) you’ll know exactly where to start. I will warn that if you set up too many daily email reports, they’ll start to seem like background noise and you will miss important information, or you might not notice if one of them fails to arrive.
Give up and get out of bed. Put on glasses and wander to kitchen. Realize I forgot the dogs, go back and let them out of their crates and outside. Get coffee pot locked and loaded, and then measure the beans. I have a hand grinder, so I literally grind beans for five minutes or so with a crank. Press “brew” and wander into office.
Dogs barking at absolutely nothing in the front yard. Run to door and call them inside before the neighbors get angry.
Wish I’d have forced myself to sleep at 4 a.m., but now it’s time to wake up the kids. Certainly could sleep now, but it’s not an option anymore.
Wake up kids for the first time.
Drink second cup of coffee, while reading online news sites. Check graphs for both home and work networks, including bandwidth usage overnight, CPU loads, memory usage, etc. Everything looks OK.
NOTE: Even if you have set up email reports, and they all appear OK, a cursory look at your systems is always a good idea. I once had perfectly good-looking reports, only to find that my web server had been attacked by an Internet worm, and had filled up all its RAM and swap file with requests. Reports only report what you ask them to report, so don’t get burned by your monitor program saying, “But you didn’t ASK me about swap file usage on the frontend web server…” Monitoring programs can be jerks.
Wake up kids for the second time.
Quickly get ready for work. Put in contacts, run fingers through thinning hair, get dressed. Once the three girls are up, Dad won’t get bathroom time.
Wake up kids for the third time, with more urgency and grumpy dad voice.
Make lunches for the family. (Actually rather proud of making lunch for the wife and kids every day.)
NOTE: I’m a huge fan of the LunchBlox kits for school lunches. But even though they claim to be microwavable containers, if you pack ravioli, it will destroy the container when they heat it.
Start urging kids to get out the door for school.
Urge kids more vociferously, stressing their potential tardiness.
Help youngest child find socks while everyone else is in the car waiting.
Kiss last child as they run to the car with one shoe on, one shoe off.
RUN out to the driveway with lunches the kids forgot on kitchen counter.
Put dogs in crate, grab laptop and jacket, go out to car, and head to work.
Red light. Check email.
NOTE: Slow traffic doesn’t count as a red light. Checking email in slow moving traffic is a horrible idea. Listen to music. Or an audiobook. Or talk to Siri via bluetooth. But don’t check email or texts.
Arrive at work, sneak into back door. Listen for angry chatter about servers being down — hear none.
Get stopped in hallway by coworker asking why changing passwords is mandatory every six months. Cringe at how lax every six months is, but try to explain rationale behind password security.
Get stopped outside copy room by coworker asking about their home computer.
NOTE: I encourage this. I don’t fix coworker’s home computers, but I like to give them as much guidance as possible, and sometimes recommend they just go to a shop.
Get stopped by supervisor requesting logs of a particular employee’s web history. Ask her to email me a request so I don’t forget.
NOTE: I joke about getting stopped in the hallway, but it really does happen. I used to get frustrated by this, but really it’s an incredible opportunity to communicate with your users. Just plan for it, and such encounters can be incredibly beneficial to you and your users. In the world of digital communication, it’s amazing what a little face time can mean to a person.
Finally get to office. Make coffee.
Walk through server room, do a visual inspection of server racks. Notice failed power supply on one server (it’s a redundant power supply). Unplug and plug back in. Failure light stays off, but make a mental note to keep an eye on it.
Check email again. Reply to question about wireless access for guests here for a tour. They want to check email during their breaks. Create temporary account for them to all share, and remind person to arrange things like this in advance. Create reminder to delete temporary account at the end of the day and check the web logs for the account.
Administrative meeting. Mention the need for a system or at least a policy for guest users in our wireless system. Our goal is to provide service, but also to protect our system.
NOTE: This wireless situation is a perfect example of how a little time invested now will save hours and hours later. The solution will look different for ever situation, but whether you choose to implement one-time passwords, or guest accounts, or a guest wireless network with limited access — having a way for guests to use wireless without contacting the operations department will be an incredible time saver. Plus, it will make your guests and those hosting guests very happy.
Get cellphone call. Entire wing’s network is down. Wireless is still working. Wireless is separate VLAN and in separate switch. Head to IDF to check for a network loop scenario.
NOTE: Spanning Tree helps stop this problem on nice switches, but cheap desktop switches in a single office can destroy network performance for an entire building.
Track down ethernet cord plugged into two ports on the same switch in an overcrowded office. Unplug cable, network normalized. Restart printers, because for some reason they don’t handle packet storms well.
NOTE: The first time this happened to me, it took over a day to figure out. I didn’t have the experience or training to know how to track something like this down. If you’re planning to be a system administrator, Cisco security training is incredibly beneficial. Even if you don’t plan to get certified, this training will help you troubleshoot networking issues and make you a much more valuable employee.
Stop in supervisor’s office before lunch to explain the network outage. Mention the need for more network drops in the crowded office space, and cite this outage as a reason.
NOTE: Communicate, communicate, communicate. The difference between a system administrator who can do their job and a system administrator who excels at their job comes down to communication. Don’t blame the network loop on the crowded office space, just communicate the situation so those in charge of purchasing are better informed. In my last sysadmin job, we could never afford the network drops required to stop situations like these, but informing everyone meant far less down time when it did happen. My calls during an outage were usually something like, “Mr. Powers, I think there might be a network loop in our wing again — everything went offline and the blinky lights are solid” — I call that sort of empowerment a win for everyone.
Lunch. Eat in office, and watch some streaming television while munching on sandwich.
Head to shipping/receiving to pick up packages. New monitors arriving today for second-floor managers. Could have monitors delivered, but prefer to go and touch base with the folks in shipping to make sure everything is going OK, especially after network outage earlier today.
NOTE: Again, communication. It’s not something many sysadmins do by nature, so force yourself!
Bring monitors up to second floor, and let managers know an intern will be installing them sometime today or tomorrow.
Head back to office. Brew new pot of coffee, and notice voicemail light blinking (I always forget to check voicemail). There are three messages about network outage earlier, safe to delete.
NOTE: Do yourself a favor, and try to get a system that emails voicemail messages to you. It will make your life so much easier!
Check email, follow up a thank you to all those who emailed from their cellphones about the network being down. Explain the situation without blaming anyone.
Restart web server which has been acting strangely since the packet storm hit earlier in the day.
NOTE: This is one of those things that you don’t normally find in books. While much of it is just learned by experience, take advantage of the “Real World” courses here at CBT Nuggets. The other trainers and I try really hard to give you as much of these troubleshooting tips as we can, because while the school of hard knocks is effective, it’s not very pleasant or efficient.
Check for security updates on servers. Two security updates on Internet-facing servers are available, plan to patch this evening.
Send email to department heads notifying them of system downtime for several minutes this evening after 8 p.m.
NOTE: Communicate… (Getting tired of me communicating the importance of communication?)
Tackle back log of broken hardware in workroom. Two laptops need to be re-imaged. One has a clicking hard drive, so is replaced with another drive cannibalized from laptop with broken screen. One bad monitor, not worth fixing. One computers unable to authenticate. Suspect bad network hardware; log in with local admin account and can’t ping. Embedded ethernet port, not worth fixing. Pull out hard drive and put into freshly cloned machine. Works. Leave sticky note for intern to deliver and retrieve loaner computer.
NOTE: If at all possible, teach people while you do this. Your time is better spent elsewhere, yet all too often the system administrator is the only one who can do this sort of work. Teach an intern, or if you’re in a school, teach a student helper.
No time left for working on backlogged broken systems. Update trouble tickets for those systems repaired.
Meet with vendor about upcoming wireless infrastructure upgrade. Vendor wants to test wireless penetration between offices. Hand off vendor to intern, quietly tell intern to stay with vendor at all times, see him out of the building, and not to leave doors unlocked.
NOTE: This isn’t due to mistrust of a vendor (although that might be a valid concern, especially if your company handles sensitive information), it’s really just a safe practice that protects you and the vendor.
Drive to kids’ school for volleyball game. Pick up dollar-menu burgers on the way.
Head back to office, make sure intern locked doors, and that everything is functioning normally. Check graphs, emails, voicemails (I remembered!), and reply to important emails.
NOTE: This is probably not really necessary, but for me a 10-minute stop saves hours of wondering if everything is OK. As more and more automation and responsibility sharing takes place (you know, like with a DevOps mindset!), stops like this should become silly. But I’m an old dog, and change is hard.
Go home. Eat dinner with family.
Watch some TV, and from recliner connect to office via VPN. Download server patches and prepare to install/reboot.
Email department heads again reminding them of server downtime for upgrades.
Take VM snapshots of servers in case updates b0rk servers. Launch updates. Restart servers.
Check servers, graphs, emails, and make sure updates are working well. Log into several workstations, testing updates.(depending on the server updated, maybe log into web apps, or connect network shares. Test whatever is appropriate).
NOTE: Don’t do this anymore!!! Please learn to use DevOps tools like Chef, Puppet, Ansible, etc. If you’re not convinced you need that sort of automation, I urge you to watch my “Chef Fundamentals” course. DevOps is more than just a way to empower developers, it’s a godsend for system administrators too! Some of the steps for upgrades will be the same, especially the communication part, but system patches shouldn’t be stressful, they should be seamless and mostly automated.
Spend some time with kids before bed.
NOTE: For me, it’s my family. For you it might be time with friends, or your dog, or playing an instrument, or just time walking outside. Even though system administration is a job that often has us mentally on call 24/7, it’s vitally important to take time for yourself. If you don’t love what you do, it’s not worth doing it, and not taking time for other things will make you hate your job.
Head to bed, checking Facebook/Twitter/online News from phone. Read a little from Kindle.
And that’s really about it. Sometimes coworkers get angry when I leave work early, but usually someone else explains to them that I work almost around the clock, so my “office hours” are flexible. The life of a system administrator is hectic, and can sometimes mean overnight stays in the server room when something fails — but with that responsibility comes flexibility. Apart from my current position at CBT Nuggets, being a system administrator is my favorite career!