A tale of a small business environment refresh – part 1

During my spare time I support the IT infrastructure for a small company located in the east part of Iceland. Late last year I decided it was time to refresh the infrastructure so I spent some time figuring out what would be the best way.

We wanted to have the system hosted locally since they have, in the past, lost connectivity so moving everything into a cloud-hosted environment wasn’t an option this time (although I expect that the next time we do a refresh we will move away from the on-premises setup). And since they are located as far away from where I live I wanted to move away from the single server setup that has been in place for a long time. I might have gone a little bit overboard on overdesigning the environment for such a small business but the results have over my expectations.

We did a cost analysis of the current setup, and calculated the cost for the new infrastructure for 5 years and found out that it came down to about the same cost as to host the main application used by the business for three years with the application provider. If we had decided to go that way we would always have had to buy some infrastructure to host basic monitoring tools/supporting applications to monitor the network environment anyway, but instead we can host it on the new environment as well.

I ended up going with the following specifications for the hardware and network infrastructure:

Network:

2x Mikrotik CCR2004-1G-12S+2XS (Core routers)
2x Mikrotik CRS518-16XS-2XQ-RM (Core switches)

Servers:

3x SuperMicro CSE-116AC10-R706WB3 chassis, each with the following specs
- Supermicro MBD-H12SSW-INR-B motherboard
- AMD EPYC 7313 16C CPU
- 128GB RAM
- 2x Samsung PM9A1 512GB boot disks
- 3x Mikron 7450 Pro 1.92TB NVMe disks for CEPH
- Supermicro AOC-S25GC-I4S-O (4x 25G ethernet adapter)
- Dual PSU

The hypervisor I decided to use was Proxmox 7.4 (latest release when I did the setup of the environment). For the backups I used the Proxmox Backup Server, and am running it on a old HPE Proliant ML350 Gen8 server.

For the network I decided to setup the Mikrotik CRS518 in a MLAG setup. Although the MLAG functionality of the RouterOS software isn’t perfect (connectivity is lost for about a minute if one of the switches goes down since the system id of the switches doesn’t stay static like it does on all enterprise switches, but I am sure that Mikrotik will fix that in a later release). The routers then have a 25G link to each switch, setup in a LACP configuration. I created new VLANs for all networks (workstations, servers, infrastructure) and setup VRRP between the core routers for each VLAN for redundancy. A simple access list is configured for the infrastructure VLANs limits access to the infrastructure. I have thought about adding a firewall running on the virtualization cluster, but at this time I haven’t set one up yet.

The main access switch has 1G link to each core switch configured as a LACP port-channel.

Backups are kept locally, but also replicated to Tuxis, which provides a PBS instance where you can replicate your backups to, meaning that if we have to restore files/VMs quickly (if we need data in the short term) but if we have a disaster we do have a copy of the data with longer retention at Tuxis. But if you feel like it you can also host your own PBS instance anywhere and store your backups there – it is amazing how easy it is to manage PBS instead of some of the backup solutions I’ve seen in the past!

For Internet redundancy we have connections from different providers – the main Internet connection is fronted by a Fortigate 40F firewall which advertises the default route through OSPF to the core routers. Then we have a Mikrotik L41G-2AXD&FG621-EA 4G router with a connection through a different provider that has a static route with a high priority on the core routers which acts as a backup. This has proven to be very stable setup for the Internet connectivity so far.

Here is a high level drawing of the infrastructure:

So far the performance has been great. My biggest worry was that the Ceph performance would be bad enough so that I would have to refactor everything and use ZFS with replication instead. A very limited testing has shown about ~1-2GB/s in writes, and 2+ GB/s in reads. Each node has only 3 OSDs (I thought about partitioning the disks and use two OSDs per disk, but after my initial testing I was more than happy with the performance) so things are kept as simple as possible.

There is a old APC SmartUPS 1000 in place that can run the environment for about 7 minutes before it looses power. So far we have only had a single incident where all of the hosts lost power (the power can be somewhat unstable in the area). During the bootup process there was a issue where two out of three hosts didn’t detect at least one out of the three NVMe disks for Ceph so the servers didn’t have the minimal amount of OSDs to boot up and I had to manually restart the hosts to get the disks to appear again. This seems to be a bug in the SuperMicro BIOS, but since then I have upgraded to a newer version and so far I haven’t seen this before (I had already seen it during the setup phase so I wasn’t all that worried when we had the issue). If we see this over and over again I will consider adding a PCIe adapter to handle the NVMe disks.

For the money, I think this environment is great, and with the exception of issue with the NVMe disks, and the MLAG issue with the Mikrotik switches I could not be happier with the result for the money. I rarely have to touch the environment, as of now I still do manual patching of the Proxmox hosts, and the Mikrotik infrastructure. All of the server patching has been automated and I don’t think I will need to touch that any time in the future.

All of the environment is monitored by CheckMK, which is running in a container on a Linux VM. CheckMK monitors the virtual guests, the Proxmox infrastructure, Ceph, hardware and the network infrastructure.

At last I have been playing around with Security Onion to monitor the environment for security events but I am still in the evaluation phase – it looks good as a open source product and seems to have most of the features I would want for such a small environment – the only thing I feel like I would want to add is to have Qualys + Kenna for vulnerability scanning for both OS updates and third party applications.

RC!

Last year I happened to stumble up on some videos of electric RC cars. I watched couple of those and started reminiscing about the time I had spent back in ~2004 when me and a friend ordered couple of Traxxas Revo’s with the TRX 3.3 engine. It was a blast, but the bad thing about living on this lovely island we call Iceland is that the temperature here is pretty low, and the engine settings really often need some modifications so we spent the better half of all sessions doing tuning before we could start bashing.

But watching those videos of those electric cars was pretty interesting as you could just charge up and start to bash right away! I started researching and ended up ordering a Traxxas Maxx v2. I had a blast bashing it couple of times, but a little later I got the chance to get my hands on a Traxxas TRX4 crawlers (Bronco 2021 body). Now….this is where things started to get interesting. I took it along with me when the family went on a hike and man….I realized I had so much more fun crawling then I did bashing.

So – since then I have built a Vanquish Phoenix VS4-10, and I just finished a Axial SCX10 Pro build (well, I still need to do some work on it, but it is in a driveable condition). Building those kits has been the best hobby I can think of. I’ve searched for a hobby to spend my free time on for a long long time and I think I have finally found it.

There are two issues though. Number 1 – this hobby is a money dump!…..however I am pretty sure that this is a lot cheaper than if I had gone into hobbies like fly fishing or hunting. Number 2 – living on a island in the middle of nowhere with a population of ~400.000 means that access to crawler parts that are not made by Traxxas is very much non-existent. So I need to order pretty much everything except original parts for my TRX4 from abroad. But even if I can find cheap stuff then shipping along with the duty fees always adds a premium to every part so I have to think carefully before making any orders.

But never the less I am lucky enough that there are stores in Germany and Asia that do ship things pretty cheap (and some even ship things pretty fast, thank god for cheap Fedex shipping!) so if I make sure to put together a sizeable order it won’t be anything crazy. I am going to create a page here sharing the stores I primarily use along with a list of my parts just for fun if it helps anyone that is in a similar situation.

Yet another summer is coming to an end…

First post in long time!

Yet another summer is coming to an end. Work starts again tomorrow and things get back to normal.

During the summer I saw that I needed to rebuild a small SMB environment for a friend and I decided on using Mikrotik for the networking (switches, routers), SuperMicro for the servers and Proxmox for the virtualization layer. I’m going to document my process here and find out the good, the bad and the ugly around those three vendors. Can’t wait to get that started but I expect to get the equipment in my hands in the next ~4 weeks or so.

This is going to be somewhat over-designed environment but I am excited to see how those vendors stack up against the enterprise vendors I work with most of the time.

On-premise Kubernetes

For the better part of the year I have been playing around with Kubernetes on-premise. While testing random solutions I didn’t realize what can of worms I just opened! ……Don’t get me wrong – the whole Kubernetes ecosystem is extremely fun to “play” in.

But after trying multiple solutions a colleague of mine pointed me to a project called Rancher. This project is pretty cool!

The project makes the installation extremely easy (yes yes, I sound like a sales person) but this was the most straight-forward product I had seen (and yes, I have seen a few) in this space.

Out of the box the project offers multi-cluster management, support for AKS, EKS and support for other managed solutions as well as a on-premise installation using either RancherOS (a custom Linux distro for running Kubernetes) or using roll-your-own VMs/bare metal instances (using for example CentOS). It can integrate with vSphere to spin up instances…..and they have a decent Active Directory integration for authentication/authorization.

Rancher is deployed on a dedicated Kubernetes cluster (if it is set up for HA) that should just be used for Rancher. Then you can go ahead and add your own clusters from AKS/EKS or on-premise. It is a nice single pane of glass for operating your Kubernetes clusters. If you have environments all over the place it can help you gain better control of the environments as well as offer a single place to interact against for things like deployments.

While I won’t go into details (the documentation simply speaks for itself) I recommend you take a look at this project if you plan to start using Kubernetes for your organization, or even just to play with your own stuff.

And the best part? The project is fully open source. Rancher are also working on a persistent storage solution (Longhorn) and they offer professional services/support if you need some help along the way.

They also have a mini Kubernetes distro called K3s – it is a (very) small instance of Kubernetes that you can run on pretty much anything that can boot Linux and be managed in the same way.

Simply put, this is an amazing project! 🙂

Openconnect and GlobalProtect VPN!

Hi!

Just tried the globalprotect support in openconnect 8 (8.02 in Fedora 29).

Very simplified version:

sudo openconnect --protocol=gp your.vpn.gw.com

Worked liked a treat! Hopefully I can stop using the offical Linux client now.

Now – hopefully NetworkManager-openconnect drops in support for connecting to globalprotect VPN soon! 🙂

Bgrds,
Finnur

Palo Alto GlobalProtect on Fedora

After spending some serious time trying to get GlobalProtect 4.1.2 to work on Fedora 28 (and probably 27 earlier this year) I finally managed to get it working. It is almost embarrassing how easy it was…

Replace /etc/redhat-release and /etc/os-release with info from RHEL 7 or CentOS 7
Profit.

Yep….it’s sucky….but at least it shows that this works. Maybe it is possible to modify some file that lists supported operating systems……will have to look into that later on.

Always read the release notes….and the supported OS lists…..and the error logs. Even better if you do it all in the same evening to puzzle this amazing solution together……

FYI: The error I was getting was: Error: Gateway my.gateway.hostname: The server certificate is invalid. Please contact your IT administrator.

Cisco UCS: vHBA bandwidth

I never really understood how Cisco UCS vHBA are configured in regards to bandwidth (coming from a FC background).

Finally I got it spelled out for me like I was five…..IT IS JUST A ETHERNET PORT (yes yes…I knew that. But I really thought there was some more magic involved). IT JUST SYNCS ON THE SAME SPEED AS THE FEX PORT IT IS CONNECTED TO. That would mean if you have a blade with a VIC, a 6332 FI and a 2304 FEX and you do not have the port expander it will be configured as a 20Gbit port (2x10Gbit) with a single flow maxing out at 10Gbit/s (to the FI…not taking into account break out speeds to your ethernet and storage network from the FI). If you have the port expander for the VIC you get native 40Gbit/s if you are using the 2304 FEX and the 6332 FI (single flow can reach 40Gbit/s from the FEX to the FI).

I had a real duh! moment there.

Now it is out there! Hopefully this can help some poor soul out there. I googled my life away for couple of days and did not find a real answer.

Bgrds,
Finnur

Trying out an iPad Pro 12.9″ for sketching and drawing….but it is awesome!

I recently got my hands on a iPad Pro 12.9″ which I wanted to use for drawing sketches when I am working on some issues or just designs since I always seem to have the need to visualize stuff when I am working (and attaching them into OneNote). My desk often looks like there has been a mass-murder of post-its or notebooks. And those end up in the trash and I end up having to hammer down a drawing in Gliffy (or I don’t…..which isn’t exacly a good thing since I often would like to remember what I was sketching later on).

So – I got an iPad Pro 12.9″, Apple Smart Keyboard and a Apple Pencil.

Now I guess I have to admit I might have laughted histerically of those crazy people buying into the iPad Pro + Apple Pencil hype. More the once. Probably more then three times even….I have owned a iPad in the past (think it was the iPad 3) but I mostly used it for watching TV episodes before going to sleep. And Skype couple of times.

So I spent last night setting everything up. Sat down in my La-Z-Boy and got down to business – getting the iPad enrolled in our MDM, installing the apps I normally use on my laptop (SSH client, RDP client, Outlook, Word, Excel, Powerpoint, VPN client etc). Played around with the pencil.

My SO was sitting in her chair with her Surface Pro 4, doing her nightly surfing. She has been using the SP4 for two years if I remember correctly. She loves that thing. Probably more than me. But less than her cats.

After playing with the iPad for like an hour I start to realize how cool device this actually is. And how useful it is (I had my doubts it would actually be this useful). Start mumbling something about how awesome it is to be able to sketch with the pencil (which is better then any pencil I have tried before). Keep using the iPad. Keep mumbling about how awesome it is. She suddenly looks at me and says: “I told you multiple times – having a tablet with a pencil is extremly useful”. While I have thought about getting a Surface I never acted on that – they are pretty expensive here in the land of ice and snow.

I guess I have to eat my words. The iPad Pro is actually one of the most useful devices I have used for work-related computing. And I can even use it to do more then I actually thought I could do on a iPad. Which is a lot of researching, hammering at those pesky SSH terminals and replying to emails. I might even stop taking my laptop to meetings. And let me tell you – Before last night I would never have told you I would replace my laptop for any task.

Don’t get me wrong – This device will not replace my laptop. But I will probably use it less then before.

Hopefully I can pair a bluetooth mouse with it and use it with my RDP client. Haven’t tried it yet. If that works then I don’t think I will take my laptop with me when I am going on short trips over weekends etc.

I just have to say it. Apple might have created a market of devices that we don’t really need – but the iPad Pro is one brilliant device. While iOS is a little bit limited as a desktop OS it has couple of things going for it LInux and Windows can’t keep up with. The battery life is awesome (at least on this thing and so was the battery life on my old iPhone 6s Plus). I am still on the first charge on this bad boy. And I probably have OST of 8 hours already.

My Lenovo T460s laptop chews through the battery in just 3-4 hours. And that thing is just a year old. But I can probably blame Chrome + Extensions for that 🙂

Bgrds,
Finnur

UniFi Network kit – awesome stuff!

Recently I got fed up with yet another router (with integrated wireless) provided by my ISP. In the last 5-6 years or so I have gone through like 6 of those – having horrible experience with each and all of them. Well…..to be fair – they all worked as expected as a router but the wireless function was just a joke. Most of these were different versions of Thomson provided by Siminn (my ISP at the time). Before all this I had always been using a Linux based router (and even OpenBSD and FreeBSD at some point!) along with a standalone access point but due to limited time and other things going on I didn’t feel like spending time building yet another one (and my second trusty WRT54G had just died) so I just started using the router from my ISP to get things going again.

I moved to another apartment in November and got yet another router from my ISP (but this time I finally got a fiber connection!). Again – the wireless signal was horrible in some of the rooms so I started checking out some new equipment.

My friends have been raving about the stuff from UniFi – the EdgeMax routers and the UniFi APs. So I decided to try it out.

This stuff is brilliant! Luckily I have an old Linux box hooked up in a corner where I can run the UniFi software for the wireless access point. The configuration is very simple and the pricing of those APs (and the routers) is a joke. I got a single UniFi AP-AC-LR and it just rocks.

Then I configured the EdgeRouter….for a kit that only costs 99$ (or something like that) this thing is just awesome. I’m late to the game but this little dude can route 1 MPPS which you don’t normally find in such a small box (or at least when it was released). Guess I won’t have to worry about that on my 100Mbit connection 😉

The GUI is pretty self explanatory and if you have ever worked with a Cisco/Juniper kit you will find your way around the CLI quickly as well.

So – for around ~200$ I finally have a home network that I don’t have to worry about!

For the next phase I am thinking about getting a NGFW…..still debating if I will go with a small Fortigate, Juniper, Palo Alto or just a UniFi Security Gateway – it would be awesome to be able to inspect SSL!

Bgrds,
Finnur

NetApp refreshes the FAS line and updates ONTAP to version 9.1

Just saw this article at The Register.

Seems like those guys are serious about the all flash party! 🙂

I honestly didn’t think they would do some major changes to the FAS series after they bought SolidFire (which I think is a pretty damn cool solution) but it seems I was wrong.

Bgrds,
Finnur